Conversational Coaching Agents: Beyond the screen

Conversational Coaching Agents: Beyond the screen

Agents Galore: Need something? There’s a chatbot for that!

The date is November, 2023.

Developers gained unprecedented access to models and architectures from OpenAI, once confined to the backend servers of sophisticated AI systems like ChatGPT. The Gold rush involved enthusiasts and professionals alike began crafting chatbots for a myriad of applications, transforming them from simple digital assistants into complex, multifaceted entities. Chatbots evolved rapidly from being mere API wrappers to actual digital companions and assistants to tackling more nuanced roles. One could find a chatbot for virtually any need or scenario – from a comforting friend to a savvy legal advisor, from an analytics guru to an imaginative screenplay writer. This wave is probably going to result in a decline in the freelancing industry – to start with the number of writing tasks on UpWork has significantly dropped.

Feb 2024 saw the launch of Gemini 1.5 with enormous context windows, and March 2024 saw Claude’s comprehensive reasoning and task completion capabilities. We are seeing such transformational developments now in the Month scale. When I meet my friends to share notes on Model architecture and performance, we often conclude with the same thought, “We live in such fascinating times, it’s Y2K all over again”. But for these advancements today, which seem overwhelming, we need to trace how we’ve made incremental progress in the Decade scale.


Intentional Empathy: The art of designing chatbots to understand and reply, and not just reply

The date is September, 1965.

Joseph Weizenbaum has finished submitting his article to Computational Linguistics. “ELIZA” was written in MAD-SLIP utlising a rule-based approach to simulate human conversation. One of his curious recollections, while his secretary was interacting with his program, she looked up at him staring over her shoulder to see her interaction if everything was ok. To his surprise, even though she knew the system wasn’t understanding what she was saying, she wanted to talk more to it. She told him, “Would you mind leaving the room, please?”.

An interesting experiment in natural language processing, opened the door to the possibility of machines understanding and responding to human language. However, the true breakthrough, as Weizenbaum observed, wasn't just in the mechanics of conversation, but in the human longing to be heard and understood – a longing that even the simplest rule based system could fulfil. We, as a species, crave to talk, and to be heard. Sometimes even if the Agent, like ELIZA, is in a roundabout way making us talk to ourselves, we feel heard, just like Joseph’s secretary and many of his students did.

Tracing the lineage of chat systems through the 1970’s, there was PARRY (famously known as ELIZA with an attitude) developed by Kenneth Colby to simulate a schizophrenic patient. There’s a fascinatingly hilarious transcript of both these systems talking to each other at the ICCC 1972, when Vinton Cerf connected them via ARPANET (the precursor to the Internet). Naturally since these systems predate the golden NLP age, the conversation ends absurdly. As we progressed through the 1980s and 1990s, the chatbot landscape blossomed with systems like SBCS and A.L.I.C.E, adding layers of complexity and hinting at the future of human-AI interaction. The 2000s marked a significant leap - chatbots burgeoned across websites in the BOT era of the Internet as chat widgets. The 2010 decade witnessed the transition from widgets to voice assistants like Siri, Google Assistant, and Amazon Alexa, pushing the boundaries of chatbot tech from textual dialogues to the realm of voice, further integrating NLU into the fabric of daily life. All the forementioned systems still struggle to understand the user. Something which limited the scope of how far LEX, Dialogflow, SiriKit and RASA can be stretched.


Conversational Design: Why Behavioural research will be central to conversation architecture  

Understanding human language is an immense challenge, akin to navigating the depths of the Mariana Trench in a linguistics sense. The intricacies involved in comprehending sentiments and linguistic nuances are vast and complex. AI systems, despite their advancements, often grapple with the subtleties of context, cultural references, and the multifaceted nature of human emotions. This difficulty stems from the inherent variability and richness of human communication, where meanings shift with tone, context, and even unspoken implications. Which is where I lean on behavioural science, expert systems alongside Language and Vision Models for JEDi.

In 1965, Joseph was limited by the IBM 7094. In a way like in the Marvel Franchise’s Iron Man 2, Tony Stark sees his late father, Howard, speak to him and teach him through a monologue while watching a video archive - “I’m limited by the technology of my time”. I see a similar moment with the late Professor Weizenbaum talking to us to think beyond NLU & NLP. In 2024, with Vector Databases, with Elastic Search, with MoE architecture models, with Recommendation Engines, we are limited only by what we can imagine for this usecase. As developers, we need to be more prudent and start thinking beyond intelligent systems. As developers, that makes it all the more rational for us to design emotional systems. Beating the Turing test shouldn't be our only aspiration. In fact, sentience should be tested by a more nuanced and multifaceted test – one that evaluates social learning, tool use, creativity, cognition, and practicality. These dimensions encompass not just the ability to process and respond to language but to understand and adapt to the complexities of human behaviour and emotion.

In 2024, we need to learn behavioural science hand in hand with language and vision AI. Which is exactly what we do at Fitterfly when we are working on JEDi. We delve into the realms of user centric design, intertwining it with our technological expertise. For every .py there's a .fig going hand in hand. This interdisciplinary approach supplemented with a clinical bias allows us to create conversational agents that don't just process requests but understand the user’s emotional state, providing responses that are not only relevant but also empathetic and supportive. We envision a future where conversational coaching agents like JEDi become are integral to daily routines, helping us manage our health, offering support, and even providing companionship - an AI that embodies the spirit of intentional empathy. Try it out in the Fitterfly app, and let us know your thoughts!

 

To view or add a comment, sign in

Others also viewed

Explore topics