Contributors: Nicole Ren (GovTech), Ng Wei Cheng (GovTech)
VICA (Digital Clever Chat Assistant) is GovTech’s Digital Assistant platform that leverages Synthetic Intelligence (AI) to permit customers to create, prepare and deploy chatbots on their web sites. On the time of writing, VICA helps over 100 chatbots and handles over 700,000 consumer queries in a month.
Behind the scenes, VICA’s NLP engine makes use of assorted applied sciences and frameworks starting from conventional intent-matching methods to generative AI frameworks like Retrieval Augmented Era (RAG). By maintaining updated with state-of-the-art applied sciences, our engine is continually evolving, guaranteeing that each citizen’s question will get matched to the absolute best reply.
Past easy Query-And-Reply (Q&A) capabilities, VICA goals to supercharge chatbots by means of conversational transactions. Our aim is to say goodbye to the robotic and awkward form-like expertise inside a chatbot, and say good day to personalised conversations with human-like help.
This text is the primary in a two half article sequence to share extra concerning the generative AI options we have now inbuilt VICA. On this article, we are going to concentrate on how LLM brokers can assist enhance the transaction course of in chatbots by means of utilizing LangChain’s Agent Framework.
- Introduction
- All about LangChain
- LangChain in production
- Challenges of productionizing LangChain
- Use case of LLM Agents
- Conclusion
- Find out more about VICA
- Acknowledgements
- References
Transaction-based chatbots are conversational brokers designed to facilitate and execute particular transactions for customers. These chatbots transcend easy Q&A interactions that happen by permitting customers to carry out duties corresponding to reserving, buying, or kind submission straight inside the chatbot interface.
In an effort to carry out transactions, the chatbots must be personalized on the backend to deal with further consumer flows and make API calls.
With the rise of Giant Language Fashions (LLMs), it has opened new avenues for simplifying and enhancing the event of those options for chatbots. LLMs can significantly enhance a chatbot’s means to understand and reply to a variety of queries, serving to to handle complicated transactions extra successfully.
Although intent-matching chatbot methods exist already to information customers by means of predefined flows for transactions, LLMs provide vital benefits by sustaining context over multi-turn interactions and dealing with a variety of inputs and language variations. Beforehand, interactions usually felt awkward and stilted, as customers had been required to pick choices from premade playing cards or kind particular phrases with a purpose to set off a transaction movement. For instance, a slight variation from “Can I make a fee?” to “Let me pay, please” may forestall the transaction movement from triggering. In distinction, LLMs can adapt to varied communication kinds permitting them to interpret consumer enter that doesn’t match neatly into predefined intents.
Recognizing this potential, our staff determined to leverage LLMs for transaction processing, enabling customers to enter transaction flows extra naturally and flexibly by breaking down and understanding their intentions. On condition that LangChain affords a framework for implementing agentic workflows, we selected to make the most of their agent framework to create an clever system to course of transactions.
On this article, we can even share two use instances we developed that make the most of LLM Brokers, specifically The Division of Statistics (DOS) Statistic Desk Builder, and the Pure Dialog Facility Reserving chatbot.
Earlier than we cowl how we made use of LLM Brokers to carry out transactions, we are going to first share on what’s LangChain and why we opted to experiment with this framework.
What’s LangChain?
LangChain is an open-source Python framework designed to help builders in constructing AI powered functions leveraging LLMs.
Why use LangChain?
The framework helps to simplify the event course of by offering abstractions and templates that allow fast software constructing, saving time and decreasing the necessity for our growth staff to code all the things from scratch. This permits for us to concentrate on higher-level performance and enterprise logic somewhat than low-level coding particulars. An instance of that is how LangChain helps to streamline third get together integration with well-liked service suppliers like MongoDB, OpenAI, and AWS, facilitating faster prototyping and decreasing the complexity of integrating varied companies. These abstractions not solely speed up growth but additionally enhance collaboration by offering a constant construction, permitting our staff to effectively construct, check, and deploy AI functions.
What’s LangChain’s Agent Framework?
One of many fundamental options of utilizing Langchain is their agent framework. The framework permits for administration of clever brokers that work together with LLMs and different instruments to carry out complicated duties.
The three fundamental elements of the framework are
Brokers act as a reasoning engine as they determine the suitable actions to take and the order to take these actions. They make use of an LLM to make the choices for them. An agent has an AgentExecutor that calls the agent and executes the instruments the agent chooses. It additionally takes the output of the motion and passes it to the agent till the ultimate final result is reached.
Instruments are interfaces that the agent could make use of. In an effort to create a instrument, a reputation and outline must be supplied. The outline and identify of the instrument are vital as it will likely be added into the agent immediate. Because of this the agent will determine the instrument to make use of primarily based on the identify and outline supplied.
A series check with sequences of calls. The chain will be coded out steps or only a name to an LLM or a instrument. Chains will be personalized or be used off-the-shelf primarily based on what LangChain supplies. A easy instance of a series is LLMChain, a series that run queries in opposition to LLMs.
How did we use LangChain in VICA?
In VICA, we arrange a microservice for LangChain invoked by means of REST API. This helps to facilitate integration by permitting completely different elements of VICA to speak with LangChain independently. Because of this, we will effectively construct our LLM agent with out being affected by modifications or growth in different elements of the system.
LangChain as a framework is fairly intensive with regards to the LLM area, overlaying retrieval strategies, brokers and LLM analysis. Listed below are the elements we made use of when creating our LLM Agent.
ReAct Agent
In VICA, we made use of a single agent system. The agent makes use of ReAct logic to find out the sequence of actions to take (Yao et al., 2022). This immediate engineering method will assist generate the next:
- Thought (Reasoning taken earlier than selecting the motion)
- Motion (Motion to take, usually a instrument)
- Motion Enter (Enter to the motion)
- Commentary (Commentary from the instrument output)
- Last Reply (Generative closing reply that the agent returns)
> Coming into new AgentExecutor chain…
The consumer desires to know the climate right this moment
Motion: Climate Device
Motion Enter: "Climate right this moment"
Commentary: Reply: "31 Levels Celsius, Sunny"
Thought: I now know the ultimate reply.
Last Reply: The climate right this moment is sunny at 31 levels celsius.
> Completed chain.
Within the above instance, the agent was capable of perceive the consumer’s intention prior to picking the instrument to make use of. There was additionally verbal reasoning being generated that helps the mannequin plan the sequence of motion to take. If the remark is inadequate to reply the query given, the agent can cycle to a distinct motion with a purpose to get nearer to the ultimate reply.
In VICA, we edited the agent immediate to raised swimsuit our use case. The bottom immediate supplied by LangChain (link here) is mostly ample for most typical use instances, serving as an efficient place to begin. Nonetheless, it may be modified to reinforce efficiency and guarantee larger relevance to particular functions. This may be carried out by utilizing a customized immediate earlier than passing it as a parameter to the create_react_agent (is likely to be completely different primarily based in your model of LangChain).
To find out if our customized immediate was an enchancment, we employed an iterative immediate engineering strategy: Write, Consider and Refine (more details here). This course of ensured that the immediate generalized successfully throughout a broad vary of check instances. Moreover, we used the bottom immediate supplied by LangChain as a benchmark to judge our customized prompts, enabling us to evaluate their efficiency with various further context throughout varied transaction situations.
Customized Instruments & Chains (Immediate Chaining)
For the 2 customized chatbot options on this article, we made use of customized instruments that our Agent could make use of to carry out transactions. Our customized instruments make use of immediate chaining to breakdown and perceive a consumer’s request earlier than deciding what to do within the specific instrument.
Immediate chaining is a way the place a number of prompts are utilized in sequence to deal with complicated duties or queries. It includes beginning with an preliminary immediate and utilizing its output as enter for subsequent prompts, permitting for iterative refinement and contextual continuity. This technique enhances the dealing with of intricate queries, improves accuracy, and maintains coherence by progressively narrowing down the main target.
For every transaction use case, we broke the method into a number of steps, permitting us to offer clearer directions to the LLM at every stage. This technique improves accuracy by making duties extra particular and manageable. We can also inject localized context into the prompts, which clarifies the targets and enhances the LLM’s understanding. Based mostly on the LLM’s reasoning, our customized chains will make requests to exterior APIs to collect information to carry out the transaction.
At each step of immediate chaining, it’s essential to implement error dealing with, as LLMs can typically produce hallucinations or inaccurate responses. By incorporating error dealing with mechanisms corresponding to validation checks, we recognized and addressed inconsistencies or errors within the outputs. This allowed us to generate fallback responses to our customers that defined what the LLM didn’t purpose at.
Lastly, in our customized instrument, we avoided merely utilizing the LLM generated output as the ultimate response as a result of threat of hallucination. As a citizen dealing with chatbot, it’s essential to forestall our chatbots from disseminating any deceptive or inaccurate data. Subsequently, we make sure that all responses to consumer queries are derived from precise information factors retrieved by means of our customized chains. We then format these information factors into pre-defined responses, guaranteeing that customers don’t see any direct output generated by the LLM.
Challenges of utilizing LLMs
Problem #1: Immediate chaining results in gradual inference time
A problem with LLMs is their inference instances. LLMs have excessive computational calls for attributable to their giant variety of parameters and having to be referred to as repeatedly for actual time processing, resulting in comparatively gradual inference instances (a number of seconds per immediate). VICA is a chatbot that will get 700,000 queries in a month. To make sure a great consumer expertise, we purpose to supply our responses as shortly as attainable whereas guaranteeing accuracy.
Immediate chaining will increase the consistency, controllability and reliability of LLM outputs. Nonetheless, every further chain we incorporate considerably slows down our answer because it necessitates making an additional LLM request. To steadiness simplicity with effectivity, we set a tough restrict on the variety of chains to forestall extreme wait instances for customers. We additionally opted to not use higher performing LLM fashions corresponding to GPT-4 attributable to their slower pace, however opted for quicker however typically nicely performing LLMs.
Problem #2 :Hallucination
As seen within the current incident with Google’s function, AI Overview, having LLMs producing outputs can result in inaccurate or non-factual particulars. Although grounding the LLM makes it extra constant and fewer prone to hallucinate, it doesn’t remove hallucination.
As talked about above, we made use of immediate chaining to carry out reasoning duties for transactions by breaking it down into smaller, simpler to know duties. By chaining LLMs, we’re capable of extract the data wanted to course of complicated queries. Nonetheless, for the ultimate output, we crafted non-generative messages as the ultimate response from the reasoning duties that the LLM performs. Because of this in VICA, our customers don’t see generated responses from our LLM Agent.
Problem #1: An excessive amount of abstraction
The primary problem with LangChain is that the framework abstracts away too many particulars, making it very troublesome to customise functions for particular actual world use instances.
In an effort to overcome such limitations, we needed to delve into the package deal and customise sure courses to raised swimsuit our use case. For example, we modified the AgentExecutor class to route the ReAct agent’s motion enter into the instrument that was chosen. This gave our customized instruments further context that helped with extracting data from consumer queries.
Problem #2: Lack of documentation
The second problem is the shortage of documentation and the consistently evolving framework. This makes growth troublesome because it takes time to know how the framework works by means of trying on the package deal code. There’s additionally an absence of consistency on how issues work, making it troublesome to choose issues up as you go. Additionally with fixed updates on current courses, an improve in model may end up in beforehand working code out of the blue breaking.
If you’re planning to make use of LangChain in manufacturing, an recommendation could be to repair your manufacturing model and check earlier than upgrading.
Use case #1: Division of Statistics (DOS) Desk builder
In the case of taking a look at statistical information about Singapore, customers can discover it troublesome to seek out and analyze the data that they’re in search of. To handle this problem, we got here up with a POC that goals to extract and current statistical information in a desk format as a function in our chatbot.
As DOS’s API is open for public use, we made use of the API documentation that was supplied of their web site. Utilizing LLM’s pure language understanding capabilities, we handed the API documentation into the immediate. The LLM was then tasked to choose the proper API endpoint primarily based on what the statistical information that the consumer was asking for. This meant that customers may ask for statistical data for annual/half-yearly/quarterly/month-to-month information in share change/absolute values in a given time filter. For instance, we’re capable of question particular data corresponding to “GDP for Development in 2022” or “CPI in quarter 1 for the previous 3 years”.
We then did additional immediate chaining to interrupt the duty down much more, permitting for extra consistency in our closing output. The queries had been then processed to generate the statistics supplied in a desk. As all the data had been obtained from the API, not one of the numbers displayed are generated by LLMs thus avoiding any threat of spreading non-factual data.
Use case #2: Pure Dialog Facility Reserving Chatbot
In right this moment’s digital age, the vast majority of bookings are carried out by means of on-line web sites. Relying on the consumer interface, it could possibly be a course of that entails sifting by means of quite a few dates to safe an accessible slot, making it troublesome as you may have to look by means of a number of dates to seek out an accessible reserving slot.
Reserving by means of pure dialog may simplify this course of. By simply typing one line corresponding to “I wish to ebook a badminton courtroom at Fengshan at 9.30 am”, you’ll be capable to get a reserving or suggestions from a digital assistant.
In the case of reserving a facility, there are three issues we want from a consumer:
- The power kind (e.g. Badminton, Assembly room, Soccer)
- Location (e.g. Ang Mo Kio, Maple Tree Enterprise Centre, Hive)
- Date (this week, 26 Feb, right this moment)
As soon as we’re capable of detect these data from pure language, we will create a customized reserving chatbot that’s reusable for a number of use instances (e.g. the reserving of hotdesk, reserving of sports activities services, and so forth).
The above instance illustrates a consumer inquiring concerning the availability of a soccer discipline at 2.30pm. Nonetheless, the consumer is lacking a required data which is the date. Subsequently, the chatbot will ask a clarifying query to acquire the lacking date. As soon as the consumer supplies the date, the chatbot will course of this multi-turn dialog and try to seek out any accessible reserving slots that matches the consumer’s request. As there was a reserving slot that matches the consumer’s precise description, the chatbot will current this data as a desk.
If there are not any accessible reserving slots accessible, our facility reserving chatbot would broaden the search, exploring completely different timeslots or growing the search date vary. It might additionally try and suggest customers accessible reserving slots primarily based on their earlier question if there their question ends in no accessible bookings. This goals to reinforce the consumer expertise by eliminating the necessity to filter out unavailable dates when making a reserving, saving customers the trouble and time.
As a result of we use LLMs as our reasoning engine, a further profit is their multilingual capabilities, which allow them to purpose and reply to customers writing in numerous languages.
The instance above illustrates the chatbot’s means to precisely course of the proper facility, dates, and site from the consumer’s message that was written in Korean to offer the suitable non-generative response though there are not any accessible slots for the date vary supplied.
What we demonstrated was a short instance of how our LLM Agent handles facility reserving transactions. In actuality, the precise answer is much more complicated, having the ability to give a number of accessible bookings for a number of areas, deal with postal codes, deal with areas too removed from the said location, and so forth. Though we would have liked to make some modifications to the package deal to suit our particular use case, LangChain’s Agent Framework was helpful in serving to us chain a number of prompts collectively and use their outputs within the ReAct Agent.
Moreover, we designed this personalized answer to be simply extendable to any comparable reserving system that requires reserving by means of pure language.
On this first a part of our sequence, we explored how GovTech’s Digital Clever Chat Assistant (VICA) leverages LLM Brokers to reinforce chatbot capabilities, significantly for transaction-based chatbots.
By integrating LangChain’s Agent Framework into VICA’s structure, we demonstrated its potential by means of the Division of Statistics (DOS) Desk Builder and Facility Reserving Chatbot use instances. These examples spotlight how LangChain can streamline complicated transaction interactions, enabling chatbots to deal with transaction associated duties like information retrieval and reserving by means of pure dialog.
LangChain affords options to shortly develop and prototype subtle chatbot options, permitting builders to harness the facility of huge language fashions effectively. Nonetheless, challenges like inadequate documentation and extreme abstraction can result in elevated upkeep efforts as customizing the framework to suit particular wants might require vital time and assets. Subsequently, evaluating an in-house answer may provide larger long run customizability and stability.
Within the subsequent article, we can be overlaying how chatbot engines will be improved by means of understanding multi-turn conversations.
Curious concerning the potential of AI chatbots? If you’re a Singapore public service officer, you may go to our web site at https://www.vica.gov.sg/ to create your individual customized chatbot and discover out extra!
Particular because of Wei Jie Kong for establishing necessities for the Facility Reserving Chatbot. We additionally want to thank Justin Wang and Samantha Yom, our hardworking interns, for his or her preliminary work on the DOS Desk builder.
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, Ok., & Cao, Y. (2022). React: Synergizing reasoning and performing in language fashions. arXiv preprint arXiv:2210.03629.