Integrating LLMs in present internet functions is changing into the norm. Additionally, there are an increasing number of AI native firms. These create autonomous brokers placing the LLM within the heart and giving it instruments permitting it to carry out actions on totally different methods.
On this submit I’ll current a brand new undertaking referred to as Offload, which lets you transfer all that processing to the consumer units, growing their knowledge privateness and decreasing the inference prices.
The are two massive considerations when integrating AI into an software: Value and consumer knowledge privateness.
1. Value. The standard strategy to join an LLM is to make use of a third-party API, like OpenAI, Anthropic, or others, there are numerous alternate options available in the market. These APIs are very sensible, with simply an HTTP request you’ll be able to simply combine an LLM into your software. Nevertheless, these APIs are costly at scale. They’re placing massive efforts into decreasing the fee, however when you make many API calls per consumer per day the invoice turns into large.
2. Person knowledge privateness. Utilizing third-party APIs for inference shouldn’t be the perfect different when you work with delicate consumer knowledge. These APIs usually use the info you ship to proceed coaching the mannequin which might expose your confidential knowledge. Additionally, the info may turn into seen at some degree when it reaches the third-party API supplier (for instance in a logging system). This isn’t only a downside for firms, but in addition for shoppers that won’t need to ship their knowledge to these API suppliers.
Offload addresses each issues without delay. The applying “invokes” the LLM through an SDK that behind the scenes runs the mannequin instantly on every consumer machine as an alternative of calling a third-party API. This protects cash on the inference invoice as a result of you don’t want to pay for API utilization and preserve the consumer knowledge inside every consumer machine, not needing to ship it to any API.