Summarization as Compression in Search Engine Structure
Serving a big textual content search index in manufacturing generally is a pricey factor. These prices develop proportionally as the scale of the dataset being listed grows and will get costlier when the queries per second develop. Due to the best way search works, conventional disk compression isn’t an excellent choice, leaving architects in a scenario the place they have to stability price and latency.
Retaining index sizes as small as attainable gives efficiency price, and environmental advantages. Smaller indexes will be searched extra rapidly by the identical {hardware} and may typically be held in reminiscence as a substitute of on disk. An efficient compression algorithm for search might drastically scale back the {hardware} necessities for a lot of search purposes.
Enter summarization as compression. The appearance of AI/ML and fashions akin to Bert, Titan and Llama permit for the fast and reasonably priced summarization of enormous textual content information units, thereby shrinking the scale of the index at the price of a lack of precision. Including a compression step in a search engine indexing pipeline can dramatically alter the efficiency of a search utility.
How does it work? Every doc is submitted to a summarization mannequin, and the abstract is saved within the index slightly than the unique textual content. Queries will probably be quicker however much less exact. Think about a textual content index of Wikipedia. Compression as summarization might dramatically scale back the index dimension whereas preserving the power to rapidly direct individuals to articles.
What if I want exact outcomes? Retaining an index of the unique textual content would possibly nonetheless be vital used alongside compressed indexes. The unique textual content might be supplied as an “develop your search” choice. Compressed indexes used as a triage step catch many of the queries making a low QPS authentic textual content index possible.
What about decompression? Retrieving the unique textual content from the database or s3 bucket the place it’s saved would successfully be decompression.
It is a very new idea made attainable by the low price of summarization with ML fashions. I’ve finished some preliminary experiments with it and the outcomes appear to be promising, and I’m in search of an opportunity to make use of this system in manufacturing.