Hybrid search
Whereas semantic search utilizing vector embeddings performs effectively for capturing rephrased or paraphrased meanings, it won’t do effectively on searches that contain uncommon phrases or jargon. In these instances, combining semantic search with the extra conventional sparse retrieval strategies (BM25 or TF-IDF), which incorporate elements like key phrase frequency, usually helps enhance the retrieval course of. With a view to incorporate each of these kinds of retrieval mechanisms, you might have chunks be assigned each scores, with the ultimate rating being a weighted mixture of the 2, or you might use sparse retrieval as a first-pass filter adopted by semantic search.
Reranking – the ultimate step
After you have run the preliminary search to retrieve related chunks, performing a last step of rating these outcomes helps to make sure that probably the most helpful info is introduced to the consumer. The rationale for that is that though the chunks would possibly technically be comparable, they may not be probably the most useful reply to the consumer’s question.
There are just a few alternative ways wherein reranking is finished in follow. One method is to make use of heuristics on sure metadata of the chunks, such because the creator, date, supply reliability, and so forth. A good thing about this method is that it’s often computationally cheap and quick.