Hybrid and Vector Searches with Elasticsearch

Semantic searches allow finding relevant information even if there are no classic keyword matches. Recent research has shown that combinations of semantic and classic keyword searches often outperform these separate search options.

Elasticsearch provides Hybrid Retrieval to run both searches in parallel to get the best of both options.

My previous posts describe semantic searches:

IBM provides the new Elasticsearch capabilities in the new offering Watsonx Discovery.

Overview

It is known that lexical retrievers (such as BM25) and semantic retrievers (like ELSER) are somewhat complementary. Specifically, it will improve relevance to combine the results of retrieval methods, if one assumes that the more matches occur between the relevant documents they retrieve than between the irrelevant documents they retrieve.

The following diagram shows the performance improvements:

Reciprocal Rank Fusion

Reciprocal Rank Fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set. RRF requires no tuning, and the different relevance indicators do not have to be related to each other to achieve high-quality results. The only drawback is that currently the query latency is increased as the two queries are performed sequentially in Elasticsearch. This is mitigated by the fact that BM25 retrieval is typically faster than semantic retrieval.

Essentially RRF normalizes the scores of the two searches to make them comparable and to create one ordered list of search results. If the same documents/passages/chuncks are returned by both searches, they are ranked higher. With the parameters ‘rank_constant’ and ‘window_size’ hybrid searches can be configured, e.g., to define how to ensure that documents that are only returned once don’t get lost.

Query

In the following example two sub searches are run independently of each other.

  
GET elser-index/_search
{
  "sub_searches": [
    {
      "query": {
        "multi_match" : {
          "query":question,
          "type":"most_field",
          "fields":[ "title", "text"]
        }
      }
    },
    {
      "query": {
        "text_expansion": {
          "my_embeddings.tokens": {
            "model_id": ".elser_model_1",
            "model_text": question
          }
        }
      }
    }
  ],
  "rank": {
    "rrf": {
          "window_size": 50,
          "rank_constant": 20
    }
  }
}

Next Steps

To learn more, check out the Watsonx.ai documentation and the Watsonx.ai landing page.