Gemini

Qdrant is compatible with Gemini Embedding Model API and its official Python SDK that can be installed as any other package:

Gemini is a new family of Google PaLM models, released in December 2023. The new embedding models succeed the previous Gecko Embedding Model.

In the latest models, an additional parameter, task_type, can be passed to the API call. This parameter serves to designate the intended purpose for the embeddings utilized.

The Embedding Model API supports various task types, outlined as follows:

  1. retrieval_query: Specifies the given text is a query in a search/retrieval setting.
  2. retrieval_document: Specifies the given text is a document from the corpus being searched.
  3. semantic_similarity: Specifies the given text will be used for Semantic Text Similarity.
  4. classification: Specifies that the given text will be classified.
  5. clustering: Specifies that the embeddings will be used for clustering.
  6. task_type_unspecified: Unset value, which will default to one of the other values.

If you’re building a semantic search application, such as RAG, you should use task_type="retrieval_document" for the indexed documents and task_type="retrieval_query" for the search queries.

The following example shows how to do this with Qdrant:

Setup

pip install google-generativeai

Let’s see how to use the Embedding Model API to embed a document for retrieval.

The following example shows how to embed a document with the models/embedding-001 with the retrieval_document task type:

Embedding a document

import pathlib
import google.generativeai as genai
import qdrant_client

GEMINI_API_KEY = "YOUR GEMINI API KEY"  # add your key here

genai.configure(api_key=GEMINI_API_KEY)

result = genai.embed_content(
    model="models/embedding-001",
    content="Qdrant is the best vector search engine to use with Gemini",
    task_type="retrieval_document",
    title="Qdrant x Gemini",
)

The returned result is a dictionary with a key: embedding. The value of this key is a list of floats representing the embedding of the document.

Indexing documents with Qdrant

from qdrant_client.http.models import Batch

qdrant_client = qdrant_client.QdrantClient()
qdrant_client.upsert(
    collection_name="GeminiCollection",
    points=Batch(
        ids=[1],
        vectors=genai.embed_content(
            model="models/embedding-001",
            content="Qdrant is the best vector search engine to use with Gemini",
            task_type="retrieval_document",
            title="Qdrant x Gemini",
        )["embedding"],
    ),
)

Searching for documents with Qdrant

Once the documents are indexed, you can search for the most relevant documents using the same model with the retrieval_query task type:

qdrant_client.search(
    collection_name="GeminiCollection",
    query=genai.embed_content(
        model="models/embedding-001",
        content="What is the best vector database to use with Gemini?",
        task_type="retrieval_query",
    )["embedding"],
)

Using Gemini Embedding Models with Binary Quantization

You can use Gemini Embedding Models with Binary Quantization - a technique that allows you to reduce the size of the embeddings by 32 times without losing the quality of the search results too much.

In this table, you can see the results of the search with the models/embedding-001 model with Binary Quantization in comparison with the original model:

At an oversampling of 3 and a limit of 100, we’ve a 95% recall against the exact nearest neighbors with rescore enabled.

oversampling112233
limit
rescoreFalseTrueFalseTrueFalseTrue
100.5233330.8311110.5233330.9155560.5233330.950000
200.5100000.8366670.5100000.9122220.5100000.937778
500.4891110.8415560.4891110.9133330.4884440.947111
1000.4857780.8465560.4855560.9290000.4860000.956333

That’s it! You can now use Gemini Embedding Models with Qdrant!