Time: 10 minLevel: BeginnerOpen In Colab

Gemini

Qdrant is compatible with Gemini Embedding Model API and its official Python SDK that can be installed as any other package:

Gemini is a new family of Google PaLM models, released in December 2023. The new embedding models succeed the previous Gecko Embedding Model.

In the latest models, an additional parameter, task_type, can be passed to the API call. This parameter serves to designate the intended purpose for the embeddings utilized.

The Embedding Model API supports various task types, outlined as follows:

  1. retrieval_query: query in a search/retrieval setting
  2. retrieval_document: document from the corpus being searched
  3. semantic_similarity: semantic text similarity
  4. classification: embeddings to be used for text classification
  5. clustering: the generated embeddings will be used for clustering
  6. task_type_unspecified: Unset value, which will default to one of the other values.

If you’re building a semantic search application, such as RAG, you should use task_type="retrieval_document" for the indexed documents and task_type="retrieval_query" for the search queries.

The following example shows how to do this with Qdrant:

Setup

pip install google-generativeai

Let’s see how to use the Embedding Model API to embed a document for retrieval.

The following example shows how to embed a document with the models/embedding-001 with the retrieval_document task type:

Embedding a document

import google.generativeai as gemini_client
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, PointStruct, VectorParams

collection_name = "example_collection"

GEMINI_API_KEY = "YOUR GEMINI API KEY"  # add your key here

client = QdrantClient(url="http://localhost:6333")
gemini_client.configure(api_key=GEMINI_API_KEY)
texts = [
    "Qdrant is a vector database that is compatible with Gemini.",
    "Gemini is a new family of Google PaLM models, released in December 2023.",
]

results = [
    gemini_client.embed_content(
        model="models/embedding-001",
        content=sentence,
        task_type="retrieval_document",
        title="Qdrant x Gemini",
    )
    for sentence in texts
]

Creating Qdrant Points and Indexing documents with Qdrant

Creating Qdrant Points

points = [
    PointStruct(
        id=idx,
        vector=response['embedding'],
        payload={"text": text},
    )
    for idx, (response, text) in enumerate(zip(results, texts))
]

Create Collection

client.create_collection(collection_name, vectors_config=
    VectorParams(
        size=768,
        distance=Distance.COSINE,
    )
)

Add these into the collection

client.upsert(collection_name, points)

Searching for documents with Qdrant

Once the documents are indexed, you can search for the most relevant documents using the same model with the retrieval_query task type:

client.search(
    collection_name=collection_name,
    query_vector=gemini_client.embed_content(
        model="models/embedding-001",
        content="Is Qdrant compatible with Gemini?",
        task_type="retrieval_query",
    )["embedding"],
)

Using Gemini Embedding Models with Binary Quantization

You can use Gemini Embedding Models with Binary Quantization - a technique that allows you to reduce the size of the embeddings by 32 times without losing the quality of the search results too much.

In this table, you can see the results of the search with the models/embedding-001 model with Binary Quantization in comparison with the original model:

At an oversampling of 3 and a limit of 100, we’ve a 95% recall against the exact nearest neighbors with rescore enabled.

Oversampling112233
RescoreFalseTrueFalseTrueFalseTrue
Limit
100.5233330.8311110.5233330.9155560.5233330.950000
200.5100000.8366670.5100000.9122220.5100000.937778
500.4891110.8415560.4891110.9133330.4884440.947111
1000.4857780.8465560.4855560.9290000.4860000.956333

That’s it! You can now use Gemini Embedding Models with Qdrant!

Gemini