Jina Embeddings

Qdrant is compatible with Jina AI embeddings. You can get a free trial key from Jina Embeddings to get embeddings.

Qdrant users can receive a 10% discount on Jina AI APIs by using the code QDRANT.

Technical Summary

ModelDimensionLanguageMRL (matryoshka)Context
jina-embeddings-v31024Multilingual (89 languages)Yes8192
jina-embeddings-v2-base-en768EnglishNo8192
jina-embeddings-v2-base-de768German & EnglishNo8192
jina-embeddings-v2-base-es768Spanish & EnglishNo8192
jina-embeddings-v2-base-zh768Chinese & EnglishNo8192

Jina recommends using jina-embeddings-v3 as it is the latest and most performant embedding model released by Jina AI.

On top of the backbone, jina-embeddings-v3 has been trained with 5 task-specific adapters for different embedding uses. Include task in your request to optimize your downstream application:

  • retrieval.query: Used to encode user queries or questions in retrieval tasks.
  • retrieval.passage: Used to encode large documents in retrieval tasks at indexing time.
  • classification: Used to encode text for text classification tasks.
  • text-matching: Used to encode text for similarity matching, such as measuring similarity between two sentences.
  • separation: Used for clustering or reranking tasks.

jina-embeddings-v3 supports Matryoshka Representation Learning, allowing users to control the embedding dimension with minimal performance loss.
Include dimensions in your request to select the desired dimension.
By default, dimensions is set to 1024, and a number between 256 and 1024 is recommended.
You can reference the table below for hints on dimension vs. performance:

Dimension32641282565127681024
Average Retrieval Performance (nDCG@10)52.5458.5461.6462.7263.1663.363.35

jina-embeddings-v3 supports Late Chunking, the technique to leverage the model’s long-context capabilities for generating contextual chunk embeddings. Include late_chunking=True in your request to enable contextual chunked representation. When set to true, Jina AI API will concatenate all sentences in the input field and feed them as a single string to the model. Internally, the model embeds this long concatenated string and then performs late chunking, returning a list of embeddings that matches the size of the input list.

Example

The code below demonstrate how to use jina-embeddings-v3 together with Qdrant:

import requests

import qdrant_client
from qdrant_client.models import Distance, VectorParams, Batch

# Provide Jina API key and choose one of the available models.
JINA_API_KEY = "jina_xxxxxxxxxxx"
MODEL = "jina-embeddings-v3"
DIMENSIONS = 1024 # Or choose your desired output vector dimensionality.
TASK = 'retrieval.passage' # For indexing, or set to retrieval.query for quering

# Get embeddings from the API
url = "https://api.jina.ai/v1/embeddings"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {JINA_API_KEY}",
}

data = {
    "input": ["Your text string goes here", "You can send multiple texts"],
    "model": MODEL,
    "dimensions": DIMENSIONS,
    "task": TASK,
    "late_chunking": True,
}

response = requests.post(url, headers=headers, json=data)
embeddings = [d["embedding"] for d in response.json()["data"]]


# Index the embeddings into Qdrant
client = qdrant_client.QdrantClient(":memory:")
client.create_collection(
    collection_name="MyCollection",
    vectors_config=VectorParams(size= DIMENSIONS, distance=Distance.DOT),
)


qdrant_client.upsert(
    collection_name="MyCollection",
    points=Batch(
        ids=list(range(len(embeddings))),
        vectors=embeddings,
    ),
)

Jina Embeddings