The points are the central entity that Qdrant operates with. A point is a record consisting of a vector and an optional payload.
You can search among the points grouped in one collection based on vector similarity. This procedure is described in more detail in the search and filtering sections.
This section explains how to create and manage vectors.
Any point modification operation is asynchronous and takes place in 2 steps. At the first stage, the operation is written to the Write-ahead-log.
After this moment, the service will not lose the data, even if the machine loses power supply.
Awaiting result
If the API is called with the &wait=false
parameter, or if it is not explicitly specified, the client will receive an acknowledgment of receiving data:
{
"result": {
"operation_id": 123,
"status": "acknowledged"
},
"status": "ok",
"time": 0.000206061
}
This response does not yet mean that the data is available for retrieval, as it is only added to the collection in the second step. Actual addition to the collection happens in the background, and if you are doing initial vector loading, we recommend using asynchronous requests to take advantage of pipelining.
If the logic of your application requires a guarantee that the vector will be available for searching immediately after the API execution, then use the flag ?wait=true
.
In this case, the API will return the result only after the operation is finished:
{
"result": {
"operation_id": 0,
"status": "completed"
},
"status": "ok",
"time": 0.000206061
}
Point IDs
Qdrant supports using both 64-bit unsigned integers
and UUID
as identifiers for points.
Examples of UUID string representations:
- simple:
936DA01F9ABD4d9d80C702AF85C822A8
- hyphenated:
550e8400-e29b-41d4-a716-446655440000
- urn:
urn:uuid:F9168C5E-CEB2-4faa-B6BF-329BF39FA1E4
That means that in every request UUID string could be used instead of numerical id. Example:
PUT /collections/{collection_name}/points
{
"points": [
{
"id": "5c56c793-69f3-4fbf-87e6-c4bf54c28c26",
"payload": {"color": "red"},
"vector": [0.9, 0.1, 0.1]
}
]
}
from qdrant_client import QdrantClient
from qdrant_client.http import models
client = QdrantClient("localhost", port=6333)
client.upsert(
collection_name="{collection_name}",
points=[
models.PointStruct(
id="5c56c793-69f3-4fbf-87e6-c4bf54c28c26",
payload={
"color": "red",
},
vector=[0.9, 0.1, 0.1],
),
]
)
and
PUT /collections/{collection_name}/points
{
"points": [
{
"id": 1,
"payload": {"color": "red"},
"vector": [0.9, 0.1, 0.1]
}
]
}
client.upsert(
collection_name="{collection_name}",
points=[
models.PointStruct(
id=1,
payload={
"color": "red",
},
vector=[0.9, 0.1, 0.1],
),
]
)
are both possible.
Upload points
To optimize performance, Qdrant supports batch loading of points. I.e., you can load several points into the service in one API call. Batching allows you to minimize the overhead of creating a network connection.
The Qdrant API supports two ways of creating batches - record-oriented and column-oriented. Internally, these options do not differ and are made only for the convenience of interaction.
Create points with REST API :
PUT /collections/{collection_name}/points
{
"batch": {
"ids": [1, 2, 3],
"payloads": [
{"color": "red"},
{"color": "green"},
{"color": "blue"}
],
"vectors": [
[0.9, 0.1, 0.1],
[0.1, 0.9, 0.1],
[0.1, 0.1, 0.9]
]
}
}
client.upsert(
collection_name="{collection_name}",
points=models.Batch(
ids=[1, 2, 3],
payloads=[
{"color": "red"},
{"color": "green"},
{"color": "blue"},
],
vectors=[
[0.9, 0.1, 0.1],
[0.1, 0.9, 0.1],
[0.1, 0.1, 0.9],
]
),
)
or record-oriented equivalent:
PUT /collections/{collection_name}/points
{
"points": [
{
"id": 1,
"payload": {"color": "red"},
"vector": [0.9, 0.1, 0.1]
},
{
"id": 2,
"payload": {"color": "green"},
"vector": [0.1, 0.9, 0.1]
},
{
"id": 3,
"payload": {"color": "blue"},
"vector": [0.1, 0.1, 0.9]
},
]
}
client.upsert(
collection_name="{collection_name}",
points=[
models.PointStruct(
id=1,
payload={
"color": "red",
},
vector=[0.9, 0.1, 0.1],
),
models.PointStruct(
id=2,
payload={
"color": "green",
},
vector=[0.1, 0.9, 0.1],
),
models.PointStruct(
id=3,
payload={
"color": "blue",
},
vector=[0.1, 0.1, 0.9],
),
]
)
All APIs in Qdrant, including point loading, are idempotent. It means that executing the same method several times in a row is equivalent to a single execution.
In this case, it means that points with the same id will be overwritten when re-uploaded.
Idempotence property is useful if you use, for example, a message queue that doesn’t provide an exactly-ones guarantee. Even with such a system, Qdrant ensures data consistency.
Available since v0.10.0
If the collection was created with multiple vectors, each vector data should be provided using the vectors name:
PUT /collections/{collection_name}/points
{
"points": [
{
"id": 1,
"vector": {
"image": [0.9, 0.1, 0.1, 0.2],
"text": [0.4, 0.7, 0.1, 0.8, 0.1, 0.1, 0.9, 0.2]
}
},
{
"id": 2,
"vector": {
"image": [0.2, 0.1, 0.3, 0.9],
"text": [0.5, 0.2, 0.7, 0.4, 0.7, 0.2, 0.3, 0.9]
}
}
]
}
client.upsert(
collection_name="{collection_name}",
points=[
models.PointStruct(
id=1,
vector={
"image": [0.9, 0.1, 0.1, 0.2],
"text": [0.4, 0.7, 0.1, 0.8, 0.1, 0.1, 0.9, 0.2],
},
),
models.PointStruct(
id=2,
vector={
"image": [0.2, 0.1, 0.3, 0.9],
"text": [0.5, 0.2, 0.7, 0.4, 0.7, 0.2, 0.3, 0.9],
},
),
]
)
Modify points
You can modify a point in two ways. The first is to modify the vector. Currently, you would need to re-upload the point to modify the vector.
The second is to modify the payload, for which there are several methods.
Set payload
REST API (Schema):
POST /collections/{collection_name}/points/payload
{
"payload": {
"property1": "string",
"property2": "string"
},
"points": [
0, 3, 100
]
}
client.set_payload(
collection_name="{collection_name}",
payload={
"property1": "string",
"property2": "string",
},
points=[0, 3, 10],
)
Delete payload keys
REST API (Schema):
POST /collections/{collection_name}/points/payload/delete
{
"keys": ["color", "price"],
"points": [0, 3, 100]
}
client.delete_payload(
collection_name="{collection_name}",
keys=["color", "price"],
points=[0, 3, 100],
)
Clear payload
This method removes all payload keys from specified points
REST API (Schema):
POST /collections/{collection_name}/points/payload/clear
{
"points": [0, 3, 100]
}
client.clear_payload(
collection_name="{collection_name}",
points_selector=models.PointIdsList(
points=[0, 3, 100],
)
)
Delete points
REST API (Schema):
POST /collections/{collection_name}/points/delete
{
"points": [0, 3, 100]
}
client.delete(
collection_name="{collection_name}",
points_selector=models.PointIdsList(
points=[0, 3, 100],
),
)
Alternative way to specify which points to remove is to use filter.
POST /collections/{collection_name}/points/delete
{
"filter": {
"must": [
{
"key": "color"
"match": {
"value": "red"
}
}
]
}
}
client.delete(
collection_name="{collection_name}",
points_selector=models.FilterSelector(
filter=models.Filter(
must=[
models.FieldCondition(
key="color",
match=models.MatchValue(value="red"),
),
],
)
),
)
This example removes all points with { "color": "red" }
from the collection.
Retrieve points
There is a method for retrieving points by their ids.
REST API (Schema):
POST /collections/{collection_name}/points
{
"ids": [0, 3, 100]
}
client.retrieve(
collection_name="{collection_name}",
ids=[0, 3, 10],
)
This method has additional parameters with_vector
and with_payload
.
Using these parameters, you can select parts of the point you want as a result.
Excluding helps you not to waste traffic transmitting useless data.
The single point can also be retrieved via the API:
REST API (Schema):
GET /collections/{collection_name}/points/{point_id}
Scroll points
Sometimes it might be necessary to get all stored points without knowing ids, or iterate over points that correspond to a filter.
REST API (Schema):
POST /collections/{collection_name}/points/scroll
{
"filter": {
"must": [
{
"key": "color",
"match": {
"value": "red"
}
}
]
},
"limit": 1,
"with_payload": true,
"with_vector": false
}
client.scroll(
collection_name="{collection_name}",
scroll_filter=models.Filter(
must=[
models.FieldCondition(
key="color",
match=models.MatchValue(value="red")
),
]
),
limit=1,
with_payload=True,
with_vector=False,
)
Returns all point with color
= red
.
{
"result": {
"next_page_offset": 1,
"points": [
{
"id": 0,
"payload": {
"color": "red"
}
}
]
},
"status": "ok",
"time": 0.0001
}
The Scroll API will return all points that match the filter in a page-by-page manner.
All resulting points are sorted by ID. To query the next page it is necessary to specify the largest seen ID in the offset
field.
For convenience, this ID is also returned in the field next_page_offset
.
If the value of the next_page_offset
field is null
- the last page is reached.
Counting points
Available since v0.8.4
Sometimes it can be useful to know how many points fit the filter conditions without doing a real search.
Among others, for example, we can highlight the following scenarios:
- Evaluation of results size for faceted search
- Determining the number of pages for pagination
- Debugging the query execution speed
REST API (Schema):
POST /collections/{collection_name}/points/count
{
"filter": {
"must": [
{
"key": "color",
"match": {
"value": "red"
}
}
]
},
"exact": true
}
client.scroll(
collection_name="{collection_name}",
scroll_filter=models.Filter(
must=[
models.FieldCondition(
key="color",
match=models.MatchValue(value="red")
),
]
),
exact=True,
)
Returns number of counts mathcing given filtering conditions:
{
"count": 3811
}