Valkey-Search (BSD-3-Clause), provided as a Valkey module, is a high-performance Vector Similarity Search engine optimized for AI-driven workloads. It delivers single-digit millisecond latency and high QPS, capable of handling billions of vectors with over 99% recall.
Valkey-Search allows users to create indexes and perform similarity searches, incorporating complex filters. It supports Approximate Nearest Neighbor (ANN) search with HNSW and exact matching using K-Nearest Neighbors (KNN). Users can index data using either Valkey Hash or Valkey-JSON data types.
While Valkey-Search currently focuses on Vector Search, its goal is to extend Valkey into a full-fledged search engine, supporting full-text search and additional indexing options.
Valkey-Search’s ability to search billions of vectors with millisecond latencies makes it ideal for real-time applications such as:
FT.CREATE
FT.DROPINDEX
FT.INFO
FT._LIST
FT.SEARCH
For a detailed description of the supported commands, examples and configuration options, see the Command Reference.
Valkey-Search supports both Standalone and Cluster modes. Query processing and ingestion scale linearly with CPU cores in both modes. For large storage requirements, users can leverage Cluster mode for horizontal scaling of the keyspace.
If replica lag is acceptable, users can achieve horizontal query scaling by directing clients to read from replicas.
Valkey-Search supports hybrid queries, combining vector similarity search with filtering on indexed fields, such as Numeric and Tag indexes.
Tags are text fields that are interpreted as a list of tags delimited by a separator character. Generally, tags are small sets of values with finite possible values like color, book genre, city name, or author.
Below are some examples of building filter query on a field named:
color.
Here { and } are part of syntax and
| is used as a OR operator to support multiple tags,
general syntax is:
@<field_name>:{<tag>}
or
@<field_name>:{<tag1> | <tag2>}
or
@<field_name>:{<tag1> | <tag2> | ...}
For example, the following query will return documents with blue OR black OR green color.
@color:{blue | black | green}
As another example, the following query will return documents containing “hello world” or “hello universe”
@color:{hello world | hello universe}
Numeric indexes allow for filtering queries to only return values that are in between a given start and end value.
+inf, -inf can be
used to express start and end ranges.As an example, the following query will return books published
between 2021 and 2024 (Both inclusive). The equivalent mathematical
expression is 2021 <= year <= 2024.
"@year:[2021 2024]"
While The following query will return books published between 2021
(exclusive) and 2024 (inclusive). The equivalent mathematical expression
is 2021 < year <= 2024:
"@year:[(2021 2024]"
The following query will return books published before 2024
(inclusive). The equivalent mathematical expression is year
<= 2024:
@year:[(-inf 2024]
The following query will return books published after 2015
(exclusive). The equivalent mathematical expression is year
>= 2015:
@year:[2015 +inf]
A query that utilizes a filter expression to filter results is called a hybrid query. Any combination of tag and numeric indexes can form a hybrid query.
Pre-filtering: Pre-filtering relies on secondary
indexes (e.g. tag, numeric) to first find the matches to the filter
expression regardless of vector similarity. Once the filtered results
are calculated a brute-force search is performed to sort by vector
similarity.Inline-filtering: Inline-filtering performs the vector
search algorithm (e.g. HNSW), ignoring found vectors which don’t match
the filter.Pre-filtering is faster when the filtered search space
is much smaller than the original search space. When the filtered search
space is large, inline-filtering becomes faster. The query
planner for Valkey-Search automatically chooses between the two
strategies based on the provided filter.
To check the server’s overall search metrics, you can use the
INFO SEARCH or INFO MODULES commands.
The following metrics are added to the INFO command’s
output:
search_used_memory_human: A human-friendly readable
version of the search_used_memory_bytes metricsearch_used_memory_bytes: The total bytes of memory
that all indexes occupysearch_number_of_indexes: Index schema total countsearch_number_of_attributes: Total count of attributes
for all indexessearch_total_indexed_documents: Total count of all keys
for all indexessearch_background_indexing_status (String) The status
of the indexing process. NO_ACTIVITY indicates idle
indexingsearch_failure_requests_count: A count of all failed
requests, including syntax errorssearch_successful_requests_count: A count of all
successful requestssearch_hnsw_create_exceptions_count: Count of HNSW
creation unexpected errorssearch_hnsw_search_exceptions_count: Count of HNSW
search unexpected errorssearch_hnsw_remove_exceptions_count: Count of HNSW
removal unexpected errorssearch_hnsw_add_exceptions_count: Count of HNSW
addition unexpected errorssearch_hnsw_modify_exceptions_count: Count of HNSW
modification unexpected errorssearch_modify_subscription_skipped_count: Count of
skipped subscription modificationssearch_remove_subscription_successful_count: Count of
successful subscription removalssearch_remove_subscription_skipped_count: Count of
skipped subscription removalssearch_remove_subscription_failure_count: Count of
failed subscription removalssearch_add_subscription_successful_count: Count of
successfully added subscriptionssearch_add_subscription_failure_count: Count of
failures of adding subscriptionssearch_add_subscription_skipped_count: Count of skipped
subscription adding processessearch_modify_subscription_failure_count: Count of
failed subscription modificationssearch_modify_subscription_successful_count: Count of
successful subscription modificationsThe following list of configurations can be passed to the
loadmodule command:
--reader-threads: (Optional) Controls the amount of
threads executing queries. (Default: number of physical CPU cores on the
host machine)--writer-threads: (Optional) Controls the amount of
threads processing index mutations. (Default: number of physical CPU
cores on the host machine)--use-coordinator: (Optional) Cluster mode enabler.
Default: false.--hnsw-block-size: (Optional) Specifies the allocation
block size used by the HNSW graph for storing new vectors. Larger block
sizes may improve performance by enhancing CPU cache efficiency, but
come at the cost of increased memory usage due to pre-allocation for
potential future growth. (Default: 10K)--log-level Controls the log verbosity level. Possible
values are: debug, verbose,
notice and warning. (Default: Valkey’s log
level)The following list of configurations can be modified at runtime using
the CONFIG SET command:
search.hnsw-block-size:: Specifies the allocation block
size used by the HNSW graph for storing new vectors. Larger block sizes
may improve performance by enhancing CPU cache efficiency, but come at
the cost of increased memory usage due to pre-allocation for potential
future growth. (Default: 10K)