| Name | Type | Description |
|---|---|---|
index_name* | str | The name of the index or alias to use for the cache. If it doesn't exist, an index is created according to the default mapping. |
store_input | bool | Default: TrueWhether to store the input text in the cache. Default is True. |
metadata | dict | Default: NoneAdditional metadata to store in the cache for filtering. Must be JSON serializable. |
namespace | str | Default: None |
maximum_duplicates_allowed | int | Default: 1 |
client | AsyncElasticsearch | Default: None |
es_url | str | Default: None |
es_cloud_id | str | Default: None |
es_user | str | Default: None |
es_api_key | str | Default: None |
es_password | str | Default: None |
Elasticsearch embeddings cache.
Caches embeddings in Elasticsearch to avoid repeated embedding computations.
Setup:
Install langchain_elasticsearch and start Elasticsearch locally using
the start-local script.
pip install -qU langchain_elasticsearch
curl -fsSL https://elastic.co/start-local | sh
This will create an elastic-start-local folder. To start Elasticsearch
and Kibana:
cd elastic-start-local
./start.sh
Elasticsearch will be available at http://localhost:9200. The password
for the elastic user and API key are stored in the .env file in the
elastic-start-local folder.
Initialize the Elasticsearch embeddings cache store.
Instantiate:
from langchain_elasticsearch import ElasticsearchEmbeddingsCache
cache = ElasticsearchEmbeddingsCache(
index_name="embeddings-cache",
es_url="http://localhost:9200"
)
Instantiate with API key:
from langchain_elasticsearch import ElasticsearchEmbeddingsCache
cache = ElasticsearchEmbeddingsCache(
index_name="embeddings-cache",
es_url="http://localhost:9200",
es_api_key="your-api-key"
)
Instantiate from cloud:
from langchain_elasticsearch import ElasticsearchEmbeddingsCache
cache = ElasticsearchEmbeddingsCache(
index_name="embeddings-cache",
es_cloud_id="<cloud_id>",
es_api_key="your-api-key"
)
Instantiate from existing connection:
from langchain_elasticsearch import ElasticsearchEmbeddingsCache
from elasticsearch import Elasticsearch
client = Elasticsearch("http://localhost:9200")
cache = ElasticsearchEmbeddingsCache(
index_name="embeddings-cache",
client=client
)
Use with CacheBackedEmbeddings:
from langchain.embeddings import CacheBackedEmbeddings
from langchain_openai import OpenAIEmbeddings
from langchain_elasticsearch import ElasticsearchEmbeddingsCache
underlying_embeddings = OpenAIEmbeddings()
cache = ElasticsearchEmbeddingsCache(
index_name="embeddings-cache",
es_url="http://localhost:9200"
)
cached_embeddings = CacheBackedEmbeddings.from_bytes_store(
underlying_embeddings,
cache,
namespace=underlying_embeddings.model
)
For synchronous applications, use the ElasticsearchEmbeddingsCache class.
For asynchronous applications, use the AsyncElasticsearchEmbeddingsCache
class.
A namespace to organize the cache.
Maximum duplicate keys permitted when using aliases across multiple indices. Default is 1.
Pre-existing Elasticsearch connection. Either provide this OR credentials.
URL of the Elasticsearch instance.
Cloud ID of the Elasticsearch instance.
Username for Elasticsearch.
API key for Elasticsearch.
Password for Elasticsearch.