.. warning:: Beta Feature!
Cache provides an optional caching layer for LLMs.
Cache is useful for two reasons:
Cache directly competes with Memory. See documentation for Pros and Cons.
Class hierarchy:
.. code-block::
BaseCache --> <name>Cache # Examples: InMemoryCache, RedisCache, GPTCache
Cosmos DB Similarity Type as enumerator.
Cosmos DB Vector Search Type as enumerator.
Enumerator of the Distance strategies for calculating distances between vectors.
Azure Cosmos DB for MongoDB vCore vector store.
To use, you should have both:
pymongo python package installedCache that stores things in memory.
SQLite table for full LLM Cache (all generations).
Cache that uses SQAlchemy as a backend.
Cache that uses SQLite as a backend.
Cache that uses Upstash Redis as a backend.
Cache that uses Redis as a backend. Allows to use a sync redis.Redis client.
Cache that uses Redis as a backend. Allows to use an
async redis.asyncio.Redis client.
Cache that uses Redis as a vector-store backend.
Cache that uses GPTCache as a backend.
Cache that uses Momento as a backend. See https://gomomento.com/
Cache that uses Cassandra / Astra DB as a backend.
Example:
.. code-block:: python
import cassio
from langchain_community.cache import CassandraCache
from langchain_core.globals import set_llm_cache
cassio.init(auto=True) # Requires env. variables, see CassIO docs
set_llm_cache(CassandraCache())
It uses a single Cassandra table. The lookup keys (which get to form the primary key) are: - prompt, a string - llm_string, a deterministic str representation of the model parameters. (needed to prevent same-prompt-different-model collisions)
Cache that uses Cassandra as a vector-store backend for semantic (i.e. similarity-based) lookup.
Example:
.. code-block:: python
import cassio
from langchain_community.cache import CassandraSemanticCache
from langchain_core.globals import set_llm_cache
cassio.init(auto=True) # Requires env. variables, see CassIO docs
my_embedding = ...
set_llm_cache(CassandraSemanticCache(
embedding=my_embedding,
table_name="my_semantic_cache",
))
It uses a single (vector) Cassandra table and stores, in principle, cached values from several LLMs, so the LLM's llm_string is part of the rows' primary keys.
One can choose a similarity measure (default: "dot" for dot-product). Choosing another one ("cos", "l2") almost certainly requires threshold tuning. (which may be in order nevertheless, even if sticking to "dot").
SQLite table for full LLM Cache (all generations).
Cache that uses SQAlchemy as a backend.
Cache that uses Cosmos DB Mongo vCore vector-store backend
Cache that uses Cosmos DB NoSQL backend
Cache that uses OpenSearch vector store backend
Cache that uses Memcached backend through pymemcache client lib
Azure Cosmos DB for NoSQL vector store.
To use, you should have both:
- the azure-cosmos python package installed
You can read more about vector search, full text search and hybrid search using AzureCosmosDBNoSQL here: https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/vector-search https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/full-text-search https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/hybrid-search
SingleStore DB vector store.
The prerequisite for using this class is the installation of the singlestoredb
Python package.
The SingleStoreDB vectorstore can be created by providing an embedding function and the relevant parameters for the database connection, connection pool, and optionally, the names of the table and the fields to use.
Cache that uses SingleStore DB as a backend