InMemoryVectorStore(
self,
embedding: Embeddings,
)Async return docs most similar to query using a specified search type.
| Name | Type | Description |
|---|---|---|
embedding* | Embeddings |
| Name | Type |
|---|---|
| embedding | Embeddings |
In-memory vector store implementation.
Uses a dictionary, and computes cosine similarity for search using numpy.
embedding function to use.
Get documents by their ids.
Async get documents by their ids.
Search for the most similar documents to the given embedding.
Load a vector store from a file.
Dump the vector store to a file.
Setup:
Install langchain-core.
pip install -U langchain-core
Key init args — indexing params:
Instantiate:
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_openai import OpenAIEmbeddings
vector_store = InMemoryVectorStore(OpenAIEmbeddings())
Add Documents:
from langchain_core.documents import Document
document_1 = Document(id="1", page_content="foo", metadata={"baz": "bar"})
document_2 = Document(id="2", page_content="thud", metadata={"bar": "baz"})
document_3 = Document(id="3", page_content="i will be deleted :(")
documents = [document_1, document_2, document_3]
vector_store.add_documents(documents=documents)
Inspect documents:
top_n = 10
for index, (id, doc) in enumerate(vector_store.store.items()):
if index < top_n:
# docs have keys 'id', 'vector', 'text', 'metadata'
print(f"{id}: {doc['text']}")
else:
break
Delete Documents:
vector_store.delete(ids=["3"])
Search:
results = vector_store.similarity_search(query="thud", k=1)
for doc in results:
print(f"* {doc.page_content} [{doc.metadata}]")
* thud [{'bar': 'baz'}]
Search with filter:
def _filter_function(doc: Document) -> bool:
return doc.metadata.get("bar") == "baz"
results = vector_store.similarity_search(
query="thud", k=1, filter=_filter_function
)
for doc in results:
print(f"* {doc.page_content} [{doc.metadata}]")
* thud [{'bar': 'baz'}]
Search with score:
results = vector_store.similarity_search_with_score(query="qux", k=1)
for doc, score in results:
print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")
* [SIM=0.832268] foo [{'baz': 'bar'}]
Async:
# add documents
# await vector_store.aadd_documents(documents=documents)
# delete documents
# await vector_store.adelete(ids=["3"])
# search
# results = vector_store.asimilarity_search(query="thud", k=1)
# search with score
results = await vector_store.asimilarity_search_with_score(query="qux", k=1)
for doc, score in results:
print(f"* [SIM={score:3f}] {doc.page_content} [{doc.metadata}]")
* [SIM=0.832268] foo [{'baz': 'bar'}]
Use as Retriever:
retriever = vector_store.as_retriever(
search_type="mmr",
search_kwargs={"k": 1, "fetch_k": 2, "lambda_mult": 0.5},
)
retriever.invoke("thud")
[Document(id='2', metadata={'bar': 'baz'}, page_content='thud')]