Recursively cast values in a dict to a form able to json.dump
Prepare a query for vector search based on the embedding type.
This function checks if the embedding is an AutoEmbeddings instance. If it is, the query is returned as-is (string) for server-side embedding. Otherwise, the query is embedded using the embedding model's embed_query method.
MongoDB Atlas vector store integration.
MongoDBAtlasVectorSearch performs data operations on text, embeddings and arbitrary data. In addition to CRUD operations, the VectorStore provides Vector Search based on similarity of embedding vectors following the Hierarchical Navigable Small Worlds (HNSW) algorithm.
This supports a number of models to ascertain scores, "similarity" (default), "MMR", and "similarity_score_threshold". These are described in the search_type argument to as_retriever, which provides the Runnable.invoke(query) API, allowing MongoDBAtlasVectorSearch to be used within a chain.
MongoDB Collection providing BaseStore interface.
This is meant to be treated as a key-value store: [str, Document]
In a MongoDB Collection, the field name _id is reserved for use as a primary key. Its value must be unique in the collection, is immutable, and may be of any type other than an array or regex. As this field is always indexed, it is the natural choice to hold keys.
The value will be held simply in a field called "value". It can contain any valid BSON type.
Example key value pair: {"_id": "foo", "value": "bar"}.
MongoDB Atlas's ParentDocumentRetriever
“Parent Document Retrieval” is a common approach to enhance the performance of retrieval methods in RAG by providing the LLM with a broader context to consider. In essence, we divide the original documents into relatively small chunks, embed each one, and store them in a vector database. Using such small chunks (a sentence or a couple of sentences) helps the embedding models to better reflect their meaning. If two high scoring chunks are contained in the same document, the query response will include the parent document just once. One can control the number of chunks found in the vector_search_stage by setting search_kwargs == {'top_k': n}. The number of query responses will be <= top_k.
In this implementation, we can store both parent and child documents in a single collection while only having to compute and index embedding vectors for the chunks!
This is achieved by backing both the
vectorstore, :class:~langchain_mongodb.vectorstores.MongoDBAtlasVectorSearch
and the docstore :class:~langchain_mongodb.docstores.MongoDBDocStore
by the same MongoDB Collection.
For more details, see superclasses
:class:~langchain.retrievers.parent_document_retriever.ParentDocumentRetriever
and :class:~langchain.retrievers.MultiVectorRetriever.