Class●Since v0.3

MongoDBGraphStore

GraphRAG DataStore

GraphRAG is a ChatModel that provides responses to semantic queries based on a Knowledge Graph that an LLM is used to create. As in Vector RAG, we augment the Chat Model's training data with relevant information that we collect from documents.

In Vector RAG, one uses an "Embedding" model that converts both the query, and the potentially relevant documents, into vectors, which can then be compared, and the most similar supplied to the Chat Model as context to the query.

In Graph RAG, one uses an "Entity-Extraction" model that converts text into Entities and their relationships, a Knowledge Graph. Comparison is done by Graph traversal, finding entities connected to the query prompts. These are then supplied to the Chat Model as context. The main difference is that GraphRAG's output is typically in a structured format.

GraphRAG excels in finding links and common entities, even if these come from different articles. It can combine information from distinct sources providing richer context than Vector RAG in certain cases.

Here are a few examples of so-called multi-hop questions where GraphRAG excels:

What is the connection between ACME Corporation and GreenTech Ltd.?
Who is leading the SolarGrid Initiative, and what is their role?
Which organizations are participating in the SolarGrid Initiative?
What is John Doe’s role in ACME’s renewable energy projects?
Which company is headquartered in San Francisco and involved in the SolarGrid Initiative?

In Graph RAG, one uses an Entity-Extraction model that interprets text documents that it is given and extracting the query, and the potentially relevant documents, into graphs. These are composed of nodes that are entities (nouns) and edges that are relationships. The idea is that the graph can find connections between entities and hence answer questions that require more than one connection.

In MongoDB, Knowledge Graphs are stored in a single Collection. Each MongoDB Document represents a single entity (node), and its relationships (edges) are defined in a nested field named "relationships". The schema, and an example, are described in the :data:~langchain_mongodb.graphrag.prompts.entity_context prompts module.

When a query is made, the model extracts the entities in it, then traverses the graph to find connections. The closest entities and their relationships form the context that is included with the query to the Chat Model.

Consider this example Query: "Does John Doe work at MongoDB?" GraphRAG can answer this question even if the following two statements come from completely different sources.

"Jane Smith works with John Doe."
"Jane Smith works at MongoDB."

MongoDBGraphStore(
  self,
  *,
  connection_string: Optional[str] = None,
  database_name: Optional[str] = None,
  collection_name: Optional[str] = None,
  collection: Optional[Collection] = None,
  entity_extraction_model: BaseChatModel,
  entity_prompt: Optional[ChatPromptTemplate] = None,
  query_prompt: Optional[ChatPromptTemplate] = None,
  max_depth: int = 3,
  allowed_entity_types: Optional[List[str]] = None,
  allowed_relationship_types: Optional[List[str]] = None,
  entity_examples: Optional[str] = None,
  entity_name_examples: str = '',
  validate: bool = False,
  validation_action: str = 'warn'
)

Parameters

Name	Type	Description
`connection_string`	`Optional[str]`	Default:`None` A valid MongoDB connection URI.
`database_name`	`Optional[str]`	Default:`None` The name of the database to connect to.
`collection_name`	`Optional[str]`	Default:`None` The name of the collection to connect to.
`collection`	`Optional[Collection]`	Default:`None` A Collection that will represent a Knowledge Graph. ** One may pass a Collection in lieu of connection_string, database_name, and collection_name.
`entity_extraction_model`*	`BaseChatModel`	LLM for converting documents into Graph of Entities and Relationships.
`entity_prompt`	`Optional[ChatPromptTemplate]`	Default:`None` Prompt to fill graph store with entities following schema. Defaults to .prompts.ENTITY_EXTRACTION_INSTRUCTIONS
`query_prompt`	`Optional[ChatPromptTemplate]`	Default:`None` Prompt extracts entities and relationships as search starting points. Defaults to .prompts.NAME_EXTRACTION_INSTRUCTIONS
`max_depth`	`int`	Default:`3` Maximum recursion depth in graph traversal.
`allowed_entity_types`	`Optional[List[str]]`	Default:`None` If provided, constrains search to these types.
`allowed_relationship_types`	`Optional[List[str]]`	Default:`None` If provided, constrains search to these types.
`entity_examples`	`Optional[str]`	Default:`None` A string containing any number of additional examples to provide as context for entity extraction.
`entity_name_examples`	`str`	Default:`''` A string appended to prompts.NAME_EXTRACTION_INSTRUCTIONS containing examples.
`validate`	`bool`	Default:`False` If True, entity schema will be validated on every insert or update.
`validation_action`	`str`	Default:`'warn'` One of {"warn", "error"}. If "warn", the default, documents will be inserted but errors logged. If "error", an exception will be raised if any document does not match the schema.

Constructors

constructor

__init__

Name	Type
connection_string	Optional[str]
database_name	Optional[str]
collection_name	Optional[str]
collection	Optional[Collection]
entity_extraction_model	BaseChatModel
entity_prompt	Optional[ChatPromptTemplate]
query_prompt	Optional[ChatPromptTemplate]
max_depth	int
allowed_entity_types	Optional[List[str]]
allowed_relationship_types	Optional[List[str]]
entity_examples	Optional[str]
entity_name_examples	str
validate	bool
validation_action	str

Attributes

attribute

collection: collection

attribute

entity_extraction_model: entity_extraction_model

entity_name_examples: entity_name_examples

attribute

max_depth: max_depth

attribute

allowed_entity_types: allowed_entity_types

attribute

allowed_relationship_types: allowed_relationship_types

attribute

entity_schema: dict[str, Any]

JSON Schema Object of Entities. Will be applied if validate is True.

Methods

method

from_connection_string

Construct a MongoDB KnowLedge Graph for RAG from a MongoDB connection URI.

method

Close the resources used by the MongoDBGraphStore.

method

add_documents

Extract entities and upsert into the collection.

Each entity is represented by a single MongoDB Document. Existing entities identified in documents will be updated.

method

extract_entities

Extract entities and their relations using chosen prompt and LLM.

method

extract_entity_names

Extract entity names from a document for similarity_search.

The second entity extraction has a different form and purpose than the first as we are looking for starting points of our search and paths to follow. We aim to find source nodes, but no target nodes or edges.

method

find_entity_by_name

Utility to get Entity dict from Knowledge Graph / Collection. Args: name: _id string to look for. Returns: List of Entity dicts if any match name.

method

related_entities

Traverse Graph along relationship edges to find connected entities.

method

similarity_search

Retrieve list of connected Entities found via traversal of KnowledgeGraph.

Use LLM & Prompt to find entities within the input_document itself.
Find Entity Nodes that match those found in the input_document.
Traverse the graph using these as starting points.

method

chat_response

Responds to a query given information found in Knowledge Graph.

method

to_networkx

Utility converts Entity Collection to NetworkX DiGraph <https://networkx.org/documentation/stable/index.html>_

NOTE: Requires optional-dependency "viz", i.e. pip install "langchain-mongodb[viz]".

method

view

Draws a Knowledge Graph as Holoviews/Bokeh interactive plot.

We first convert the entity collection to a NetworkX Graph, and then convert it to a Holoviews Graph via their API.

The default layout chosen is the spring_layout. This maximizes the distance between nodes. As our entities have a type field, however, another good layout choice might be layout=nx.multipartite_layout, nx_opts["subset_key"]= "type" as multipartite layout positions nodes in straight lines by subset key.

NOTE: Requires optional-dependency "viz", i.e. pip install "langchain-mongodb[viz]".

You can save the view as any HoloViews object with .save. The type will be inferred from the filename's suffix, (e.g., hv.save(graph, "graph.html")) or by clicking the download widget on the Bokeh plot from a Jupyter notebook.

View source on GitHub

MongoDBGraphStore

Parameters

Constructors

Attributes

Methods

LangChain Assistant

Menu

MongoDBGraphStore

Parameters

Constructors

Attributes

Methods