Skip to content

langchain-neo4j

PyPI - Version PyPI - License PyPI - Downloads

Reference docs

This page contains reference documentation for Neo4j. See the docs for conceptual guides, tutorials, and examples on using Neo4j modules.

langchain_neo4j

GraphCypherQAChain

Bases: Chain

Chain for question-answering against a graph by generating Cypher statements.

Security note

Make sure that the database connection uses credentials that are narrowly-scoped to only include necessary permissions. Failure to do so may result in data corruption or loss, since the calling code may attempt commands that would result in deletion, mutation of data if appropriately prompted or reading sensitive data if such data is present in the database.

The best way to guard against such negative outcomes is to (as appropriate) limit the permissions granted to the credentials used with this tool.

See https://docs.langchain.com/oss/python/security-policy for more information.

METHOD DESCRIPTION
__init__

Initialize the chain.

from_llm

Initialize from LLM.

top_k class-attribute instance-attribute

top_k: int = 10

Number of results to return from the query

return_intermediate_steps class-attribute instance-attribute

return_intermediate_steps: bool = False

Whether or not to return the intermediate steps along with the final answer.

return_direct class-attribute instance-attribute

return_direct: bool = False

Whether or not to return the result of querying the graph directly.

cypher_query_corrector class-attribute instance-attribute

cypher_query_corrector: CypherQueryCorrector | None = None

Optional cypher validation tool

use_function_response class-attribute instance-attribute

use_function_response: bool = False

Whether to wrap the database context as tool/function response

allow_dangerous_requests class-attribute instance-attribute

allow_dangerous_requests: bool = False

Forced user opt-in to acknowledge that the chain can make dangerous requests.

Security note

Make sure that the database connection uses credentials that are narrowly-scoped to only include necessary permissions. Failure to do so may result in data corruption or loss, since the calling code may attempt commands that would result in deletion, mutation of data if appropriately prompted or reading sensitive data if such data is present in the database.

The best way to guard against such negative outcomes is to (as appropriate) limit the permissions granted to the credentials used with this tool.

See https://docs.langchain.com/oss/python/security-policy for more information.

input_keys property

input_keys: list[str]

Return the input keys.

output_keys property

output_keys: list[str]

Return the output keys.

__init__

__init__(**kwargs: Any) -> None

Initialize the chain.

from_llm classmethod

from_llm(
    llm: BaseLanguageModel | None = None,
    *,
    qa_prompt: BasePromptTemplate | None = None,
    cypher_prompt: BasePromptTemplate | None = None,
    cypher_llm: BaseLanguageModel | None = None,
    qa_llm: BaseLanguageModel | None = None,
    exclude_types: list[str] = [],
    include_types: list[str] = [],
    validate_cypher: bool = False,
    qa_llm_kwargs: dict[str, Any] | None = None,
    cypher_llm_kwargs: dict[str, Any] | None = None,
    use_function_response: bool = False,
    function_response_system: str = FUNCTION_RESPONSE_SYSTEM,
    **kwargs: Any,
) -> GraphCypherQAChain

Initialize from LLM.

Neo4jChatMessageHistory

Bases: BaseChatMessageHistory

Chat message history stored in a Neo4j database.

METHOD DESCRIPTION
aget_messages

Async version of getting messages.

add_user_message

Convenience method for adding a human message string to the store.

add_ai_message

Convenience method for adding an AIMessage string to the store.

add_messages

Add a list of messages.

aadd_messages

Async add a list of messages.

aclear

Async remove all messages from the store.

__str__

Return a string representation of the chat history.

add_message

Append the message to the record in Neo4j

clear

Clear session memory from Neo4j

messages property writable

messages: list[BaseMessage]

Retrieve the messages from Neo4j

aget_messages async

aget_messages() -> list[BaseMessage]

Async version of getting messages.

Can over-ride this method to provide an efficient async implementation.

In general, fetching messages may involve IO to the underlying persistence layer.

RETURNS DESCRIPTION
list[BaseMessage]

The messages.

add_user_message

add_user_message(message: HumanMessage | str) -> None

Convenience method for adding a human message string to the store.

Note

This is a convenience method. Code should favor the bulk add_messages interface instead to save on round-trips to the persistence layer.

This method may be deprecated in a future release.

PARAMETER DESCRIPTION
message

The HumanMessage to add to the store.

TYPE: HumanMessage | str

add_ai_message

add_ai_message(message: AIMessage | str) -> None

Convenience method for adding an AIMessage string to the store.

Note

This is a convenience method. Code should favor the bulk add_messages interface instead to save on round-trips to the persistence layer.

This method may be deprecated in a future release.

PARAMETER DESCRIPTION
message

The AIMessage to add.

TYPE: AIMessage | str

add_messages

add_messages(messages: Sequence[BaseMessage]) -> None

Add a list of messages.

Implementations should over-ride this method to handle bulk addition of messages in an efficient manner to avoid unnecessary round-trips to the underlying store.

PARAMETER DESCRIPTION
messages

A sequence of BaseMessage objects to store.

TYPE: Sequence[BaseMessage]

aadd_messages async

aadd_messages(messages: Sequence[BaseMessage]) -> None

Async add a list of messages.

PARAMETER DESCRIPTION
messages

A sequence of BaseMessage objects to store.

TYPE: Sequence[BaseMessage]

aclear async

aclear() -> None

Async remove all messages from the store.

__str__

__str__() -> str

Return a string representation of the chat history.

add_message

add_message(message: BaseMessage) -> None

Append the message to the record in Neo4j

clear

clear(delete_session_node: bool = False) -> None

Clear session memory from Neo4j

PARAMETER DESCRIPTION
delete_session_node

Whether to delete the session node.

TYPE: bool DEFAULT: False

Neo4jGraph

Bases: GraphStore

Neo4j database wrapper for various graph operations.

Security note

Make sure that the database connection uses credentials that are narrowly-scoped to only include necessary permissions. Failure to do so may result in data corruption or loss, since the calling code may attempt commands that would result in deletion, mutation of data if appropriately prompted or reading sensitive data if such data is present in the database.

The best way to guard against such negative outcomes is to (as appropriate) limit the permissions granted to the credentials used with this tool.

See https://docs.langchain.com/oss/python/security-policy for more information.

METHOD DESCRIPTION
__init__

Create a new Neo4j graph wrapper instance.

query

Query Neo4j database.

refresh_schema

Refreshes the Neo4j graph schema information.

add_graph_documents

This method constructs nodes and relationships in the graph based on the

close

Explicitly close the Neo4j driver connection.

__enter__

Enter the runtime context for the Neo4j graph connection.

__exit__

Exit the runtime context for the Neo4j graph connection.

__del__

Destructor for the Neo4j graph connection.

get_schema property

get_schema: str

Returns the schema of the Graph

get_structured_schema property

get_structured_schema: dict[str, Any]

Returns the structured schema of the Graph

__init__

__init__(
    url: str | None = None,
    username: str | None = None,
    password: str | None = None,
    database: str | None = None,
    timeout: float | None = None,
    sanitize: bool = False,
    refresh_schema: bool = True,
    *,
    driver_config: dict | None = None,
    enhanced_schema: bool = False,
) -> None

Create a new Neo4j graph wrapper instance.

PARAMETER DESCRIPTION
url

The URL of the Neo4j database server.

TYPE: str | None DEFAULT: None

username

The username for database authentication.

TYPE: str | None DEFAULT: None

password

The password for database authentication.

TYPE: str | None DEFAULT: None

database

The name of the database to connect to. Default is 'neo4j'.

TYPE: str | None DEFAULT: None

timeout

The timeout for transactions in seconds. Useful for terminating long-running queries.

Note

By default, there is no timeout set.

TYPE: float | None DEFAULT: None

sanitize

A flag to indicate whether to remove lists with more than 128 elements from results. Useful for removing embedding-like properties from database responses.

TYPE: bool DEFAULT: False

refresh_schema

A flag whether to refresh schema information at initialization.

TYPE: bool DEFAULT: True

driver_config

Configuration passed to Neo4j Driver.

TYPE: dict | None DEFAULT: None

enhanced_schema

A flag whether to scan the database for example values and use them in the graph schema.

TYPE: bool DEFAULT: False

query

query(query: str, params: dict = {}, session_params: dict = {}) -> list[dict[str, Any]]

Query Neo4j database.

PARAMETER DESCRIPTION
query

The Cypher query to execute.

TYPE: str

params

The parameters to pass to the query.

TYPE: dict DEFAULT: {}

session_params

Parameters to pass to the session used for executing the query.

TYPE: dict DEFAULT: {}

RETURNS DESCRIPTION
list[dict[str, Any]]

The list of dictionaries containing the query results.

RAISES DESCRIPTION
RuntimeError

If the connection has been closed.

refresh_schema

refresh_schema() -> None

Refreshes the Neo4j graph schema information.

RAISES DESCRIPTION
RuntimeError

If the connection has been closed.

add_graph_documents

add_graph_documents(
    graph_documents: list[GraphDocument],
    include_source: bool = False,
    baseEntityLabel: bool = False,
) -> None

This method constructs nodes and relationships in the graph based on the provided GraphDocument objects.

PARAMETER DESCRIPTION
graph_documents

A list of GraphDocument objects that contain the nodes and relationships to be added to the graph. Each GraphDocument should encapsulate the structure of part of the graph, including nodes, relationships, and optionally the source document information.

TYPE: list[GraphDocument]

include_source

If True, stores the source document and links it to nodes in the graph using the MENTIONS relationship. This is useful for tracing back the origin of data. Merges source documents based on the id property from the source document metadata if available; otherwise it calculates the MD5 hash of page_content for merging process.

TYPE: bool DEFAULT: False

baseEntityLabel

If True, each newly created node gets a secondary __Entity__ label, which is indexed and improves import speed and performance.

TYPE: bool DEFAULT: False

RAISES DESCRIPTION
RuntimeError

If the connection has been closed.

close

close() -> None

Explicitly close the Neo4j driver connection.

Delegates connection management to the Neo4j driver.

__enter__

__enter__() -> Neo4jGraph

Enter the runtime context for the Neo4j graph connection.

Enables use of the graph connection with the 'with' statement. This method allows for automatic resource management and ensures that the connection is properly handled.

RETURNS DESCRIPTION
Neo4jGraph

The current graph connection instance

TYPE: Neo4jGraph

Example
with Neo4jGraph(...) as graph:
    graph.query(...)  # Connection automatically managed

__exit__

__exit__(
    exc_type: Type[BaseException] | None,
    exc_val: BaseException | None,
    exc_tb: Any | None,
) -> None

Exit the runtime context for the Neo4j graph connection.

This method is automatically called when exiting a 'with' statement. It ensures that the database connection is closed, regardless of whether an exception occurred during the context's execution.

PARAMETER DESCRIPTION
exc_type

The type of exception that caused the context to exit

TYPE: Type[BaseException] | None

exc_val

The exception instance that caused the context to exit

TYPE: BaseException | None

exc_tb

The traceback for the exception

TYPE: Any | None

Info

Any exception is re-raised after the connection is closed.

__del__

__del__() -> None

Destructor for the Neo4j graph connection.

This method is called during garbage collection to ensure that database resources are released if not explicitly closed.

Danger

  • Do not rely on this method for deterministic resource cleanup
  • Always prefer explicit .close() or context manager

Best practices

  1. Use context manager:
    with Neo4jGraph(...) as graph:
        ...
    
  2. Explicitly close:
    graph = Neo4jGraph(...)
    try:
        ...
    finally:
        graph.close()
    

Neo4jVector

Bases: VectorStore

Neo4j vector index.

To use, you should have the neo4j python package installed.

PARAMETER DESCRIPTION
url

Neo4j connection url

TYPE: str | None DEFAULT: None

username

Neo4j username.

TYPE: str | None DEFAULT: None

password

Neo4j password

TYPE: str | None DEFAULT: None

database

Optionally provide Neo4j database Defaults to 'neo4j'

TYPE: str | None DEFAULT: None

embedding

Any embedding function implementing langchain.embeddings.base.Embeddings interface.

TYPE: Embeddings

distance_strategy

The distance strategy to use. (default: COSINE)

TYPE: DistanceStrategy DEFAULT: DEFAULT_DISTANCE_STRATEGY

search_type

The type of search to be performed, either 'vector' or 'hybrid'

TYPE: SearchType DEFAULT: VECTOR

node_label

The label used for nodes in the Neo4j database.

TYPE: str DEFAULT: 'Chunk'

embedding_node_property

The property name in Neo4j to store embeddings.

TYPE: str DEFAULT: 'embedding'

text_node_property

The property name in Neo4j to store the text.

TYPE: str DEFAULT: 'text'

retrieval_query

The Cypher query to be used for customizing retrieval. If empty, a default query will be used.

TYPE: str DEFAULT: ''

index_type

The type of index to be used, either 'NODE' or 'RELATIONSHIP'

TYPE: EntityType DEFAULT: DEFAULT_INDEX_TYPE

pre_delete_collection

If True, will delete existing data if it exists. Useful for testing.

TYPE: bool DEFAULT: False

embedding_dimension

The dimension of the embeddings. If not provided, will query the embedding model to calculate the dimension.

TYPE: int | None DEFAULT: None

Example
from langchain_neo4j import Neo4jVector
from langchain_openai import OpenAIEmbeddings

url="bolt://localhost:7687"
username="neo4j"
password="password"
embeddings = OpenAIEmbeddings()
vectorestore = Neo4jVector.from_documents(
    embedding=embeddings,
    documents=docs,
    url=url
    username=username,
    password=password,
)
METHOD DESCRIPTION
delete

Delete by vector ID or other criteria.

get_by_ids

Get documents by their IDs.

aget_by_ids

Async get documents by their IDs.

adelete

Async delete by vector ID or other criteria.

aadd_texts

Async run more texts through the embeddings and add to the VectorStore.

add_documents

Add or update documents in the VectorStore.

aadd_documents

Async run more documents through the embeddings and add to the VectorStore.

search

Return docs most similar to query using a specified search type.

asearch

Async return docs most similar to query using a specified search type.

asimilarity_search_with_score

Async run similarity search with distance.

similarity_search_with_relevance_scores

Return docs and relevance scores in the range [0, 1].

asimilarity_search_with_relevance_scores

Async return docs and relevance scores in the range [0, 1].

asimilarity_search

Async return docs most similar to query.

asimilarity_search_by_vector

Async return docs most similar to embedding vector.

amax_marginal_relevance_search

Async return docs selected using the maximal marginal relevance.

max_marginal_relevance_search_by_vector

Return docs selected using the maximal marginal relevance.

amax_marginal_relevance_search_by_vector

Async return docs selected using the maximal marginal relevance.

afrom_documents

Async return VectorStore initialized from documents and embeddings.

afrom_texts

Async return VectorStore initialized from texts and embeddings.

as_retriever

Return VectorStoreRetriever initialized from this VectorStore.

query

Query Neo4j database with retries and exponential backoff.

verify_version

Check if the connected Neo4j database version supports vector indexing.

retrieve_existing_index

Check if the vector index exists in the Neo4j database

retrieve_existing_fts_index

Check if the fulltext index exists in the Neo4j database

create_new_index

This method constructs a Cypher query and executes it

create_new_keyword_index

This method constructs a Cypher query and executes it

add_embeddings

Add embeddings to the VectorStore.

add_texts

Run more texts through the embeddings and add to the VectorStore.

similarity_search

Run similarity search with Neo4jVector.

similarity_search_with_score

Return docs most similar to query.

similarity_search_with_score_by_vector

Perform a similarity search in the Neo4j database using a

similarity_search_by_vector

Return docs most similar to embedding vector.

from_texts

Return Neo4jVector initialized from texts and embeddings.

from_embeddings

Construct Neo4jVector wrapper from raw documents and pre-

from_existing_index

Get instance of an existing Neo4j vector index. This method will

from_existing_relationship_index

Get instance of an existing Neo4j relationship vector index.

from_documents

Return Neo4jVector initialized from documents and embeddings.

from_existing_graph

Initialize and return a Neo4jVector instance from an existing graph.

max_marginal_relevance_search

Return docs selected using the maximal marginal relevance.

embeddings property

embeddings: Embeddings

Access the query embedding object if available.

delete

delete(ids: list[str] | None = None, **kwargs: Any) -> bool | None

Delete by vector ID or other criteria.

PARAMETER DESCRIPTION
ids

List of IDs to delete. If None, delete all.

TYPE: list[str] | None DEFAULT: None

**kwargs

Other keyword arguments that subclasses might use.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
bool | None

True if deletion is successful, False otherwise, None if not implemented.

get_by_ids

get_by_ids(ids: Sequence[str]) -> list[Document]

Get documents by their IDs.

The returned documents are expected to have the ID field set to the ID of the document in the vector store.

Fewer documents may be returned than requested if some IDs are not found or if there are duplicated IDs.

Users should not assume that the order of the returned documents matches the order of the input IDs. Instead, users should rely on the ID field of the returned documents.

This method should NOT raise exceptions if no documents are found for some IDs.

PARAMETER DESCRIPTION
ids

List of IDs to retrieve.

TYPE: Sequence[str]

RETURNS DESCRIPTION
list[Document]

List of Document objects.

aget_by_ids async

aget_by_ids(ids: Sequence[str]) -> list[Document]

Async get documents by their IDs.

The returned documents are expected to have the ID field set to the ID of the document in the vector store.

Fewer documents may be returned than requested if some IDs are not found or if there are duplicated IDs.

Users should not assume that the order of the returned documents matches the order of the input IDs. Instead, users should rely on the ID field of the returned documents.

This method should NOT raise exceptions if no documents are found for some IDs.

PARAMETER DESCRIPTION
ids

List of IDs to retrieve.

TYPE: Sequence[str]

RETURNS DESCRIPTION
list[Document]

List of Document objects.

adelete async

adelete(ids: list[str] | None = None, **kwargs: Any) -> bool | None

Async delete by vector ID or other criteria.

PARAMETER DESCRIPTION
ids

List of IDs to delete. If None, delete all.

TYPE: list[str] | None DEFAULT: None

**kwargs

Other keyword arguments that subclasses might use.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
bool | None

True if deletion is successful, False otherwise, None if not implemented.

aadd_texts async

aadd_texts(
    texts: Iterable[str],
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> list[str]

Async run more texts through the embeddings and add to the VectorStore.

PARAMETER DESCRIPTION
texts

Iterable of strings to add to the VectorStore.

TYPE: Iterable[str]

metadatas

Optional list of metadatas associated with the texts.

TYPE: list[dict] | None DEFAULT: None

ids

Optional list

TYPE: list[str] | None DEFAULT: None

**kwargs

VectorStore specific parameters.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[str]

List of IDs from adding the texts into the VectorStore.

RAISES DESCRIPTION
ValueError

If the number of metadatas does not match the number of texts.

ValueError

If the number of IDs does not match the number of texts.

add_documents

add_documents(documents: list[Document], **kwargs: Any) -> list[str]

Add or update documents in the VectorStore.

PARAMETER DESCRIPTION
documents

Documents to add to the VectorStore.

TYPE: list[Document]

**kwargs

Additional keyword arguments.

If kwargs contains IDs and documents contain ids, the IDs in the kwargs will receive precedence.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[str]

List of IDs of the added texts.

aadd_documents async

aadd_documents(documents: list[Document], **kwargs: Any) -> list[str]

Async run more documents through the embeddings and add to the VectorStore.

PARAMETER DESCRIPTION
documents

Documents to add to the VectorStore.

TYPE: list[Document]

**kwargs

Additional keyword arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[str]

List of IDs of the added texts.

search

search(query: str, search_type: str, **kwargs: Any) -> list[Document]

Return docs most similar to query using a specified search type.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

search_type

Type of search to perform. Can be 'similarity', 'mmr', or 'similarity_score_threshold'.

TYPE: str

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

RAISES DESCRIPTION
ValueError

If search_type is not one of 'similarity', 'mmr', or 'similarity_score_threshold'.

asearch async

asearch(query: str, search_type: str, **kwargs: Any) -> list[Document]

Async return docs most similar to query using a specified search type.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

search_type

Type of search to perform. Can be 'similarity', 'mmr', or 'similarity_score_threshold'.

TYPE: str

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

RAISES DESCRIPTION
ValueError

If search_type is not one of 'similarity', 'mmr', or 'similarity_score_threshold'.

asimilarity_search_with_score async

asimilarity_search_with_score(
    *args: Any, **kwargs: Any
) -> list[tuple[Document, float]]

Async run similarity search with distance.

PARAMETER DESCRIPTION
*args

Arguments to pass to the search method.

TYPE: Any DEFAULT: ()

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score).

similarity_search_with_relevance_scores

similarity_search_with_relevance_scores(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

kwargs to be passed to similarity search. Should include score_threshold, An optional floating point value between 0 to 1 to filter the resulting set of retrieved docs

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score).

asimilarity_search_with_relevance_scores async

asimilarity_search_with_relevance_scores(
    query: str, k: int = 4, **kwargs: Any
) -> list[tuple[Document, float]]

Async return docs and relevance scores in the range [0, 1].

0 is dissimilar, 1 is most similar.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

kwargs to be passed to similarity search. Should include score_threshold, An optional floating point value between 0 to 1 to filter the resulting set of retrieved docs

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of tuples of (doc, similarity_score)

asimilarity_search(query: str, k: int = 4, **kwargs: Any) -> list[Document]

Async return docs most similar to query.

PARAMETER DESCRIPTION
query

Input text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query.

asimilarity_search_by_vector async

asimilarity_search_by_vector(
    embedding: list[float], k: int = 4, **kwargs: Any
) -> list[Document]

Async return docs most similar to embedding vector.

PARAMETER DESCRIPTION
embedding

Embedding to look up documents similar to.

TYPE: list[float]

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query vector.

amax_marginal_relevance_search(
    query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, **kwargs: Any
) -> list[Document]

Async return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER DESCRIPTION
query

Text to look up documents similar to.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

fetch_k

Number of Document objects to fetch to pass to MMR algorithm.

TYPE: int DEFAULT: 20

lambda_mult

Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity.

TYPE: float DEFAULT: 0.5

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects selected by maximal marginal relevance.

max_marginal_relevance_search_by_vector

max_marginal_relevance_search_by_vector(
    embedding: list[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    **kwargs: Any,
) -> list[Document]

Return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER DESCRIPTION
embedding

Embedding to look up documents similar to.

TYPE: list[float]

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

fetch_k

Number of Document objects to fetch to pass to MMR algorithm.

TYPE: int DEFAULT: 20

lambda_mult

Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity.

TYPE: float DEFAULT: 0.5

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects selected by maximal marginal relevance.

amax_marginal_relevance_search_by_vector async

amax_marginal_relevance_search_by_vector(
    embedding: list[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    **kwargs: Any,
) -> list[Document]

Async return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER DESCRIPTION
embedding

Embedding to look up documents similar to.

TYPE: list[float]

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

fetch_k

Number of Document objects to fetch to pass to MMR algorithm.

TYPE: int DEFAULT: 20

lambda_mult

Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity.

TYPE: float DEFAULT: 0.5

**kwargs

Arguments to pass to the search method.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects selected by maximal marginal relevance.

afrom_documents async classmethod

afrom_documents(
    documents: list[Document], embedding: Embeddings, **kwargs: Any
) -> Self

Async return VectorStore initialized from documents and embeddings.

PARAMETER DESCRIPTION
documents

List of Document objects to add to the VectorStore.

TYPE: list[Document]

embedding

Embedding function to use.

TYPE: Embeddings

**kwargs

Additional keyword arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Self

VectorStore initialized from documents and embeddings.

afrom_texts async classmethod

afrom_texts(
    texts: list[str],
    embedding: Embeddings,
    metadatas: list[dict] | None = None,
    *,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> Self

Async return VectorStore initialized from texts and embeddings.

PARAMETER DESCRIPTION
texts

Texts to add to the VectorStore.

TYPE: list[str]

embedding

Embedding function to use.

TYPE: Embeddings

metadatas

Optional list of metadatas associated with the texts.

TYPE: list[dict] | None DEFAULT: None

ids

Optional list of IDs associated with the texts.

TYPE: list[str] | None DEFAULT: None

**kwargs

Additional keyword arguments.

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Self

VectorStore initialized from texts and embeddings.

as_retriever

as_retriever(**kwargs: Any) -> VectorStoreRetriever

Return VectorStoreRetriever initialized from this VectorStore.

PARAMETER DESCRIPTION
**kwargs

Keyword arguments to pass to the search function. Can include:

  • search_type: Defines the type of search that the Retriever should perform. Can be 'similarity' (default), 'mmr', or 'similarity_score_threshold'.
  • search_kwargs: Keyword arguments to pass to the search function. Can include things like:

    • k: Amount of documents to return (Default: 4)
    • score_threshold: Minimum relevance threshold for similarity_score_threshold
    • fetch_k: Amount of documents to pass to MMR algorithm (Default: 20)
    • lambda_mult: Diversity of results returned by MMR; 1 for minimum diversity and 0 for maximum. (Default: 0.5)
    • filter: Filter by document metadata

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
VectorStoreRetriever

Retriever class for VectorStore.

Examples:

# Retrieve more documents with higher diversity
# Useful if your dataset has many similar documents
docsearch.as_retriever(
    search_type="mmr", search_kwargs={"k": 6, "lambda_mult": 0.25}
)

# Fetch more documents for the MMR algorithm to consider
# But only return the top 5
docsearch.as_retriever(search_type="mmr", search_kwargs={"k": 5, "fetch_k": 50})

# Only retrieve documents that have a relevance score
# Above a certain threshold
docsearch.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"score_threshold": 0.8},
)

# Only get the single most similar document from the dataset
docsearch.as_retriever(search_kwargs={"k": 1})

# Use a filter to only retrieve documents from a specific paper
docsearch.as_retriever(
    search_kwargs={"filter": {"paper_title": "GPT-4 Technical Report"}}
)

query

query(query: str, *, params: dict | None = None) -> list[dict[str, Any]]

Query Neo4j database with retries and exponential backoff.

PARAMETER DESCRIPTION
query

The Cypher query to execute.

TYPE: str

params

Dictionary of query parameters.

TYPE: dict | None DEFAULT: None

RETURNS DESCRIPTION
list[dict[str, Any]]

List of dictionaries containing the query results.

verify_version

verify_version() -> None

Check if the connected Neo4j database version supports vector indexing.

Queries the Neo4j database to retrieve its version and compares it against a target version (5.11.0) that is known to support vector indexing. Raises a ValueError if the connected Neo4j version is not supported.

retrieve_existing_index

retrieve_existing_index() -> tuple[int | None, str] | None

Check if the vector index exists in the Neo4j database and returns its embedding dimension.

This method queries the Neo4j database for existing indexes and attempts to retrieve the dimension of the vector index with the specified name. If the index exists, its dimension is returned. If the index doesn't exist, None is returned.

RETURNS DESCRIPTION
tuple[int | None, str] | None

int or None: The embedding dimension of the existing index if found.

retrieve_existing_fts_index

retrieve_existing_fts_index(text_node_properties: list[str] = []) -> str | None

Check if the fulltext index exists in the Neo4j database

This method queries the Neo4j database for existing fts indexes with the specified name.

RETURNS DESCRIPTION
str | None

Keyword index information

create_new_index

create_new_index() -> None

This method constructs a Cypher query and executes it to create a new vector index in Neo4j.

create_new_keyword_index

create_new_keyword_index(text_node_properties: list[str] = []) -> None

This method constructs a Cypher query and executes it to create a new full text index in Neo4j.

add_embeddings

add_embeddings(
    texts: Iterable[str],
    embeddings: list[list[float]],
    metadatas: list[dict] | None = None,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> list[str]

Add embeddings to the VectorStore.

PARAMETER DESCRIPTION
texts

Iterable of strings to add to the VectorStore.

TYPE: Iterable[str]

embeddings

List of list of embedding vectors.

TYPE: list[list[float]]

metadatas

List of metadatas associated with the texts.

TYPE: list[dict] | None DEFAULT: None

kwargs

VectorStore specific parameters

TYPE: Any DEFAULT: {}

add_texts

add_texts(
    texts: Iterable[str],
    metadatas: list[dict] | None = None,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> list[str]

Run more texts through the embeddings and add to the VectorStore.

PARAMETER DESCRIPTION
texts

Iterable of strings to add to the VectorStore.

TYPE: Iterable[str]

metadatas

Optional list of metadatas associated with the texts.

TYPE: list[dict] | None DEFAULT: None

kwargs

VectorStore specific parameters

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
list[str]

List of IDs from adding the texts into the VectorStore.

similarity_search(
    query: str,
    k: int = 4,
    params: dict[str, Any] = {},
    filter: dict[str, Any] | None = None,
    effective_search_ratio: int = 1,
    **kwargs: Any,
) -> list[Document]

Run similarity search with Neo4jVector.

PARAMETER DESCRIPTION
query

Query text to search for.

TYPE: str

k

Number of results to return.

TYPE: int DEFAULT: 4

params

The search params for the index type.

TYPE: dict[str, Any] DEFAULT: {}

filter

Dictionary of argument(s) to filter on metadata.

TYPE: dict[str, Any] | None DEFAULT: None

effective_search_ratio

Controls the candidate pool size by multiplying $k to balance query accuracy and performance.

TYPE: int DEFAULT: 1

Returns: List of Document objects most similar to the query.

similarity_search_with_score

similarity_search_with_score(
    query: str,
    k: int = 4,
    params: dict[str, Any] = {},
    filter: dict[str, Any] | None = None,
    effective_search_ratio: int = 1,
    **kwargs: Any,
) -> list[tuple[Document, float]]

Return docs most similar to query.

PARAMETER DESCRIPTION
query

Text to look up documents similar to.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

params

The search params for the index type.

TYPE: dict[str, Any] DEFAULT: {}

filter

Dictionary of argument(s) to filter on metadata.

TYPE: dict[str, Any] | None DEFAULT: None

effective_search_ratio

Controls the candidate pool size by multiplying $k to balance query accuracy and performance.

TYPE: int DEFAULT: 1

RETURNS DESCRIPTION
list[tuple[Document, float]]

List of Document objects most similar to the query and score for each

similarity_search_with_score_by_vector

similarity_search_with_score_by_vector(
    embedding: list[float],
    k: int = 4,
    filter: dict[str, Any] | None = None,
    params: dict[str, Any] = {},
    effective_search_ratio: int = 1,
    **kwargs: Any,
) -> list[tuple[Document, float]]

Perform a similarity search in the Neo4j database using a given vector and return the top k similar documents with their scores.

This method uses a Cypher query to find the top k documents that are most similar to a given embedding. The similarity is measured using a vector index in the Neo4j database. The results are returned as a list of tuples, each containing a Document object and its similarity score.

PARAMETER DESCRIPTION
embedding

The embedding vector to compare against.

TYPE: list[float]

k

The number of top similar documents to retrieve.

TYPE: int DEFAULT: 4

filter

Dictionary of argument(s) to filter on metadata.

TYPE: dict[str, Any] | None DEFAULT: None

params

The search params for the index type.

TYPE: dict[str, Any] DEFAULT: {}

effective_search_ratio

Controls the candidate pool size by multiplying $k to balance query accuracy and performance.

TYPE: int DEFAULT: 1

RETURNS DESCRIPTION
list[tuple[Document, float]]

A list of tuples, each containing a Document object and its similarity score.

similarity_search_by_vector

similarity_search_by_vector(
    embedding: list[float],
    k: int = 4,
    filter: dict[str, Any] | None = None,
    params: dict[str, Any] = {},
    effective_search_ratio: int = 1,
    **kwargs: Any,
) -> list[Document]

Return docs most similar to embedding vector.

PARAMETER DESCRIPTION
embedding

Embedding to look up documents similar to.

TYPE: list[float]

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

filter

Dictionary of argument(s) to filter on metadata.

TYPE: dict[str, Any] | None DEFAULT: None

params

The search params for the index type.

TYPE: dict[str, Any] DEFAULT: {}

RETURNS DESCRIPTION
list[Document]

List of Document objects most similar to the query vector.

from_texts classmethod

from_texts(
    texts: list[str],
    embedding: Embeddings,
    metadatas: list[dict] | None = None,
    distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> Neo4jVector

Return Neo4jVector initialized from texts and embeddings. Neo4j credentials are required in the form of url, username, and password and optional database parameters.

from_embeddings classmethod

from_embeddings(
    text_embeddings: list[tuple[str, list[float]]],
    embedding: Embeddings,
    metadatas: list[dict] | None = None,
    distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
    ids: list[str] | None = None,
    pre_delete_collection: bool = False,
    **kwargs: Any,
) -> Neo4jVector

Construct Neo4jVector wrapper from raw documents and pre- generated embeddings.

Return Neo4jVector initialized from documents and embeddings. Neo4j credentials are required in the form of url, username, and password and optional database parameters.

Example
from langchain_neo4j import Neo4jVector
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()
text_embeddings = embeddings.embed_documents(texts)
text_embedding_pairs = list(zip(texts, text_embeddings))
vectorstore = Neo4jVector.from_embeddings(
    text_embedding_pairs, embeddings)

from_existing_index classmethod

from_existing_index(
    embedding: Embeddings,
    index_name: str,
    search_type: SearchType = DEFAULT_SEARCH_TYPE,
    keyword_index_name: str | None = None,
    embedding_dimension: int | None = None,
    **kwargs: Any,
) -> Neo4jVector

Get instance of an existing Neo4j vector index. This method will return the instance of the store without inserting any new embeddings. Neo4j credentials are required in the form of url, username, and password and optional database parameters along with the index_name definition.

from_existing_relationship_index classmethod

from_existing_relationship_index(
    embedding: Embeddings,
    index_name: str,
    search_type: SearchType = DEFAULT_SEARCH_TYPE,
    embedding_dimension: int | None = None,
    **kwargs: Any,
) -> Neo4jVector

Get instance of an existing Neo4j relationship vector index. This method will return the instance of the store without inserting any new embeddings. Neo4j credentials are required in the form of url, username, and password and optional database parameters along with the index_name definition.

from_documents classmethod

from_documents(
    documents: list[Document],
    embedding: Embeddings,
    distance_strategy: DistanceStrategy = DEFAULT_DISTANCE_STRATEGY,
    ids: list[str] | None = None,
    **kwargs: Any,
) -> Neo4jVector

Return Neo4jVector initialized from documents and embeddings. Neo4j credentials are required in the form of url, username, and password and optional database parameters.

from_existing_graph classmethod

from_existing_graph(
    embedding: Embeddings,
    node_label: str,
    embedding_node_property: str,
    text_node_properties: list[str],
    *,
    keyword_index_name: str | None = "keyword",
    index_name: str = "vector",
    search_type: SearchType = DEFAULT_SEARCH_TYPE,
    retrieval_query: str = "",
    **kwargs: Any,
) -> Neo4jVector

Initialize and return a Neo4jVector instance from an existing graph.

This method initializes a Neo4jVector instance using the provided parameters and the existing graph. It validates the existence of the indices and creates new ones if they don't exist.

Neo4jVector: An instance of Neo4jVector initialized with the provided parameters and existing graph.

Example
neo4j_vector = Neo4jVector.from_existing_graph(
    embedding=my_embedding,
    node_label="Document",
    embedding_node_property="embedding",
    text_node_properties=["title", "content"]
)

Note

Neo4j credentials are required in the form of url, username, and password, and optional database parameters passed as additional keyword arguments.

max_marginal_relevance_search(
    query: str,
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    filter: dict | None = None,
    **kwargs: Any,
) -> list[Document]

Return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

PARAMETER DESCRIPTION
query

search query text.

TYPE: str

k

Number of Document objects to return.

TYPE: int DEFAULT: 4

fetch_k

Number of Document objects to fetch to pass to MMR algorithm.

TYPE: int DEFAULT: 20

lambda_mult

Number between 0 and 1 that determines the degree of diversity among the results with 0 corresponding to maximum diversity and 1 to minimum diversity.

TYPE: float DEFAULT: 0.5

filter

Filter on metadata properties, e.g.

{
    "str_property": "foo",
    "int_property": 123
}

TYPE: dict | None DEFAULT: None

Returns: List of Document objects selected by maximal marginal relevance.