LangChain Reference home pageLangChain ReferenceLangChain Reference
  • GitHub
  • Main Docs
Deep Agents
LangChain
LangGraph
Integrations
LangSmith
  • Overview
    • Overview
    • Caches
    • Callbacks
    • Documents
    • Document loaders
    • Embeddings
    • Exceptions
    • Language models
    • Serialization
    • Output parsers
    • Prompts
    • Rate limiters
    • Retrievers
    • Runnables
    • Utilities
    • Vector stores
    MCP Adapters
    Standard Tests
    Text Splitters
    ⌘I

    LangChain Assistant

    Ask a question to get started

    Enter to send•Shift+Enter new line

    Menu

    OverviewCachesCallbacksDocumentsDocument loadersEmbeddingsExceptionsLanguage modelsSerializationOutput parsersPromptsRate limitersRetrieversRunnablesUtilitiesVector stores
    MCP Adapters
    Standard Tests
    Text Splitters
    Language
    Theme
    Pythonlangchain-corevectorstoresin_memory
    Moduleā—Since v0.2

    in_memory

    In-memory vector store.

    Functions

    Classes

    Modules

    View source on GitHub
    function
    dumpd
    function
    cosine_similarity
    function
    maximal_marginal_relevance
    class
    Document
    class
    VectorStore
    class
    Embeddings
    class
    InMemoryVectorStore
    module
    load

    Return a dict representation of an object.

    Calculate maximal marginal relevance.

    Class for storing a piece of text and associated metadata.

    Note

    Document is for retrieval workflows, not chat I/O. For sending text to an LLM in a conversation, use message types from langchain.messages.

    Interface for vector store.

    Interface for embedding models.

    This is an interface meant for implementing text embedding models.

    Text embedding models are used to map text to a vector (a point in n-dimensional space).

    Texts that are similar will usually be mapped to points that are close to each other in this space. The exact details of what's considered "similar" and how "distance" is measured in this space are dependent on the specific embedding model.

    This abstraction contains a method for embedding a list of documents and a method for embedding a query text. The embedding of a query text is expected to be a single vector, while the embedding of a list of documents is expected to be a list of vectors.

    Usually the query embedding is identical to the document embedding, but the abstraction allows treating them independently.

    In addition to the synchronous methods, this interface also provides asynchronous versions of the methods.

    By default, the asynchronous methods are implemented using the synchronous methods; however, implementations may choose to override the asynchronous methods with an async native implementation for performance reasons.

    In-memory vector store implementation.

    Uses a dictionary, and computes cosine similarity for search using numpy.

    Load LangChain objects from JSON strings or objects.

    How it works

    Each Serializable LangChain object has a unique identifier (its "class path"), which is a list of strings representing the module path and class name. For example:

    • AIMessage -> ["langchain_core", "messages", "ai", "AIMessage"]
    • ChatPromptTemplate -> ["langchain_core", "prompts", "chat", "ChatPromptTemplate"]

    When deserializing, the class path from the JSON 'id' field is checked against an allowlist. If the class is not in the allowlist, deserialization raises a ValueError.

    Threat model

    A serialized LangChain payload crosses a trust boundary because the manifest may contain serialized objects and configuration that affect runtime behavior. For example, a payload can configure a chat model with a custom base_url, custom headers, a different model name, or other constructor arguments. These are supported features, but they also mean the payload contents should be treated as executable configuration rather than plain text.

    Concretely, deserialization instantiates Python objects, so any constructor (__init__) or validator on an allowed class can run during load(). A crafted payload that is allowed to reach an unintended class — or an intended class with attacker-controlled kwargs — could cause network calls, file operations, or environment-variable access while the object is being built.

    Do not use with untrusted input

    If the source is untrusted, avoid calling load() / loads() on it. If you must, restrict allowed_objects to types that do not execute logic during init — allowed_objects='messages' (or an explicit list of message classes) is the safe choice. Keep secrets_from_env=False.

    The allowed_objects parameter controls which classes can be deserialized:

    • Explicit list of classes (recommended for untrusted input): only those specific classes are allowed.
    • 'messages': chat-message classes only (e.g. AIMessage, HumanMessage). Safe for untrusted input.
    • 'core' (current default) — unsafe with untrusted manifests. Classes defined in the serialization mappings under langchain_core (messages, documents, prompts, etc.).
    • 'all' — unsafe with untrusted manifests. Every class in the serialization mappings, including partner chat models and LLMs and their constructor kwargs (endpoint URLs, headers, model names, etc.).
    Side effects in allowed classes

    Deserialization calls __init__ on allowed classes. If those classes perform side effects during initialization (network calls, file operations, etc.), those side effects will occur. The allowlist prevents instantiation of classes outside the allowlist, but does not sandbox the allowed classes themselves or constrain their constructor kwargs.

    Import paths are also validated against trusted namespaces before any module is imported.

    Best practices

    • Use the most restrictive allowed_objects possible. For untrusted input, pass an explicit list of classes or 'messages'. 'core' and 'all' are unsafe with untrusted manifests — only use them when the source serves the entire payload, including its configuration.
    • Keep secrets_from_env set to False (the default). If you must use it, ensure the serialized data comes from a fully trusted source, as a crafted payload can read arbitrary environment variables.
    • When using secrets_map, include only the specific secrets that the serialized object requires.

    Injection protection (escape-based)

    During serialization, plain dicts that contain an 'lc' key are escaped by wrapping them: {"__lc_escaped__": {...}}. During deserialization, escaped dicts are unwrapped and returned as plain dicts, NOT instantiated as LC objects.

    This is an allowlist approach: only dicts explicitly produced by Serializable.to_json() (which are NOT escaped) are treated as LC objects; everything else is user data.

    Even if an attacker's payload includes __lc_escaped__ wrappers, it will be unwrapped to plain dicts and NOT instantiated as malicious objects.

    Examples

    from langchain_core.load import load
    from langchain_core.prompts import ChatPromptTemplate
    from langchain_core.messages import AIMessage, HumanMessage
    
    # Use default allowlist (classes from mappings) - recommended
    obj = load(data)
    
    # Allow only specific classes (most restrictive)
    obj = load(
        data,
        allowed_objects=[
            ChatPromptTemplate,
            AIMessage,
            HumanMessage,
        ],
    )