PineconeSparseVectorStore

Name	Type
index	Optional[Any]
embedding	Optional[PineconeSparseEmbeddings]
text_key	Optional[str]
namespace	Optional[str]
distance_strategy	Optional[DistanceStrategy]
pinecone_api_key	Optional[str]
index_name	Optional[str]
host	Optional[str]

aadd_texts

Asynchronously run more texts through the embeddings and add to the vectorstore.

Upsert optimization is done by chunking the embeddings and upserting them. This is done to avoid memory issues and optimize using HTTP based embeddings. For OpenAI embeddings, use pool_threads>4 when constructing the pinecone.Index, embedding_chunk_size>1000 and batch_size~64 for best performance. Args: texts: Iterable of strings to add to the vectorstore. metadatas: Optional list of metadatas associated with the texts. ids: Optional list of ids to associate with the texts. namespace: Optional pinecone namespace to add the texts to. batch_size: Batch size to use when adding the texts to the vectorstore. embedding_chunk_size: Chunk size to use when embedding the texts. id_prefix: Optional string to use as an ID prefix when upserting vectors.

from pinecone import Pinecone from langchain_pinecone import PineconeSparseVectorStore from langchain_pinecone.embeddings import PineconeSparseEmbeddings # Initialize Pinecone client pc = Pinecone(api_key="your-api-key") # Get your sparse index index = pc.Index("your-sparse-index-name") # Initialize embedding function embeddings = PineconeSparseEmbeddings() # Create vector store vectorstore = PineconeSparseVectorStore( index=index, embedding=embeddings, text_key="content", namespace="my-namespace" )

from langchain_core.documents import Document docs = [ Document(page_content="This is a sparse vector example"), Document(page_content="Another document for testing") ] # Option 1: Add from Document objects vectorstore.add_documents(docs) # Option 2: Add from texts texts = ["Text 1", "Text 2"] metadatas = [{"source": "source1"}, {"source": "source2"}] vectorstore.add_texts(texts, metadatas=metadatas)

ids = ["id1", "id2"] texts = ["Updated text 1", "Updated text 2"] metadatas = [{"source": "updated_source1"}, {"source": "updated_source2"}] vectorstore.add_texts(texts, metadatas=metadatas, ids=ids)

# Delete by IDs vectorstore.delete(ids=["id1", "id2"]) # Delete by filter vectorstore.delete(filter={"source": "source1"}) # Delete all documents in a namespace vectorstore.delete(delete_all=True, namespace="my-namespace")

# Search for similar documents docs = vectorstore.similarity_search("query text", k=5) # Search with filters docs = vectorstore.similarity_search( "query text", k=5, filter={"source": "source1"} ) # Maximal marginal relevance search for diversity docs = vectorstore.max_marginal_relevance_search( "query text", k=5, fetch_k=20, lambda_mult=0.5 )

# Search with relevance scores docs_and_scores = vectorstore.similarity_search_with_score( "query text", k=5 ) for doc, score in docs_and_scores: print(f"Score: {score}, Document: {doc.page_content}")

# Create a retriever retriever = vectorstore.as_retriever() # Customize retriever retriever = vectorstore.as_retriever( search_type="mmr", search_kwargs={"k": 5, "fetch_k": 20, "lambda_mult": 0.5}, filter={"source": "source1"} ) # Use the retriever docs = retriever.get_relevant_documents("query text")

LangChain Assistant

Menu

Bases

Constructors

Attributes

Methods

Inherited fromPineconeVectorStore

Attributes

Methods

Inherited fromVectorStore(langchain_core)

Methods

Menu

PineconeSparseVectorStore

Bases

Used in Docs

Constructors

Attributes

Methods

Inherited fromPineconeVectorStore

Attributes

Methods

Inherited fromVectorStore(langchain_core)

Methods