AzureOpenAIEmbeddings
¶
Reference docs
This page contains reference documentation for AzureOpenAIEmbeddings
. See
the docs
for conceptual guides, tutorials, and examples on using AzureOpenAIEmbeddings
.
langchain_openai.embeddings.AzureOpenAIEmbeddings
¶
Bases: OpenAIEmbeddings
AzureOpenAI embedding model integration.
Setup
To access AzureOpenAI embedding models you'll need to create an Azure account,
get an API key, and install the langchain-openai
integration package.
You'll need to have an Azure OpenAI instance deployed. You can deploy a version on Azure Portal following this guide.
Once you have your instance running, make sure you have the name of your instance and key. You can find the key in the Azure Portal, under the “Keys and Endpoint” section of your instance.
Key init args — completion params:
model:
Name of AzureOpenAI
model to use.
dimensions:
Number of dimensions for the embeddings. Can be specified only if the
underlying model supports it.
See full list of supported init args and their descriptions in the params section.
Instantiate
from langchain_openai import AzureOpenAIEmbeddings
embeddings = AzureOpenAIEmbeddings(
model="text-embedding-3-large"
# dimensions: int | None = None, # Can specify dimensions with new text-embedding-3 models
# azure_endpoint="https://<your-endpoint>.openai.azure.com/", If not provided, will read env variable AZURE_OPENAI_ENDPOINT
# api_key=... # Can provide an API key directly. If missing read env variable AZURE_OPENAI_API_KEY
# openai_api_version=..., # If not provided, will read env variable AZURE_OPENAI_API_VERSION
)
Async
METHOD | DESCRIPTION |
---|---|
embed_documents |
Call out to OpenAI's embedding endpoint for embedding search docs. |
embed_query |
Call out to OpenAI's embedding endpoint for embedding query text. |
aembed_documents |
Call out to OpenAI's embedding endpoint async for embedding search docs. |
aembed_query |
Call out to OpenAI's embedding endpoint async for embedding query text. |
build_extra |
Build extra kwargs from additional params that were passed in. |
validate_environment |
Validate that api key and python package exists in environment. |
dimensions
class-attribute
instance-attribute
¶
dimensions: int | None = None
The number of dimensions the resulting output embeddings should have.
Only supported in text-embedding-3
and later models.
openai_api_base
class-attribute
instance-attribute
¶
openai_api_base: str | None = Field(
alias="base_url", default_factory=from_env("OPENAI_API_BASE", default=None)
)
Base URL path for API requests, leave blank if not using a proxy or service emulator.
embedding_ctx_length
class-attribute
instance-attribute
¶
embedding_ctx_length: int = 8191
The maximum number of tokens to embed at once.
openai_organization
class-attribute
instance-attribute
¶
openai_organization: str | None = Field(
alias="organization",
default_factory=from_env(["OPENAI_ORG_ID", "OPENAI_ORGANIZATION"], default=None),
)
Automatically inferred from env var OPENAI_ORG_ID
if not provided.
max_retries
class-attribute
instance-attribute
¶
max_retries: int = 2
Maximum number of retries to make when generating.
request_timeout
class-attribute
instance-attribute
¶
Timeout for requests to OpenAI completion API. Can be float, httpx.Timeout
or
None.
tiktoken_enabled
class-attribute
instance-attribute
¶
tiktoken_enabled: bool = True
Set this to False for non-OpenAI implementations of the embeddings API, e.g.
the --extensions openai
extension for text-generation-webui
tiktoken_model_name
class-attribute
instance-attribute
¶
tiktoken_model_name: str | None = None
The model name to pass to tiktoken when using this class. Tiktoken is used to count the number of tokens in documents to constrain them to be under a certain limit. By default, when set to None, this will be the same as the embedding model name. However, there are some cases where you may want to use this Embedding class with a model name not supported by tiktoken. This can include when using Azure embeddings or when using one of the many model providers that expose an OpenAI-like API but with different models. In those cases, in order to avoid erroring when tiktoken is called, you can specify a model name to use here.
show_progress_bar
class-attribute
instance-attribute
¶
show_progress_bar: bool = False
Whether to show a progress bar when embedding.
model_kwargs
class-attribute
instance-attribute
¶
Holds any model parameters valid for create
call not explicitly specified.
skip_empty
class-attribute
instance-attribute
¶
skip_empty: bool = False
Whether to skip empty strings when embedding or raise an error.
retry_min_seconds
class-attribute
instance-attribute
¶
retry_min_seconds: int = 4
Min number of seconds to wait between retries
retry_max_seconds
class-attribute
instance-attribute
¶
retry_max_seconds: int = 20
Max number of seconds to wait between retries
http_client
class-attribute
instance-attribute
¶
http_client: Any | None = None
Optional httpx.Client
. Only used for sync invocations. Must specify
http_async_client
as well if you'd like a custom client for async
invocations.
http_async_client
class-attribute
instance-attribute
¶
http_async_client: Any | None = None
Optional httpx.AsyncClient
. Only used for async invocations. Must specify
http_client
as well if you'd like a custom client for sync invocations.
check_embedding_ctx_length
class-attribute
instance-attribute
¶
check_embedding_ctx_length: bool = True
Whether to check the token length of inputs and automatically split inputs longer than embedding_ctx_length.
azure_endpoint
class-attribute
instance-attribute
¶
azure_endpoint: str | None = Field(
default_factory=from_env("AZURE_OPENAI_ENDPOINT", default=None)
)
Your Azure endpoint, including the resource.
Automatically inferred from env var AZURE_OPENAI_ENDPOINT
if not provided.
Example: https://example-resource.azure.openai.com/
deployment
class-attribute
instance-attribute
¶
A model deployment.
If given sets the base client URL to include /deployments/{azure_deployment}
.
Note
This means you won't be able to use non-deployment endpoints.
openai_api_key
class-attribute
instance-attribute
¶
openai_api_key: SecretStr | None = Field(
alias="api_key",
default_factory=secret_from_env(
["AZURE_OPENAI_API_KEY", "OPENAI_API_KEY"], default=None
),
)
Automatically inferred from env var AZURE_OPENAI_API_KEY
if not provided.
openai_api_version
class-attribute
instance-attribute
¶
openai_api_version: str | None = Field(
default_factory=from_env("OPENAI_API_VERSION", default="2023-05-15"),
alias="api_version",
)
Automatically inferred from env var OPENAI_API_VERSION
if not provided.
Set to '2023-05-15'
by default if env variable OPENAI_API_VERSION
is not
set.
azure_ad_token
class-attribute
instance-attribute
¶
azure_ad_token: SecretStr | None = Field(
default_factory=secret_from_env("AZURE_OPENAI_AD_TOKEN", default=None)
)
Your Azure Active Directory token.
Automatically inferred from env var AZURE_OPENAI_AD_TOKEN
if not provided.
azure_ad_token_provider
class-attribute
instance-attribute
¶
A function that returns an Azure Active Directory token.
Will be invoked on every sync request. For async requests,
will be invoked if azure_ad_async_token_provider
is not provided.
azure_ad_async_token_provider
class-attribute
instance-attribute
¶
A function that returns an Azure Active Directory token.
Will be invoked on every async request.
chunk_size
class-attribute
instance-attribute
¶
chunk_size: int = 2048
Maximum number of texts to embed in each batch
embed_documents
¶
embed_documents(
texts: list[str], chunk_size: int | None = None, **kwargs: Any
) -> list[list[float]]
Call out to OpenAI's embedding endpoint for embedding search docs.
PARAMETER | DESCRIPTION |
---|---|
texts
|
The list of texts to embed. |
chunk_size
|
The chunk size of embeddings. If
TYPE:
|
kwargs
|
Additional keyword arguments to pass to the embedding API.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[list[float]]
|
List of embeddings, one for each text. |
embed_query
¶
aembed_documents
async
¶
aembed_documents(
texts: list[str], chunk_size: int | None = None, **kwargs: Any
) -> list[list[float]]
Call out to OpenAI's embedding endpoint async for embedding search docs.
PARAMETER | DESCRIPTION |
---|---|
texts
|
The list of texts to embed. |
chunk_size
|
The chunk size of embeddings. If
TYPE:
|
kwargs
|
Additional keyword arguments to pass to the embedding API.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[list[float]]
|
List of embeddings, one for each text. |
aembed_query
async
¶
build_extra
classmethod
¶
Build extra kwargs from additional params that were passed in.