Ask a question to get started

Class●Since v0.3

UpstashRatelimitHandler

UpstashRatelimitHandler(
  self,
  identifier: str,
  *,
  token_ratelimit: Optional[Ratelimit] = None

Bases

BaseCallbackHandler

Constructors

Attributes

Methods

Inherited fromBaseCallbackHandler(langchain_core)

Attributes

Arun_inline Aignore_llm Aignore_retry Aignore_chain

View source on GitHub

include_output_tokens

Inherited fromLLMManagerMixin(langchain_core)

Methods

Mon_llm_new_token Mon_llm_error

Inherited fromToolManagerMixin(langchain_core)

Methods

Mon_tool_end Mon_tool_error

Inherited fromRetrieverManagerMixin(langchain_core)

Methods

Mon_retriever_error Mon_retriever_end

Inherited fromRunManagerMixin(langchain_core)

Methods

Mon_text Mon_retry Mon_custom_event

Parameters

Name	Type	Description
`identifier Union[int, str]`*	`unknown`
`token_ratelimit Optional[Ratelimit]`*	`unknown`
`request_ratelimit Optional[Ratelimit]`*	`unknown`
`include_output_tokens bool`*	`unknown`

constructor

__init__

Name	Type
identifier	str
token_ratelimit	Optional[Ratelimit]
request_ratelimit	Optional[Ratelimit]
include_output_tokens	bool

Callback to handle rate limiting based on the number of requests or the number of tokens in the input.

It uses Upstash Ratelimit to track the ratelimit which utilizes Upstash Redis to track the state.

Should not be passed to the chain when initialising the chain. This is because the handler has a state which should be fresh every time invoke is called. Instead, initialise and pass a handler every time you invoke.

Ratelimit to limit the number of tokens. Only works with OpenAI models since only these models provide the number of tokens as information in their output.

Ratelimit to limit the number of requests

Whether to count output tokens when rate limiting based on number of tokens. Only used when token_ratelimit is passed. False by default.

Run when chain starts running.

on_chain_start runs multiple times during a chain execution. To make sure that it's only called once, we keep a bool state _checked. If not self._checked, we call limit with request_ratelimit and raise UpstashRatelimitError if the identifier is rate limited.

Run when LLM starts running

Run when LLM ends running

If the include_output_tokens is set to True, number of tokens in LLM completion are counted for rate limiting

Creates a new UpstashRatelimitHandler object with the same ratelimit configurations but with a new identifier if it's provided.

Also resets the state of the handler.

LangChain Assistant

Menu

UpstashRatelimitHandler

Bases

Constructors

Attributes

Methods

Inherited fromBaseCallbackHandler(langchain_core)

Attributes

Inherited fromLLMManagerMixin(langchain_core)

Methods

Inherited fromChainManagerMixin(langchain_core)

Methods

Inherited fromToolManagerMixin(langchain_core)

Methods

Inherited fromRetrieverManagerMixin(langchain_core)

Methods

Inherited fromCallbackManagerMixin(langchain_core)

Methods

Inherited fromRunManagerMixin(langchain_core)

Methods

Parameters

Menu

UpstashRatelimitHandler

Bases

Used in Docs

Constructors

Attributes

Methods

Inherited fromBaseCallbackHandler(langchain_core)

Attributes

Inherited fromLLMManagerMixin(langchain_core)

Methods

Inherited fromChainManagerMixin(langchain_core)

Methods

Inherited fromToolManagerMixin(langchain_core)

Methods

Inherited fromRetrieverManagerMixin(langchain_core)

Methods

Inherited fromCallbackManagerMixin(langchain_core)

Methods

Inherited fromRunManagerMixin(langchain_core)

Methods

Parameters