Simple interface for implementing a custom LLM.
You should subclass this class and implement the following:
_call method: Run the LLM on the given prompt and input (used by invoke)._identifying_params property: Return a dictionary of the identifying parameters
This is critical for caching and tracing purposes. Identifying parameters
is a dict that identifies the LLM.
It should mostly include a model_name.Optional: Override the following methods to provide more optimizations:
_acall: Provide a native async version of the _call method.
If not provided, will delegate to the synchronous version using
run_in_executor. (Used by ainvoke)._stream: Stream the LLM on the given prompt and input.
stream will use _stream if provided, otherwise it
use _call and output will arrive in one chunk._astream: Override to provide a native async version of the _stream method.
astream will use _astream if provided, otherwise it will implement
a fallback behavior that will use _stream if _stream is implemented,
and use _acall if _stream is not implemented.Whether to cache the response.
Whether to print out response text.
Callbacks to add to the run trace.
Tags to add to the run trace.
Metadata to add to the run trace.
Optional encoder to use for counting tokens.
Get the input type for this Runnable.
If verbose is None, set it.
Pass a sequence of prompts to the model and return model generations.
Asynchronously pass a sequence of prompts and return model generations.
Not implemented on this class.
Return the ordered IDs of the tokens in a text.
Get the number of tokens present in the text.
Get the number of tokens in the messages.
The name of the Runnable. Used for debugging and tracing.
Input type.
Output Type.
The type of input this Runnable accepts specified as a Pydantic model.
Output schema.
List configurable fields for this Runnable.
Get the name of the Runnable.
Get a Pydantic model that can be used to validate input to the Runnable.
Get a JSON schema that represents the input to the Runnable.
Get a Pydantic model that can be used to validate output to the Runnable.
Get a JSON schema that represents the output of the Runnable.
The type of config this Runnable accepts specified as a Pydantic model.
Get a JSON schema that represents the config of the Runnable.
Return a graph representation of this Runnable.
Return a list of prompts used by this Runnable.
Pipe Runnable objects.
Pick keys from the output dict of this Runnable.
Assigns new fields to the dict output of this Runnable.
Transform a single input into an output.
Transform a single input into an output.
Default implementation runs invoke in parallel using a thread pool executor.
Run invoke in parallel on a list of inputs.
Default implementation runs ainvoke in parallel using asyncio.gather.
Run ainvoke in parallel on a list of inputs.
Default implementation of stream, which calls invoke.
Default implementation of astream, which calls ainvoke.
Stream all output from a Runnable, as reported to the callback system.
Generate a stream of events.
Transform inputs to outputs.
Transform inputs to outputs.
Bind arguments to a Runnable, returning a new Runnable.
Bind config to a Runnable, returning a new Runnable.
Bind lifecycle listeners to a Runnable, returning a new Runnable.
Bind async lifecycle listeners to a Runnable.
Bind input and output types to a Runnable, returning a new Runnable.
Create a new Runnable that retries the original Runnable on exceptions.
Return a new Runnable that maps a list of inputs to a list of outputs.
Add fallbacks to a Runnable, returning a new Runnable.
Create a BaseTool from a Runnable.