decorators

Inference-priority decorator for LangChain chat models.

Provides a decorator / context manager that sets priority for every :class:~langchain_core.language_models.BaseChatModel call in scope.

The mechanism is universal: any BaseChatModel subclass whose Pydantic model_fields include priority will automatically receive the value as a keyword argument — no per-model integration required.

Lower number = higher priority (priority=1 is most urgent).

Example — decorator (deprioritize background work)::

from langchain_nvidia_ai_endpoints import ChatNVIDIADynamo, inference_priority

llm = ChatNVIDIADynamo(model="my-model", base_url="http://localhost:8099/v1")

@inference_priority(priority=10)
def background_research(query: str) -> str:
    return llm.invoke(query).content

Example — context manager::

with inference_priority(priority=10):
    result = llm.invoke("background task")

Set inference priority for all LLM calls within scope.

Lower number = higher priority (priority=1 is most urgent).

Works as both a decorator and a context manager::

# decorator — deprioritize background work
@inference_priority(priority=10)
def background_research(query):
    return llm.invoke(query)

# context manager
with inference_priority(priority=1):
    result = llm.invoke(query)

# async decorator
@inference_priority(priority=10)
async def background_async(query):
    return await llm.ainvoke(query)

Precedence (wins first → last):

Active inference_priority context
Instance default: ChatNVIDIADynamo(priority=1)

Nesting: inner scopes fully replace outer scopes.

LangChain Assistant

Menu

Attributes

Functions

Classes