Class●Since v1.2

inference_priority

inference_priority(
    self,
    *,
    priority: int,
)

Constructors

View source on GitHub

Name	Type
priority	int

Set inference priority for all LLM calls within scope.

Lower number = higher priority (priority=1 is most urgent).

Works as both a decorator and a context manager::

# decorator — deprioritize background work
@inference_priority(priority=10)
def background_research(query):
    return llm.invoke(query)

# context manager
with inference_priority(priority=1):
    result = llm.invoke(query)

# async decorator
@inference_priority(priority=10)
async def background_async(query):
    return await llm.ainvoke(query)

Precedence (wins first → last):

Active inference_priority context
Instance default: ChatNVIDIADynamo(priority=1)

Nesting: inner scopes fully replace outer scopes.

LangChain Assistant

Menu

inference_priority

Constructors