Inference-priority decorator for LangChain chat models.
Provides a decorator / context manager that sets priority for every
:class:~langchain_core.language_models.BaseChatModel call in scope.
The mechanism is universal: any BaseChatModel subclass whose Pydantic
model_fields include priority will automatically receive the value as
a keyword argument — no per-model integration required.
Lower number = higher priority (priority=1 is most urgent).
Example — decorator (deprioritize background work)::
from langchain_nvidia_ai_endpoints import ChatNVIDIADynamo, inference_priority
llm = ChatNVIDIADynamo(model="my-model", base_url="http://localhost:8099/v1")
@inference_priority(priority=10)
def background_research(query: str) -> str:
return llm.invoke(query).content
Example — context manager::
with inference_priority(priority=10):
result = llm.invoke("background task")