Class●Since v0.3

WeightOnlyQuantPipeline

WeightOnlyQuantPipeline()

Bases

LLM

Attributes

Methods

Inherited fromBaseLLM(langchain_core)

Attributes

AOutputType

Methods

Minvoke Mainvoke Mbatch

View source on GitHub

Weight only quantized model.

To use, you should have the intel-extension-for-transformers packabge and transformers package installed. intel-extension-for-transformers: https://github.com/intel/intel-extension-for-transformers

Example using from_model_id:

.. code-block:: python

from langchain_community.llms import WeightOnlyQuantPipeline from intel_extension_for_transformers.transformers import ( WeightOnlyQuantConfig ) config = WeightOnlyQuantConfig hf = WeightOnlyQuantPipeline.from_model_id( model_id="google/flan-t5-large", task="text2text-generation" pipeline_kwargs={"max_new_tokens": 10}, quantization_config=config, )

Example passing pipeline in directly: .. code-block:: python

from langchain_community.llms import WeightOnlyQuantPipeline
from intel_extension_for_transformers.transformers import (
    AutoModelForSeq2SeqLM
)
from intel_extension_for_transformers.transformers import (
    WeightOnlyQuantConfig
)
from transformers import AutoTokenizer, pipeline

model_id = "google/flan-t5-large"
tokenizer = AutoTokenizer.from_pretrained(model_id)
config = WeightOnlyQuantConfig
model = AutoModelForSeq2SeqLM.from_pretrained(
    model_id,
    quantization_config=config,
)
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=10,
)
hf = WeightOnlyQuantPipeline(pipeline=pipe)

LangChain Assistant

Menu

WeightOnlyQuantPipeline

Bases

Attributes

Methods

Inherited fromBaseLLM(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods

Inherited fromBaseModel

Attributes

Menu

WeightOnlyQuantPipeline

Bases

Used in Docs

Attributes

Methods

Inherited fromBaseLLM(langchain_core)

Attributes

Methods

Inherited fromBaseLanguageModel(langchain_core)

Attributes

Methods

Inherited fromRunnableSerializable(langchain_core)

Attributes

Methods

Inherited fromSerializable(langchain_core)

Attributes

Methods

Inherited fromRunnable(langchain_core)

Attributes

Methods

Inherited fromBaseModel

Attributes