# GLiNERLinkExtractor

> **Class** in `langchain_community`

📖 [View in docs](https://reference.langchain.com/python/langchain-community/graph_vectorstores/extractors/gliner_link_extractor/GLiNERLinkExtractor)

Link documents with common named entities using `GLiNER`_.

`GLiNER`_ is a Named Entity Recognition (NER) model capable of identifying any
entity type using a bidirectional transformer encoder (BERT-like).

The ``GLiNERLinkExtractor`` uses GLiNER to create links between documents that
have named entities in common.

Example::

    extractor = GLiNERLinkExtractor(
        labels=["Person", "Award", "Date", "Competitions", "Teams"]
    )
    results = extractor.extract_one("some long text...")

.. _GLiNER: https://github.com/urchade/GLiNER

.. seealso::

        - :mod:`How to use a graph vector store <langchain_community.graph_vectorstores>`
        - :class:`How to create links between documents <langchain_community.graph_vectorstores.links.Link>`

How to link Documents on common named entities
==============================================

Preliminaries
-------------

Install the ``gliner`` package:

.. code-block:: bash

    pip install -q langchain_community gliner

Usage
-----

We load the ``state_of_the_union.txt`` file, chunk it, then for each chunk we
extract named entity links and add them to the chunk.

Using extract_one()
^^^^^^^^^^^^^^^^^^^

We can use :meth:`extract_one` on a document to get the links and add the links
to the document metadata with
:meth:`~langchain_community.graph_vectorstores.links.add_links`::

    from langchain_community.document_loaders import TextLoader
    from langchain_community.graph_vectorstores import CassandraGraphVectorStore
    from langchain_community.graph_vectorstores.extractors import GLiNERLinkExtractor
    from langchain_community.graph_vectorstores.links import add_links
    from langchain_text_splitters import CharacterTextSplitter

    loader = TextLoader("state_of_the_union.txt")
    raw_documents = loader.load()

    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    documents = text_splitter.split_documents(raw_documents)

    ner_extractor = GLiNERLinkExtractor(["Person", "Topic"])
    for document in documents:
        links = ner_extractor.extract_one(document)
        add_links(document, links)

    print(documents[0].metadata)

.. code-block:: output

    {'source': 'state_of_the_union.txt', 'links': [Link(kind='entity:Person', direction='bidir', tag='President Zelenskyy'), Link(kind='entity:Person', direction='bidir', tag='Vladimir Putin')]}

Using LinkExtractorTransformer
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Using the :class:`~langchain_community.graph_vectorstores.extractors.link_extractor_transformer.LinkExtractorTransformer`,
we can simplify the link extraction::

    from langchain_community.document_loaders import TextLoader
    from langchain_community.graph_vectorstores.extractors import (
        GLiNERLinkExtractor,
        LinkExtractorTransformer,
    )
    from langchain_text_splitters import CharacterTextSplitter

    loader = TextLoader("state_of_the_union.txt")
    raw_documents = loader.load()

    text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
    documents = text_splitter.split_documents(raw_documents)

    ner_extractor = GLiNERLinkExtractor(["Person", "Topic"])
    transformer = LinkExtractorTransformer([ner_extractor])
    documents = transformer.transform_documents(documents)

    print(documents[0].metadata)

.. code-block:: output

    {'source': 'state_of_the_union.txt', 'links': [Link(kind='entity:Person', direction='bidir', tag='President Zelenskyy'), Link(kind='entity:Person', direction='bidir', tag='Vladimir Putin')]}

The documents with named entity links can then be added to a :class:`~langchain_community.graph_vectorstores.base.GraphVectorStore`::

    from langchain_community.graph_vectorstores import CassandraGraphVectorStore

    store = CassandraGraphVectorStore.from_documents(documents=documents, embedding=...)

## Signature

```python
GLiNERLinkExtractor(
    self,
    labels: List[str],
    *,
    kind: str = 'entity',
    model: str = 'urchade/gliner_mediumv2.1',
    extract_kwargs: Optional[Dict[str, Any]] = None,
)
```

## Parameters

| Name | Type | Required | Description |
|------|------|----------|-------------|
| `labels` | `List[str]` | Yes | List of kinds of entities to extract. |
| `kind` | `str` | No | Kind of links to produce with this extractor. (default: `'entity'`) |
| `model` | `str` | No | GLiNER model to use. (default: `'urchade/gliner_mediumv2.1'`) |
| `extract_kwargs` | `Optional[Dict[str, Any]]` | No | Keyword arguments to pass to GLiNER. (default: `None`) |

## Extends

- `LinkExtractor[GLiNERInput]`

## Constructors

```python
__init__(
    self,
    labels: List[str],
    *,
    kind: str = 'entity',
    model: str = 'urchade/gliner_mediumv2.1',
    extract_kwargs: Optional[Dict[str, Any]] = None,
)
```

| Name | Type |
|------|------|
| `labels` | `List[str]` |
| `kind` | `str` |
| `model` | `str` |
| `extract_kwargs` | `Optional[Dict[str, Any]]` |


## Methods

- [`extract_one()`](https://reference.langchain.com/python/langchain-community/graph_vectorstores/extractors/gliner_link_extractor/GLiNERLinkExtractor/extract_one)
- [`extract_many()`](https://reference.langchain.com/python/langchain-community/graph_vectorstores/extractors/gliner_link_extractor/GLiNERLinkExtractor/extract_many)

---

[View source on GitHub](https://github.com/langchain-ai/langchain-community/blob/a6a6079511ac8a5c1293337f88096b8641562e77/libs/community/langchain_community/graph_vectorstores/extractors/gliner_link_extractor.py#L15)