Skip to content

API Reference

Modules

MELD: A multilingual and multi-domain dataset for named entity recognition.

Supports fully reproducible downloading, processing, and format normalization of NER datasets.

local_dataset_names(data_directory)

Yields names of datasets located in the specified directory.

Parameters:

Name Type Description Default
data_directory AnyPath

Path to the directory containing dataset folders.

required

Yields: Name of each dataset.

Source code in meld/formats.py
def local_dataset_names(data_directory: AnyPath) -> Iterator[str]:
    """
    Yields names of datasets located in the specified directory.

    Args:
        data_directory: Path to the directory containing dataset
            folders.
    Yields:
        Name of each dataset.
    """

    for path in _local_processed_directory(data_directory).iterdir():
        if (path / METADATA_FILENAME).is_file():
            yield path.name

local_datasets(data_directory)

Yields all locally downloaded datasets within a given directory.

Parameters:

Name Type Description Default
data_directory AnyPath

Path to the directory containing dataset metadata files.

required

Yields: An iterator over Dataset instances.

Source code in meld/formats.py
def local_datasets(data_directory: AnyPath) -> Iterator[Dataset]:
    """
    Yields all locally downloaded datasets within a given directory.

    Args:
        data_directory: Path to the directory containing dataset
            metadata files.
    Yields:
        An iterator over Dataset instances.
    """

    processed_directory = _local_processed_directory(data_directory)
    for metadata in _local_dataset_metadata(data_directory):
        yield Dataset(processed_directory / metadata.name, metadata)

This documentation is organized into the following modules: