Download¶
Dataset download utilities.
download(url, target_directory=None, stream=True, skip_if_file_exists=False)
¶
download(url: str, target_directory: Path | None = None, stream: Literal[True] = True, skip_if_file_exists: bool = False) -> AbstractContextManager[Iterable[bytes]]
Download a file from the given URL.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str
|
The URL of the file to download. |
required |
target_directory
|
Path | None
|
Optional directory where the file should be
saved. Ignored when downloading in streaming mode. If not
provided and streaming is disabled, a |
None
|
stream
|
bool
|
Streams the file contents instead of saving it to the target directory |
True
|
Returns:
| Type | Description |
|---|---|
Path | AbstractContextManager[Iterable[bytes]]
|
The path of the downloaded file if |
Path | AbstractContextManager[Iterable[bytes]]
|
specified; otherwise, a streaming response. |
Source code in meld/download.py
extract(path_or_stream, target_directory, members=None, archive_extension=None, member_globs=False)
¶
Extract files from an archive.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path_or_stream
|
Iterable[bytes] | Path
|
The file path or bytestream of the archive. |
required |
target_directory
|
Path
|
The directory where the files should be extracted. |
required |
members
|
Iterable[str | Path] | None
|
Specific members to extract. If None, all members are extracted. |
None
|
archive_extension
|
str | None
|
The extension of the archive (optional if path_or_stream is a Path). |
None
|
member_globs
|
bool
|
Whether to interpret the members parameter as globs. |
False
|
Returns:
| Type | Description |
|---|---|
list[Path]
|
A list of the extracted file paths. |
Source code in meld/download.py
extract_tar(path_or_stream, target_directory, members=None, member_globs=False)
¶
Extract a tar archive.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path_or_stream
|
Iterable[bytes] | Path
|
The file path or bytestream of the tar file. |
required |
target_directory
|
Path
|
The directory where the files should be extracted. |
required |
members
|
Iterable[str | Path] | None
|
Specific members to extract. If None, all members are extracted. |
None
|
member_globs
|
bool
|
Whether to interpret the members parameter as globs. |
False
|
Returns:
| Type | Description |
|---|---|
list[Path]
|
A list of the extracted file paths |
Source code in meld/download.py
extract_zip(path, target_directory, members=None, member_globs=False)
¶
Extract a zip archive.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Iterable[bytes] | Path
|
The file path or bytestream of the zip file. |
required |
target_directory
|
Path
|
The directory where the files should be extracted. |
required |
members
|
Iterable[str | Path] | None
|
Specific members to extract. If None, all members are extracted. |
None
|
member_globs
|
bool
|
Whether to interpret the members parameter as globs. |
False
|
Returns:
| Type | Description |
|---|---|
list[Path]
|
A list of the extracted file paths. |
Source code in meld/download.py
git_download(repo_url, revision, target_directory, files, keep_repo=False)
¶
Clone a specific revision of a Git repository and extract specified files to a target directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
repo_url
|
str
|
URL of the Git repository to clone. |
required |
revision
|
str
|
Revision of the repository to download. |
required |
target_directory
|
Path
|
Directory where the extracted files from the repository will be stored. |
required |
files
|
Sequence[PathLike[str] | str]
|
List of file paths relative to extract relative to the root of the Git repository. |
required |
keep_repo
|
bool
|
Keeps the cloned repository in the target_directory instead of a temporary directory for future re-use |
False
|