fairmd.lipids.databankio module

Input/Output auxiliary functions.

Input/Output module with some small usefull functions. It includes: - Downloading files. - Resolving URLs. - Calculating file hash for fingerprinting.

fairmd.lipids.databankio.calc_file_sha1_hash(fi: str, step: int = 67108864, *, one_block: bool = True) str[source]

Calculate the SHA1 hash of a file.

Reads the file in chunks to handle large files efficiently if specified.

Parameters:
  • fi (str) – The path to the file.

  • step (int) – The chunk size in bytes for reading the file. Defaults to 64MB. Only used if one_block is False.

  • one_block (bool) – If True, reads the first step bytes of the file. If False, reads the entire file in chunks of step bytes. Defaults to True.

Returns:

str

Return type:

The hexadecimal SHA1 hash of the file content.

fairmd.lipids.databankio.create_simulation_directories(software: str, sim_hashes: Mapping, out: str, *, dry_run_mode: bool = False) str[source]

Create a nested output directory structure to save simulation results.

The directory structure is generated based on the hashes of the simulation input files.

Parameters:
  • software – MD engine software (from simulation metadata)

  • sim_hashes (Mapping) – A dictionary mapping file types (e.g., “TPR”, “TRJ”) to their hash information. The structure is expected to be {‘TYPE’: [(‘filename’, ‘hash’)]}.

  • out (str) – The root output directory where the nested structure will be created.

  • dry_run_mode (bool) – If True, the directory path is resolved but not created. Defaults to False.

Returns:

str

Return type:

The full path to the created output directory.

Raises:
  • FileExistsError – If the target output directory already exists and is: not empty.

  • NotImplementedError – If the simulation software is not supported.:

  • RuntimeError – If the target output directory could not be created.:

fairmd.lipids.databankio.download_resource_from_uri(uri: str, dest: str, *, override_if_exists: bool = False, max_bytes: bool = False) int[source]

Download file resource from a URI to a local destination.

Checks if the file already exists and has the same size before downloading. Can also perform a partial “dry-run” download.

Parameters:
  • uri (str) – The URL of the file resource.

  • dest (str) – The local destination path to save the file.

  • override_if_exists (bool) – If True, the file will be re-downloaded even if it already exists. Defaults to False.

  • max_bytes (bool) – If True, only a partial download is performed (up to MAX_DRYRUN_SIZE). Defaults to False.

Returns:

int – 0: Download was successful. 1: Download was skipped because the file already exists. 2: File was re-downloaded due to a size mismatch.

Return type:

A status code indicating the result.

Raises:
  • ConnectionError – An error occurred after multiple download attempts.:

  • OSError – The downloaded file size does not match the expected size.:

fairmd.lipids.databankio.download_with_progress_with_retry(uri: str, dest: str, *, tqdm_title: str = 'Downloading', stop_after: int | None = None) None[source]

Download a file with a progress bar and retry logic.

Uses tqdm to display a progress bar during the download.

Parameters:
  • uri (str) – The URL of the file to download.

  • dest (str) – The local destination path to save the file.

  • tqdm_title (str) – The title used for the progress bar description.

  • stop_after (int) – Download max num of bytes

fairmd.lipids.databankio.resolve_file_url(doi: str, fi_name: str, *, validate_uri: bool = True) str[source]

Resolve a download file URI from zenodo record’s DOI and filename.

Currently supports Zenodo DOIs.

Parameters:
  • (str) (fi_name) – The DOI identifier for the repository (e.g., “10.5281/zenodo.1234”).

  • (str) – The name of the file within the repository.

  • (bool) (validate_uri) – If True, checks if the resolved URL is a valid and reachable address. Defaults to True.

Return str:

The full, direct download URL for the file.

Raises:
  • HTTPError or other connection errors – If the URL cannot be opened after multiple retries.

  • NotImplementedError – If the DOI provider is not supported.