MetaNetX SDK Documentation

Parse and process information from MetaNetX for MIRIAM compatibility using the Identifiers.org namespaces.

Install

It’s as simple as:

pip install metanetx-sdk

Usage

The authoritative source on how to use the various commands is always accessible via the commands’ help.

mnx-sdk -h

Normally you would start by loading the files from the MetaNetX FTP server

mnx-sdk pull ./data

and then transforming each data table.

mnx-sdk etl chem-prop ./data/chem_prop.tsv.gz ./data/transformed_chem_prop.tsv.gz

You can also directly use the functions from the metanetx_sdk.api module.

Contents

API Reference

metanetx_sdk

metanetx_sdk package
Subpackages
metanetx_sdk.cli package Submodules metanetx_sdk.cli.cli module

Provide a command line interface for working with MetaNetX data.

metanetx_sdk.cli.etl module

Provide MetaNetX table processing commands.

Module contents

Provide a command line interface.

metanetx_sdk.data package Module contents

Provide data files.

metanetx_sdk.model package Submodules metanetx_sdk.model.ftp_configuration_model module

Provide an FTP configuration data model.

class metanetx_sdk.model.ftp_configuration_model.FTPConfigurationModel[source]

Bases: pydantic.main.BaseModel

Define the FTP configuration data model.

property directory

Return the compound working directory for the FTP server.

classmethod load(version: Optional[str] = None) → metanetx_sdk.model.ftp_configuration_model.FTPConfigurationModel[source]

Load the packaged FTP configuration.

class metanetx_sdk.model.ftp_configuration_model.FTPPath[source]

Bases: pathlib.PurePosixPath

Define an FTP path data type.

classmethod __get_validators__()[source]

Follow the pydantic guide for custom types.

See https://pydantic-docs.helpmanual.io/#custom-data-types

classmethod validate(value: str) → metanetx_sdk.model.ftp_configuration_model.FTPPath[source]

Transform the given path string into an object.

class metanetx_sdk.model.ftp_configuration_model.Timezone[source]

Bases: datetime.tzinfo

Define a timezone custom data type.

classmethod __get_validators__()[source]

Follow the pydantic guide for custom types.

See https://pydantic-docs.helpmanual.io/#custom-data-types

classmethod validate(value: str) → metanetx_sdk.model.ftp_configuration_model.Timezone[source]

Transform the given timezone string into an object.

metanetx_sdk.model.path_info_model module

Provide an FTP configuration data model.

class metanetx_sdk.model.path_info_model.PathInfoModel[source]

Bases: pydantic.main.BaseModel

Describe information found about FTP files.

localize(local_tz: pytz.timezone) → None[source]

Convert the modify timestamp into a timezone aware one.

classmethod transform_modify(value: str) → datetime.datetime[source]

Transform the modify string to a datetime object.

metanetx_sdk.model.table_configuration_model module

Provide an FTP configuration data model.

class metanetx_sdk.model.table_configuration_model.SingleTableConfigurationModel[source]

Bases: pydantic.main.BaseModel

Describe the configuration needed for a single table.

class metanetx_sdk.model.table_configuration_model.TableConfigurationModel[source]

Bases: pydantic.main.BaseModel

Describe all table configuration models.

classmethod load(version: Optional[str] = None) → metanetx_sdk.model.table_configuration_model.TableConfigurationModel[source]

Load the configuration from the packaged file.

Module contents

Provide data models.

metanetx_sdk.transform package Submodules metanetx_sdk.transform.chemical module

Provide chemical data transformation functions.

metanetx_sdk.transform.chemical.transform_chebi_prefix(table: pandas.core.frame.DataFrame)[source]

Transform all ChEBI identifiers.

metanetx_sdk.transform.chemical.transform_chemical_cross_references(references: pandas.core.frame.DataFrame, prefix_mapping: Mapping) → pandas.core.frame.DataFrame[source]

Transform the MetaNetX chemical cross-references.

metanetx_sdk.transform.chemical.transform_chemical_properties(chemicals: pandas.core.frame.DataFrame, prefix_mapping: Mapping) → pandas.core.frame.DataFrame[source]

Transform the MetaNetX chemical cross-references.

metanetx_sdk.transform.chemical.transform_kegg_prefix(table: pandas.core.frame.DataFrame)[source]

Transform all KEGG identifiers.

metanetx_sdk.transform.chemical.transform_metanetx_prefix(table: pandas.core.frame.DataFrame)[source]

Transform all MetaNetX identifiers.

metanetx_sdk.transform.chemical.transform_swisslipid_prefix(table: pandas.core.frame.DataFrame)[source]

Transform all swisslipid identifiers.

metanetx_sdk.transform.compartment module

Provide compartment data transformation functions.

metanetx_sdk.transform.compartment.transform_cell_cycle_ontology_prefix(table: pandas.core.frame.DataFrame)[source]

Transform all CCO terms.

metanetx_sdk.transform.compartment.transform_compartment_cross_references(references: pandas.core.frame.DataFrame, prefix_mapping: Mapping) → pandas.core.frame.DataFrame[source]

Transform the MetaNetX compartment cross-references.

metanetx_sdk.transform.compartment.transform_compartment_properties(compartments: pandas.core.frame.DataFrame, prefix_mapping: Mapping) → pandas.core.frame.DataFrame[source]

Transform the MetaNetX compartment properties.

metanetx_sdk.transform.compartment.transform_go_prefix(table: pandas.core.frame.DataFrame)[source]

Transform all GO terms.

metanetx_sdk.transform.compartment.transform_metanetx_prefix(table: pandas.core.frame.DataFrame)[source]

Transform all MetaNetX identifiers.

metanetx_sdk.transform.reaction module

Provide reaction data transformation functions.

metanetx_sdk.transform.reaction.transform_metanetx_prefix(table: pandas.core.frame.DataFrame)[source]

Transform all MetaNetX identifiers.

metanetx_sdk.transform.reaction.transform_reaction_cross_references(references: pandas.core.frame.DataFrame, prefix_mapping: Mapping) → pandas.core.frame.DataFrame[source]

Transform the MetaNetX reaction cross-references.

metanetx_sdk.transform.reaction.transform_reaction_properties(reactions: pandas.core.frame.DataFrame, prefix_mapping: Mapping) → pandas.core.frame.DataFrame[source]

Transform the MetaNetX reaction properties.

Module contents

Provide data transformation functions.

Submodules
metanetx_sdk.api module

Expose the application programmer interface.

metanetx_sdk.api.etl_table(filename: pathlib.Path, output: pathlib.Path, configuration: metanetx_sdk.model.table_configuration_model.SingleTableConfigurationModel, mapping: Mapping, transform: Callable) → None[source]

Extract, transform, and load a MetaNetX table.

Parameters
  • filename (pathlib.Path) – The table to extract and transform.

  • output (pathlib.Path) – Where to store the processed output.

  • configuration (metanetx_sdk.model.SingleTableConfigurationModel) – The configuration to use for extracting the specific file.

  • mapping (typing.Mapping) – A mapping between MetaNetX resources and Identifiers.org registries.

  • transform (typing.Callable) – The table-specific transformation function to apply.

metanetx_sdk.api.pull(directory: pathlib.Path, files: Optional[List[pathlib.Path]] = None, configuration: Optional[metanetx_sdk.model.ftp_configuration_model.FTPConfigurationModel] = None, last_checked: Optional[datetime.datetime] = None, compress: bool = True) → datetime.datetime[source]

Pull in changes to one or more files from the MetaNetX FTP server.

Parameters
  • directory (pathlib.Path) – The working directory where files are updated.

  • files (list of pathlib.Path, optional) – A list of one or more filenames as they are found on the FTP server (basename only). By default all known files are checked.

  • configuration (metanetx_sdk.model.FTPConfigurationModel, optional) – Configuration values encoded in an object. A default configuration is provided.

  • last_checked (datetime, optional) – The time when the files were last checked for updates. By default it is assumed that the files have never been checked before.

  • compress (bool, optional) – Whether or not to compress the downloaded files with gzip (default True).

Returns

The current time (timezone of the FTP server) when files were checked for updates.

Return type

datetime

metanetx_sdk.extract module

Provide extraction functions.

metanetx_sdk.extract.extract_chemical_prefix_mapping()[source]

Return the packaged chemical prefix mapping.

metanetx_sdk.extract.extract_compartment_prefix_mapping()[source]

Return the packaged compartment prefix mapping.

metanetx_sdk.extract.extract_reaction_prefix_mapping()[source]

Return the packaged reaction prefix mapping.

metanetx_sdk.extract.extract_table(filename: pathlib.Path, columns: List[str], skip: int) → pandas.core.frame.DataFrame[source]

Extract tabular MetaNetX data.

The tables dumped by MetaNetX have their column names in comments and are not always appropriate for the given table.

Parameters
  • filename (pathlib.Path) – The filesystem location of the table.

  • columns (list of str) – The column headers to use for this table.

  • skip (int) – The number of initial lines in the file to skip.

Returns

Return type

pandas.DataFrame

metanetx_sdk.ftp module

Provide functions to interact with the MetaNetX FTP server.

async metanetx_sdk.ftp.update_file(host: str, ftp_directory: pathlib.PurePosixPath, path: pathlib.Path, filename: pathlib.Path, last_checked: datetime.datetime, local_timezone: pytz.timezone, compress: bool = True, timeout: Union[float, int, None] = 5) → None[source]

Retrieve a file from an FTP server if it is newer than a local version.

Parameters
  • host (str) – The FTP host, for example, ftp.vital-it.ch.

  • ftp_directory (pathlib.Path) –

  • path (pathlib.Path) – Working directory where files are searched and stored.

  • filename (pathlib.Path) – The file to retrieve relative to the working directory on the server.

  • last_checked (datetime) – The date and time when this script was last run.

  • local_timezone (pytz.timezone) –

  • compress (bool, optional) – Whether or not to gzip the files.

  • timeout (float, int, or None, optional) – The timeout in seconds for FTP operations (default 5 s). Can be disabled by setting None.

async metanetx_sdk.ftp.update_tables(host: str, ftp_directory: pathlib.PurePosixPath, output: pathlib.Path, files: List[pathlib.Path], last_checked: datetime.datetime, local_tz: pytz.timezone, compress: bool) → None[source]

Load all given files if newer versions exist.

Parameters
  • host (str) – The FTP host, for example, ftp.vital-it.ch.

  • ftp_directory (pathlib.PurePosixPath) – The working directory on the host.

  • output (pathlib.Path) – The output directory for the files. If a filename of any of the files exists in that directory, it is only overwritten if the one on the host is more recent.

  • files (list of pathlib.Path) – Pure filenames of files of interest to be loaded from the server.

  • last_checked (datetime.datetime) – When the local files were last checked.

  • local_tz (pytz.timezone) – A timezone that the FTP server is in, for example, Europe/Zurich.

  • compress (bool) – Whether or not to gzip compress downloaded files.

metanetx_sdk.helpers module

Define general helper functions.

metanetx_sdk.helpers.show_versions()[source]

Print dependency information.

Module contents

Create top level imports.

Indices and Tables