Skip to content

Latest commit

 

History

History
84 lines (65 loc) · 4.19 KB

README.md

File metadata and controls

84 lines (65 loc) · 4.19 KB

PyPI version Documentation Status OpenOmics codecov openomics

pyOpenSci status DOI

OpenOmics is currently under active development and we may break API compatibility in the future.

This Python package provide a series of tools to integrate and explore the genomics, transcriptomics, proteomics, and clinical data (aka multi-omics data). With interfaces to popular annotation databases and scalable data-frame manipulation tools, OpenOmics facilitates the common data wrangling tasks when preparing data for RNA-seq bioinformatics analysis.

Documentation (Latest | Stable) | OpenOmics at a glance

Features

OpenOmics assist in integration of heterogeneous multi-omics bioinformatics data. The library provides a Python API as well as an interactive Dash web interface. It features support for:

  • Genomics, Transcriptomics, Proteomics, and Clinical data.
  • Harmonization with 20+ popular annotation, interaction, disease-association databases.

OpenOmics also has an efficient data pipeline that bridges the popular data manipulation Pandas library and Dask distributed processing to address the following use cases:

  • Providing a standard pipeline for dataset indexing, table joining and querying, which are transparent and customizable for end-users.
  • Providing Efficient disk storage for large multi-omics dataset with Parquet data structures.
  • Integrating various data types including interactions and sequence data, then exporting to NetworkX graphs or data generators for down-stream machine learning.
  • Accessible by both developers and scientists with a Python API that works seamlessly with an external Galaxy tool interface or the built-in Dash web interface (WIP).

Installation:

PyPI

pip install openomics

Conda

conda install openomics -c jonnytran # Work in progress

From source

git clone https://github.com/JonnyTran/OpenOmics/
cd OpenOmics
pip install -e .

Citations

The journal paper for this scientific package was reviewed by JOSS at https://joss.theoj.org/papers/10.21105/joss.03249#, and can be cited with:

# BibTeX
@article{Tran2021,
  doi = {10.21105/joss.03249},
  url = {https://doi.org/10.21105/joss.03249},
  year = {2021},
  publisher = {The Open Journal},
  volume = {6},
  number = {61},
  pages = {3249},
  author = {Nhat C. Tran and Jean X. Gao},
  title = {OpenOmics: A bioinformatics API to integrate multi-omics datasets and interface with public databases.},
  journal = {Journal of Open Source Software}
}

Credits

Thank you for extremely helpful feedback and guidance from the pyOpenSci reviewers. This package was created with the pyOpenSci/cookiecutter-pyopensci project template, based off audreyr/cookiecutter-pypackage.