macでインフォマティクス

macでインフォマティクス

HTS (NGS) 関連のインフォマティクス情報についてまとめています。

keggcharter

 

Githubより

KEGGCharter は KEGG API と Pathway 機能のユーザーフレンドリーな実装です。特徴は

  • KEGG ID から KEGG Orthologs (KO) への変換、および KO から EC 番号への変換。
  • 主要な分類群の代謝ポテンシャルを KEGG メタボリックマップで表現(上位 10 位まで、それぞれ独自の色で区別される)。
  • KEGG代謝マップにおけるサンプル間の発現差の表現(各機能の総和を表現)

 

インストール

condaで環境を作ってテストした。

Github

mamba create -n keggcharter -y
conda activate keggcharter
mamba install -c conda-forge -c bioconda keggcharter -y

> keggcharter.py -h

usage: keggcharter.py [-h] [-o OUTPUT] [-rd RESOURCES_DIRECTORY] [-mm METABOLIC_MAPS] [-gcol GENOMIC_COLUMNS] [-tcol TRANSCRIPTOMIC_COLUMNS] [-tls TAXA_LIST] [-not NUMBER_OF_TAXA] [-keggc KEGG_COLUMN] [-koc KO_COLUMN] [-ecc EC_COLUMN] [-iq] [-it] [-tc TAXA_COLUMN]

                      [--resume] [-v] -f FILE [--show-available-maps]

 

KEGGCharter - A tool for representing genomic potential and transcriptomic expression into KEGG pathways

 

options:

  -h, --help            show this help message and exit

  -o OUTPUT, --output OUTPUT

                        Output directory

  -rd RESOURCES_DIRECTORY, --resources-directory RESOURCES_DIRECTORY

                        Directory for storing KGML and CSV files.

  -mm METABOLIC_MAPS, --metabolic-maps METABOLIC_MAPS

                        IDs of metabolic maps to output

  -gcol GENOMIC_COLUMNS, --genomic-columns GENOMIC_COLUMNS

                        Names of columns with genomic identification

  -tcol TRANSCRIPTOMIC_COLUMNS, --transcriptomic-columns TRANSCRIPTOMIC_COLUMNS

                        Names of columns with transcriptomics quantification

  -tls TAXA_LIST, --taxa-list TAXA_LIST

                        List of taxa to represent in genomic potential charts (comma separated)

  -not NUMBER_OF_TAXA, --number-of-taxa NUMBER_OF_TAXA

                        Number of taxa to represent in genomic potential charts (comma separated)

  -keggc KEGG_COLUMN, --kegg-column KEGG_COLUMN

                        Column with KEGG IDs.

  -koc KO_COLUMN, --ko-column KO_COLUMN

                        Column with KOs.

  -ecc EC_COLUMN, --ec-column EC_COLUMN

                        Column with EC numbers.

  -iq, --input-quantification

                        If input table has no quantification, will create a mock quantification column

  -it, --input-taxonomy

                        If no taxonomy column exists and there is only one taxon in question.

  -tc TAXA_COLUMN, --taxa-column TAXA_COLUMN

                        Column with the taxa designations to represent with KEGGCharter

  --resume              If data inputed has already been analyzed by KEGGCharter.

  -v, --version         show program's version number and exit

 

required named arguments:

  -f FILE, --file FILE  TSV or EXCEL table with information to chart

 

Special functions:

  --show-available-maps

                        Outputs KEGG maps IDs and descriptions to the console (so you may pick the ones you want!)

 

Input file must be specified.

(keggcharter) kazu@kazu:/media/kazu/8TB6/cyanobacteria_paper/5_nif_search/pfam/nifH-candidate_proteins$ 

 

 

 

テストラン

git clone https://github.com/iquasere/KEGGCharter.git

KEGGCharter/MOSCA_Entry_Report.xlsx

 

keggcharter.py -f KEGGCharter/MOSCA_Entry_Report.xlsx -gcol mg -tcol mt_0.01a_normalized,mt_1a_normalized,mt_100a_normalized,mt_0.01b_normalized,mt_1b_normalized,mt_100b_normalized,mt_0.01c_normalized,mt_1c_normalized,mt_100c_normalized -keggc "Cross-reference (KEGG)" -o test_keggcharter -tc "Taxonomic lineage (GENUS)"

KEGGCharter の初回実行には時間がかかる。このコマンドは、KEGGCharter の 252 のデフォルトマップ全て表現する。

 

 

引用

UPIMAPI, reCOGnizer and KEGGCharter: Bioinformatics tools for functional annotation and visualization of (meta)-omics datasets
João C Sequeira, Miguel Rocha, M Madalena Alves, Andreia F Salvador

Comput Struct Biotechnol J. 2022 Apr 9;20:1798-1810