Githubより
KEGGCharter は KEGG API と Pathway 機能のユーザーフレンドリーな実装です。特徴は
- KEGG ID から KEGG Orthologs (KO) への変換、および KO から EC 番号への変換。
- 主要な分類群の代謝ポテンシャルを KEGG メタボリックマップで表現(上位 10 位まで、それぞれ独自の色で区別される)。
- KEGG代謝マップにおけるサンプル間の発現差の表現(各機能の総和を表現)
インストール
condaで環境を作ってテストした。
mamba create -n keggcharter -y
conda activate keggcharter
mamba install -c conda-forge -c bioconda keggcharter -y
> keggcharter.py -h
usage: keggcharter.py [-h] [-o OUTPUT] [-rd RESOURCES_DIRECTORY] [-mm METABOLIC_MAPS] [-gcol GENOMIC_COLUMNS] [-tcol TRANSCRIPTOMIC_COLUMNS] [-tls TAXA_LIST] [-not NUMBER_OF_TAXA] [-keggc KEGG_COLUMN] [-koc KO_COLUMN] [-ecc EC_COLUMN] [-iq] [-it] [-tc TAXA_COLUMN]
[--resume] [-v] -f FILE [--show-available-maps]
KEGGCharter - A tool for representing genomic potential and transcriptomic expression into KEGG pathways
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
Output directory
-rd RESOURCES_DIRECTORY, --resources-directory RESOURCES_DIRECTORY
Directory for storing KGML and CSV files.
-mm METABOLIC_MAPS, --metabolic-maps METABOLIC_MAPS
IDs of metabolic maps to output
-gcol GENOMIC_COLUMNS, --genomic-columns GENOMIC_COLUMNS
Names of columns with genomic identification
-tcol TRANSCRIPTOMIC_COLUMNS, --transcriptomic-columns TRANSCRIPTOMIC_COLUMNS
Names of columns with transcriptomics quantification
-tls TAXA_LIST, --taxa-list TAXA_LIST
List of taxa to represent in genomic potential charts (comma separated)
-not NUMBER_OF_TAXA, --number-of-taxa NUMBER_OF_TAXA
Number of taxa to represent in genomic potential charts (comma separated)
-keggc KEGG_COLUMN, --kegg-column KEGG_COLUMN
Column with KEGG IDs.
-koc KO_COLUMN, --ko-column KO_COLUMN
Column with KOs.
-ecc EC_COLUMN, --ec-column EC_COLUMN
Column with EC numbers.
-iq, --input-quantification
If input table has no quantification, will create a mock quantification column
-it, --input-taxonomy
If no taxonomy column exists and there is only one taxon in question.
-tc TAXA_COLUMN, --taxa-column TAXA_COLUMN
Column with the taxa designations to represent with KEGGCharter
--resume If data inputed has already been analyzed by KEGGCharter.
-v, --version show program's version number and exit
required named arguments:
-f FILE, --file FILE TSV or EXCEL table with information to chart
Special functions:
--show-available-maps
Outputs KEGG maps IDs and descriptions to the console (so you may pick the ones you want!)
Input file must be specified.
(keggcharter) kazu@kazu:/media/kazu/8TB6/cyanobacteria_paper/5_nif_search/pfam/nifH-candidate_proteins$
テストラン
git clone https://github.com/iquasere/KEGGCharter.git
KEGGCharter/MOSCA_Entry_Report.xlsx
keggcharter.py -f KEGGCharter/MOSCA_Entry_Report.xlsx -gcol mg -tcol mt_0.01a_normalized,mt_1a_normalized,mt_100a_normalized,mt_0.01b_normalized,mt_1b_normalized,mt_100b_normalized,mt_0.01c_normalized,mt_1c_normalized,mt_100c_normalized -keggc "Cross-reference (KEGG)" -o test_keggcharter -tc "Taxonomic lineage (GENUS)"
KEGGCharter の初回実行には時間がかかる。このコマンドは、KEGGCharter の 252 のデフォルトマップ全て表現する。
引用
UPIMAPI, reCOGnizer and KEGGCharter: Bioinformatics tools for functional annotation and visualization of (meta)-omics datasets
João C Sequeira, Miguel Rocha, M Madalena Alves, Andreia F Salvador
Comput Struct Biotechnol J. 2022 Apr 9;20:1798-1810