8/8 誤字修正
QIIME1のfilter_tree.pyスクリプト(QIIME2ではqiime phylogeny filter-tree)は、系統樹ファイルから入力されたリスト(OTU名、ゲノム名など)で見つかったツリーのチップだけを保持するサブツリーを出力する。-negateオプションのTRUEフラグを立てると、見つからなかったサブツリーを返す。
QIIME1
filter_tree.py – This script prunes a tree based on a set of tip names — Homepage
QIIME2
https://docs.qiime2.org/2022.2/plugins/available/phylogeny/filter-tree/?highlight=filter_tree%20py
インストール
依存関係が多いので、公開されているQIIME1のdocker image(非公式)を使用した。
QIIME2
QIIME1
#dockerhub, github
docker pull mbari/qiime1:latest
> filter_tree.py -h
# filter_tree.py -h
Usage: filter_tree.py [options] {-i/--input_tree_filepath
INPUT_TREE_FP -o/--output_tree_filepath OUTPUT_TREE_FP}
[] indicates optional input (order unimportant)
{} indicates required input (order unimportant)
This script takes a tree and a list of OTU IDs (in one of several
supported formats) and outputs a subtree retaining only the tips on
the tree which are found in the inputted list of OTUs (or not found,
if the --negate option is provided).
Example usage:
Print help message and exit
filter_tree.py -h
Prune a tree to include only the tips in tips_to_keep.txt:
filter_tree.py -i rep_seqs.tre -t tips_to_keep.txt -o pruned.tre
Prune a tree to remove the tips in tips_to_remove.txt. Note that the
-n/--negate option must be passed for this functionality:
filter_tree.py -i rep_seqs.tre -t tips_to_keep.txt -o negated.tre -n
Prune a tree to include only the tips found in the fasta file provided:
filter_tree.py -i rep_seqs.tre -f fast_f.fna -o pruned_fast.tre
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-v, --verbose Print information during execution -- useful for
debugging [default: False]
-n, --negate if negate is True will remove input tips/seqs, if
negate is False, will retain input tips/seqs [default:
False]
-t TIPS_FP, --tips_fp=TIPS_FP
A list of tips (one tip per line) or sequence
identifiers (tab-delimited lines with a seq
identifier in the first field) which should be
retained [default: none]
-f FASTA_FP, --fasta_fp=FASTA_FP
A fasta file where the seq ids should be retained
[default: none]
REQUIRED options:
The following options must be provided under all circumstances.
-i INPUT_TREE_FP, --input_tree_filepath=INPUT_TREE_FP
input tree filepath [REQUIRED]
-o OUTPUT_TREE_FP, --output_tree_filepath=OUTPUT_TREE_FP
output tree filepath [REQUIRED]
実行方法
1、ここではdockerイメージを立ち上げて環境内で作業する。
cd <path>/<to>/<tree_dir>/
docker run -itv $PWD:/data -w /data --rm mbari/qiime1:latest
source activate qiime1
2、保持するOTU名やゲノム名を記入したリスト(1行に1つずつ)と、フィルタリングするツリーファイル名、出力ツリーファイル名を指定する。”-n”をつけるとリストに含まれないツリーが出力される。
filter_tree.py -i input.tre -t tips_keep.txt -o output.tre
- -t A list of tips (one tip per line) or sequence identifiers (tab-delimited lines with a seq identifier in the first field) which should be retained [default: none]
- -i input tree filepath [REQUIRED]
- -o output tree filepath [REQUIRED]
- -n if negate is True will remove input tips/seqs, if negate is False, will retain input tips/seqs
引用
Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2
Evan Bolyen, Jai Ram Rideout, Matthew R. Dillon, Nicholas A. Bokulich, Christian C. Abnet, Gabriel A. Al-Ghalith, Harriet Alexander, Eric J. Alm, Manimozhiyan Arumugam, Francesco Asnicar, Yang Bai, Jordan E. Bisanz, Kyle Bittinger, Asker Brejnrod, Colin J. Brislawn, C. Titus Brown, Benjamin J. Callahan, Andrés Mauricio Caraballo-Rodríguez, John Chase, Emily K. Cope, Ricardo Da Silva, Christian Diener, Pieter C. Dorrestein, Gavin M. Douglas, Daniel M. Durall, Claire Duvallet, Christian F. Edwardson, Madeleine Ernst, Mehrbod Estaki, Jennifer Fouquier, Julia M. Gauglitz, Sean M. Gibbons, Deanna L. Gibson, Antonio Gonzalez, Kestrel Gorlick, Jiarong Guo, Benjamin Hillmann, Susan Holmes, Hannes Holste, Curtis Huttenhower, Gavin A. Huttley, Stefan Janssen, Alan K. Jarmusch, Lingjing Jiang, Benjamin D. Kaehler, Kyo Bin Kang, Christopher R. Keefe, Paul Keim, Scott T. Kelley, Dan Knights, Irina Koester, Tomasz Kosciolek, Jorden Kreps, Morgan G. I. Langille, Joslynn Lee, Ruth Ley, Yong-Xin Liu, Erikka Loftfield, Catherine Lozupone, Massoud Maher, Clarisse Marotz, Bryan D. Martin, Daniel McDonald, Lauren J. McIver, Alexey V. Melnik, Jessica L. Metcalf, Sydney C. Morgan, Jamie T. Morton, Ahmad Turan Naimey, Jose A. Navas-Molina, Louis Felix Nothias, Stephanie B. Orchanian, Talima Pearson, Samuel L. Peoples, Daniel Petras, Mary Lai Preuss, Elmar Pruesse, Lasse Buur Rasmussen, Adam Rivers, Michael S. Robeson II, Patrick Rosenthal, Nicola Segata, Michael Shaffer, Arron Shiffer, Rashmi Sinha, Se Jin Song, John R. Spear, Austin D. Swafford, Luke R. Thompson, Pedro J. Torres, Pauline Trinh, Anupriya Tripathi, Peter J. Turnbaugh, Sabah Ul-Hasan, Justin J. J. van der Hooft, Fernando Vargas, Yoshiki Vázquez-Baeza, Emily Vogtmann, Max von Hippel, William Walters, Yunhu Wan, Mingxun Wang, Jonathan Warren, Kyle C. Weber, Charles H. D. Williamson, Amy D. Willis, Zhenjiang Zech Xu, Jesse R. Zaneveld, Yilong Zhang, Qiyun Zhu, Rob Knight & J. Gregory Caporaso
Nature Biotechnology volume 37, pages 852–857 (2019)