TransCombは、junction graphに基づいて開発されたゲノムガイドのアセンブルツール。ペアのショートリードとリファレンスゲノムを使い、RNA seqのシーケンスデータをアセンブルする。複数種のシミュレーションデータセットとリアルデータセットの両方でテストされ、StringTie、Cufflinks、Bayesembler、およびTraphなどの主要アセンブラに対してリコールと精度が向上していることが述べられている。 動作は高速で、メモリ使用量も少ないとされる。
インストール
バイナリが配布されている。
SorceForge ダウンロード
https://sourceforge.net/projects/transcriptomeassembly/files/
> ./TransComb
$ ./TransComb
** Error: BAM input file is not provided! **
===========================================================================
TransComb v.1.0 usage:
** Required **
-b <string>: BAM file produced by Tophat or Tophat2.
-s <string>: Strand-specific RNA-Seq reads orientation.
1) Use <unstranded> to indicate RNA-seq reads are non-strand-specific.
2) Use <first> to indicate fr-first-stranded RNA-seq reads.
3) Use <second> to indicate fr-second-stranded RNA-seq reads.
---------------------------------------------------------------------------
** Options **
-h: Output TransComb Help Information
-o <string>: Output path/file name of the assembled transcripts GTF, default: ./TransComb.gtf
-f <string>: Minimum expression level estimated by abundance analysis for output, default: 0.
-l <string>: Minimum assembled transcript length, default: 500.
-d <string>: Minimum junction coverage fraction by maximum junction coverage, default: 0.02.
-D <string>: Minimum farction of unbalanced junction, default: 0.1.
-g <string>: Minimum gene length, default: 200.
-t: Disable trimming of predicted transcripts based on coverage, default: coverage trimming is enabled.
-e <string>: Minimum gap length between two exons, default: 200.
-F <string>: Minimum seed coverage used for generate a new transcript, default: 0.
-a <string>: Minimum anchor length for junctions, default: 1.
-m <string>: Fraction of exon allowed to be covered by multi-hit reads, default: 1.
-v: Report the current version of TransComb and exit.
** Note **
A typical command of TransComb might be:
TransComb -b file.bam -s unstranded
===========================================================================
パスの通ったディレクトリに移動しておく。
ラン
テストランを行う (unstranded)。
cd sample_test/
TransComb -b test.bam -s unstranded -l 200
- -b <string> BAM file produced by Tophat or Tophat2.
- -s <string> Strand-specific RNA-Seq reads orientation. 1) Use <unstranded> to indicate RNA-seq reads are non-strand-specific. 2) Use <first> to indicate fr-first-stranded RNA-seq reads. 3) Use <second> to indicate fr-second-stranded RNA-seq reads.
引用
TransComb: genome-guided transcriptome assembly via combing junctions in splicing graphs.
Genome Biol. 2016 Oct 19;17(1):213.
Liu J, Yu T, Jiang T, Li G.