QTrimは454のトリミングツール。PRINSEQと同等のパフォーマンスを持つとされる。
公式HP
http://hiv.sanbi.ac.za/software/qtrim#Installation
webサーバー
http://hiv.sanbi.ac.za/tools/#/qtrim
インストール
公式HPから実行可能なバイナリと454のテストデータQTrimTestData.tarがダウンロードできる。
> ./QTrim_v1_1 -h
user$ QTrim
*****************************************
LICENSING:
QTrim
Copyright (c) 2013, QTrim Development Team (QDT)
QTrim is freely available for use for non-commercial users and there is no restriction for academic use of QTrim. Commercial use may be restricted and such users should contact Prof Simon Travers for further details (simon@sanbi.ac.za).
All rights reserved.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
The software listed below is called by QTrim and is bundled in the executable file to facilitate easy usage and installation of QTrim. The QTrim development team have in no way modified any aspect of the softwares listed below.
Matplotlib (Copyright (c) 2012-2013 Matplotlib Development Team; All Rights Reserved) is distributed under the BSD license with licensing information available at at http://matplotlib.org/users/license.html. Matplotlib is available at www.matplotlib.org
Numpy (Copyright (c) 2005, NumPy Developers) is distributed under BSD license (http://docs.scipy.org/doc/numpy/license.html). Numpy is available at: www.numpy.org
Biopython is available under GNU free license 1.2 (http://www.biopython.org/DIST/LICENSE). Biopython is available at: www.biopython.org
*****************************************
QTrim: Highly sensitive 454 pyrosequence Quality Trimming tool
QTrim Version: 1.1
Required options:
Input File: -fastq fastqfile OR both -fasta fastafile -qual qualityfile
Other options:
Output file: -o [Default filename: Outputfile]
Mean quality: -m [INT] Range: 0-40 Default: 20
Minimum read length: -l [INT] Default: 50
Mode: -mode [1,2,3,4] Default: 2
remove keys: -rk [INT]
Verbose: -verbose
WindowSize: -w [INT] [Default is mininum length]
Output file format: -out_format [Output file format: 1) Fastq file with INT quality score 2) Fastq file with ASCII quality score 3) Sequence and Quality in different Fasta file with Base name provided in output filename]
Sequence statistics in id: -seq_id_stat
Analytical plotting: -plot plot_format (supports these formats:eps, pdf, svg, svgz )
Example command:
QTrim_v1_1 -fastq myfastqfile #Runs with all default values
QTrim_v1_1 -verbose -fasta fastafile -qual qualityfile -l 10 -m 30 -o outputfilename -mode 2 -out_format 2
OR
/Fullpath/to/QTrim_v1_1 -fastq fastqfile -l 50 -m 30 -o outputfilename -mode 3 -out_format 3 -seq_id_stat -plot pdf
パスの通ったディレクトリに移動しておく。またはリンクを張る。
実行方法
テストデータを指定してラン。
QTrim -fastq Poor_quality_dataset.fastq -o output -plot pdf -out_format 2
- -fastq fastq file that contains both sequence data and quality scores. Quality scores should be in PHRED format.
- -o Output filename.
- -out_format Output file format Options: 1: fastq format with sequence quality scores in integer value. 2: fastq format with sequence quality scores in ASCII characters. 3. separate sequence (fasta) and quality (.qual files) with quality scores in integer values (default2).
- -plot If this option is invoked QTrim will produce a number of plots of the statistics associated with the trimming (see below for further details). Available output formats are: eps, pdf, svg, svgz. If this option is not invoked trimming will continue without outputting graphs
- -v Prints a verbose output to the screen while processing and trimming sequence reads.
トリミングしたfastqや、リード数、統計情報などが出力される。
statistics
user$ head Outputfile_stat.txt
Total reads input: 33022
Total reads output: 32835
Maximum read length in output: 479
Minimum read length in output: 50
Mean read length in output: 274
-plotをつけると、fastq以外にもいくつか図が出力される。
Before
After
Before
After
Before
After
引用
QTrim: a novel tool for the quality trimming of sequence reads generated using the Roche/454 sequencing platform
Shrestha RK, Lubinsky B, Bansode VB, Moinz MB, McCormack GP, Travers SA
BMC Bioinformatics. 2014 Jan 30;15:33.