特定の領域由来のロングリードを高速選抜する selectION

London Calling 2017

インストール

ubuntu18.04LTSでテストした。

ビルド依存

requires gcc > 5 and the following libraries:

Boost is available through standard package sources. Libhdf5 is downloaded and build by the install script.

本体　Github

git clone https://github.com/PayGiesselmann/selectION
cd selectION
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
make install

> selection index-h

# selection index-h

Program: SelectION

Usage: selection <command> [options]

Commands: index : Build FM-Index for reference sequence

scan : Scan input for reads matching specified positions

> selection scan -h

# selection scan -h

Usage: selection scan [options] <db.prefix> <input.fq> <outputDir>

Read selection options:

-f [ --filter ] arg Input selection filter

-s [ --sam ] arg Write pseudo alignment in sam format to file

-t [ --threads ] arg (=1) Number of threads

-q [ --quality ] arg (=20) Quality threshold for filtered reads

--scanPrefix arg (=1000) Prefix of read to use for alignment

実行方法

１、index

参照ゲノムのインデックスを作成する。

selection index -t 8 ref.fa

２、Scan

input.fq内のすべてのポジションを推定し、その結果をout.samに書き込む。

selection scan -t 8 ref.fa input.fq ./ --sam ./out.sam

フルのアラインメントをしないため動作は非常に高速。1GB程度のファイルなら10秒程度でsam出力する。

動画を見てもらえば分かりますがまだ開発中です。更新がありましたら追記します。ONT fast5からの直接選抜などが予定されているようです。

引用

macでインフォマティクス