ABySS 2.0アセンブラ - macでインフォマティクス

2022/12/27更新

ABySS1.0はヒトゲノムのアセンブルも可能であったが、SOAPdenovoなどと同様600GB以上のメモリを必要とするなどコンピュータ負荷が高い問題があった。AByss2.0は一桁以上メモリ要求量を減らし、より効率的にアセンブルが行えるように工夫された。論文中でヒトゲノムアセンブリを行うのに35GBのメモリ使用量で済んだと記載されている。また、BioNano社のoptical mappingやロングメイトペアと合わせることで、染色体サイズのDNAを2-4のScaffoldsまでアセンブルできるとされる。指定できるk-merサイズも127まで増えている。

インストール

依存

本体　Github

brewやcondaで導入できる。

すでにbrewで導入済みで2.0にアップするなら"brew upgrade"。

#bioconda (link)
mamba install -c bioconda -y abyss

バージョン確認

> abyss-pe --version

abyss-pe version

abyss-pe (ABySS) 2.0.1

Written by Shaun Jackman and Anthony Raymond.

　ヘルプ

> abyss-pe help

user$ abyss-pe help

Usage: abyss-pe [OPTION]... [PARAMETER=VALUE]... [COMMAND]...

Assemble reads into contigs and scaffolds. ABySS is a de novo

sequence assembler intended for short paired-end reads and large

genomes. See the abyss-pe man page for documentation of assembly

parameters and commands. abyss-pe is a Makefile script, and so

options of `make` may also be used with abyss-pe. See the `make`

man page for documentation.

詳細なパラメータについてはGithubを確認してください（リンク）。

実行方法

ペアエンドリードのアセンブル。

abyss-pe name=ecoli k=61 in='reads1.fa reads2.fa'

k　the length of a k-mer (when -K is not set) or the span of a k-mer pair (when -K is set)

ペアのリードファイル名には1と2が付いてないといけない。

ABySS2で実装されたBloom filter de Bruijn graphモードでのアセンブル（１桁以上メモリ使用量が減る）。

 abyss-pe name=ecoli k=61 in='reads1.fa reads2.fa' B=100M H=3 kc=3 v=-v

B: Bloom filter size (e.g. "100M")
H: number of Bloom filter hash functions [1]
v: use v=-v for verbose logging, v=-vv for extra verbose

メモリバッファサイズ100Mはバクテリアサイズのゲノムの例です。

Strand specific RNA のアセンブル。

abyss-pe name=SS-RNA k=61 in='reads1.fa reads2.fa' SS=--SS

--SS: assemble in strand-specific mode

引用

ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter.

Jackman SD, Vandervalk BP, Mohamadi H, Chu J, Yeo S, Hammond SA, Jahesh G, Khan H, Coombe L, Warren RL, Birol I.

Genome Res. 2017 May;27(5):768-777. doi: 10.1101/gr.214346.116. Epub 2017 Feb 23.

ファーストオーサーによる簡単な説明

http://sjackman.ca/2016-08-08-abyss-2.0/

QUAST-LGのベンチマークでも使われています。diploidのラージゲノムでも良好な成績を出してます。