2021 9/6 コード修正
BamDeal は bam ファイルの包括的な解析を行うためのフル機能ツールキットである。C/C++ 言語で実装されており、Linux と Mac OS X オペレーティングシステムで利用可能である。
インストール
依存
Pre-installations of 4 libraries or softs are required before installing BamDeal
- htslib: samtools-1.9/htslib-1.9 1.5 <= htslib <= 1.9
- g++ : g++ with --std=c++11 > 4.8+ is recommended
- zlib : zlib > 1.2.3 is recommended
- R : R with ggplot is recommended
リリースからstable releaseをダウンロードできる。
#実行権をつけてリネームする
chmod 755 ./bin/BamDeal_Linux
mv BamDeal_Linux BamDeal
> ./BamDeal_Linux
# ./BamDeal_Linux
Program: BamDeal
Version: 0.24 hewm2008@gmail.com Sep 16 2020
Usage:
convert convert tools
modify modify tools
statistics statistics analysis tools
visualize visualize tools for bam
Help Show help in detail
パスの通ったディレクトリにコピーする。
詳細なヘルプ
> BamDeal convert
# BamDeal convert
soap2bam soap --> bam/sam Format
bam2soap bam/sam --> soap Format
bam2fq bam/sam --> Fastq Format
bam2fa bam/sam --> Fasta Format
Help Show this help
> BamDeal modify
# BamDeal modify
bamFilter filter low quality read in bam
bamSplit split single/muti-Bam by chr
bamAssign split single/muti-Bam by assign chr
bamCat Merge/Cat muti (diff header) bam to one bam
bamRand random out partly of bam read
bamSubChr extract or remove chr(s) from SAM/BAM
bamShiftQ modify seq Phred quality in bam
bamLimit Limit big bam to muti subbam by fix line
Help Show this help
> BamDeal statistics
# BamDeal statistics
Coverage Calculate Genome Coverage/Depth/GC Dis based Bam
BasesCount Calculate Genome every Site's four base Depth
DeteCNV Detect CNV/Deletion Region by merge Depth info based Bam
DeteSV Detect SV by Pair End Read insert size in Bam
LowDepth GiveOut bed file of low Depth Region(may BigDeletion)
Help Show this help
> BamDeal visualize
# BamDeal visualize
StatQC generate plots for quality control
DepthCov Show Fig of Depth Dis & Depth~Coverage
DepthGC Show Fig of Depth~RefGC
DepthSlide Show Manhattan Fig of Depth sliding Windows along genome
Help Show this help
実行方法
大半のコマンドは、-iでbamファイルの指定、-lで複数bamのリスト指定となっている。bamをリスト指定する場合は、リストのファイルを指定する。
./A.bam
./B.bam
./C.bam
...
カレントにbamがある場合はこうゆう感じで、1行に1つのbamファイルのパスを記載したリストを提供する。
BamDeal convert
bam2soap - bam/sam => SOAP bam
BamDeal convert bam2soap -InFile in.bam -OutPut out_SOAP.bam
soap2bam - SOAP bam => bam/sam
BamDeal convert soap2bam -InSoap in_SOAP.bam -OutBam out.bam -Dict Ref.fa
bam2fq - bam/sam => fastq
BamDeal convert bam2fq -i in.bam -o out.fq
=> out.fq.gzが出力される
bam2fa - bam/sam => fasta
BamDeal convert bam2fa -i in.bam -o out.fa
=> out.fa.gzが出力される
BamDeal modify
bamFilter -低クオリティなリードをフィルタリング
#Q15以下のリード、30bp以下のリード、duplicateリードをフィルタリング
BamDeal modify bamFilter -i in.bam -o out.bam -q 15 -l 30 -d
- -q the quality to filter reads, default [15]
- -l the length to filter reads, default [30]
- -s the beginning of interval containing the 1-based leftmost mapping position of first matching base, default [0]
- -e the end of interval containing the 1-based leftmost mapping position of first matching base, default [1e9]
- -c specify the chromosome to output, default [all chromosomes]
- -d remove the duplicate read
bamSplit - クロモソームごとにbamを分離
mkdir output_dir
BamDeal modify bamSplit -i in.bam -o output_dir
- -i input SAM/BAM files, delimited by space
- -l input list of SAM/BAM files
- -o output directory, default [PWD]
- -s to set the output files in SAM format, default output is in BAM format.
- -q reads with quality lower than this would be classified to unmap.bam, default [10]
- -r reset output files headers by remove the chromosomes not in the output files
bamAssign - ユーザー指定の組み合わせでbamを分離(詳細はBamDeal modify bamAssign -h参照)
mkdir out_dir
BamDeal modify bamAssign -l list -o output_dir
-
-i input SAM/BAM files, delimited by space
-
-l input list of SAM/BAM/CRAM files
-
-a list indicating how to assign chromosomes to outputs
-
-o output directory, default [PWD]
-
-q reads with quality lower than this would be classified to unmap.bam, default [10]
-
-r reset output files headers by remove the chromosomes not in the output files
bamCat - bamをマージ
bamCat -i A.bam B.bam -o merged.bam -s
-
-s output sort bam file when all inputs were sorted
bamRand - bamをランダムサンプリング
#10 percent
BamDeal modify bamRand -i in.bam -p 0.1 -o out.bam
- -p probability with which each read would be outputed, default [0.1]
- -s random seed, default [time]
bamSubChr - bamからリストで指定したchrを削除、または追加
#リストのchrを除去
bamSubChr -i in.bam -d delete.list -o out.bam -r
#リストのchrを追加
bamSubChr -i in.bam -k keep.list -o out.bam -r
#unmapのリードを除去
bamSubChr -i in.bam> -o AAA -u
- -k list of chromosomes to be kept
- -d list of chromosomes to be deleted
- -u remove unmapped reads
- -r reset output headers by remove the chr(s) not in the out files
bamShiftQ - Phred qualityのタイプを修正
#リードのPhred qualityをASCII33に修正して出力
bamShiftQ -i in.bam -o out.bam -p 1
#リードのPhred qualityをASCII64に修正して出力
bamShiftQ -i in.bam -o out.bam -p 2
- -p phred quality in output BAM, [1]: ASCII+33 or [2]: ASCII+64, default [1]
- -q the quality to filter reads, default [10]
- -l the length to filter reads, default [30]
bamLimit - 大きなbamを分割
#最大1000000リードずつ分割
mkdir out_dir
BamDeal modify bamLimit -i in.bam -o out_dir/ -n 1000000
- -n max read number for each bam[1000000000]
BamDeal statistics
Coverage - Coverage/Depth/GC分布を調べる
BamDeal statistics Coverage -i in.bam -r ref.fasta -o outprefix
-
-i input SAM/BAM files, delimited by space
-
-l input list of SAM/BAM files
-
-o prefix of output file
-
-b list of the regions of which the coverage and mean of depth would be given
-
-q the quality to filter reads, default [10]
- -d Filter the duplicated read
BasesCount - 全ポジションのATGC全てのデプスカウント
#3つのbamを調べる。q10以上
BamDeal statistics BasesCount -i A.bam B.bam C.bam -o outprefix -q 10
DeteCNV - CNV/Deletionを検出
BamDeal statistics DeteCNV -i A.bam B.bam -r ref.fasta -m 1000 -o outprefix
-
-f <float> depthRatio to judge breakpoint of merge adjacent[0.45]
- -c for each chromosome, use its own mean of depth into calculation default would use the mean of depth of the whole genome
-
-m <int> set the minimum length of CNV, default [1800]
-
-p <float> p-value of CNV depth bias, default [0.02]
DeteSV - ペアエンドのインサートサイズ情報を使ってSVを検出
BamDeal statistics DeteSV -i in.bam -r ref.fasta -o outprefix -m 1800
LowDepth - low depthの領域を検出
BamDeal statistics LowDepth -i in.bam -o out.bed -q 10 -s 1000
- -o output bed region file
- -x set the minimum value of low depth,default[2]
- -s the length to filter short region, default [1000]
- -q ignore too low mapQ read, default [10]
BamDeal visualize
StatQC - クオリティレポート出力(Rのggplot2とreshapeパッケージが必要*1)
BamDeal visualize StatQC -i in.bam -o outdir
#複数bamをlist指定
BamDeal visualize StatQC -i in.bam -o outdir
分析結果のテキストとPDFが出力される。
DepthCov - リードデプス出力
BamDeal visualize DepthCov -i in.bam -o out
- -d depth along site in reference FASTA
- -m x-axis of the plot, default [4*meanDepth]
- -q the quality to filter reads, default [10]
- -k output the Rscript used to generate plots
DepthGC - リードデプス-GCプロット出力
BamDeal visualize DepthGC -i in.bam -r ref.fasta -o outprefix -k -q 10
- -r input reference FASTA
- -f file containing depth and GC content in each window. This file is one of the output files of Bamdeal statistics Coverage.
- -w window size to calculate base frequency, default [10000]
- -q reads with quality lower than this will be filtered, default [10]
- -y maximum of y axis of the plot, default [3*mean of depth]
- -k output the Rscript used to generate plots
DepthSlide - ゲノムのchr、ポジション順のマンハッタンプロット出力
BamDeal visualize DepthSlide -i in.bam -r ref.fasta -o outprefix
- -w <int> window size to calculate base frequency, default [10000]
- -s <float> windows sliding ratio (0,1], default [1]
- -q <int> reads with quality lower than this will be filtered, default [10]
- -c <str> chromosome(s) to draw, delimited by comma. default [all chromosomes]
- -y <int> maximum of y axis of the plot, default [4*mean of depth]
- -k output the Rscript used to generate plots
引用
https://github.com/BGI-shenzhen/BamDeal
*1
Rのコンソールで
> install.packages("ggplot2")
> install.packages("reshape")
> install.packages("ggExtra")
このツールは内藤先生のツイートで知りました。ありがとうございました。