インストール
cent OSに導入した。
https://github.com/mourisl/Lighter
git clone https://github.com/mourisl/Lighter.git
cd Lighter/
make
./lighter #動作確認
ghter]$ lighter
Usage: ./lighter [OPTIONS]
OPTIONS:
Required parameters:
-r seq_file: seq_file is the path to the sequence file. Can use multiple -r to specifiy multiple sequence files
The file can be fasta and fastq, and can be gzip'ed with extension *.gz.
When the input file is *.gz, the corresponding output file will also be gzip'ed.
-k kmer_length genome_size alpha: (see README for information on setting alpha)
or
-K kmer_length genom_size: in this case, the genome size should be relative accurate.
Other parameters:
-od output_file_directory: (default: ./)
-t num_of_threads: number of threads to use (default: 1)
-maxcor INT: the maximum number of corrections within a 20bp window (default: 4)
-trim: allow trimming (default: false)
-discard: discard unfixable reads. Will LOSE paired-end matching when discarding (default: false)
-noQual: ignore the quality socre (default: false)
-newQual ascii_quality_score: set the quality for the bases corrected to the specified score (default: not used)
-saveTrustedKmers file: save the trusted kmers to specified file then stop (default: not used)
-loadTrustedKmers file: directly get solid kmers from specified file (default: not used)
-zlib compress_level: set the compression level(0-9) of gzip (default: 1)
-h: print the help message and quit
-v: print the version information and quit
lighterをパスが通ったディレクトリに移動しておく。macに導入する場合、オーサーが準備してくれているconda環境でインストールしてください。brewでも導入できるようです。
実行方法
ゲノムサイズを指定してエラー補正を行う。シングルエンド。
lighter -r single.fq -k 17 5000000 0.1 -t 12
- -t num_of_threads: number of threads to use (default: 1)
- -od output_file_directory: (default: ./)
- -k kmer_length genome_size alpha: (see README for information on setting alpha)
- -r seq_file seq_file is the path to the sequence file. Can use multiple -r to specifiy multiple sequence files-r seq_file: seq_file is the path to the sequence file. Can use multiple -r to specifiy multiple sequence files The file can be fasta and fastq, and can be gzip'ed with extension *.gz. When the input file is *.gz, the corresponding output file will also be gzip'ed.
ペアードエンド。
lighter -r left.fq -r right.fq -k 17 5000000 0.1 -t 12
k=17が常にベストとは限らないようで、k=13、15、19でよりアセンブルがよくなったという話もあります(ref.1)。
引用
Lighter: fast and memory-efficient sequencing error correction without counting
Li Song, Liliana Florea and Ben Langmead
Genome Biol. 2014;15(11):509.
ref.1