2021 1/12 helpと解析例追加
2022/04/19 ツイート追加
We have improved RATTLE to be able to process >10M Nanopore reads https://t.co/xm90pLT8eB
— Eduardo Eyras (@EduEyras) April 19, 2022
git clone --recurse-submodules https://github.com/comprna/RATTLE
> ./rattle
$ ./rattle
Run with mode: ./rattle <cluster|cluster_summary|extract_clusters|correct|polish>
> rattle cluster -h
# rattle cluster -h
-h, --help
shows this help message
-i, --input
input fasta/fastq file (required)
whether input and output should be in fastq format (instead of fasta)
-o, --output
output folder (default: .)
-t, --threads
number of threads to use (default: 1)
-k, --kmer-size
k-mer size for gene clustering (default: 10)
-s, --score-threshold
minimum score for two reads to be in the same gene cluster (default: 0.2)
-v, --max-variance
max allowed variance for two reads to be in the same gene cluster (default: 1000000)
perform clustering at the isoform level
k-mer size for isoform clustering (default: 11)
minimum score for two reads to be in the same isoform cluster (default: 0.3)
max allowed variance for two reads to be in the same isoform cluster (default: 25)
-B, --bv-start-threshold
starting threshold for the bitvector k-mer comparison (default: 0.4)
-b, --bv-end-threshold
ending threshold for the bitvector k-mer comparison (default: 0.2)
-f, --bv-falloff
falloff value for the bitvector threshold for each iteration (default: 0.05)
-r, --min-reads-cluster
minimum number of reads per cluster (default: 0)
-p, --repr-percentile
cluster representative percentile (default: 0.15)
use this mode if data is direct RNA (disables checking both strands)
> rattle cluster_summary -h
# rattle cluster_summary -h
-h, --help
shows this help message
-i, --input
input fasta/fastq file (required)
-c, --clusters
clusters file (required)
whether input and output should be in fastq format (instead of fasta)
> rattle extract_clusters -h
# rattle extract_clusters -h
-h, --help
shows this help message
-i, --input
input fasta/fastq file (required)
-c, --clusters
clusters file (required)
-o, --output-folder
output folder for fastx files (default: .)
-m, --min-reads
min reads per cluster to save it into a file
whether input and output should be in fastq format (instead of fasta)
> rattle correct -h
# rattle correct -h
-h, --help
shows this help message
-i, --input
input fasta/fastq file (required)
-c, --clusters
clusters file (required)
-o, --output
output folder (default: .)
-g, --gap-occ
gap-occ (default: 0.3)
-m, --min-occ
min-occ (default: 0.3)
-s, --split
split clusters into sub-clusters of size s for msa (default: 200)
-r, --min-reads
min reads to correct/output consensus for a cluster (default: 5)
-t, --threads
number of threads to use (default: 1)
> rattle polish -h
# rattle polish -h
-h, --help
shows this help message
-i, --input
input RATTLE consensi fasta/fastq file (required)
-o, --output-folder
output folder for fastx files (default: .)
-t, --threads
number of threads to use (default: 1)
use this mode if data is direct RNA (disables checking both strands)
rattle cluster -i reads.fq -t 24 --fastq -o clusters
他のONTのRNA seq リードを使ったところランできた。
Reading fasta file... Done
[================================================================================] 98056/98056 (100%)9%))
[================================================================================] 26082/26082 (100%)62%)
Iteration 0.35 complete
[================================================================================] 14039/14039 (100%)29%)
Iteration 0.3 complete
[================================================================================] 7434/7434 (100%)65%)
Iteration 0.25 complete
[================================================================================] 4173/4173 (100%)6%))
Iteration 0.2 complete
[================================================================================] 2651/2651 (100%)23%)
Iteration 0 complete
Gene clustering done
1684 gene clusters found
[================================================================================] 1684/1684 (100%)06%)
summary (csv with read_id,cluster_id)
rattle cluster_summary -c clusters.out -i reads.fq --fastq > summary
Reference-free reconstruction and quantification of transcriptomes from Nanopore long-read sequencing
Ivan de la Rubia, Joel A. Indi, Silvia Carbonell-Sala, Julien Lagarde, M Mar Albà, Eduardo Eyras
bioRxiv, Posted July 30, 2020
RATTLE: reference-free reconstruction and quantification of transcriptomes from Nanopore sequencing
Ivan de la Rubia, Akanksha Srivastava, Wenjing Xue, Joel A. Indi, Silvia Carbonell-Sala, Julien Lagarde, M. Mar Albà & Eduardo Eyras
Genome Biology volume 23, Article number: 153 (2022)