macでインフォマティクス

macでインフォマティクス

HTS (NGS) 関連のインフォマティクス情報についてまとめています。

メタゲノムのハイブリッドアセンブリとビニングのためのベスト・プラクティス・パイプライン nf-core/mag

2023/03/02 論文引用

 

 ショットガンメタゲノムデータを解析することで、微生物群集に関する貴重な知見が得られると同時に、個々のゲノムレベルでの解決が可能となる。しかし、完全なリファレンスゲノムが存在しない場合、シークエンスリードからメタゲノムアセンブルゲノム(MAG)を再構築する必要がある。本研究では、メタゲノムアセンブリ、ビニング、分類学的分類を行うnf-core/magパイプラインを紹介する。nf-core/magは、ショートリードとロングリードを組み合わせることでアセンブリの連続性を高め、サンプルごとのグループ情報を共アセンブリやゲノムビニングに利用することができる。パイプラインは、インストールが容易で、すべての依存関係がコンテナ内に用意されており、移植性と再現性に優れている。Nextflowで書かれており、パイプライン開発のベストプラクティスであるnf-coreイニシアチブの一環として開発されている。すべてのコードは、GitHubのnf-core organization(https://github.com/nf-core/mag)でホストされており、MITライセンスで公開されている。

 

usage

https://nf-co.re/mag/usage

 

Githubより

デフォルトでは、パイプラインは次の解析を実行する。ショートリードとロングリードの両方をサポートしている。

1、fastpとPorechopでリードとアダプターをクオリティートリムし、FastQCで基本的なQCを実行する。

2、Centrifugeおよび/またはKraken2を用いてリードにtaxonomyを割り当てる。
3、MEGAHITとSPAdesを用いてアセンブリを行い、Quastを用いて品質をチェックする。

4、MetaBAT2を用いてビニングを行い、Buscoを用いてゲノムビンの品質を確認する。

5、GTDB-TkやCATを用いてビンに分類を付与する。

6、指定されたresultsディレクトリに、結果の一部やソフトウェアのバージョンをまとめたMultiQCのレポートなどを作成する。

 

2023/03/02

 

インストール

依存

  • Nextflow (>=21.04.0)

Github

help

> nextflow run nf-core/mag --help --show_hidden

 

 N E X T F L O W   ~  version 25.04.7

 

Launching `https://github.com/nf-core/mag` [elegant_tuckerman] DSL2 - revision: 7ffd8b8c65 [main]

 

 

------------------------------------------------------

                                        ,--./,-.

        ___     __   __   __   ___     /,-._.--~'

  |\ | |__  __ /  ` /  \ |__) |__         }  {

  | \| |       \__, \__/ |  \ |___     \`-._,-`-,

                                        `._,._,'

  nf-core/mag 4.0.0

------------------------------------------------------

Typical pipeline command:

 

  nextflow run nf-core/mag -profile <docker/singularity/.../institute> --input samplesheet.csv --outdir <OUTDIR>

 

--help                                   [boolean, string] Show the help message for all top level parameters. When a parameter is given to `--help`, the full help message of that parameter will be printed. 

--help_full                              [boolean]         Show the help message for all non-hidden parameters. 

--show_hidden                            [boolean]         Show all hidden parameters in the help message. This needs to be used in combination with `--help` or `--help_full`. 

 

Input/output options

  --input                                [string]  CSV samplesheet file containing information about the samples in the experiment. 

  --single_end                           [boolean] Specifies that the input is single-end reads. 

  --assembly_input                       [string]  Additional input CSV samplesheet containing information about pre-computed assemblies. When set, both read pre-processing and assembly are skipped and the pipeline begins at the binning stage. 

  --outdir                               [string]  The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure. 

  --email                                [string]  Email address for completion summary. 

  --multiqc_title                        [string]  MultiQC report title. Printed as page header, used for filename if not otherwise specified. 

 

Reference genome options

  --igenomes_ignore                      [boolean] Do not load the iGenomes reference config. 

  --igenomes_base                        [string]  The base path to the igenomes reference files [default: s3://ngi-igenomes/igenomes/] 

 

Institutional config options

  --custom_config_version                [string] Git commit id for Institutional configs. [default: master] 

  --custom_config_base                   [string] Base directory for Institutional configs. [default: https://raw.githubusercontent.com/nf-core/configs/master] 

  --config_profile_name                  [string] Institutional config name. 

  --config_profile_description           [string] Institutional config description. 

  --config_profile_contact               [string] Institutional config contact information. 

  --config_profile_url                   [string] Institutional config URL link. 

 

Generic options

  --version                              [boolean] Display version and exit. 

  --publish_dir_mode                     [string]  Method used to save pipeline results to output directory.  (accepted: symlink, rellink, link, copy, copyNoFollow, move) [default: copy] 

  --monochrome_logs                      [boolean] Use monochrome_logs 

  --email_on_fail                        [string]  Email address for completion summary, only when pipeline fails. 

  --plaintext_email                      [boolean] Send plain-text email instead of HTML. 

  --max_multiqc_email_size               [string]  File size limit when attaching MultiQC reports to summary emails. [default: 25.MB] 

  --hook_url                             [string]  Incoming hook URL for messaging service 

  --multiqc_config                       [string]  Custom config file to supply to MultiQC. 

  --multiqc_logo                         [string]  Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file 

  --multiqc_methods_description          [string]  Custom MultiQC yaml file containing HTML including a methods description. 

  --validate_params                      [boolean] Boolean whether to validate parameters against the schema at runtime [default: true] 

  --pipelines_testdata_base_path         [string]  Base URL or local path to location of pipeline test dataset files [default: https://raw.githubusercontent.com/nf-core/test-datasets/] 

  --trace_report_suffix                  [string]  Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss. 

 

Reproducibility options

  --megahit_fix_cpu_1                    [boolean] Fix number of CPUs for MEGAHIT to 1. Not increased with retries. 

  --spades_fix_cpus                      [integer] Fix number of CPUs used by SPAdes. Not increased with retries. [default: -1] 

  --spadeshybrid_fix_cpus                [integer] Fix number of CPUs used by SPAdes hybrid. Not increased with retries. [default: -1] 

  --metabat_rng_seed                     [integer] RNG seed for MetaBAT2. [default: 1] 

 

Quality control for short reads options

  --clip_tool                            [string]  Specify which adapter clipping tool to use.  (accepted: fastp, adapterremoval, trimmomatic) [default: fastp] 

  --save_clipped_reads                   [boolean] Specify to save the resulting clipped FASTQ files to --outdir. 

  --reads_minlength                      [integer] The minimum length of reads must have to be retained for downstream analysis. [default: 15] 

  --fastp_qualified_quality              [integer] Minimum phred quality value of a base to be qualified in fastp. [default: 15] 

  --fastp_cut_mean_quality               [integer] The mean quality requirement used for per read sliding window cutting by fastp. [default: 15] 

  --fastp_save_trimmed_fail              [boolean] Save reads that fail fastp filtering in a separate file. Not used downstream. 

  --fastp_trim_polyg                     [boolean] Turn on detecting and trimming of poly-G tails 

  --adapterremoval_minquality            [integer] The minimum base quality for low-quality base trimming by AdapterRemoval. [default: 2] 

  --adapterremoval_trim_quality_stretch  [boolean] Turn on quality trimming by consecutive stretch of low quality bases, rather than by window. 

  --adapterremoval_adapter1              [string]  Forward read adapter to be trimmed by AdapterRemoval. [default: AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG] 

  --adapterremoval_adapter2              [string]  Reverse read adapter to be trimmed by AdapterRemoval for paired end data. [default: AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT] 

  --host_genome                          [string]  Name of iGenomes reference for host contamination removal. 

  --host_fasta                           [string]  Fasta reference file for host contamination removal. 

  --host_fasta_bowtie2index              [string]  Bowtie2 index directory corresponding to `--host_fasta` reference file for host contamination removal. 

  --host_removal_verysensitive           [boolean] Use the `--very-sensitive` instead of the`--sensitive`setting for Bowtie 2 to map reads against the host genome. 

  --host_removal_save_ids                [boolean] Save the read IDs of removed host reads. 

  --save_hostremoved_reads               [boolean] Specify to save input FASTQ files with host reads removed to --outdir. 

  --keep_phix                            [boolean] Keep reads similar to the Illumina internal standard PhiX genome. 

  --phix_reference                       [string]  Genome reference used to remove Illumina PhiX contaminant reads. [default: ${baseDir}/assets/data/GCA_002596845.1_ASM259684v1_genomic.fna.gz] 

  --skip_clipping                        [boolean] Skip read preprocessing using fastp or adapterremoval. 

  --save_phixremoved_reads               [boolean] Specify to save input FASTQ files with phiX reads removed to --outdir. 

  --bbnorm                               [boolean] Run BBnorm to normalize sequence depth. 

  --bbnorm_target                        [integer] Set BBnorm target maximum depth to this number. [default: 100] 

  --bbnorm_min                           [integer] Set BBnorm minimum depth to this number. [default: 5] 

  --save_bbnorm_reads                    [boolean] Save normalized read files to output directory. 

 

Quality control for long reads options

  --skip_adapter_trimming                [boolean] Skip removing adapter sequences from long reads. 

  --longreads_min_length                 [integer] Discard any read which is shorter than this value. [default: 1000] 

  --longreads_min_quality                [integer] Discard any read which has a mean quality score lower than this value. 

  --longreads_keep_percent               [integer] Keep this percent of bases. [default: 90] 

  --longreads_length_weight              [integer] The higher the more important is read length when choosing the best reads. [default: 10] 

  --keep_lambda                          [boolean] Keep reads similar to the ONT internal standard Escherichia virus Lambda genome. 

  --lambda_reference                     [string]  Genome reference used to remove ONT Lambda contaminant reads. [default: ${baseDir}/assets/data/GCA_000840245.1_ViralProj14204_genomic.fna.gz] 

  --save_lambdaremoved_reads             [boolean] Specify to save input FASTQ files with lamba reads removed  to --outdir. 

  --save_porechop_reads                  [boolean] Specify to save the resulting clipped FASTQ files to --outdir. 

  --save_filtered_longreads              [boolean] Specify to save the resulting length filtered long read FASTQ files to --outdir. 

  --longread_adaptertrimming_tool        [string]  Specify which long read adapter trimming tool to use.  (accepted: porechop, porechop_abi) [default: porechop_abi] 

  --longread_filtering_tool              [string]  Specify which long read filtering tool to use.  (accepted: filtlong, nanoq, chopper) [default: filtlong] 

 

Taxonomic profiling options

  --centrifuge_db                        [string]  Database for taxonomic binning with centrifuge. 

  --kraken2_db                           [string]  Database for taxonomic binning with kraken2. 

  --krona_db                             [string]  Database for taxonomic binning with krona 

  --skip_krona                           [boolean] Skip creating a krona plot for taxonomic binning. 

  --cat_db                               [string]  Database for taxonomic classification of metagenome assembled genomes. Can be either a zipped file or a directory containing the extracted output of such. 

  --cat_db_generate                      [boolean] Generate CAT database. 

  --save_cat_db                          [boolean] Save the CAT database generated when specified by `--cat_db_generate`. 

  --cat_official_taxonomy                [boolean] Only return official taxonomic ranks (Kingdom, Phylum, etc.) when running CAT. 

  --skip_gtdbtk                          [boolean] Skip the running of GTDB, as well as the automatic download of the database 

  --gtdb_db                              [string]  Specify the location of a GTDBTK database. Can be either an uncompressed directory or a `.tar.gz` archive. If not specified will be downloaded for you when GTDBTK or binning QC is not skipped. [default: 

https://data.gtdb.ecogenomic.org/releases/release220/220.0/auxillary_files/gtdbtk_package/full_package/gtdbtk_r220_data.tar.gz]  

  --gtdb_mash                            [string]  Specify the location of a GTDBTK mash database. If missing, GTDB-Tk will skip the ani_screening step 

  --gtdbtk_min_completeness              [number]  Min. bin completeness (in %) required to apply GTDB-tk classification. [default: 50] 

  --gtdbtk_max_contamination             [number]  Max. bin contamination (in %) allowed to apply GTDB-tk classification. [default: 10] 

  --gtdbtk_min_perc_aa                   [number]  Min. fraction of AA (in %) in the MSA for bins to be kept. [default: 10] 

  --gtdbtk_min_af                        [number]  Min. alignment fraction to consider closest genome. [default: 0.65] 

  --gtdbtk_pplacer_cpus                  [integer] Number of CPUs used for the by GTDB-Tk run tool pplacer. [default: 1] 

  --gtdbtk_pplacer_useram                [boolean] Speed up pplacer step of GTDB-Tk by loading to memory. 

 

Assembly options

  --coassemble_group                     [boolean] Co-assemble samples within one group, instead of assembling each sample separately. 

  --spades_options                       [string]  Additional custom options for SPAdes and SPAdesHybrid. Do not specify `--meta` as this will be added for you! 

  --spades_downstreaminput               [string]  Specify whether to use contigs or scaffolds assembled by SPAdes  (accepted: scaffolds, contigs) [default: scaffolds] 

  --megahit_options                      [string]  Additional custom options for MEGAHIT. 

  --skip_spades                          [boolean] Skip Illumina-only SPAdes assembly. 

  --skip_spadeshybrid                    [boolean] Skip SPAdes hybrid assembly. 

  --skip_megahit                         [boolean] Skip MEGAHIT assembly. 

  --skip_quast                           [boolean] Skip metaQUAST. 

 

Gene prediction and annotation options

  --skip_prodigal                        [boolean] Skip Prodigal gene prediction 

  --prokka_with_compliance               [boolean] Turn on Prokka complicance mode for truncating contig names for NCBI/ENA compatibility. 

  --prokka_compliance_centre             [string]  Specify sequencing centre name required for Prokka's compliance mode. 

  --skip_prokka                          [boolean] Skip Prokka genome annotation. 

  --skip_metaeuk                         [boolean] Skip MetaEuk gene prediction and annotation 

  --metaeuk_mmseqs_db                    [string]  A string containing the name of one of the databases listed in the [mmseqs2 documentation](https://github.com/soedinglab/MMseqs2/wiki#downloading-databases). This database will be downloaded and formatted for eukaryotic genome annotation. Incompatible with --metaeuk_db.  (accepted: UniRef100, 

UniRef90, UniRef50, UniProtKB, UniProtKB/TrEMBL, UniProtKB/Swiss-Prot, NR, NT, GTDB, PDB, PDB70, Pfam-A.full, Pfam-A.seed, Pfam-B, CDD, eggNOG, VOGDB, dbCAN2, SILVA, Resfinder, Kalamari)  

  --metaeuk_db                           [string]  Path to either a local fasta file of protein sequences, or to a directory containing an MMseqs2-formatted database, for annotation of eukaryotic genomes. 

  --save_mmseqs_db                       [boolean] Save the downloaded mmseqs2 database specified in `--metaeuk_mmseqs_db`. 

 

Virus identification options

  --run_virus_identification             [boolean] Run virus identification. 

  --genomad_db                           [string]  Database for virus classification with geNomad 

  --genomad_min_score                    [number]  Minimum geNomad score for a sequence to be considered viral [default: 0.7] 

  --genomad_splits                       [integer] Number of groups that geNomad's MMSeqs2 databse should be split into (reduced memory requirements) [default: 1] 

 

Binning options

  --binning_map_mode                     [string]  Defines mapping strategy to compute co-abundances for binning, i.e. which samples will be mapped against the assembly.  (accepted: all, group, own) [default: group] 

  --skip_binning                         [boolean] Skip metagenome binning entirely 

  --skip_metabat2                        [boolean] Skip MetaBAT2 Binning 

  --skip_maxbin2                         [boolean] Skip MaxBin2 Binning 

  --skip_concoct                         [boolean] Skip CONCOCT Binning 

  --min_contig_size                      [integer] Minimum contig size to be considered for binning and for bin quality check. [default: 1500] 

  --min_length_unbinned_contigs          [integer] Minimal length of contigs that are not part of any bin but treated as individual genome. [default: 1000000] 

  --max_unbinned_contigs                 [integer] Maximal number of contigs that are not part of any bin but treated as individual genome. [default: 100] 

  --bin_min_size                         [integer] Specify the shortest length a bin should be to retain for downstream processing (in base pairs) [default: 0] 

  --bin_max_size                         [integer] Specify the longest length a bin should be to retain for downstream processing (in base pairs). By default no limit. 

  --bin_concoct_chunksize                [integer] Specify length of sub-contigs cut up prior CONCOCT binning [default: 10000] 

  --bin_concoct_overlap                  [integer] Specify the overlap between each sub-contig prior CONCOCT binning [default: 0] 

  --bin_concoct_donotconcatlast          [boolean] Specify to not append the last contig less than sub-contig length to the last correct length contig 

  --bowtie2_mode                         [string]  Specify alternative Bowtie2 settings for aligning reads back against the assembly. 

  --save_assembly_mapped_reads           [boolean] Save the output of mapping raw reads back to assembled contigs 

  --bin_domain_classification            [boolean] Enable domain-level (prokaryote or eukaryote) classification of bins using Tiara. Processes which are domain-specific will then only receive bins matching the domain requirement. 

  --bin_domain_classification_tool       [string]  Specify which tool to use for domain classification of bins. Currently only 'tiara' is implemented. [default: tiara] 

  --tiara_min_length                     [integer] Minimum contig length for Tiara to use for domain classification. For accurate classification, should be longer than 3000 bp. [default: 3000] 

  --exclude_unbins_from_postbinning      [boolean] Exclude unbinned contigs in the post-binning steps (bin QC, taxonomic classification, and annotation steps). 

 

Bin quality check options

  --skip_binqc                           [boolean] Disable bin QC with BUSCO, CheckM or CheckM2. 

  --binqc_tool                           [string]  Specify which tool for bin quality-control validation to use.  (accepted: busco, checkm, checkm2) [default: busco] 

  --busco_db                             [string]  Download URL, local tar.gz archive, or local uncompressed directory for an *_odb10 or *_odb12 BUSCO lineage dataset. 

  --busco_db_lineage                     [string]  Name of the BUSCO *_odb10 or *_odb12 lineage to check against. Additionally supports 'auto', 'auto_prok' and 'auto_euk' for automatic lineage selection mode. [default: auto] 

  --save_busco_db                        [boolean] Save the used BUSCO lineage datasets provided via `--busco_db`. 

  --busco_clean                          [boolean] Enable clean-up of temporary files created during BUSCO runs. 

  --checkm_download_url                  [string]  URL pointing to checkM database for auto download, if local path not supplied. [default: https://zenodo.org/records/7401545/files/checkm_data_2015_01_16.tar.gz] 

  --checkm_db                            [string]  Path to local folder containing already downloaded and uncompressed CheckM database. 

  --save_checkm_data                     [boolean] Save the used CheckM reference files downloaded when not using --checkm_db parameter. 

  --checkm2_db                           [string]  Path to local file of an already downloaded and uncompressed CheckM2 database (.dmnd file). 

  --checkm2_db_version                   [integer] CheckM2 database version number to download (Zenodo record ID, for reference check the canonical reference https://zenodo.org/records/5571251, and pick the Zenodo ID of the database version of your choice). [default: 14897628] 

  --save_checkm2_data                    [boolean] Save the used CheckM2 reference files downloaded when not using --checkm2_db parameter. 

  --refine_bins_dastool                  [boolean] Turn on bin refinement using DAS Tool. 

  --refine_bins_dastool_threshold        [number]  Specify single-copy gene score threshold for bin refinement. [default: 0.5] 

  --postbinning_input                    [string]  Specify which binning output is sent for downstream annotation, taxonomic classification, bin quality control etc.  (accepted: raw_bins_only, refined_bins_only, both) [default: raw_bins_only] 

  --run_gunc                             [boolean] Turn on GUNC genome chimerism checks 

  --gunc_db                              [string]  Specify a path to a pre-downloaded GUNC dmnd database file 

  --gunc_database_type                   [string]  Specify which database to auto-download if not supplying own  (accepted: progenomes, gtdb) [default: progenomes] 

  --gunc_save_db                         [boolean] Save the used GUNC reference files downloaded when not using --gunc_db parameter. 

 

Ancient DNA assembly

  --ancient_dna                          [boolean] Turn on/off the ancient DNA subworfklow 

  --pydamage_accuracy                    [number]  PyDamage accuracy threshold [default: 0.5] 

  --skip_ancient_damagecorrection        [boolean] deactivate damage correction of ancient contigs using variant and consensus calling 

  --freebayes_ploidy                     [integer] Ploidy for variant calling [default: 1] 

  --freebayes_min_basequality            [integer] minimum base quality required for variant calling [default: 20] 

  --freebayes_minallelefreq              [number]  minimum minor allele frequency for considering variants [default: 0.33] 

  --bcftools_view_high_variant_quality   [integer] minimum genotype quality for considering a variant high quality [default: 30] 

  --bcftools_view_medium_variant_quality [integer] minimum genotype quality for considering a variant medium quality [default: 20] 

  --bcftools_view_minimal_allelesupport  [integer] minimum number of bases supporting the alternative allele [default: 3] 

 

------------------------------------------------------

 

* The pipeline

    https://doi.org/10.1093/nargab/lqac007

 

* The nf-core framework

    https://doi.org/10.1038/s41587-020-0439-x

 

* Software dependencies

    https://github.com/nf-core/mag/blob/main/CITATIONS.md

 

テストラン

conda、docker、Singularity、Shifter、Podman(Docker互換のコンテナエンジン)、Charliecloudなどに対応している。

#docker
nextflow run nf-core/mag -profile test,docker

#conda
nextflow run nf-core/mag -profile test,conda

順番に実行されていく。テストランもある程度時間がかかる。

 

出力

f:id:kazumaxneo:20210905223511p:plain

Taxonomy

f:id:kazumaxneo:20210905223720p:plain

Assembly

f:id:kazumaxneo:20210905223739p:plain

Genome Binning

f:id:kazumaxneo:20210905223806p:plain

MEGAHIT-test_minigut-binDepths.heatmap.png

f:id:kazumaxneo:20210905223842p:plain

SPAdes-test_minigut-binDepths.heatmap.png

f:id:kazumaxneo:20210905223917p:plain

Genome Binning/QC

f:id:kazumaxneo:20210905224044p:plain

multiqc

f:id:kazumaxneo:20210905224139p:plain

 

 

実際のランではprofileとfastqのパス、もしくはfastqのパスとサンプル名を記載したCSVファイルを指定する。

#docker
nextflow run nf-core/mag -profile docker --input '*_R{1,2}.fastq.gz'

#samplesheet.csv
nextflow run nf-core/mag -profile docker --input samplesheet.csv

カンマ区切りで最大5列の情報を記載する。ヘッダーはsample,group,short_reads_1,short_reads_2,long_readsとする。

sample,group,short_reads_1,short_reads_2,long_reads
sample1,0,data/sample1_R1.fastq.gz,data/sample1_R2.fastq.gz,data/sample1.fastq.gz
sample2,0,data/sample2_R1.fastq.gz,data/sample2_R2.fastq.gz,data/sample2.fastq.gz
sample3,1,data/sample3_R1.fastq.gz,data/sample3_R2.fastq.gz,

 

サンプルIDは一意でなければならない。2列目のグループ情報は、ビニングステップの共分散の計算にのみ使用され、共アセンブリには使用されない。共アセンブリには--coassemble_groupオプションを使う。3列目以降で指定するFastQファイルは圧縮されている必要がある(.fastq.gz, .fq.gz)。ロングリードもある場合、ペアエンドのshort readデータとの組み合わせでのみ提供可能。1つのサンプルシート内でシングルエンドとペアエンドの混在は不可。シングルエンドリードを指定する場合は、コマンドラインパラメータ -single_end も指定する。

 

シングルエンド,megahitのみでアセンブリ、不要なステップを除外。

nextflow run nf-core/mag \
  -profile docker \
  --input samplesheet.csv \
--outdir results_dir \
--single_end \
  --skip_gtdbtk \
  --skip_quast \
  --skip_prodigal --skip_prokka --skip_metaeuk \
  --skip_spades \
  --skip_spadeshybrid \
  --skip_krona

この例だと、以下の黄色のステップのみ実行される。

引用

nf-core/mag: a best-practice pipeline for metagenome hybrid assembly and binning

Sabrina Krakau,  Daniel Straub,  Hadrien Gourlé,  Gisela Gabernet,  Sven Nahnsen

bioRxiv, Posted August 31, 2021

 

2023/01

nf-core/mag: a best-practice pipeline for metagenome hybrid assembly and binning
Sabrina Krakau, Daniel Straub, Hadrien Gourlé, Gisela Gabernet, and Sven Nahnsen

NAR Genom Bioinform. 2022 Mar; 4(1)

 

 

参考

file:///Users/kazu/Downloads/IPSJ-BIO18054047.pdf

 

DockerユーザーのためのPodmanとBuildahの紹介 - 赤帽エンジニアブログ