hifiasm
bio.toolsVerified7 docs · 3.9K tokensHaplotype-resolved de novo assembler for PacBio HiFi reads
Use with Claude
Paste this MCP call into Claude to instantly fetch hifiasm documentation:
MCP call
resolve-library-id("hifiasm")Or share this page: biocontext7.com/tools/biotools/hifiasm
EDAM Annotations
Topics
Operations
Documentation Snippets
3,893 tokens · 7 snippets```sh
git clone https://github.com/chhylp123/hifiasm
cd hifiasm && make
wget https://github.com/chhylp123/hifiasm/releases/download/v0.7/chr11-2M.fa.gz
./hifiasm -o test -t4 -f0 chr11-2M.fa.gz 2> test.log
awk '/^S/{print ">"$2;print $3}' test.bp.p_ctg.gfa > test.p_ctg.fa # get primary contigs in FASTA
hifiasm -o CHM13.asm -t32 -l0 CHM13-HiFi.fa.gz 2> CHM13.asm.log
hifiasm -o HG002.asm -t32 HG002-file1.fq.gz HG002-file2.fq.gz
hifiasm -o HG002.asm --ont -t32 HG002-ont.fq.gz
hifiasm -o HG002.asm --h1 read1.fq.gz --h2 read2.fq.gz HG002-HiFi.fq.gz
yak count -b37 -t16 -o pat.yak <(cat pat_1.fq.gz pat_2.fq.gz) <(cat pat_1.fq.gz pat_2.fq.gz)
yak count -b37 -t16 -o mat.yak <(cat mat_1.fq.gz mat_2.fq.gz) <(cat mat_1.fq.gz mat_2.fq.gz)
hifiasm -o HG002.asm -t32 -1 pat.yak -2 mat.yak HG002-HiFi.fa.gz
--dual-scaf)hifiasm -o HG002.asm --dual-scaf --h1 read1.fq.gz --h2 read2.fq.gz HG002-HiFi.fq.gz
--telo-m CCCTAA)hifiasm -o HG002.asm --telo-m CCCTAA --h1 read1.fq.gz --h2 read2.fq.gz HG002-HiFi.fq.gz
hifiasm -o HG002.asm --h1 read1.fq.gz --h2 read2.fq.gz --ul ul.fq.gz HG002-HiFi.fq.gz
hifiasm -o HG002.asm --dual-scaf --telo-m CCCTAA --h1 read1.fq.gz --h2 read2.fq.gz --ul ul.fq.gz HG002-HiFi.fq.gz
```
See [tutorial][tutorial] for more details.
- [Getting Started](#started)
- [Introduction](#intro)
- [Why Hifiasm?](#why)
- [Usage](#use)
- [Assembling HiFi reads without additional data types](#hifionly)
- [Assembling ONT reads](#ontonly)
- [Hi-C integration](#hic)
- [Trio binning](#trio)
- [Ultra-long ONT integration](#ul)
- [Output files](#output)
- [Results](#results)
- [Getting Help](#help)
- [Limitations](#limit)
- [Citing Hifiasm](#cite)
Hifiasm is a fast haplotype-resolved de novo assembler initially designed for PacBio HiFi reads.
Its latest release could support the telomere-to-telomere assembly by utilizing ultralong Oxford Nanopore reads. Hifiasm produces arguably the best single-sample telomere-to-telomere assemblies combing HiFi, ultralong and Hi-C reads, and it is one of the best haplotype-resolved assemblers for the trio-binning assembly given parental short reads. For a human genome, hifiasm can produce the telomere-to-telomere assembly in one day.
- Hifiasm delivers high-quality telomere-to-telomere assemblies. It tends to generate longer contigs
and resolve more segmental duplications than other assemblers.
- Given Hi-C reads or short reads from the parents, hifiasm can produce overall the best
haplotype-resolved assembly so far. It is the assembler of choice by the
[Human Pangenome Project][hpp] for the first batch of samples.
- Hifiasm can purge duplications between haplotigs without relying on
third-party tools such as purge\_dups. Hifiasm does not need polishing tools
like pilon or racon, either. This simplifies the assembly pipeline and saves
running time.
- Hifiasm is fast. It can assemble a human genome in half a day and assemble a
~30Gb redwood genome in three days. No genome is too large for hifiasm.
- Hifiasm is trivial to install and easy to use. It does not required Python,
R or C++11 compilers, and can be compiled into a single executable. The
default setting works well with a variety of genomes.
[hpp]: https://humanpangenome.org
A typical hifiasm command line looks like:
```sh
hifiasm -o NA12878.asm -t 32 NA12878.fq.gz
```
where NA12878.fq.gz provides the input reads, -t sets the number of CPUs in
use and -o specifies the prefix of output files. For this example, the
primary contigs are written to NA12878.asm.bp.p_ctg.gfa.
Since v0.15, hifiasm also produces two sets of
partially phased contigs at NA12878.asm.bp.hap?.p_ctg.gfa. This pair of files
can be thought to represent the two haplotypes in a diploid genome, though with
occasional switch errors. The frequency of switches is determined by the
heterozygosity of the input sample.
At the first run, hifiasm saves corrected reads and
overlaps to disk as NA12878.asm.*.bin. It reuses the saved results to avoid
the time-consuming all-vs-all overlap calculation next time. You may specify
-i to ignore precomputed overlaps and redo overlapping from raw reads.
You can also dump error corrected reads in FASTA and read overlaps in PAF with
```sh
hifiasm -o NA12878.asm -t 32 --write-paf --write-ec /dev/null
```
Hifiasm purges haplotig duplications by default. For inbred or homozygous
genomes, you may disable purging with option -l0. Old HiFi reads may contain
short adapter sequences at the ends of reads. You can specify -z20 to trim
both ends of reads by 20bp. For small genomes, use -f0 to disable the initial
bloom filter which takes 16GB memory at the beginning. For genomes much larger
than human, applying -f38 or even -f39 is preferred to save memory on k-mer
counting.
Since version 0.21.0 (r686), hifiasm can support ONT assembly using ONT simplex R10 reads.
To enable this feature, add the --ont option as shown below:
```sh
hifiasm -t64 --ont -o ONT.asm ONT.read.fastq.gz
```
Please note that this module requires input reads in FASTQ format.
Compiled Skill
No compiled skill yet for hifiasm.
Installation
conda install -c bioconda hifiasmContainer Images
docker pull quay.io/hifiasm:0.19.9--h43eeafb_0Version History
- v0.19.9