fastp
bio.toolsVerified25 docs · 11.6K tokensUltra-fast all-in-one FASTQ preprocessor for quality control and adapter trimming
Use with Claude
Paste this MCP call into Claude to instantly fetch fastp documentation:
MCP call
resolve-library-id("fastp")Or share this page: biocontext7.com/tools/biotools/fastp
EDAM Annotations
Topics
Operations
Documentation Snippets
11,565 tokens · 25 snippets[](https://anaconda.org/bioconda/fastp)
[](https://anaconda.org/bioconda/fastp)
[](https://packages.debian.org/unstable/fastp)

A tool designed to provide ultrafast all-in-one preprocessing and quality control for FastQ data.
This tool is designed for processing short reads (i.e. Illumina NovaSeq, MGI), if you are looking for tools to process long reads (i.e. Nanopore, PacBio, Cyclone), please use fastplong.
fastp supports batch processing of multiple FASTQ files in a folder, see - [batch processing](#batch-processing)
If you use fastp in your work, you can cite fastp as: Shifu Chen. fastp 1.0: An ultra-fast all-round tool for FASTQ data quality control and preprocessing. iMeta 4.5 (2025): e70078
- [features](#features)
- [simple usage](#simple-usage)
- [examples of report](#examples-of-report)
- [get fastp](#get-fastp)
- [install with Bioconda](#install-with-bioconda)
- [or download the latest prebuilt binary for Linux users](#or-download-the-latest-prebuilt-binary-for-linux-users)
- [or compile from source](#or-compile-from-source)
- [Step 1: install isa-l](#step-1-install-isa-l)
- [step 2: install libdeflate](#step-2-install-libdeflate)
- [Step 3: download and build fastp](#step-3-download-and-build-fastp)
- [input and output](#input-and-output)
- [output to STDOUT](#output-to-stdout)
- [input from STDIN](#input-from-stdin)
- [store the unpaired reads for PE data](#store-the-unpaired-reads-for-pe-data)
- [store the reads that fail the filters](#store-the-reads-that-fail-the-filters)
- [process only part of the data](#process-only-part-of-the-data)
- [do not overwrite exiting files](#do-not-overwrite-exiting-files)
- [split the output to multiple files for parallel processing](#split-the-output-to-multiple-files-for-parallel-processing)
- [merge PE reads](#merge-pe-reads)
- [filtering](#filtering)
- [quality filter](#quality-filter)
- [length filter](#length-filter)
- [low complexity filter](#low-complexity-filter)
- [Other filter](#other-filter)
- [adapters](#adapters)
- [per read cutting by quality score](#per-read-cutting-by-quality-score)
- [base correction for PE data](#base-correction-for-pe-data)
- [global trimming](#global-trimming)
- [polyG tail trimming](#polyg-tail-trimming)
- [polyX tail trimming](#polyx-tail-trimming)
- [unique molecular identifier (UMI) processing](#unique-molecular-identifier-umi-processing)
- [UMI example](#umi-example)
- [output splitting](#output-splitting)
- [splitting by limiting file number](#splitting-by-limiting-file-number)
- [splitting by limiting the lines of each file](#splitting-by-limiting-the-lines-of-each-file)
- [overrepresented sequence analysis](#overrepresented-sequence-analysis)
- [merge paired-end reads](#merge-paired-end-reads)
- [duplication rate and deduplication](#duplication-rate-and-deduplication)
- [duplication rate evaluation](#duplication-rate-evaluation)
- [deduplication](#deduplication)
- [batch processing](#batch-processing)
- [all options](#all-options)
- [citations](#citations)
- comprehensive quality profiling for both before and after filtering data (quality curves, base contents, KMER, Q20/Q30, GC Ratio, duplication, adapter contents...)
- filter out bad reads (too low quality, too short, or too many N...)
- cut low quality bases for per read in its 5' and 3' by evaluating the mean quality from a sliding window (like Trimmomatic but faster).
- trim all reads in front and tail
- cut adapters. Adapter sequences can be automatically detected, which means you don't have to input the adapter sequences to trim them.
- correct mismatched base pairs in overlapped regions of paired end reads, if one base is with high quality while the other is with ultra low quality
- trim polyG in 3' ends, which is commonly seen in NovaSeq/NextSeq data. Trim polyX in 3' ends to remove unwanted polyX tailing (i.e. polyA tailing for mRNA-Seq data)
- preprocess unique molecular identifier (UMI) enabled data, shift UMI to sequence name.
- report JSON format result for further interpreting.
- visualize quality control and filtering results on a single HTML page (like FASTQC but faster and more informative).
- split the output to multiple files (0001.R1.gz, 0002.R1.gz...) to support parallel processing. Two modes can be used, limiting the total split file number, or limitting the lines of each split file.
- support long reads (data from PacBio / Nanopore devices).
- support reading from STDIN and writing to STDOUT
- support interleaved input
- support ultra-fast FASTQ-level deduplication
- ...
If you find a bug or have additional requirement for fastp, please file an issue:https://github.com/OpenGene/fastp/issues/new
- for single end data (not compressed)
```
fastp -i in.fq -o out.fq
```
- for paired end data (gzip compressed)
```
fastp -i in.R1.fq.gz -I in.R2.fq.gz -o out.R1.fq.gz -O out.R2.fq.gz
```
By default, the HTML report is saved to fastp.html (can be specified with -h option), and the JSON report is saved to fastp.json (can be specified with -j option).
fastp creates reports in both HTML and JSON format.
- HTML report: http://opengene.org/fastp/fastp.html
- JSON report: http://opengene.org/fastp/fastp.json
[](https://anaconda.org/bioconda/fastp)
```shell
Compiled Skill
No compiled skill yet for fastp.
Installation
conda install -c bioconda fastpContainer Images
docker pull quay.io/fastp:0.23.4--hadf994f_2Version History
- v0.23.4