FMLRC2 — FM-index Long Read Corrector version 2, a Rust-based hybrid error correction tool for long reads (PacBio CLR, Oxford Nanopore) using Illumina short reads as a correction guide. Builds a compr

9

FragGeneScan

Use when working with fraggenescan — fragGeneScan -- gene prediction

9

FreeBayes -- Bayesian Haplotype-Based Variant Caller

FreeBayes -- Bayesian haplotype-based variant caller for short-read sequencing data. Detects SNPs, indels, MNPs, and complex variants from BAM/CRAM alignments against a reference genome. Uses literal

9

FSL MRS

Use when processing or analyzing Magnetic Resonance Spectroscopy (MRS) data with FSL MRS. Covers spectral fitting, basis set generation, preprocessing, quantification, and quality control of 1H-MRS an

9

FusionCatcher

FusionCatcher — tool for detecting somatic fusion genes, translocations, and chimeric transcripts from RNA-seq data. Identifies known and novel gene fusions in tumor and normal samples using multiple

11

GATK HaplotypeCaller

GATK HaplotypeCaller — germline short variant caller using localized de novo assembly of haplotypes. Calls SNPs and indels from BAM/CRAM alignment files, producing VCF or gVCF output. Supports single-

12

GATK Methylation

Use this skill for GATK-based DNA methylation and bisulfite sequencing workflows including WGBS preprocessing, base quality score recalibration for bisulfite data, duplicate marking of bisulfite reads

10

GATK VQSR

GATK VQSR (Variant Quality Score Recalibration) — machine learning-based variant filtering for GATK Best Practices germline pipelines. Trains a Gaussian mixture model on truth/training resource datase

10

GATK4 (Genome Analysis Toolkit)

GATK4 — Genome Analysis Toolkit for germline and somatic short variant discovery (SNPs and indels). Industry-standard caller providing HaplotypeCaller for germline, Mutect2 for somatic, plus Base Qual

9

GEMMA

GEMMA (Genome-wide Efficient Mixed Model Association) — fast C++ tool for genome-wide association study (GWAS) analysis using linear mixed models (LMM). Computes genetic relatedness matrices (GRM/kins

10

GeMoMa

GeMoMa (Gene Model Mapper) is a Java-based homology-driven gene prediction tool for eukaryotic genome annotation. Uses protein and CDS sequences from annotated reference species to predict gene models

10

geNomad

geNomad — identify viruses and plasmids in metagenomic assemblies and isolate genomes using neural-network sequence embeddings and marker gene profiles. Classifies sequences as chromosome, plasmid, or

10

gfatools

Use when working with gfatools — a lightweight C toolkit for GFA format manipulation — for viewing, converting, sorting, and validating Graphical Fragment Assembly (GFA) files. Provides subcommands fo

10

gff3sort

gff3sort is a fast command-line tool for sorting GFF3 annotation files so that parent features always appear before their child features, and features on the same sequence are ordered by start coordin

10

gffcompare

gffcompare -- Tool for comparing, merging, and annotating RNA-seq transcript assemblies against a reference annotation. Classifies each assembled transcript with a class code (=, c, j, u, x, i, p, r,

11

gffread

gffread -- GFF/GTF utility for filtering, converting, and extracting sequences from genome annotation files. Converts between GFF3 and GTF formats, extracts transcript (FASTA) and protein sequences fr

11

GLnexus

GLnexus — scalable gVCF merging and joint variant calling for cohort genomics. Merges per-sample gVCFs from DeepVariant or GATK HaplotypeCaller into a joint-called project-level BCF/VCF using an embed

10

goleft

Use when working with goleft — a collection of Go-based tools for fast

10

Gubbins

Gubbins — rapid phylogenetic analysis of recombinant bacterial whole genome sequences. Iteratively detects recombination hotspots with elevated SNP density while constructing clonal frame phylogenies.

10

GWAS Catalog Tools

GWAS Catalog tools (gwas-sumstats-tools) are EMBL-EBI utilities for reading, writing, validating, and formatting genome-wide association study (GWAS) summary statistics in the GWAS-SSF standard format

11

Harvest Tools

Harvest Tools — core genome phylogenomics suite for rapid whole-genome alignment, SNP extraction, and phylogenetic tree inference from closely related microbial genomes. Parsnp performs MUMmer-based c

10

Herro

Herro — haplotype-aware error correction for Oxford Nanopore long reads using deep learning. Corrects ONT simplex reads to Q30+ accuracy while preserving haplotype information for diploid assembly. Us

10

HiCanu

HiCanu — HiFi-optimized mode of the Canu genome assembler for accurate

9

HiCExplorer

HiCExplorer — comprehensive toolkit for Hi-C chromosome conformation capture data analysis, quality control, normalization, and visualization. Provides hicBuildMatrix (contact matrix construction), hi

10

HTSeq

HTSeq — Python framework for high-throughput sequencing data analysis, primarily used for counting aligned reads overlapping genomic features (genes, exons) from SAM/BAM/CRAM files using GTF/GFF annot

10

ICGC/ARGO Tools

Verified

ICGC/ARGO tools — suite of command-line clients and APIs for accessing the International Cancer Genome Consortium (ICGC) and ARGO (Accelerating Research in Genomic Oncology) platform. Includes score-c

10

ipyrad

ipyrad is a Python toolkit for de novo and reference-guided assembly and genotyping of RADseq, GBS, ddRAD, 2bRAD, and other restriction enzyme-based reduced representation genomic datasets. Implements

9

Juicebox

Juicebox — interactive visualization and analysis tool for Hi-C and other proximity-ligation chromatin conformation data. Loads .hic files produced by Juicer, Dovetail, or Arima workflows and enables

9

kb-python

kb-python — Python wrapper for the kallisto|bustools (kb) workflow for single-cell RNA-seq pre-processing. Runs kb ref to build or download reference indices and kb count to generate count matrices fr

9

KING

KING — robust kinship estimation and relationship inference for genome-wide SNP data. Computes pairwise kinship coefficients, infers family relationships (MZ twins, parent-offspring, full siblings, 2n

11

KMCP

Use when working with kmcp — KMCP — coverage-based metagenomic sequence

10

KneadData

KneadData — quality control and host decontamination tool for metagenomic

9

Kraken 2

Kraken 2 — fast k-mer-based taxonomic sequence classifier for metagenomics. Assigns NCBI taxonomic labels to DNA/protein sequences using exact k-mer matching against a minimizer-indexed hash table. Su

11

Krona

Krona — interactive HTML pie chart visualization of hierarchical taxonomic data from metagenomics and sequence classification. Part of KronaTools, which provides importers for Kraken2, BLAST, Diamond,

10

LAST

LAST — adaptive-seed sequence aligner for genomes, long reads, and proteins. Uses lastdb to build databases, last-train to learn substitution/gap rates, lastal for alignment, last-split for rearrangem

10

Liftoff

Liftoff — accurate genome annotation liftover tool that maps GFF/GTF annotations between assemblies of the same or closely-related species using Minimap2 alignment. Supports gene copy detection, ORF v

10

LINX

LINX — structural variant annotation and visualization tool from the Hartwig Medical Foundation hmftools suite. Interprets structural variants and copy number data to classify driver events including

10

locfdr

locfdr — Efron's empirical Bayes local false discovery rate estimation from z-score vectors. Computes per-test posterior probability of being null using a mixture model approach that fits the empirica

10

LocusZoom

LocusZoom — regional association plot generator for GWAS and fine-mapping results. Creates publication-quality locus zoom plots showing -log10(p-value) vs genomic position with LD coloring, recombinat

11

LSHTM Pipelines

LSHTM PathogenSeq bioinformatics pipelines for pathogen whole-genome sequencing analysis, including TB-Profiler for Mycobacterium tuberculosis lineage and drug-resistance prediction from WGS data, Mal

9

LUMPY

LUMPY — probabilistic structural variant discovery from paired-end short-read sequencing. Detects deletions, duplications, inversions, and translocations from BWA-MEM aligned BAM files using discordan

10

Maaslin2

MaAsLin 2 (Microbiome Multivariable Association with Linear Models 2) is the bioBakery tool for finding associations between microbial features (taxa, functional pathways, gene families) and sample me

10

MACS2/MACS3

MACS2/MACS3 — Model-based Analysis of ChIP-Seq for identifying transcription factor binding sites and histone modification enrichment from ChIP-seq, ATAC-seq, and CUT&Tag data. Provides peak calling (

11

MAGeCK -- Model-based Analysis of Genome-wide CRISPR-Knockout

MAGeCK (Model-based Analysis of Genome-wide CRISPR-Knockout) -- computational pipeline for CRISPR genetic screen analysis. Performs sgRNA counting from FASTQ files (mageck count), gene-level essential

9

MAGeCK-VISPR

MAGeCK-VISPR — comprehensive CRISPR screen analysis framework combining MAGeCK statistical testing (RRA and MLE algorithms) with VISPR interactive visualization. Supports sgRNA count generation from F

10

MAJIQ

MAJIQ — Modeling Alternative Junction Inclusion Quantification for detecting, quantifying, and visualizing local splicing variations (LSVs) from RNA-Seq data. Builds splice graphs from BAM files and G

10

MAKER

MAKER — portable genome annotation pipeline that integrates ab initio gene predictors (SNAP, Augustus, GeneMark), protein and EST evidence alignment, and repeat masking to produce GFF3 gene models wit

10

Mash

Mash — fast genome and metagenome distance estimation using MinHash sketching. Estimates pairwise distances between genomic sequences (FASTA/FASTQ) without full alignment using k-mer sketches. Support

10

Mashtree

Mashtree — rapid distance-based phylogenetic tree construction from genome assemblies using MinHash (Mash) distances. Creates neighbor-joining trees from FASTA assemblies, FASTQ reads, or GenBank file

10

MaSuRCA

MaSuRCA (Maryland Super-Read Celera Assembler) — hybrid genome assembler combining Illumina short reads with PacBio or Oxford Nanopore long reads using super-read and mega-read technology. Performs au

10

Tool	Registry	Domain	Docs
FMLRC2 FMLRC2 — FM-index Long Read Corrector version 2, a Rust-based hybrid error correction tool for long reads (PacBio CLR, Oxford Nanopore) using Illumina short reads as a correction guide. Builds a compr	HudsonAlpha/fmlrc2	Genomics	9
FragGeneScan Use when working with fraggenescan — fragGeneScan -- gene prediction	COL-IU/FragGeneScan	Metagenomics	9
FreeBayes -- Bayesian Haplotype-Based Variant Caller FreeBayes -- Bayesian haplotype-based variant caller for short-read sequencing data. Detects SNPs, indels, MNPs, and complex variants from BAM/CRAM alignments against a reference genome. Uses literal	freebayes/freebayes	Genomics	9
FSL MRS Use when processing or analyzing Magnetic Resonance Spectroscopy (MRS) data with FSL MRS. Covers spectral fitting, basis set generation, preprocessing, quantification, and quality control of 1H-MRS an	fsl-mrs/fsl_mrs	Imaging	9
FusionCatcher FusionCatcher — tool for detecting somatic fusion genes, translocations, and chimeric transcripts from RNA-seq data. Identifies known and novel gene fusions in tumor and normal samples using multiple	ndaniel/fusioncatcher	Transcriptomics	11
GATK HaplotypeCaller GATK HaplotypeCaller — germline short variant caller using localized de novo assembly of haplotypes. Calls SNPs and indels from BAM/CRAM alignment files, producing VCF or gVCF output. Supports single-	broadinstitute/gatk	Genomics	12
GATK Methylation Use this skill for GATK-based DNA methylation and bisulfite sequencing workflows including WGBS preprocessing, base quality score recalibration for bisulfite data, duplicate marking of bisulfite reads	broadinstitute/gatk	Epigenomics	10
GATK VQSR GATK VQSR (Variant Quality Score Recalibration) — machine learning-based variant filtering for GATK Best Practices germline pipelines. Trains a Gaussian mixture model on truth/training resource datase	broadinstitute/gatk	Genomics	10
GATK4 (Genome Analysis Toolkit) GATK4 — Genome Analysis Toolkit for germline and somatic short variant discovery (SNPs and indels). Industry-standard caller providing HaplotypeCaller for germline, Mutect2 for somatic, plus Base Qual	broadinstitute/gatk	Transcriptomics	9
GEMMA GEMMA (Genome-wide Efficient Mixed Model Association) — fast C++ tool for genome-wide association study (GWAS) analysis using linear mixed models (LMM). Computes genetic relatedness matrices (GRM/kins	genetics-statistics/GEMMA	Population Genetics	10
GeMoMa GeMoMa (Gene Model Mapper) is a Java-based homology-driven gene prediction tool for eukaryotic genome annotation. Uses protein and CDS sequences from annotated reference species to predict gene models	Jstacs/Jstacs	Other	10
geNomad geNomad — identify viruses and plasmids in metagenomic assemblies and isolate genomes using neural-network sequence embeddings and marker gene profiles. Classifies sequences as chromosome, plasmid, or	apcamargo/genomad	Metagenomics	10
gfatools Use when working with gfatools — a lightweight C toolkit for GFA format manipulation — for viewing, converting, sorting, and validating Graphical Fragment Assembly (GFA) files. Provides subcommands fo	lh3/gfatools	Genomics	10
gff3sort gff3sort is a fast command-line tool for sorting GFF3 annotation files so that parent features always appear before their child features, and features on the same sequence are ordered by start coordin	billzt/gff3sort	Utilities & Infrastructure	10
gffcompare gffcompare -- Tool for comparing, merging, and annotating RNA-seq transcript assemblies against a reference annotation. Classifies each assembled transcript with a class code (=, c, j, u, x, i, p, r,	gpertea/gffcompare	Transcriptomics	11
gffread gffread -- GFF/GTF utility for filtering, converting, and extracting sequences from genome annotation files. Converts between GFF3 and GTF formats, extracts transcript (FASTA) and protein sequences fr	gpertea/gffread	Transcriptomics	11
GLnexus GLnexus — scalable gVCF merging and joint variant calling for cohort genomics. Merges per-sample gVCFs from DeepVariant or GATK HaplotypeCaller into a joint-called project-level BCF/VCF using an embed	dnanexus-rnd/GLnexus	Genomics	10
goleft Use when working with goleft — a collection of Go-based tools for fast	brentp/goleft	QC & Preprocessing	10
Gubbins Gubbins — rapid phylogenetic analysis of recombinant bacterial whole genome sequences. Iteratively detects recombination hotspots with elevated SNP density while constructing clonal frame phylogenies.	nickjcroucher/gubbins	Phylogenetics	10
GWAS Catalog Tools GWAS Catalog tools (gwas-sumstats-tools) are EMBL-EBI utilities for reading, writing, validating, and formatting genome-wide association study (GWAS) summary statistics in the GWAS-SSF standard format	EBISPOT/gwas-summary-statistics	Utilities & Infrastructure	11
Harvest Tools Harvest Tools — core genome phylogenomics suite for rapid whole-genome alignment, SNP extraction, and phylogenetic tree inference from closely related microbial genomes. Parsnp performs MUMmer-based c	marbl/harvest	Phylogenetics	10
Herro Herro — haplotype-aware error correction for Oxford Nanopore long reads using deep learning. Corrects ONT simplex reads to Q30+ accuracy while preserving haplotype information for diploid assembly. Us	lbcb-sci/herro	Genomics	10
HiCanu HiCanu — HiFi-optimized mode of the Canu genome assembler for accurate	marbl/canu	Genomics	9
HiCExplorer HiCExplorer — comprehensive toolkit for Hi-C chromosome conformation capture data analysis, quality control, normalization, and visualization. Provides hicBuildMatrix (contact matrix construction), hi	deeptools/HiCExplorer	Genomics	10
HTSeq HTSeq — Python framework for high-throughput sequencing data analysis, primarily used for counting aligned reads overlapping genomic features (genes, exons) from SAM/BAM/CRAM files using GTF/GFF annot	htseq/htseq	Genomics	10
ICGC/ARGO Tools Verified ICGC/ARGO tools — suite of command-line clients and APIs for accessing the International Cancer Genome Consortium (ICGC) and ARGO (Accelerating Research in Genomic Oncology) platform. Includes score-c	overture-stack/score	Utilities & Infrastructure	10
ipyrad ipyrad is a Python toolkit for de novo and reference-guided assembly and genotyping of RADseq, GBS, ddRAD, 2bRAD, and other restriction enzyme-based reduced representation genomic datasets. Implements	dereneaton/ipyrad	Other	9
Juicebox Juicebox — interactive visualization and analysis tool for Hi-C and other proximity-ligation chromatin conformation data. Loads .hic files produced by Juicer, Dovetail, or Arima workflows and enables	aidenlab/Juicebox	Genomics	9
kb-python kb-python — Python wrapper for the kallisto\|bustools (kb) workflow for single-cell RNA-seq pre-processing. Runs kb ref to build or download reference indices and kb count to generate count matrices fr	pachterlab/kb_python	Single-Cell	9
KING KING — robust kinship estimation and relationship inference for genome-wide SNP data. Computes pairwise kinship coefficients, infers family relationships (MZ twins, parent-offspring, full siblings, 2n	Shicheng-Guo/KING	Population Genetics	11
KMCP Use when working with kmcp — KMCP — coverage-based metagenomic sequence	shenwei356/kmcp	Metagenomics	10
KneadData KneadData — quality control and host decontamination tool for metagenomic	biobakery/kneaddata	Metagenomics	9
Kraken 2 Kraken 2 — fast k-mer-based taxonomic sequence classifier for metagenomics. Assigns NCBI taxonomic labels to DNA/protein sequences using exact k-mer matching against a minimizer-indexed hash table. Su	DerrickWood/kraken2	Metagenomics	11
Krona Krona — interactive HTML pie chart visualization of hierarchical taxonomic data from metagenomics and sequence classification. Part of KronaTools, which provides importers for Kraken2, BLAST, Diamond,	marbl/Krona	Metagenomics	10
LAST LAST — adaptive-seed sequence aligner for genomes, long reads, and proteins. Uses lastdb to build databases, last-train to learn substitution/gap rates, lastal for alignment, last-split for rearrangem	manual	Genomics	10
Liftoff Liftoff — accurate genome annotation liftover tool that maps GFF/GTF annotations between assemblies of the same or closely-related species using Minimap2 alignment. Supports gene copy detection, ORF v	agshumate/Liftoff	Genomics	10
LINX LINX — structural variant annotation and visualization tool from the Hartwig Medical Foundation hmftools suite. Interprets structural variants and copy number data to classify driver events including	hartwigmedical/hmftools	Clinical Genomics	10
locfdr locfdr — Efron's empirical Bayes local false discovery rate estimation from z-score vectors. Computes per-test posterior probability of being null using a mixture model approach that fits the empirica	manual	Statistics	10
LocusZoom LocusZoom — regional association plot generator for GWAS and fine-mapping results. Creates publication-quality locus zoom plots showing -log10(p-value) vs genomic position with LD coloring, recombinat	statgen/locuszoom	Population Genetics	11
LSHTM Pipelines LSHTM PathogenSeq bioinformatics pipelines for pathogen whole-genome sequencing analysis, including TB-Profiler for Mycobacterium tuberculosis lineage and drug-resistance prediction from WGS data, Mal	jodyphelan/TBProfiler	Metagenomics	9
LUMPY LUMPY — probabilistic structural variant discovery from paired-end short-read sequencing. Detects deletions, duplications, inversions, and translocations from BWA-MEM aligned BAM files using discordan	arq5x/lumpy-sv	Genomics	10
Maaslin2 MaAsLin 2 (Microbiome Multivariable Association with Linear Models 2) is the bioBakery tool for finding associations between microbial features (taxa, functional pathways, gene families) and sample me	biobakery/Maaslin2	Metagenomics	10
MACS2/MACS3 MACS2/MACS3 — Model-based Analysis of ChIP-Seq for identifying transcription factor binding sites and histone modification enrichment from ChIP-seq, ATAC-seq, and CUT&Tag data. Provides peak calling (	macs3-project/MACS	Epigenomics	11
MAGeCK -- Model-based Analysis of Genome-wide CRISPR-Knockout MAGeCK (Model-based Analysis of Genome-wide CRISPR-Knockout) -- computational pipeline for CRISPR genetic screen analysis. Performs sgRNA counting from FASTQ files (mageck count), gene-level essential	davidwwei/MAGeCK	Genomics	9
MAGeCK-VISPR MAGeCK-VISPR — comprehensive CRISPR screen analysis framework combining MAGeCK statistical testing (RRA and MLE algorithms) with VISPR interactive visualization. Supports sgRNA count generation from F	liulab-dfci/MAGeCK-VISPR	Systems Biology	10
MAJIQ MAJIQ — Modeling Alternative Junction Inclusion Quantification for detecting, quantifying, and visualizing local splicing variations (LSVs) from RNA-Seq data. Builds splice graphs from BAM files and G	biociphers/MAJIQ	Transcriptomics	10
MAKER MAKER — portable genome annotation pipeline that integrates ab initio gene predictors (SNAP, Augustus, GeneMark), protein and EST evidence alignment, and repeat masking to produce GFF3 gene models wit	Yandell-Lab/maker	Genomics	10
Mash Mash — fast genome and metagenome distance estimation using MinHash sketching. Estimates pairwise distances between genomic sequences (FASTA/FASTQ) without full alignment using k-mer sketches. Support	marbl/Mash	Metagenomics	10
Mashtree Mashtree — rapid distance-based phylogenetic tree construction from genome assemblies using MinHash (Mash) distances. Creates neighbor-joining trees from FASTA assemblies, FASTQ reads, or GenBank file	lskatz/mash	Phylogenetics	10
MaSuRCA MaSuRCA (Maryland Super-Read Celera Assembler) — hybrid genome assembler combining Illumina short reads with PacBio or Oxford Nanopore long reads using super-read and mega-read technology. Performs au	alekseyzimin/masurca	Genomics	10

Tool	Registry	Domain	Docs
FMLRC2 FMLRC2 — FM-index Long Read Corrector version 2, a Rust-based hybrid error correction tool for long reads (PacBio CLR, Oxford Nanopore) using Illumina short reads as a correction guide. Builds a compr	HudsonAlpha/fmlrc2	Genomics	9
FragGeneScan Use when working with fraggenescan — fragGeneScan -- gene prediction	COL-IU/FragGeneScan	Metagenomics	9
FreeBayes -- Bayesian Haplotype-Based Variant Caller FreeBayes -- Bayesian haplotype-based variant caller for short-read sequencing data. Detects SNPs, indels, MNPs, and complex variants from BAM/CRAM alignments against a reference genome. Uses literal	freebayes/freebayes	Genomics	9
FSL MRS Use when processing or analyzing Magnetic Resonance Spectroscopy (MRS) data with FSL MRS. Covers spectral fitting, basis set generation, preprocessing, quantification, and quality control of 1H-MRS an	fsl-mrs/fsl_mrs	Imaging	9
FusionCatcher FusionCatcher — tool for detecting somatic fusion genes, translocations, and chimeric transcripts from RNA-seq data. Identifies known and novel gene fusions in tumor and normal samples using multiple	ndaniel/fusioncatcher	Transcriptomics	11
GATK HaplotypeCaller GATK HaplotypeCaller — germline short variant caller using localized de novo assembly of haplotypes. Calls SNPs and indels from BAM/CRAM alignment files, producing VCF or gVCF output. Supports single-	broadinstitute/gatk	Genomics	12
GATK Methylation Use this skill for GATK-based DNA methylation and bisulfite sequencing workflows including WGBS preprocessing, base quality score recalibration for bisulfite data, duplicate marking of bisulfite reads	broadinstitute/gatk	Epigenomics	10
GATK VQSR GATK VQSR (Variant Quality Score Recalibration) — machine learning-based variant filtering for GATK Best Practices germline pipelines. Trains a Gaussian mixture model on truth/training resource datase	broadinstitute/gatk	Genomics	10
GATK4 (Genome Analysis Toolkit) GATK4 — Genome Analysis Toolkit for germline and somatic short variant discovery (SNPs and indels). Industry-standard caller providing HaplotypeCaller for germline, Mutect2 for somatic, plus Base Qual	broadinstitute/gatk	Transcriptomics	9
GEMMA GEMMA (Genome-wide Efficient Mixed Model Association) — fast C++ tool for genome-wide association study (GWAS) analysis using linear mixed models (LMM). Computes genetic relatedness matrices (GRM/kins	genetics-statistics/GEMMA	Population Genetics	10
GeMoMa GeMoMa (Gene Model Mapper) is a Java-based homology-driven gene prediction tool for eukaryotic genome annotation. Uses protein and CDS sequences from annotated reference species to predict gene models	Jstacs/Jstacs	Other	10
geNomad geNomad — identify viruses and plasmids in metagenomic assemblies and isolate genomes using neural-network sequence embeddings and marker gene profiles. Classifies sequences as chromosome, plasmid, or	apcamargo/genomad	Metagenomics	10
gfatools Use when working with gfatools — a lightweight C toolkit for GFA format manipulation — for viewing, converting, sorting, and validating Graphical Fragment Assembly (GFA) files. Provides subcommands fo	lh3/gfatools	Genomics	10
gff3sort gff3sort is a fast command-line tool for sorting GFF3 annotation files so that parent features always appear before their child features, and features on the same sequence are ordered by start coordin	billzt/gff3sort	Utilities & Infrastructure	10
gffcompare gffcompare -- Tool for comparing, merging, and annotating RNA-seq transcript assemblies against a reference annotation. Classifies each assembled transcript with a class code (=, c, j, u, x, i, p, r,	gpertea/gffcompare	Transcriptomics	11
gffread gffread -- GFF/GTF utility for filtering, converting, and extracting sequences from genome annotation files. Converts between GFF3 and GTF formats, extracts transcript (FASTA) and protein sequences fr	gpertea/gffread	Transcriptomics	11
GLnexus GLnexus — scalable gVCF merging and joint variant calling for cohort genomics. Merges per-sample gVCFs from DeepVariant or GATK HaplotypeCaller into a joint-called project-level BCF/VCF using an embed	dnanexus-rnd/GLnexus	Genomics	10
goleft Use when working with goleft — a collection of Go-based tools for fast	brentp/goleft	QC & Preprocessing	10
Gubbins Gubbins — rapid phylogenetic analysis of recombinant bacterial whole genome sequences. Iteratively detects recombination hotspots with elevated SNP density while constructing clonal frame phylogenies.	nickjcroucher/gubbins	Phylogenetics	10
GWAS Catalog Tools GWAS Catalog tools (gwas-sumstats-tools) are EMBL-EBI utilities for reading, writing, validating, and formatting genome-wide association study (GWAS) summary statistics in the GWAS-SSF standard format	EBISPOT/gwas-summary-statistics	Utilities & Infrastructure	11
Harvest Tools Harvest Tools — core genome phylogenomics suite for rapid whole-genome alignment, SNP extraction, and phylogenetic tree inference from closely related microbial genomes. Parsnp performs MUMmer-based c	marbl/harvest	Phylogenetics	10
Herro Herro — haplotype-aware error correction for Oxford Nanopore long reads using deep learning. Corrects ONT simplex reads to Q30+ accuracy while preserving haplotype information for diploid assembly. Us	lbcb-sci/herro	Genomics	10
HiCanu HiCanu — HiFi-optimized mode of the Canu genome assembler for accurate	marbl/canu	Genomics	9
HiCExplorer HiCExplorer — comprehensive toolkit for Hi-C chromosome conformation capture data analysis, quality control, normalization, and visualization. Provides hicBuildMatrix (contact matrix construction), hi	deeptools/HiCExplorer	Genomics	10
HTSeq HTSeq — Python framework for high-throughput sequencing data analysis, primarily used for counting aligned reads overlapping genomic features (genes, exons) from SAM/BAM/CRAM files using GTF/GFF annot	htseq/htseq	Genomics	10
ICGC/ARGO Tools Verified ICGC/ARGO tools — suite of command-line clients and APIs for accessing the International Cancer Genome Consortium (ICGC) and ARGO (Accelerating Research in Genomic Oncology) platform. Includes score-c	overture-stack/score	Utilities & Infrastructure	10
ipyrad ipyrad is a Python toolkit for de novo and reference-guided assembly and genotyping of RADseq, GBS, ddRAD, 2bRAD, and other restriction enzyme-based reduced representation genomic datasets. Implements	dereneaton/ipyrad	Other	9
Juicebox Juicebox — interactive visualization and analysis tool for Hi-C and other proximity-ligation chromatin conformation data. Loads .hic files produced by Juicer, Dovetail, or Arima workflows and enables	aidenlab/Juicebox	Genomics	9
kb-python kb-python — Python wrapper for the kallisto\|bustools (kb) workflow for single-cell RNA-seq pre-processing. Runs kb ref to build or download reference indices and kb count to generate count matrices fr	pachterlab/kb_python	Single-Cell	9
KING KING — robust kinship estimation and relationship inference for genome-wide SNP data. Computes pairwise kinship coefficients, infers family relationships (MZ twins, parent-offspring, full siblings, 2n	Shicheng-Guo/KING	Population Genetics	11
KMCP Use when working with kmcp — KMCP — coverage-based metagenomic sequence	shenwei356/kmcp	Metagenomics	10
KneadData KneadData — quality control and host decontamination tool for metagenomic	biobakery/kneaddata	Metagenomics	9
Kraken 2 Kraken 2 — fast k-mer-based taxonomic sequence classifier for metagenomics. Assigns NCBI taxonomic labels to DNA/protein sequences using exact k-mer matching against a minimizer-indexed hash table. Su	DerrickWood/kraken2	Metagenomics	11
Krona Krona — interactive HTML pie chart visualization of hierarchical taxonomic data from metagenomics and sequence classification. Part of KronaTools, which provides importers for Kraken2, BLAST, Diamond,	marbl/Krona	Metagenomics	10
LAST LAST — adaptive-seed sequence aligner for genomes, long reads, and proteins. Uses lastdb to build databases, last-train to learn substitution/gap rates, lastal for alignment, last-split for rearrangem	manual	Genomics	10
Liftoff Liftoff — accurate genome annotation liftover tool that maps GFF/GTF annotations between assemblies of the same or closely-related species using Minimap2 alignment. Supports gene copy detection, ORF v	agshumate/Liftoff	Genomics	10
LINX LINX — structural variant annotation and visualization tool from the Hartwig Medical Foundation hmftools suite. Interprets structural variants and copy number data to classify driver events including	hartwigmedical/hmftools	Clinical Genomics	10
locfdr locfdr — Efron's empirical Bayes local false discovery rate estimation from z-score vectors. Computes per-test posterior probability of being null using a mixture model approach that fits the empirica	manual	Statistics	10
LocusZoom LocusZoom — regional association plot generator for GWAS and fine-mapping results. Creates publication-quality locus zoom plots showing -log10(p-value) vs genomic position with LD coloring, recombinat	statgen/locuszoom	Population Genetics	11
LSHTM Pipelines LSHTM PathogenSeq bioinformatics pipelines for pathogen whole-genome sequencing analysis, including TB-Profiler for Mycobacterium tuberculosis lineage and drug-resistance prediction from WGS data, Mal	jodyphelan/TBProfiler	Metagenomics	9
LUMPY LUMPY — probabilistic structural variant discovery from paired-end short-read sequencing. Detects deletions, duplications, inversions, and translocations from BWA-MEM aligned BAM files using discordan	arq5x/lumpy-sv	Genomics	10
Maaslin2 MaAsLin 2 (Microbiome Multivariable Association with Linear Models 2) is the bioBakery tool for finding associations between microbial features (taxa, functional pathways, gene families) and sample me	biobakery/Maaslin2	Metagenomics	10
MACS2/MACS3 MACS2/MACS3 — Model-based Analysis of ChIP-Seq for identifying transcription factor binding sites and histone modification enrichment from ChIP-seq, ATAC-seq, and CUT&Tag data. Provides peak calling (	macs3-project/MACS	Epigenomics	11
MAGeCK -- Model-based Analysis of Genome-wide CRISPR-Knockout MAGeCK (Model-based Analysis of Genome-wide CRISPR-Knockout) -- computational pipeline for CRISPR genetic screen analysis. Performs sgRNA counting from FASTQ files (mageck count), gene-level essential	davidwwei/MAGeCK	Genomics	9
MAGeCK-VISPR MAGeCK-VISPR — comprehensive CRISPR screen analysis framework combining MAGeCK statistical testing (RRA and MLE algorithms) with VISPR interactive visualization. Supports sgRNA count generation from F	liulab-dfci/MAGeCK-VISPR	Systems Biology	10
MAJIQ MAJIQ — Modeling Alternative Junction Inclusion Quantification for detecting, quantifying, and visualizing local splicing variations (LSVs) from RNA-Seq data. Builds splice graphs from BAM files and G	biociphers/MAJIQ	Transcriptomics	10
MAKER MAKER — portable genome annotation pipeline that integrates ab initio gene predictors (SNAP, Augustus, GeneMark), protein and EST evidence alignment, and repeat masking to produce GFF3 gene models wit	Yandell-Lab/maker	Genomics	10
Mash Mash — fast genome and metagenome distance estimation using MinHash sketching. Estimates pairwise distances between genomic sequences (FASTA/FASTQ) without full alignment using k-mer sketches. Support	marbl/Mash	Metagenomics	10
Mashtree Mashtree — rapid distance-based phylogenetic tree construction from genome assemblies using MinHash (Mash) distances. Creates neighbor-joining trees from FASTA assemblies, FASTQ reads, or GenBank file	lskatz/mash	Phylogenetics	10
MaSuRCA MaSuRCA (Maryland Super-Read Celera Assembler) — hybrid genome assembler combining Illumina short reads with PacBio or Oxford Nanopore long reads using super-read and mega-read technology. Performs au	alekseyzimin/masurca	Genomics	10

Browse Tools

Browse Tools