Browse the BioContext7 deep skill library. Tool pages surface documentation, registry links, and install details for agent-facing workflows.
2,064 tools — page 2 of 42
| Tool | Registry | Domain | Docs |
|---|---|---|---|
Bismark — bisulfite-seq alignment and methylation calling toolkit. Maps bisulfite-treated reads to a reference genome using Bowtie 2, HISAT2, or minimap2, performs cytosine methylation calls in CpG/CH | FelixKrueger/Bismark | Genomics | 11 |
Chopper — fast Rust-based quality, length, and GC content filtering and trimming tool for long-read sequencing data (Oxford Nanopore, PacBio) in FASTQ format. Successor to NanoFilt with multithreaded | wdecoster/chopper | QC & Preprocessing | 11 |
cuteSV -- sensitive and scalable long-read-based structural variation (SV) detection from PacBio CLR, PacBio CCS/HiFi, and Oxford Nanopore Technology (ONT) sequencing data. Detects deletions, insertio | tjiangHIT/cuteSV | Genomics | 11 |
deepTools — suite of Python tools for efficient analysis and visualization of high-throughput sequencing data including ChIP-seq, ATAC-seq, MNase-seq, and RNA-seq. BAM-to-bigWig conversion with normal | deeptools/deepTools | Genomics | 11 |
Use when working with falco — a high-speed C++ reimplementation of | smithlabcode/falco | QC & Preprocessing | 10 |
fastp — ultra-fast all-in-one FASTQ preprocessor for quality control, adapter trimming, quality filtering, per-read trimming, polyG/polyX removal, UMI processing, base correction, deduplication, and c | OpenGene/fastp | QC & Preprocessing | 11 |
MAFFT — multiple sequence alignment for nucleotide and protein sequences. Implements progressive (FFT-NS-1, FFT-NS-2), iterative refinement (FFT-NS-i), and consistency-based iterative methods (L-INS-i | manual | Phylogenetics | 11 |
NanoFilt — Python tool for filtering and trimming Oxford Nanopore long-read sequencing data in FASTQ format. Filters reads by minimum average quality score, minimum/maximum read length, and GC content | wdecoster/nanofilt | QC & Preprocessing | 9 |
NanoPlot — visualization and quality-control tool for Oxford Nanopore long-read sequencing data. Generates read length histograms, quality distribution plots, cumulative yield curves, and alignment id | wdecoster/NanoPlot | QC & Preprocessing | 11 |
pbmm2 — SMRT C++ wrapper for minimap2's C API providing native PacBio long-read alignment. Supports CCS/HiFi, SUBREAD, ISOSEQ, and UNROLLED presets. Reads native PacBio BAM and dataset XML inputs, pro | PacificBiosciences/pbmm2 | Genomics | 11 |
Pharokka — fast, scalable bacteriophage genome annotation. Annotates phage genomes using pyhmmer (PHROGs), MMseqs2 (CARD, VFDB, COG), and tRNAscan-SE. Produces GenBank, GFF3, and tabular outputs in mi | gbouras13/pharokka | Metagenomics | 10 |
Prodigal/Pyrodigal — prokaryotic gene prediction tools that identify protein-coding sequences (CDS) in bacterial and archaeal genomes and metagenomes using dynamic programming. Prodigal is the canonic | hyattpd/Prodigal | Genomics | 10 |
RSEM (RNA-Seq by Expectation-Maximization) — accurate gene and isoform quantification from RNA-seq data using EM algorithm with probabilistic multi-mapped read assignment. Supports STAR, Bowtie2, and | deweylab/RSEM | Transcriptomics | 11 |
somalier -- fast sample-identity verification and relatedness checking for BAM/CRAM/VCF/GVCF files. Extracts genotypes at informative polymorphic sites, computes pairwise relatedness using bit-vector | brentp/somalier | Genomics | 10 |
trimAl — automated alignment trimming tool for removing spurious sequences and poorly aligned regions from multiple sequence alignments. Supports automated heuristic methods (gappyout, strict, strictp | inab/trimal | Phylogenetics | 10 |
VCFtools — C++ toolkit for filtering, comparing, summarizing, converting, and manipulating VCF (Variant Call Format) and BCF files. Provides site and individual-level filtering, allele frequency calcu | vcftools/vcftools | Genomics | 9 |
scikit-allel — Python package for exploratory analysis of large-scale genetic variation data. Provides data structures for genotypes, haplotypes, and allele counts (GenotypeArray, HaplotypeArray, Alle | cggh/scikit-allel | Metagenomics | 12 |
Zarr — chunked, compressed N-dimensional arrays for Python with cloud-native storage. Provides hierarchical groups, pluggable compression codecs (Blosc, Zstd, Gzip), sharding for large-scale datasets, | zarr-developers/zarr-python | Utilities & Infrastructure | 18 |
Use when working with aurora — a machine learning GWAS R package for identifying microbial habitat adaptation genes and autochthonous strain provenance. Implements a Random Forest + random-walk (AUtoc | DalimilBujdos/aurora | Machine Learning | 14 |
Bambi (BAyesian Model-Building Interface) is a high-level Python package for fitting Bayesian generalized linear and generalized linear mixed models using a concise, R-style Wilkinson formula syntax. | bambinos/bambi | Statistics | 12 |
CoralNet is a web platform for benthic image analysis, serving as a repository and resource. It uses deep learning for automated annotation of benthic images, reducing the manual inspection bottleneck | coralnet/coralnet | Metagenomics | 8 |
Dandelion (sc-dandelion) is a Python package for single-cell BCR and TCR immune repertoire analysis integrated with scanpy. Processes 10x Genomics VDJ output (filtered_contig_annotations.csv, airr_rea | zktuong/dandelion | Clinical Genomics | 9 |
DoE.base is a foundational R package by Ulrike Groemping for full factorial experimental designs and designs based on orthogonal arrays. It provides utility functions for the shared `design` class use | manual | Statistics | 12 |
EpiEstim — CRAN R package from the MRC Centre for Global Infectious Disease Analysis for estimating the time-varying (instantaneous) reproduction number R_t from epidemic incidence time series. Implem | mrc-ide/EpiEstim | Clinical Genomics | 12 |
Use when working with Escher for metabolic pathway visualization, map editing, data overlay, or embedding interactive maps in notebooks and web pages. Covers the browser Builder and Viewer, the Python | zakandrewking/escher | Systems Biology | 9 |
fdrtool — estimation of tail area-based false discovery rates (Fdr/q-values) and density-based local false discovery rates (fdr) from observed test statistics. Supports four null models: normal (z-sco | cran/fdrtool | Statistics | 15 |
Finlay is a fast Triangle-Free Solver for undirected graphs, part of the Aegypti project. It detects, counts, and lists triangles in graphs provided in DIMACS or Boolean Adjacency Matrix formats. It p | frankvegadelgado/finlay | Clinical Genomics | 10 |
HDMT — High-Dimensional Mediation Testing for joint significance of exposure-mediator and mediator-outcome associations. Controls FWER and FDR for mediation hypotheses in high-dimensional settings (ep | jchen1981/HDMT | Statistics | 15 |
Hmsc (Hierarchical Modelling of Species Communities) is an R package for fitting joint species distribution models (JSDMs) in a Bayesian hierarchical framework. It links species occurrences or abundan | hmsc-r/HMSC | Other | 11 |
qvalue — Q-value estimation for false discovery rate control in multiple hypothesis testing. Estimates q-values, the proportion of true null hypotheses (pi0), and local false discovery rates from vect | StoreyLab/qvalue | Statistics | 15 |
Reactome — curated pathway knowledgebase and analysis platform for pathway enrichment, expression overlay, and species comparison. Use when user needs pathway over-representation analysis (ORA) with R | reactome/reactome2py | Visualization | 18 |
REDCap (Research Electronic Data Capture) is a secure, web-based application for building and managing online surveys and databases, widely used in clinical and translational research. This skill cove | redcap-tools/PyCap | Clinical Genomics | 12 |
rootwater is a Python toolbox for estimating root water uptake (RWU) from diurnal soil moisture dynamics in the rhizosphere, with complementary tools for converting sap velocity to sap flow in active | cojacoo/rootwater | Other | 12 |
t-SNE via Rtsne — Barnes-Hut t-distributed Stochastic Neighbor Embedding for nonlinear dimensionality reduction and visualization. Wraps Van der Maaten's C++ Barnes-Hut implementation for O(n log n) c | jkrijthe/Rtsne | Statistics | 13 |
Scanpy — scalable Python toolkit for analyzing single-cell gene expression data built on AnnData. Provides preprocessing (QC, normalization, feature selection, dimensionality reduction), clustering (L | scverse/scanpy | Single-Cell | 12 |
Query STRING API for protein-protein interaction networks, functional enrichment, and interaction partner discovery. Covers 59M proteins across 5000+ species with 20B+ scored interactions from 7 evide | manual | Proteomics | 17 |
Cenote-Taker 3 is a comprehensive bioinformatics pipeline for the discovery and annotation of diverse viral genomes (the virome) from metagenomic assemblies or individual sequences. It identifies vira | mtisza1/Cenote-Taker2 | Metagenomics | 9 |
Tcgabiolinks Verified TCGAbiolinks for searching, downloading, and analyzing cancer genomics data from the NCI Genomic Data Commons (GDC). Routes tasks for GDCquery/GDCdownload, data preparation with GDCprepare, differenti | BioinformaticsFMRP/TCGAbiolinks | Clinical Genomics | 17 |
Uni Mol Verified Uni-Mol universal 3D molecular representation learning framework for molecular property prediction, protein-ligand docking, drug-target interaction, and binding affinity scoring. Covers unimol_tools P | dptech-corp/Uni-Mol | Drug Discovery | 17 |
windspharm — high-level Python library for global wind field computations on the sphere using spherical harmonic transforms (SPHEREPACK). Compute vorticity, divergence, stream function, velocity poten | ajdawson/windspharm | Other | 11 |
xclim is an xarray-based climate analytics library for computing climate indicators and indices, running data quality checks, managing ensembles, and batch-processing netCDF datasets from the command | Ouranosinc/xclim | Other | 12 |
Use when working with abricate — ABRicate — mass screening of contigs | tseemann/abricate | Metagenomics | 10 |
slow5lib and blue-crab — C library and Python tools for reading, writing, and converting Oxford Nanopore raw signal data in SLOW5/BLOW5 format. blue-crab converts between FAST5, POD5, and SLOW5/BLOW5 | hasindu2008/slow5lib | Genomics | 9 |
Use when working with xMIP (formerly cmip6_preprocessing) — a Python package designed to make CMIP6 (Climate Model Intercomparison Project Phase 6) data "analysis-ready" by homogenizing inconsistent n | jbusecke/xMIP | Other | 12 |
Use when working with ACE (Accurate CRISPR Essentiality) — a Python package from the Pe'er Lab (Memorial Sloan Kettering) for estimating gene essentiality from CRISPR pooled knockout screens. Models g | dpeerlab/ACE | Systems Biology | 10 |
Cell Ranger — 10x Genomics official analysis pipeline for Chromium single-cell and spatial data. Performs FASTQ generation, alignment (STAR-based), barcode processing, UMI counting, V(D)J assembly, Fe | 10XGenomics/cellranger | Single-Cell | 12 |
ClonalFrameML (CFML) — maximum-likelihood inference of recombination in bacterial genomes. Detects horizontally transferred genomic segments, estimates recombination parameters (R/theta, nu, delta), a | xavierdidelot/ClonalFrameML | Phylogenetics | 8 |
EDTA (Extensive de-novo TE Annotator) — automated transposable element annotation pipeline for plant genomes. Integrates LTRharvest, LTR_FINDER, TIR-Learner, Helitron-Scanner, and RepeatMasker into a | oushujun/EDTA | Other | 10 |
MEGAHIT — ultra-fast and memory-efficient NGS assembler for large and complex metagenomes, single genomes, and single-cell data. Uses succinct de Bruijn graph (SdBG) with an iterative multi k-mer stra | voutcn/megahit | Metagenomics | 8 |
SPAdes — de Bruijn graph genome assembler for Illumina, IonTorrent, and hybrid short+long read data. Modes: isolate (--isolate), metagenomics (--meta), single-cell MDA (--sc), plasmid recovery (--plas | ablab/spades | Metagenomics | 11 |