Find open-source science resources
A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.
Filters
Health
Domain
Language(1)
License
Source
Type
17 of 5,893 resources
Ultra-fast, sensitive search and clustering suite for protein and nucleotide sequence sets.
This repository contains GGUF files for gemma4-12b-bioinfo, a fine-tuned Gemma 4 12B model for bioinformatics and computational biology.
A compressor of common genomic file formats (BAM, CRAM, FASTQ, VCF etc).
samtools/bcftools are a suite of tools for manipulating NGS data and can be used to call variants.
Fast and accurate protein structure search using a learned 3Di structural alphabet (VQ-VAE) that discretizes tertiary interactions into structural tokens, enabling protein-universe-scale structural alignment at sequence-search speeds (4-5 orders of magnitude faster than DALI/TM-align) and underpinning many AI4S tools such as SaProt, ESMAtlas search, and AFDB clustering pipelines (Steinegger Lab, Nature Biotechnology 2023)
Genome mapping and spliced alignment of cDNA or amino acid sequences
BWA-MEM drop-in replacement: 2-3x faster, 2-5x cheaper, 100% identical output on standard CPUs.
lumpy: a general probabilistic framework for structural variant discovery.
A python extension, written in C, for quick access to bigBed files and access to and creation of bigWig files.
SIMD C library for global, semi-global, and local pairwise sequence alignments
Minigraph is a sequence-to-graph mapper and graph constructor. For graph generation, it aligns a query sequence against a sequence graph and incrementally augments an existing graph with long query subsequences diverged from the graph.
Burrow-Wheeler Aligner for pairwise alignment between DNA sequences.
A database system designed to store, organize, and manage large-scale nucleotide sequencing read data (like PacBio reads) for the Dazzler genome assembler
A fuzzy Bruijn graph approach to long noisy reads assembly
VerityMap is a tool for mapping long reads to assemblies of extra-long tandem repeats, producing SAM files and identifying potential heterozygous sites and assembly errors through analysis of rare k-mers. It supports PacBio HiFi and ONT reads and generates interactive HTML plots for variant analysis.
FASTQ/A short-reads pre-processing tools: Demultiplexing, trimming, clipping, quality filtering, and masking utilities.
Comprehensive set of programs for phylogenetic analyses; available for PC and Mac; source code available for easy compiling in UNIX.