Find open-source science resources
A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.
Filters
Health
Domain
Language
License
Source(1)
Type(1)
126 of 5,893 resources
Showing 1–50
Ultra-fast, sensitive search and clustering suite for protein and nucleotide sequence sets.
Freely available tools for biological computing in Python, with included cookbook, packaging and thorough documentation. Part of the [Open Bioinformatics Foundation](http://open-bio.org/). Contains the very useful [Entrez](https://biopython.org/DIST/docs/api/Bio.Entrez-module.html) package for API access to the NCBI databases.
A compressor of common genomic file formats (BAM, CRAM, FASTQ, VCF etc).
samtools/bcftools are a suite of tools for manipulating NGS data and can be used to call variants.
A Workflow Management System geared towards scientific workflows.
A haplotype-resolved assembler for accurate Hifi reads.
A small language for defining pipeline stages and linking them together to make pipelines.
The modern C++ library for sequence analysis.
Structural variant discovery by integrated paired-end and split-read analysis.
Sequence manipulation toolkit for FASTA/FASTQ files written in Nim.
A quality control tool for high throughput sequence data.
Utilities for working with CSV/Tab-delimited files.
Suite of tools to handle gene annotations in any GTF/GFF format.
Cython + HTSlib == fast VCF parsing; even faster parsing than pyVCF.
SPAdes (St. Petersburg genome assembler) is an assembly toolkit containing various assembly pipelines and the de-facto standard for prokaryotic genome assemblies.
fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.
Another cross-platform, efficient, practical and pretty CSV/TSV toolkit.
Access to Biological Web Services from Python.
GFF and GTF file manipulation and interconversion.
Deep learning-based variant caller
Genetic variant annotation and effect prediction toolbox.
A single molecule sequence assembler for genomes large and small.
FASTQ and SAM quality control using Python.
BWA-MEM drop-in replacement: 2-3x faster, 2-5x cheaper, 100% identical output on standard CPUs.
lumpy: a general probabilistic framework for structural variant discovery.
A polymorphic bayesian genotyping model with wide applicability.
a specification for describing analysis workflows and tools that are portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments.
Prokka: rapid prokaryotic genome annotation. Prokka is one of the most cited annotation command line tools for microbial genome annotations.
Biocaml aims to be a high-performance user-friendly library for Bioinformatics.
Sort genomic files according to a specified order.
SIMD C library for global, semi-global, and local pairwise sequence alignments
GRIDSS: the Genomic Rearrangement IDentification Software Suite.
Collection of tools for working with BAM files.
Burrow-Wheeler Aligner for pairwise alignment between DNA sequences.
A system for rapidly aligning entire genomes, whether in complete or draft form.
A Go library and command line utility for engineering organisms.
Batteries included genomic analysis pipeline for variant and RNA-Seq analysis, structural variant calling, annotation, and prediction.
Workflow library embedded in the Go programming language, focusing on supporting complex workflow constructs, compiling to a single binary, providing powerful file naming and comprehensive audit reports for every output
Resources on ChIP-seq data which include papers, methods, links to software, and analysis.
structural variant calling and genotyping with existing tools, but,smoothly.
A collection of research papers for AI-based protein design.
Solid path for those of you who want to complete a Bioinformatics course on your own time, for free, with courses from the best universities in the World.
file format conversion in Biopython in a convenient way.
Predicts whether an amino acid substitution affects protein function.