Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

126 of 5,893 resources

Showing 101126

Fast sample-swap and relatedness checks on BAMs/CRAMs/VCFs/GVCFs.

Telseq is a tool for estimating telomere length from whole genome sequence data.

Annotate a VCF with other VCFs/BEDs/tabixed files.

A C++ library for parsing and manipulating VCF files.

VCF manipulation and statistics (e.g. linkage disequilibrium, allele frequency, Fst).

Tools for adding mutations to existing `.bam` files, used for testing mutation callers.

**Comes with samtools!** - Reads simulator.

Pythonic access to the UCSC Genome database.

A port of [pyVCF](https://github.com/jamescasbon/PyVCF) using Cython for speed.

Python library for blazing-fast genomic interval operations and genomic file formats I/O on Polars DataFrames

Python wrapper for [bedtools](https://github.com/arq5x/bedtools).

Pythonic access to FASTA files.

Python wrapper for [samtools](https://github.com/samtools/samtools).

A VCF Parser for Python.

Scalable toolkit for analyzing single-cell gene expression data, including preprocessing, visualization, clustering, and trajectory inference.

SKESA is a de-novo sequence read assembler for microbial genomes. It uses conservative heuristics and is designed to create breaks at repeat regions in the genome. This leads to excellent sequence quality without significantly compromising contiguity.

Minimap2 is an pairwise aligner for genomic and spliced nucleotide sequences. It can perform the assembly-to-assembly alignment, and works with gzip'd FASTQ, FASTA formats. It also finds overlaps between long-reads.

Bakta is a tool for the rapid & standardized annotation of bacterial genomes & plasmids. It provides dbxref-rich and sORF-including annotations in machine-readable JSON & bioinformatics standard file formats for automatic downstream analysis.

De novo assembler for single molecule sequencing reads using repeat graphs.

Embeddable genome viewer. Integration data from a wide variety of sources, and can load data directly from popular genomics file formats including bigWig, BAM, and VCF.

Flexible circular visualization of genome-associated data with BioPerl and SVG.

Horizon chart D3-based JavaScript library for DNA data.

Java-based browser. Fast, efficient, scalable visualization tool for genomics data and annotations. Handles a large variety of formats.

JavaScript genome browser that is highly customizable via plugins and track customizations.

List of resources on alternative splicing including software, databases, and other tools.

[@crazyhottommy](https://github.com/crazyhottommy)'s notes on various steps and considerations when doing RNA-seq analysis.