Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

126 of 5,893 resources

Showing 150

A compressor of common genomic file formats (BAM, CRAM, FASTQ, VCF etc).

Active1836 days ago
C
NOASSERTION

samtools/bcftools are a suite of tools for manipulating NGS data and can be used to call variants.

Active8711 week ago
C
NOASSERTION

A Workflow Management System geared towards scientific workflows.

Active1.1K1 week ago
Scala
BSD-3-Clause

A haplotype-resolved assembler for accurate Hifi reads.

Active7791 week ago
C++
MIT

A small language for defining pipeline stages and linking them together to make pipelines.

Active2421 week ago
Groovy
NOASSERTION

The modern C++ library for sequence analysis.

Active4542 weeks ago
C++
NOASSERTION

Structural variant discovery by integrated paired-end and split-read analysis.

Active5212 weeks ago
C++
BSD-3-Clause

Sequence manipulation toolkit for FASTA/FASTQ files written in Nim.

Active1272 weeks ago
Nim
GPL-3.0

A quality control tool for high throughput sequence data.

Active6012 weeks ago
Java
GPL-3.0

Utilities for working with CSV/Tab-delimited files.

Active6.4K3 weeks ago
Python
MIT

Suite of tools to handle gene annotations in any GTF/GFF format.

Active5713 weeks ago
HTML
GPL-3.0

Cython + HTSlib == fast VCF parsing; even faster parsing than pyVCF.

Active4433 weeks ago
Cython
MIT

Pythonic Access to the Ensembl database.

Active4003 weeks ago
Python
Apache-2.0

SPAdes (St. Petersburg genome assembler) is an assembly toolkit containing various assembly pipelines and the de-facto standard for prokaryotic genome assemblies.

Active9351 month ago
C++
NOASSERTION

fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.

Active8561 month ago
Nim
MIT

Access to Biological Web Services from Python.

Active3371 month ago
Python
NOASSERTION

A list of pipeline resources.

Active6.6K1 month ago

A python-based workflow manager.

Active5901 month ago
Python
Apache-2.0

GFF and GTF file manipulation and interconversion.

Active3192 months ago
Python
MIT

Deep learning-based variant caller

Active3.7K2 months ago
Python
BSD-3-Clause

Genetic variant annotation and effect prediction toolbox.

Active3083 months ago
Java
NOASSERTION

A single molecule sequence assembler for genomes large and small.

Active7003 months ago
C++

FASTQ and SAM quality control using Python.

Active1093 months ago
Python
MIT

BWA-MEM drop-in replacement: 2-3x faster, 2-5x cheaper, 100% identical output on standard CPUs.

Active223 months ago
C
MIT

lumpy: a general probabilistic framework for structural variant discovery.

Active3423 months ago
C
MIT

A polymorphic bayesian genotyping model with wide applicability.

Active3233 months ago
C++
MIT

a specification for describing analysis workflows and tools that are portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments.

Active1.5K5 months ago
Common Workflow Language
Apache-2.0

Prokka: rapid prokaryotic genome annotation. Prokka is one of the most cited annotation command line tools for microbial genome annotations.

Active9825 months ago
Perl
GPL-3.0

Biocaml aims to be a high-performance user-friendly library for Bioinformatics.

Idle1256 months ago
OCaml
NOASSERTION

Sort genomic files according to a specified order.

Idle367 months ago
Go
MIT

SIMD C library for global, semi-global, and local pairwise sequence alignments

Idle2849 months ago
C
NOASSERTION

A circos representation of multiple GWAS results.

Idle971 year ago
R
GPL-3.0

GRIDSS: the Genomic Rearrangement IDentification Software Suite.

Idle2831 year ago
Java
NOASSERTION

Collection of tools for working with BAM files.

Idle4301 year ago
C++
MIT

A Swiss Army knife for genome arithmetic.

Idle1K1 year ago
C
MIT

A Go library and command line utility for engineering organisms.

Idle7291 year ago
Go
MIT

Batteries included genomic analysis pipeline for variant and RNA-Seq analysis, structural variant calling, annotation, and prediction.

Idle1K1 year ago
Python
MIT

Workflow library embedded in the Go programming language, focusing on supporting complex workflow constructs, compiling to a single binary, providing powerful file naming and comprehensive audit reports for every output

Idle1.1K1 year ago
Go
MIT

Resources on ChIP-seq data which include papers, methods, links to software, and analysis.

Idle8501 year ago
Python
MIT

UNIX-style FASTA manipulation tools.

Idle171 year ago
Python
MIT

structural variant calling and genotyping with existing tools, but,smoothly.

Idle2641 year ago
Go
Apache-2.0

A collection of research papers for AI-based protein design.

Stale3062 years ago
Apache-2.0

Solid path for those of you who want to complete a Bioinformatics course on your own time, for free, with courses from the best universities in the World.

Archived6.9K2 years ago

file format conversion in Biopython in a convenient way.

Stale1182 years ago
Python
GPL-3.0

Predicts whether an amino acid substitution affects protein function.

Stale5482 years ago
MIT

A fuzzy Bruijn graph approach to long noisy reads assembly

Stale5302 years ago
C
GPL-3.0

Educational resource on performing RNA-seq analysis in the cloud using Amazon AWS cloud services. Topics include preparing the data, preprocessing, differential expression, isoform discovery, data visualization, and interpretation.

Stale1.4K3 years ago
R
NOASSERTION

Easily submitting PBS jobs with script template. Multiple input files supported.

Stale293 years ago
Python
MIT

Syntax Highlighting for Computational Biology file formats (SAM, VCF, GTF, FASTA, PDB, etc...) in vim/less/gedit/sublime.

Stale2723 years ago
Shell
GPL-3.0

Point and click, cross platform suite for analysing and visualizing next-generation sequencing datasets.

Stale173 years ago
TypeScript
GPL-3.0