Find open-source science resources

A compressor of common genomic file formats (BAM, CRAM, FASTQ, VCF etc).

Active1836 days ago

bcftools

Variant Calling

samtools/bcftools are a suite of tools for manipulating NGS data and can be used to call variants.

Active8711 week ago

Cromwell

A Workflow Management System geared towards scientific workflows.

Active1.1K1 week ago

Scala

BSD-3-Clause

hifiasm

Long-read Assembly

A haplotype-resolved assembler for accurate Hifi reads.

Active7791 week ago

Bpipe

A small language for defining pipeline stages and linking them together to make pipelines.

Active2421 week ago

Groovy

SeqAn

Package suites

The modern C++ library for sequence analysis.

Active4542 weeks ago

Delly

Structural variant discovery by integrated paired-end and split-read analysis.

Active5212 weeks ago

BSD-3-Clause

SeqFu

Sequence manipulation toolkit for FASTA/FASTQ files written in Nim.

Active1272 weeks ago

Nim

FastQC

A quality control tool for high throughput sequence data.

Active6012 weeks ago

Java

CSVKit

Command Line Utilities

Utilities for working with CSV/Tab-delimited files.

Active6.4K3 weeks ago

AGAT

GFF BED File Utilities

Suite of tools to handle gene annotations in any GTF/GFF format.

Active5713 weeks ago

HTML

cyvcf2

Tools

Cython + HTSlib == fast VCF parsing; even faster parsing than pyVCF.

Active4433 weeks ago

Cython

pyensembl

Data

Pythonic Access to the Ensembl database.

Active4003 weeks ago

SPAdes

Assembly

SPAdes (St. Petersburg genome assembler) is an assembly toolkit containing various assembly pipelines and the de-facto standard for prokaryotic genome assemblies.

Active9351 month ago

mosdepth

BAM File Utilities

fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing.

Active8561 month ago

Nim

bioservices

Data

Access to Biological Web Services from Python.

Active3371 month ago

Awesome-Pipeline

Pipelines

A list of pipeline resources.

Active6.6K1 month ago

redun

A python-based workflow manager.

Active5901 month ago

gffutils

GFF BED File Utilities

GFF and GTF file manipulation and interconversion.

Active3192 months ago

DeepVariant

Variant Calling

Deep learning-based variant caller

Active3.7K2 months ago

Variant Prediction/Annotation

BSD-3-Clause

SnpEff

Genetic variant annotation and effect prediction toolbox.

Active3083 months ago

Java

canu

Long-read Assembly

A single molecule sequence assembler for genomes large and small.

Active7003 months ago

Fastqp

FASTQ and SAM quality control using Python.

Active1093 months ago

BWA-FastAlign

Pairwise

BWA-MEM drop-in replacement: 2-3x faster, 2-5x cheaper, 100% identical output on standard CPUs.

Active223 months ago

lumpy

lumpy: a general probabilistic framework for structural variant discovery.

Active3423 months ago

Octopus

Variant Calling

A polymorphic bayesian genotyping model with wide applicability.

Active3233 months ago

Common Workflow Language

a specification for describing analysis workflows and tools that are portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments.

Active1.5K5 months ago

Common Workflow Language

Prokka

Annotation

Prokka: rapid prokaryotic genome annotation. Prokka is one of the most cited annotation command line tools for microbial genome annotations.

Active9825 months ago

Perl

Biocaml

Package suites

Biocaml aims to be a high-performance user-friendly library for Bioinformatics.

Idle1256 months ago

OCaml

gsort

Command Line Utilities

Sort genomic files according to a specified order.

Idle367 months ago

Parasail

Pairwise

SIMD C library for global, semi-global, and local pairwise sequence alignments

Idle2849 months ago

fujiplot

Circos Related

A circos representation of multiple GWAS results.

Idle971 year ago

gridss

GRIDSS: the Genomic Rearrangement IDentification Software Suite.

Idle2831 year ago

Java

Bamtools

BAM File Utilities

Collection of tools for working with BAM files.

Idle4301 year ago

Bedtools2

GFF BED File Utilities

A Swiss Army knife for genome arithmetic.

Idle1K1 year ago

(Poly)merase

Package suites

A Go library and command line utility for engineering organisms.

Idle7291 year ago

bcbio-nextgen

Pipelines

Batteries included genomic analysis pipeline for variant and RNA-Seq analysis, structural variant calling, annotation, and prediction.

Idle1K1 year ago

SciPipe

Workflow library embedded in the Go programming language, focusing on supporting complex workflow constructs, compiling to a single binary, providing powerful file naming and comprehensive audit reports for every output

Idle1.1K1 year ago

ChIP-seq analysis notes from Tommy Tang

ChIP-Seq

Resources on ChIP-seq data which include papers, methods, links to software, and analysis.

Idle8501 year ago

smof

UNIX-style FASTA manipulation tools.

Idle171 year ago

smoove

structural variant calling and genotyping with existing tools, but,smoothly.

Idle2641 year ago

Awesome AI-based Protein Design

Bioinformatics on GitHub

A collection of research papers for AI-based protein design.

Stale3062 years ago

Becoming a Bioinformatician

Open Source Society University on Bioinformatics

Solid path for those of you who want to complete a Bioinformatics course on your own time, for free, with courses from the best universities in the World.

Archived6.9K2 years ago

seqmagick

file format conversion in Biopython in a convenient way.

Stale1182 years ago

Variant Prediction/Annotation

SIFT

Predicts whether an amino acid substitution affects protein function.

Stale5482 years ago

wtdbg2

Long-read Assembly

A fuzzy Bruijn graph approach to long noisy reads assembly

Stale5302 years ago

Informatics for RNA-seq: A web resource for analysis on the cloud

RNA-Seq

Educational resource on performing RNA-seq analysis in the cloud using Amazon AWS cloud services. Topics include preparing the data, preprocessing, differential expression, isoform discovery, data visualization, and interpretation.

Stale1.4K3 years ago