Find open-source science resources

Hierarchical Generation of Molecular Graphs using Structural Motifs.

Stale4383 years ago

Neural Operators & Model Discovery

DeepONet

Learning nonlinear operators

Stale8193 years ago

GNET2

GeneExpression

Cluster genes to functional groups with E-M process. Iteratively perform TF assigning and Gene assigning, until the assignment of genes did not change, or max number of iterations is reached.

Stale23 years ago

trident-chemwidgets

Molecular Visualization

Jupyter Widgets to interact with molecular datasets.

Stale333 years ago

CSS

BSD-3-Clause

Algorithm Metadata Vocabulary

Algorithm Metadata Vocabulary is a vocabulary for capturing and storing the metadata about the algorithms (a procedure or a set of rules that is followed step-by-step to solve a problem, especially by a computer). There are uncountable algorithms present in every area (e.g., Computer Science, Mathematics), which makes it hard for specialists, academicians, application engineers, and so forth to discover, distinguish, select, and reuse them. [from repository]

Stale03 years ago

CC0-1.0

eds

Sequencing

This packages provides a single function, readEDS. This is a low-level utility for reading in Alevin EDS format into R. This function is not designed for end-users but instead the package is predominantly for simplifying package dependency graph for other Bioconductor packages.

Stale04 years ago

scDDboost

SingleCell

scDDboost is an R package to analyze changes in the distribution of single-cell expression data between two experimental conditions. Compared to other methods that assess differential expression, scDDboost benefits uniquely from information conveyed by the clustering of cells into cellular subtypes. Through a novel empirical Bayesian formulation it calculates gene-specific posterior probabilities that the marginal expression distribution is the same (or different) between the two conditions. The implementation in scDDboost treats gene-level expression data within each condition as a mixture of negative binomial distributions.

Stale04 years ago

GPL-2.0+

Food-Biomarker Ontology

Stale174 years ago

CC-BY-4.0

snapcount

Coverage

snapcount is a client interface to the Snaptron webservices which support querying by gene name or genomic region. Results include raw expression counts derived from alignment of RNA-seq samples and/or various summarized measures of expression across one or more regions/genes per-sample (e.g. percent spliced in).

Stale34 years ago

multiHiCcompare

multiHiCcompare provides functions for joint normalization and difference detection in multiple Hi-C datasets. This extension of the original HiCcompare package now allows for Hi-C experiments with more than 2 groups and multiple samples per group. multiHiCcompare operates on processed Hi-C data in the form of sparse upper triangular matrices. It accepts four column (chromosome, region1, region2, IF) tab-separated text files storing chromatin interaction matrices. multiHiCcompare provides cyclic loess and fast loess (fastlo) methods adapted to jointly normalizing Hi-C data. Additionally, it provides a general linear model (GLM) framework adapting the edgeR package to detect differences in Hi-C data in a distance dependent manner.

Stale104 years ago

cellmigRation

CellBiology

Import TIFF images of fluorescently labeled cells, and track cell movements over time. Parallelization is supported for image processing and for fast computation of cell trajectories. In-depth analysis of cell trajectories is enabled by 15 trajectory analysis functions.

Stale14 years ago

Molecular Transformers

Chemical Synthesis

AI for chemical reaction prediction and synthesis planning

Stale4244 years ago

Big Open Network of Sustainability Assessment Information Ontology

Stale24 years ago

InterCellar

InterCellar is implemented as an R/Bioconductor Package containing a Shiny app that allows users to interactively analyze cell-cell communication from scRNA-seq data. Starting from precomputed ligand-receptor interactions, InterCellar provides filtering options, annotations and multiple visualizations to explore clusters, genes and functions. Finally, based on functional annotation from Gene Ontology and pathway databases, InterCellar implements data-driven analyses to investigate cell-cell communication in one or multiple conditions.

Stale124 years ago

CluMSID

Metabolomics

CluMSID is a tool that aids the identification of features in untargeted LC-MS/MS analysis by the use of MS2 spectra similarity and unsupervised statistical methods. It offers functions for a complete and customisable workflow from raw data to visualisations and is interfaceable with the xmcs family of preprocessing packages.

Stale104 years ago

scDD

ImmunoOncology

This package implements a method to analyze single-cell RNA- seq Data utilizing flexible Dirichlet Process mixture models. Genes with differential distributions of expression are classified into several interesting patterns of differences between two conditions. The package also includes functions for simulating data with these patterns from negative binomial distributions.

Stale354 years ago

tomoda

GeneExpression

This package provides many easy-to-use methods to analyze and visualize tomo-seq data. The tomo-seq technique is based on cryosectioning of tissue and performing RNA-seq on consecutive sections. (Reference: Kruse F, Junker JP, van Oudenaarden A, Bakkers J. Tomo-seq: A method to obtain genome-wide expression data with spatial resolution. Methods Cell Biol. 2016;135:299-307. doi:10.1016/bs.mcb.2016.01.006) The main purpose of the package is to find zones with similar transcriptional profiles and spatially expressed genes in a tomo-seq sample. Several visulization functions are available to create easy-to-modify plots.

Stale04 years ago

CIMICE

CIMICE is a tool in the field of tumor phylogenetics and its goal is to build a Markov Chain (called Cancer Progression Markov Chain, CPMC) in order to model tumor subtypes evolution. The input of CIMICE is a Mutational Matrix, so a boolean matrix representing altered genes in a collection of samples. These samples are assumed to be obtained with single-cell DNA analysis techniques and the tool is specifically written to use the peculiarities of this data for the CMPC construction.

Stale04 years ago

Artistic-2.0

MBQN

Normalization

Modified quantile normalization for omics or other matrix-like data distorted in location and scale.

Stale24 years ago

Fastx Tookit

Sequence Processing

FASTQ/A short-reads pre-processing tools: Demultiplexing, trimming, clipping, quality filtering, and masking utilities.

Stale2024 years ago

sarks

MotifDiscovery

Suffix Array Kernel Smoothing (see https://academic.oup.com/bioinformatics/article-abstract/35/20/3944/5418797), or SArKS, identifies sequence motifs whose presence correlates with numeric scores (such as differential expression statistics) assigned to the sequences (such as gene promoters). SArKS smooths over sequence similarity, quantified by location within a suffix array based on the full set of input sequences. A second round of smoothing over spatial proximity within sequences reveals multi-motif domains. Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.

Stale34 years ago

BSD-3-Clause

dsw

Stale54 years ago

Shell

CC0-1.0

Functional Mock-up Interface Ontology

An ontology transcription of definitions in the Functional Mock-up Interface (FMI) standard document from https://fmi-standard.org/ that enables representing Functional Mock-up Units (FMUs) in RDF

Stale24 years ago

maser

AlternativeSplicing

This package provides functionalities for downstream analysis, annotation and visualizaton of alternative splicing events generated by rMATS.

Stale214 years ago

sevenbridges

R client and utilities for Seven Bridges platform API, from Cancer Genomics Cloud to other Seven Bridges supported platforms.

Stale374 years ago

VaSP

RNASeq

Discovery of genome-wide variable alternative splicing events from short-read RNA-seq data and visualizations of gene splicing information for publication-quality multi-panel figures in a population. (Warning: The visualizing function is removed due to the dependent package Sushi deprecated. If you want to use it, please change back to an older version.)

Stale34 years ago

GPL-2.0+

Arctic Data Center Academic Disciplines Ontology

Genome Browsers / Gene Diagrams

An ontology to support disciplinary annotation of Arctic Data Center datasets.

Stale14 years ago

CC-BY-4.0

pileup.js

JavaScript library that can be used to generate interactive and highly customizable web-based genome browsers.

Stale2814 years ago

JavaScript

HiLDA

A package built under the Bayesian framework of applying hierarchical latent Dirichlet allocation. It statistically tests whether the mutational exposures of mutational signatures (Shiraishi-model signatures) are different between two groups. The package also provides inference and visualization.

Stale34 years ago

cmapR

DataImport

Genome Browsers / Gene Diagrams

The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.

Stale934 years ago

Other

BioJS

BioJS is a library of over hundred JavaScript components enabling you to visualize and process data using current web technologies.

Stale5064 years ago

IntramiRExploreR

Intra-miR-ExploreR, an integrative miRNA target prediction bioinformatics tool, identifies targets combining expression and biophysical interactions of a given microRNA (miR). Using the tool, we have identified targets for 92 intragenic miRs in D. melanogaster, using available microarray expression data, from Affymetrix 1 and Affymetrix2 microarray array platforms, providing a global perspective of intragenic miR targets in Drosophila. Predicted targets are grouped according to biological functions using the DAVID Gene Ontology tool and are ranked based on a biologically relevant scoring system, enabling the user to identify functionally relevant targets for a given miR.

Stale04 years ago

Common Information Model Ontology

An ontology encoding the Common Information Model (CIM) schema

Stale84 years ago

Code Ontology

Stale294 years ago

Java

scReClassify

A post hoc cell type classification tool to fine-tune cell type annotations generated by any cell type classification procedure with semi-supervised learning algorithm AdaSampling technique. The current version of scReClassify supports Support Vector Machine and Random Forest as a base classifier.

Stale114 years ago

scDataviz

SingleCell

In the single cell World, which includes flow cytometry, mass cytometry, single-cell RNA-seq (scRNA-seq), and others, there is a need to improve data visualisation and to bring analysis capabilities to researchers even from non-technical backgrounds. scDataviz attempts to fit into this space, while also catering for advanced users. Additonally, due to the way that scDataviz is designed, which is based on SingleCellExperiment, it has a 'plug and play' feel, and immediately lends itself as flexibile and compatibile with studies that go beyond scDataviz. Finally, the graphics in scDataviz are generated via the ggplot engine, which means that users can 'add on' features to these with ease.

Stale674 years ago

tib.re

Stale04 years ago

MEAT

Epigenetics

This package estimates epigenetic age in skeletal muscle, using DNA methylation data generated with the Illumina Infinium technology (HM27, HM450 and HMEPIC).

Stale14 years ago

bot

Stale624 years ago

HTML

tabix

Command Line Utilities

Table file index.

Archived924 years ago

HDTD

DifferentialExpression

Characterization of intra-individual variability using physiologically relevant measurements provides important insights into fundamental biological questions ranging from cell type identity to tumor development. For each individual, the data measurements can be written as a matrix with the different subsamples of the individual recorded in the columns and the different phenotypic units recorded in the rows. Datasets of this type are called high-dimensional transposable data. The HDTD package provides functions for conducting statistical inference for the mean relationship between the row and column variables and for the covariance structure within and between the row and column variables.

Stale14 years ago

cyanoFilter

FlowCytometry

An approach to filter out and/or identify phytoplankton cells from all particles measured via flow cytometry pigment and cell complexity information. It does this using a sequence of one-dimensional gates on pre-defined channels measuring certain pigmentation and complexity. The package is especially tuned for cyanobacteria, but will work fine for phytoplankton communities where there is at least one cell characteristic that differentiates every phytoplankton in the community.

Stale04 years ago

Rqc

Sequencing

Rqc is an optimised tool designed for quality control and assessment of high-throughput sequencing data. It performs parallel processing of entire files and produces a report which contains a set of high-resolution graphics.

Stale04 years ago

GPL-2.0+

granulator

RNASeq

granulator is an R package for the cell type deconvolution of heterogeneous tissues based on bulk RNA-seq data or single cell RNA-seq expression profiles. The package provides a unified testing interface to rapidly run and benchmark multiple state-of-the-art deconvolution methods. Data for the deconvolution of peripheral blood mononuclear cells (PBMCs) into individual immune cell types is provided as well.

Stale34 years ago

fedup

GeneSetEnrichment

An R package that tests for enrichment and depletion of user-defined pathways using a Fisher's exact test. The method is designed for versatile pathway annotation formats (eg. gmt, txt, xlsx) to allow the user to run pathway analysis on custom annotations. This package is also integrated with Cytoscape to provide network-based pathway visualization that enhances the interpretability of the results.

Stale74 years ago

Ruffus

Workflow Managers

Computation Pipeline library for python widely used in science and bioinformatics.

Stale1754 years ago

Genome Browsers / Gene Diagrams

Squiggle

Easy-to-use DNA sequence visualization tool that turns FASTA files into browser-based visualizations.

Archived424 years ago

Python for chemoinformatics

Stale2344 years ago

wpm

GUI

The Well-Plate Maker (WPM) is a shiny application deployed as an R package. Functions for a command-line/script use are also available. The WPM allows users to generate well plate maps to carry out their experiments while improving the handling of batch effects. In particular, it helps controlling the "plate effect" thanks to its ability to randomize samples over multiple well plates. The algorithm for placing the samples is inspired by the backtracking algorithm: the samples are placed at random while respecting specific spatial constraints.

Stale64 years ago

Artistic-2.0

epidecodeR

DifferentialExpression

epidecodeR is a package capable of analysing impact of degree of DNA/RNA epigenetic chemical modifications on dysregulation of genes or proteins. This package integrates chemical modification data generated from a host of epigenomic or epitranscriptomic techniques such as ChIP-seq, ATAC-seq, m6A-seq, etc. and dysregulated gene lists in the form of differential gene expression, ribosome occupancy or differential protein translation and identify impact of dysregulation of genes caused due to varying degrees of chemical modifications associated with the genes. epidecodeR generates cumulative distribution function (CDF) plots showing shifts in trend of overall log2FC between genes divided into groups based on the degree of modification associated with the genes. The tool also tests for significance of difference in log2FC between groups of genes.

Stale55 years ago