Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

5,923 resources indexed

Showing 1,4511,500

The Audience Types Controlled Vocabulary was created for NSF's EarthCube program's Resource Registry. The vocabulary defines the types of audience each resource in the program is targeted to. At this point the vocabulary is very bare - no term definitions even; however, the intention is to extend the vocabulary over time. If you would like to assist with this or in extending any of the other controlled vocabularies/ontologies developed as part of the Resource Registry project, please see https://github.com/earthcubearchitecture-ecresourcereg.

Stale06 years ago

This mini-ontology contains classes and instances for each version of the licenses that are commonly used in software projects, particularly open source software projects. The URI's for each are the canonical URI's for that license (where they exist).

Stale06 years ago
HTML

The Extensible Observation Ontology (OBOE) is a formal ontology for capturing the semantics of scientific observation and measurement. The ontology supports researchers to add detailed semantic annotations to scientific data, thereby clarifying the inherent meaning of scientific observations.

Stale336 years ago
XSLT

loci2path performs statistics-rigorous enrichment analysis of eQTLs in genomic regions of interest. Using eQTL collections provided by the Genotype-Tissue Expression (GTEx) project and pathway collections from MSigDB.

Stale16 years ago
R
Artistic-2.0

An ontology that allows the description of numerical and categorical bibliometric data (e.g., journal impact factor, author h-index, categories describing research careers) in RDF.

Stale06 years ago

An ontology for describing the administrative information of research projects, e.g., grant applications, funding bodies, project partners, etc.

Stale26 years ago

An ontology based on PRO for describing the contributions that may be made, and the roles that may be held by a person with respect to a journal article or other publication (e.g. the role of article guarantor or illustrator).

Stale16 years ago

An ontology that permits the number of in-text citations of a cited source to be recorded, together with their textual citation contexts, along with the number of citations a cited entity has received globally on a particular date.

Stale06 years ago

An ontology meant to define bibliographic records, bibliographic references, and their compilation into bibliographic collections and bibliographic lists, respectively.

Stale06 years ago

An ontology that enables characterization of the nature or type of citations, both factually and rhetorically.

Stale156 years ago
CC-BY-4.0

An ontology for the characterisation of the roles of agents – people, corporate bodies and computational agents in the publication process. These agents can be, e.g. authors, editors, reviewers, publishers or librarians.

Stale16 years ago

Flexible circular visualization of genome-associated data with BioPerl and SVG.

Stale466 years ago
Perl
NOASSERTION

geneXtendeR optimizes the functional annotation of ChIP-seq peaks by exploring relative differences in annotating ChIP-seq peak sets to variable-length gene bodies. In contrast to prior techniques, geneXtendeR considers peak annotations beyond just the closest gene, allowing users to see peak summary statistics for the first-closest gene, second-closest gene, ..., n-closest gene whilst ranking the output according to biologically relevant events and iteratively comparing the fidelity of peak-to-gene overlap across a user-defined range of upstream and downstream extensions on the original boundaries of each gene's coordinates. Since different ChIP-seq peak callers produce different differentially enriched peaks with a large variance in peak length distribution and total peak count, annotating peak lists with their nearest genes can often be a noisy process. As such, the goal of geneXtendeR is to robustly link differentially enriched peaks with their respective genes, thereby aiding experimental follow-up and validation in designing primers for a set of prospective gene candidates during qPCR.

Stale107 years ago
R
GPL-3.0+

JavaScript library for drawing canvas-based gene diagrams.

Stale767 years ago
JavaScript

TogoID is an ID conversion service implementing unique features with an intuitive web interface and an API for programmatic access. TogoID supports datasets from various biological categories such as gene, protein, chemical compound, pathway, disease, etc. TogoID users can perform exploratory multistep conversions to find a path among IDs. To guide the interpretation of biological meanings in the conversions, we crafted an ontology that defines the semantics of the dataset relations. (from https://togoid.dbcls.jp/)

Stale27 years ago
HTML

Contains functions and classes that are needed by arrayCGH packages.

Stale07 years ago
R
GPL

Tool for analysis of codon usage in various unannotated or KEGG/COG annotated DNA sequences. Calculates different measures of CU bias and CU-based predictors of gene expressivity, and performs gene set enrichment analysis for annotated sequences. Implements several methods for visualization of CU and enrichment analysis results.

Stale237 years ago
R
Artistic-2.0

VCFArray extends the DelayedArray to represent VCF data entries as array-like objects with on-disk / remote VCF file as backend. Data entries from VCF files, including info fields, FORMAT fields, and the fixed columns (REF, ALT, QUAL, FILTER) could be converted into VCFArray instances with different dimensions.

Stale17 years ago
R
GPL-3.0

The Shape Expressions (ShEx) language describes RDF nodes and graph structures. A node constraint describes an RDF node (IRI, blank node or literal) and a shape describes the triples involving nodes in an RDF graph. These descriptions identify predicates and their associated cardinalities and datatypes. ShEx shapes can be used to communicate data structures associated with some process or interface, generate or validate data, or drive user interfaces.

Stale17 years ago
HTML

Expertly curated genomics papers to get up to speed on genomics, RNA-seq, statistics (used in genomics), software development, and more.

Stale5027 years ago

Perl package for circular plots, which are well suited for genomic rearrangements.

Stale887 years ago
Perl

Telseq is a tool for estimating telomere length from whole genome sequence data.

Stale767 years ago
C++
GPL-3.0

AbSeq is a comprehensive bioinformatic pipeline for the analysis of sequencing datasets generated from antibody libraries and abseqR is one of its packages. abseqR empowers the users of abseqPy (https://github.com/malhamdoosh/abseqPy) with plotting and reporting capabilities and allows them to generate interactive HTML reports for the convenience of viewing and sharing with other researchers. Additionally, abseqR extends abseqPy to compare multiple repertoire analyses and perform further downstream analysis on its output.

Stale07 years ago
R
GPL-3.0

All alleles from the IPD IMGT/HLA <https://www.ebi.ac.uk/ipd/imgt/hla/> and IPD KIR <https://www.ebi.ac.uk/ipd/kir/> database for Homo sapiens. Reference: Robinson J, Maccari G, Marsh SGE, Walter L, Blokhuis J, Bimber B, Parham P, De Groot NG, Bontrop RE, Guethlein LA, and Hammond JA KIR Nomenclature in non-human species Immunogenetics (2018), in preparation.

Stale07 years ago
R
Artistic-2.0

An ontology that enables the description of reviews of scientific articles and other scholarly resources.

Stale07 years ago

This package implements UbiBic algorithm in R. This biclustering algorithm for analysis of gene expression data was introduced by Zhenjia Wang et al. in 2016. It is currently considered the most promising biclustering method for identification of meaningful structures in complex and noisy data.

Stale47 years ago
R
MIT

Virtual machine with all software and sample data to run 3D-e-Chem Knime workflows

Stale177 years ago
Shell
Apache-2.0

This package uses an innovative network-based approach that will enhance our ability to determine the identities of significant ions detected by LC-MS.

Stale17 years ago
R
Artistic-2.0

Hadoop Oozie-based workflow system focused on genomics data analysis in cloud environments.

Stale307 years ago
Java
GPL-3.0

A comprehensive pipeline for analyzing and interactively visualizing genomic profiles generated through commercial or custom aCGH arrays. As inputs, rCGH supports Agilent dual-color Feature Extraction files (.txt), from 44 to 400K, Affymetrix SNP6.0 and cytoScanHD probeset.txt, cychp.txt, and cnchp.txt files exported from ChAS or Affymetrix Power Tools. rCGH also supports custom arrays, provided data complies with the expected format. This package takes over all the steps required for individual genomic profiles analysis, from reading files to profiles segmentation and gene annotations. This package also provides several visualization functions (static or interactive) which facilitate individual profiles interpretation. Input files can be in compressed format, e.g. .bz2 or .gz.

Stale58 years ago
R
Artistic-2.0

Upper-Level ontology for Biology and Medicine. Compatible with BFO, DOLCE, and the UMLS Semantic Network

Stale48 years ago
Perl
CC-BY-3.0

An ontology of histopathological morphologies used by pathologists to classify/categorise animal lesions observed histologically during regulatory toxicology studies. The ontology was developed using real data from over 6000 regulatory toxicology studies donated by 13 companies spanning nine species. The original structure of the histopathology ontology was designed ab initio when the [INHAND](http://www.goreni.org/) manuscripts were not available. However, the ontology has been repetitively reviewed and updated to align with the subsequently published INHAND manuscripts. During this process cross references to INHAND lesion identifiers were added to the ontology. [from GitHub]

Stale98 years ago
Apache-2.0

List of resources on alternative splicing including software, databases, and other tools.

Stale588 years ago

DTO integrates and harmonizes knowledge of the most important druggable protein families: kinases, GPCRs, ion channels and nuclear hormone receptors.

Stale88 years ago
CC-BY-SA-4.0
Stale68 years ago
Python

Natural Product-likeness calculator v-2.1 : calculates natural product-likeness of small molecules based on open-data of natural products.

Stale48 years ago
Java

GA4GHclient provides an easy way to access public data servers through Global Alliance for Genomics and Health (GA4GH) genomics API. It provides low-level access to GA4GH API and translates response data into Bioconductor-based class objects.

Stale18 years ago
R
GPL-2.0+

Selventa legacy chemical namespace used with the Biological Expression Language

Archived08 years ago
Python
Apache-2.0

A calculator incorporating various empirical pair and many-body potentials.

Stale238 years ago
Fortran
LGPL-3.0

This package does nucleosome positioning using informative Multinomial-Dirichlet prior in a t-mixture with reversible jump estimation of nucleosome positions for genome-wide profiling.

Stale08 years ago
R
Artistic-2.0

GSALightning provides a fast implementation of permutation-based gene set analysis for two-sample problem. This package is particularly useful when testing simultaneously a large number of gene sets, or when a large number of permutations is necessary for more accurate p-values estimation.

Stale58 years ago
R
GPL-2.0+

This package builds on the Epimods framework which facilitates finding weighted subnetworks ("modules") on Illumina Infinium 27k arrays using the SpinGlass algorithm, as implemented in the iGraph package. We have created a class of gene centric annotations associated with p-values and effect sizes and scores from any researchers prior statistical results to find functional modules.

Stale19 years ago
R
GPL-2.0+

isobar provides methods for preprocessing, normalization, and report generation for the analysis of quantitative mass spectrometry proteomics data labeled with isobaric tags, such as iTRAQ and TMT. Features modules for integrating and validating PTM-centric datasets (isobar-PTM). More information on http://www.ms-isobar.org.

Stale109 years ago
R
LGPL-2.0

Methodology for supervised clustering of potentially many predictor variables, such as genes etc., in time series datasets Provides functions that help the user assigning genes to predefined set of model profiles.

Stale19 years ago
R
GPL-2.0

Find the most characteristic gene ontology terms for groups of human genes. This package was created as a part of the thesis which was developed under the auspices of MI^2 Group (http://mi2.mini.pw.edu.pl/, https://github.com/geneticsMiNIng).

Stale29 years ago
R
GPL-3.0

DermO is an ontology with broad coverage of the domain of dermatologic disease and we demonstrate here its utility for text mining and investigation of phenotypic relationships between dermatologic disorders

Stale410 years ago
Web Ontology Language

It is an ontology model used to describe associations between biomedical entities in triple format based on W3C specification. OBAN is a generic association representation model that loosely couples a subject and object (e.g. disease and its associated phenotypes supported by the source of evidence for that association) via a construction of class OBAN:association. [from GitHub]

Stale610 years ago
Web Ontology Language

An ontology that represents the basic knowledge of physical, chemical and functional characteristics of nanotechnology as used in cancer diagnosis and therapy.

The fmcsR package introduces an efficient maximum common substructure (MCS) algorithms combined with a novel matching strategy that allows for atom and/or bond mismatches in the substructures shared among two small molecules. The resulting flexible MCSs (FMCSs) are often larger than strict MCSs, resulting in the identification of more common features in their source structures, as well as a higher sensitivity in finding compounds with weak structural similarities. The fmcsR package provides several utilities to use the FMCS algorithm for pairwise compound comparisons, structure similarity searching and clustering.

Stale610 years ago
R
Artistic-2.0