Open Science Index

Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

Filters

Health

Active748
Idle370
Stale316
Archived13
(None)4476

Domain

Software422
ImmunoOncology251
Microarray138
Infrastructure123
GeneExpression117
Sequencing85
SingleCell72
Protein & Drug Discovery66
text-generation63
Visualization61
Annotation51
Genetics51
(None)2332

Language

R2426
Python448
Jupyter Notebook52
HTML30
C21
Makefile19
JavaScript16
C++15
Java10
Shell9
Web Ontology Language7
Perl6
(None)2815

License

GPL-3.0620
Artistic-2.0550
MIT549
CC-BY-4.0268
GPL-2.0252
GPL-2.0+243
CC0-1.0120
Apache-2.0107
GPL-3.0+101
CC-BY-3.083
NOASSERTION82
Other61
(None)2441

Source

bioconductor2418
bioregistry2418
github1150
awesome-ai-for-science418
huggingface303
awesome-bioinformatics126
bio.tools116
awesome-python-chemistry87
awesome-cheminformatics45
awesome-scientific-python18

Type

Software tool3202
Database2418
AI model303

Filters

Health

Active748
Idle370
Stale316
Archived13
(None)4476

Domain

Software422
ImmunoOncology251
Microarray138
Infrastructure123
GeneExpression117
Sequencing85
SingleCell72
Protein & Drug Discovery66
text-generation63
Visualization61
Annotation51
Genetics51
(None)2332

Language

R2426
Python448
Jupyter Notebook52
HTML30
C21
Makefile19
JavaScript16
C++15
Java10
Shell9
Web Ontology Language7
Perl6
(None)2815

License

GPL-3.0620
Artistic-2.0550
MIT549
CC-BY-4.0268
GPL-2.0252
GPL-2.0+243
CC0-1.0120
Apache-2.0107
GPL-3.0+101
CC-BY-3.083
NOASSERTION82
Other61
(None)2441

Source

bioconductor2418
bioregistry2418
github1150
awesome-ai-for-science418
huggingface303
awesome-bioinformatics126
bio.tools116
awesome-python-chemistry87
awesome-cheminformatics45
awesome-scientific-python18

Type

Software tool3202
Database2418
AI model303

5,923 resources indexed

Showing 1,451–1,500

Audience Types Controlled Vocabulary

The Audience Types Controlled Vocabulary was created for NSF's EarthCube program's Resource Registry. The vocabulary defines the types of audience each resource in the program is targeted to. At this point the vocabulary is very bare - no term definitions even; however, the intention is to extend the vocabulary over time. If you would like to assist with this or in extending any of the other controlled vocabularies/ontologies developed as part of the Resource Registry project, please see https://github.com/earthcubearchitecture-ecresourcereg.

Stale★06 years ago

Software Licenses Ontology

This mini-ontology contains classes and instances for each version of the licenses that are commonly used in software projects, particularly open source software projects. The URI's for each are the canonical URI's for that license (where they exist).

Stale★06 years ago

The Extensible Observation Ontology

The Extensible Observation Ontology (OBOE) is a formal ontology for capturing the semantics of scientific observation and measurement. The ontology supports researchers to add detailed semantic annotations to scientific data, thereby clarifying the inherent meaning of scientific observations.

Stale★336 years ago

loci2path

FunctionalGenomics

loci2path performs statistics-rigorous enrichment analysis of eQTLs in genomic regions of interest. Using eQTL collections provided by the Genotype-Tissue Expression (GTEx) project and pathway collections from MSigDB.

Stale★16 years ago

Bibliometric Data Ontology

An ontology that allows the description of numerical and categorical bibliometric data (e.g., journal impact factor, author h-index, categories describing research careers) in RDF.

Stale★06 years ago

Funding, Research Administration and Projects Ontology

An ontology for describing the administrative information of research projects, e.g., grant applications, funding bodies, project partners, etc.

Stale★26 years ago

Scholarly Contributions and Roles Ontology

An ontology based on PRO for describing the contributions that may be made, and the roles that may be held by a person with respect to a journal article or other publication (e.g. the role of article guarantor or illustrator).

Stale★16 years ago

Citation Counting and Context Characterisation Ontology

An ontology that permits the number of in-text citations of a cited source to be recorded, together with their textual citation contexts, along with the number of citations a cited entity has received globally on a particular date.

Stale★06 years ago

Bibliographic Reference Ontology

An ontology meant to define bibliographic records, bibliographic references, and their compilation into bibliographic collections and bibliographic lists, respectively.

Stale★06 years ago

Citation Typing Ontology

An ontology that enables characterization of the nature or type of citations, both factually and rhetorically.

Stale★156 years ago

Publishing Roles Ontology

An ontology for the characterisation of the roles of agents – people, corporate bodies and computational agents in the publication process. These agents can be, e.g. authors, editors, reviewers, publishers or librarians.

Stale★16 years ago

Circleator

Genome Browsers / Gene Diagrams

Flexible circular visualization of genome-associated data with BioPerl and SVG.

Stale★466 years ago

geneXtendeR

geneXtendeR optimizes the functional annotation of ChIP-seq peaks by exploring relative differences in annotating ChIP-seq peak sets to variable-length gene bodies. In contrast to prior techniques, geneXtendeR considers peak annotations beyond just the closest gene, allowing users to see peak summary statistics for the first-closest gene, second-closest gene, ..., n-closest gene whilst ranking the output according to biologically relevant events and iteratively comparing the fidelity of peak-to-gene overlap across a user-defined range of upstream and downstream extensions on the original boundaries of each gene's coordinates. Since different ChIP-seq peak callers produce different differentially enriched peaks with a large variance in peak length distribution and total peak count, annotating peak lists with their nearest genes can often be a noisy process. As such, the goal of geneXtendeR is to robustly link differentially enriched peaks with their respective genes, thereby aiding experimental follow-up and validation in designing primers for a set of prospective gene candidates during qPCR.

Stale★107 years ago

scribl

Genome Browsers / Gene Diagrams

JavaScript library for drawing canvas-based gene diagrams.

Stale★767 years ago

TogoID Ontology

TogoID is an ID conversion service implementing unique features with an intuitive web interface and an API for programmatic access. TogoID supports datasets from various biological categories such as gene, protein, chemical compound, pathway, disease, etc. TogoID users can perform exploratory multistep conversions to find a path among IDs. To guide the interpretation of biological meanings in the conversions, we crafted an ontology that defines the semantics of the dataset relations. (from https://togoid.dbcls.jp/)

Stale★27 years ago

CGHbase

Contains functions and classes that are needed by arrayCGH packages.

Stale★07 years ago

coRdon

Tool for analysis of codon usage in various unannotated or KEGG/COG annotated DNA sequences. Calculates different measures of CU bias and CU-based predictors of gene expressivity, and performs gene set enrichment analysis for annotated sequences. Implements several methods for visualization of CU and enrichment analysis results.

Stale★237 years ago

VCFArray

VCFArray extends the DelayedArray to represent VCF data entries as array-like objects with on-disk / remote VCF file as backend. Data entries from VCF files, including info fields, FORMAT fields, and the fixed columns (REF, ALT, QUAL, FILTER) could be converted into VCFArray instances with different dimensions.

Stale★17 years ago

Shape Expression Vocabulary

The Shape Expressions (ShEx) language describes RDF nodes and graph structures. A node constraint describes an RDF node (IRI, blank node or literal) and a shape describes the triples involving nodes in an RDF graph. These descriptions identify predicates and their associated cardinalities and datatypes. ShEx shapes can be used to communicate data structures associated with some process or interface, generate or validate data, or drive user interfaces.

Stale★17 years ago

The Leek group guide to genomics papers

Expertly curated genomics papers to get up to speed on genomics, RNA-seq, statistics (used in genomics), software development, and more.

Stale★5027 years ago

Circos

Perl package for circular plots, which are well suited for genomic rearrangements.

Stale★887 years ago

Telseq

BAM File Utilities

Telseq is a tool for estimating telomere length from whole genome sequence data.

Stale★767 years ago

abseqR

AbSeq is a comprehensive bioinformatic pipeline for the analysis of sequencing datasets generated from antibody libraries and abseqR is one of its packages. abseqR empowers the users of abseqPy (https://github.com/malhamdoosh/abseqPy) with plotting and reporting capabilities and allows them to generate interactive HTML reports for the convenience of viewing and sharing with other researchers. Additionally, abseqR extends abseqPy to compare multiple repertoire analyses and perform further downstream analysis on its output.

Stale★07 years ago

ipdDb

GenomicVariation

All alleles from the IPD IMGT/HLA <https://www.ebi.ac.uk/ipd/imgt/hla/> and IPD KIR <https://www.ebi.ac.uk/ipd/kir/> database for Homo sapiens. Reference: Robinson J, Maccari G, Marsh SGE, Walter L, Blokhuis J, Bimber B, Parham P, De Groot NG, Bontrop RE, Guethlein LA, and Hammond JA KIR Nomenclature in non-human species Immunogenetics (2018), in preparation.

Stale★07 years ago

FAIR* Reviews Ontology

An ontology that enables the description of reviews of scientific articles and other scholarly resources.

Stale★07 years ago

runibic

This package implements UbiBic algorithm in R. This biclustering algorithm for analysis of gene expression data was introduced by Zhenjia Wang et al. in 2016. It is currently considered the most promising biclustering method for identification of meaningful structures in complex and noisy data.

Stale★47 years ago

3D e-Chem Virtual Machine

Virtual Machine

Virtual machine with all software and sample data to run 3D-e-Chem Knime workflows

Stale★177 years ago

MetID

This package uses an innovative network-based approach that will enhance our ability to determine the identities of significant ions detected by LC-MS.

Stale★17 years ago

SeqWare

Workflow Managers

Hadoop Oozie-based workflow system focused on genomics data analysis in cloud environments.

Stale★307 years ago

rCGH

A comprehensive pipeline for analyzing and interactively visualizing genomic profiles generated through commercial or custom aCGH arrays. As inputs, rCGH supports Agilent dual-color Feature Extraction files (.txt), from 44 to 400K, Affymetrix SNP6.0 and cytoScanHD probeset.txt, cychp.txt, and cnchp.txt files exported from ChAS or Affymetrix Power Tools. rCGH also supports custom arrays, provided data complies with the expected format. This package takes over all the steps required for individual genomic profiles analysis, from reading files to profiles segmentation and gene annotations. This package also provides several visualization functions (static or interactive) which facilitate individual profiles interpretation. Input files can be in compressed format, e.g. .bz2 or .gz.

Stale★58 years ago

BioTop

Upper-Level ontology for Biology and Medicine. Compatible with BFO, DOLCE, and the UMLS Semantic Network

Stale★48 years ago

Histopathology Ontology

An ontology of histopathological morphologies used by pathologists to classify/categorise animal lesions observed histologically during regulatory toxicology studies. The ontology was developed using real data from over 6000 regulatory toxicology studies donated by 13 companies spanning nine species. The original structure of the histopathology ontology was designed ab initio when the [INHAND](http://www.goreni.org/) manuscripts were not available. However, the ontology has been repetitively reviewed and updated to align with the subsequently published INHAND manuscripts. During this process cross references to INHAND lesion identifiers were added to the ontology. [from GitHub]

Stale★98 years ago

Awesome-alternative-splicing

Bioinformatics on GitHub

List of resources on alternative splicing including software, databases, and other tools.

Stale★588 years ago

Drug Target Ontology

DTO integrates and harmonizes knowledge of the most important druggable protein families: kinases, GPCRs, ion channels and nuclear hormone receptors.

Stale★88 years ago

cmpo

Stale★68 years ago

NP-Likeness

Small molecules

Natural Product-likeness calculator v-2.1 : calculates natural product-likeness of small molecules based on open-data of natural products.

Stale★48 years ago

GA4GHclient

DataRepresentation

GA4GHclient provides an easy way to access public data servers through Global Alliance for Genomics and Health (GA4GH) genomics API. It provides low-level access to GA4GH API and translates response data into Bioconductor-based class objects.

Stale★18 years ago

Selventa Chemicals

Selventa legacy chemical namespace used with the Biological Expression Language

Archived★08 years ago

pysic

A calculator incorporating various empirical pair and many-body potentials.

Stale★238 years ago

RJMCMCNucleosomes

BiologicalQuestion

This package does nucleosome positioning using informative Multinomial-Dirichlet prior in a t-mixture with reversible jump estimation of nucleosome positions for genome-wide profiling.

Stale★08 years ago

GSALightning

GSALightning provides a fast implementation of permutation-based gene set analysis for two-sample problem. This package is particularly useful when testing simultaneously a large number of gene sets, or when a large number of permutations is necessary for more accurate p-values estimation.

Stale★58 years ago

SMITE

This package builds on the Epimods framework which facilitates finding weighted subnetworks ("modules") on Illumina Infinium 27k arrays using the SpinGlass algorithm, as implemented in the iGraph package. We have created a class of gene centric annotations associated with p-values and effect sizes and scores from any researchers prior statistical results to find functional modules.

Stale★19 years ago

isobar

isobar provides methods for preprocessing, normalization, and report generation for the analysis of quantitative mass spectrometry proteomics data labeled with isobaric tags, such as iTRAQ and TMT. Features modules for integrating and validating PTM-centric datasets (isobar-PTM). More information on http://www.ms-isobar.org.

Stale★109 years ago

ctsGE

Methodology for supervised clustering of potentially many predictor variables, such as genes etc., in time series datasets Provides functions that help the user assigning genes to predefined set of model profiles.

Stale★19 years ago

tib.ofm

Stale★39 years ago

Web Ontology Language

GOpro

Find the most characteristic gene ontology terms for groups of human genes. This package was created as a part of the thesis which was developed under the auspices of MI^2 Group (http://mi2.mini.pw.edu.pl/, https://github.com/geneticsMiNIng).

Stale★29 years ago

Human Dermatological Disease Ontology

DermO is an ontology with broad coverage of the domain of dermatologic disease and we demonstrate here its utility for text mining and investigation of phenotypic relationships between dermatologic disorders

Stale★410 years ago

Web Ontology Language

Open Biomedical Annotations

It is an ontology model used to describe associations between biomedical entities in triple format based on W3C specification. OBAN is a generic association representation model that loosely couples a subject and object (e.g. disease and its associated phenotypes supported by the source of evidence for that association) via a construction of class OBAN:association. [from GitHub]

Stale★610 years ago

Web Ontology Language

NanoParticle Ontology

An ontology that represents the basic knowledge of physical, chemical and functional characteristics of nanotechnology as used in cancer diagnosis and therapy.

Stale★110 years ago

Web Ontology Language

fmcsR

Cheminformatics

The fmcsR package introduces an efficient maximum common substructure (MCS) algorithms combined with a novel matching strategy that allows for atom and/or bond mismatches in the substructures shared among two small molecules. The resulting flexible MCSs (FMCSs) are often larger than strict MCSs, resulting in the identification of more common features in their source structures, as well as a higher sensitivity in finding compounds with weak structural similarities. The fmcsR package provides several utilities to use the FMCS algorithm for pairwise compound comparisons, structure similarity searching and clustering.

Stale★610 years ago

1
28
29
30
31
32
119

Submit a resource bio.tools Awesome Bioinformatics