Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

2,419 of 5,933 resources

Showing 1,4011,450

MarkerDB is a freely available electronic database that attempts to consolidate information on all known clinical and pre-clinical biomarkers into a single resource. It provides identifiers for known clinical and pre-clinical biomarkers. Each entry provides detailed annotations, including biomarker descriptions, associated conditions, specificity, sensitivity, molecular structures, chromosomal locations, and clinical approval status.

This vocabulary and grammar defines which types of objects are admissible to the MathAlgoDB - the algorithm knowledge graph - and by which properties they can relate. All in all five classes, "problem", "algorithm", "benchmark", "software", "publication", are defined, as well as a minimal but intuitively intelligible number of properties. As opposed to the more liberal WikiData, MathAlgoDB relies on the strict adherence to the ontology to provide a reliable machine-readable database of (numerical) algorithm knowledge. [from homepage]

MatrixDB is a freely available database focused on interactions established by extracellular matrix proteins, proteoglycans and polysaccharides

Some IDs may represent experiment sets, e.g. https://www.mavedb.org/#/experiment-sets/urn:mavedb:00000011 Others represent genomic regions (specifically deep mutational scans thereof) e.g. https://www.mavedb.org/#/experiment-sets/urn:mavedb:00000011-a

The MDL number contains a unique identification number for each reaction and variation. The format is RXXXnnnnnnnn. R indicates a reaction, XXX indicates which database contains the reaction record. The numeric portion, nnnnnnnn, is an 8-digit number. [wikipedia]

This vocabulary is used internally inside MedGen to assign temporary identifiers to terms that will later get put in UMLS. Mappings between MedGen, MedGen CIDs, and UMLS can be found [here](https://ftp.ncbi.nlm.nih.gov/pub/medgen/MedGenIDMappings.txt.gz).

The mission of MediaDive is to transform poorly structured media recipes into a standardized database. The contents of the database include mined thousands of PDF and HTML documents. To ensure the quality of the media and continous improvement of the database, we developed an internal editor interface. Experts at the DSMZ are creating new media and curating the existing media using this interface. [adapted from https://mediadive.dsmz.de/about]

The mission of MediaDive is to transform poorly structured media recipes into a standardized database. The contents of the database include mined thousands of PDF and HTML documents. To ensure the quality of the media and continous improvement of the database, we developed an internal editor interface. Experts at the DSMZ are creating new media and curating the existing media using this interface. [adapted from https://mediadive.dsmz.de/about]

The mission of MediaDive is to transform poorly structured media recipes into a standardized database. The contents of the database include mined thousands of PDF and HTML documents. To ensure the quality of the media and continous improvement of the database, we developed an internal editor interface. Experts at the DSMZ are creating new media and curating the existing media using this interface. [adapted from https://mediadive.dsmz.de/about]

A semantic space of small molecules

Chemical reactions in the Merck Index. This website doesn't exist anymore

The MEROPS database is an information resource for peptidases (also termed proteases, proteinases and proteolytic enzymes) and the proteins that inhibit them.

Medical Subject Headings vocabulary is the set of predicates used in the MeSH RDF dump

MetaCyc is a database of non-redundant, experimentally elucidated metabolic pathways and enzymes. It also contains reactions, chemical compounds, and genes. It stores predominantly qualitative information rather than quantitative data, although it does contain some quantitative data such as enzyme kinetics data. MetaCyc is [curated](http://www.biocyc.org/glossary.shtml?sid=biocyc14-3908554027#Curation) from the scientific experimental literature according to an [extensive process](https://metacyc.org/MetaCycUserGuide.shtml#TAG:__tex2page_sec_4)]. The majority of pathways occurring in it are from microorganisms and plants. MetaCyc stores thousands of additional enzyme-catalyzed reactions that have not yet been assigned an EC number

A resource for exploring metabolism, starting with a set of of community-curated genome-scale metabolic models of human and model organisms, enriched with pathway maps and other tools for easy browsing and analysis.

A subspace of Metabolic Atlas for compartment-specific records for metabolites.

A subspace of Metabolic Atlas for reactions.