Find open-source science resources

A directory of tools, AI models, datasets, and research resources for biotech, bioinformatics, and other scientific fields. Aggregated from curated GitHub awesome-lists, HuggingFace, bio.tools, Bioconductor, and more.

2,418 of 5,923 resources

Showing 451500

A vocabuarly for describing the background and methodology for the design of the DataCite profile of DCAT-AP (CiteDCAT-AP), as well as the defined mappings.

One of the precursors to the EuropePMC project. Now EuropePMC is able to resolve CiteXplore codes.

Representation of languages

Concepts for encoding bibliometric information

ClinGen is a National Institutes of Health (NIH)-funded resource that defines the clinical relevance of genes and variants for use in precision medicine and research. This prefix provides identifiers for a panel of experts performing variant pathogenecity evaluation.

ClinGen is a National Institutes of Health (NIH)-funded resource that defines the clinical relevance of genes and variants for use in precision medicine and research. This prefix provides and maintains identifiers for alleles.

ClinGen is a National Institutes of Health (NIH)-funded resource that defines the clinical relevance of genes and variants for use in precision medicine and research. This prefix provides identifiers for curations representing evidence aggregation and expert panel assertions based on standardized evaluation procedures

This ontology represents the clinical findings and procedures used in the oral and maxillo-facial surgical domain

The China National Center for Bioinformation's (CNCB) Genome Warehouse (GWH) is a public repository holding genetic information for a wide range of species including humans, plants, animals, and microorganisms. Identifiers in this resource correspond to genomes of various species. The goal of the resource is to make genomic data accessible to researchers in areas like precision medicine and biotechnology.

identifier for an academic research group issued by the CNRS

This vocabulary is intended to provide a flexible framework within different usage scenarios to semantically represent any type of content, be it on the Web or in local storage media. For example, it can be used by web quality assurance tools such as web accessibility evaluation tools to record a representation of the assessed web content, including text, images, or other types of formats. In many cases, it can be used together with HTTP Vocabulary in RDF 1.0, which allows quality assurance tools to record the HTTP headers that have been exchanged between a client and a server. This is particularly useful for quality assurance testing, conformance claims, and reporting languages like the W3C Evaluation And Report Language (EARL). [from homepage]

COCONUT (COlleCtion of Open Natural ProdUcTs) Online is an open source project for Natural Products (NPs) storage, search and analysis. It gathers data from over 50 open NP resources and is available free of charge and without any restriction. Each entry corresponds to a "flat" NP structure, and is associated, when available, to their known stereochemical forms, literature, organisms that produce them, natural geographical presence and diverse pre-computed molecular properties.

The goal of the CODATA Research Data Management Terminology is to gather the key terms needed for a common understanding of the research data management domain. The RDMT was revised by the CODATA RDM Terminology Working Group, shared for public review, and then confirmed and finalised in 2023. The RDMT grew out of the CASRAI Research Data Management Glossary, which was intended as a practical reference for individuals and groups concerned with the improvement of research data management (RDM). In 2020, CASRAI requested that CODATA assume responsibility for the curation of this valued resource. To that end, the RDM Terminology Working Group uses a lightweight and pragmatic biennial process to review the resource now restructured as the CODATA RDM Terminology and suggest any edits, additions and removals that are required in order to develop and improve this important reference resource.

GE Healthcare/Amersham Biosciences CodeLink? Human Whole Genome Bioarray targets most of the known and predictive genes of the human genome as it is described today in the public domain. It is comprised of approximately 55,000 30-mer probes designed to conserved exons across the transcripts of targeted genes. These 55,000 probes represent well annotated, full length, and partial human gene sequences from major public databases. GE Healthcare/Amersham Biosciences CodeLink? Human Whole Genome Bioarray probe sequences were selected from the NCBI UniGene build #165, RefSeq database (January 5, 2004 release) and dbEST database (January 8, 2004 release).

COEXISTENCE - Thesaurus of intersectionality and decolonial issues: black studies, gender, sexuality and feminist studies

Higher-level classifications of COG Pathways

COGs stands for Clusters of Orthologous Genes. The database was initially created in 1997 (Tatusov et al., PMID: 9381173) followed by several updates, most recently in 2014 (Galperin et al., PMID: 25428365). The current update includes complete genomes of 1,187 bacteria and 122 archaea that map into 1,234 genera. The new features include ~250 updated COG annotations with corresponding references and PDB links, where available; new COGs for proteins involved in CRISPR-Cas immunity, sporulation, and photosynthesis, and the lists of COGs grouped by pathways and functional systems.

Database of Clusters of Orthologous Genes grouped by pathways and functional systems. It includes the complete genomes of 1,187 bacteria and 122 archaea that map into 1,234 genera.

Contains identifiers of cohesin binding sites in human cells from CohesinDB. CohesinDB includes 2043 epigenomics, transcriptomics and 3D genomics datasets from 530 studies involving 176 cell types. Each cohesin object is annotated with locus, cell type, classification, function, 3D genomics and cis-regulatory information.

Contains identifiers for genes that are part of cohesin regulated CRMs (cis-regulatory modules) from CohesinDB. CohesinDB includes 2043 epigenomics, transcriptomics and 3D genomics datasets from 530 studies involving 176 cell types. Each identifier represents a single cohesin-related CRM. Each cohesin object is annotated with locus, cell type, classification, function, 3D genomics and cis-regulatory information.

Contains identifiers of cohesin-related chromatin loops from CohesinDB. CohesinDB includes 2043 epigenomics, transcriptomics and 3D genomics datasets from 530 studies involving 176 cell types. Each identifier represents a single cohesin-related chromatin loop. Each cohesin object is annotated with locus, cell type, classification, function, 3D genomics and cis-regulatory information.

COI Catalogue is a herbarium with c. 800.000 specimens, organised in separate collections due to the research priorities over the years.

Identifier (name code) for a taxon in the catalogue of life in taiwan