COMET

Version 2.0 (use this one)

Version 2.0 (use this one)

Attribute Name Type Description Allowable Values Required
lab_id Textfield A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1   False
source_storage_duration_value Numeric The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12   True
time_since_acquisition_instrument_calibration_value Numeric The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10   False
contributors_path Textfield The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv   True
data_path Textfield The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as “.”, whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named “TEST001-RK”, use the syntax “./TEST001-RK” for this field. If there are multiple directory levels, use the format “./TEST001-RK/Run1/Pass2”, where “Pass2” is the subdirectory where the single dataset’s data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK   True
number_of_antibodies Numeric The number of antibodies used in the assay. If no antibodies were utilized, enter 0. Example: 5   True
number_of_biomarker_imaging_rounds Numeric The number of imaging rounds required to capture the tagged biomarkers. For CODEX, a biomarker imaging round includes steps such as (1) oligo application, (2) fluor application, and (3) washes. For Cell DIVE, it involves (1) the staining of a biomarker via secondary detection or direct conjugate, followed by (2) dye inactivation. Example: 3   True
number_of_total_imaging_rounds Numeric The total number of imaging rounds performed using a microscope to collect either autofluorescence/background or stained signals, such as those used in histological analysis. Example: 5   True
slide_id Textfield The unique identifier assigned to each slide, enabling users to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name to prevent overlapping values across different centers. Example: VAN0071-PA-1-1_AF   True
dataset_type Assigned Value The specific type of dataset being produced. Example: RNAseq Visium HD, 4i, LC-MS, Thick section Multiphoton MxIF, Light Sheet, ATACseq, Resolve, HiFi-Slide, COMET, MPLEx, 10X Multiome, MALDI, Raman Imaging, Histology, Cell DIVE, FACS, MS Lipidomics, Visium (no probes), MUSIC, RNAseq, GeoMx (NGS), GeoMx (nCounter), RNAseq (with probes), Singular Genomics G4X, Molecular Cartography, CosMx Transcriptomics, MERFISH, Pixel-seqV2, 2D Imaging Mass Cytometry, Confocal, seqFISH, DART-FISH, MIBI, Olink, Enhanced Stimulated Raman Spectroscopy (SRS), DESI, Xenium, iCLAP, CyCIF, SNARE-seq2, nanoSPLITS, STARmap, Stereo-seq, Visium (with probes), SIMS, Auto-fluorescence, CyTOF, CosMx Proteomics, Virtual Histology, DBiT-seq True
analyte_class Assigned Value The analyte class which is the target molecule that the assay is measuring. Example: DNA Nucleic acid + protein, Lipid + metabolite, Collagen, RNA, Fluorochrome, DNA, Metabolite, DNA + RNA, Saturated lipid, Lipid, Lipid + metabolite + protein, RNA + protein, Peptide, Protein, Unsaturated lipid, Endogenous fluorophore, Chromatin, Polysaccharide True
acquisition_instrument_vendor Assigned Value The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter “In-House”. Example: Illumina Complete Genomics, Cytek Biosciences, Thermo Fisher Scientific, Sciex, Vizgen, Leica Microsystems, Akoya Biosciences, Keyence, Andor, Standard BioTools (Fluidigm), Leica Biosystems, Zeiss Microscopy, Ionpath, Motic, In-House, Revvity, Evident Scientific (Olympus), GE Healthcare, Element Biosciences, Hamamatsu, Waters, Bruker, Illumina, 3DHISTECH, Singular Genomics, Huron Digital Pathology, Resolve Biosciences, NanoString, Cytiva, 10x Genomics, Microscopes International, BGI Genomics True
acquisition_instrument_model Assigned Value The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter “In-House”. If the model is unknown, enter “Unknown”. Example: HiSeq 4000 NovaSeq X, NovaSeq X Plus, Cytek Northern Lights, Lightsheet 7, Resolve Biosciences Molecular Cartography, timsTOF HT, timsTOF Pro 2, timsTOF Pro, timsTOF Ultra, timsTOF Ultra 2, timsTOF SCP, Axio Scan.Z1, MALDI timsTOF Flex Prototype, CosMx Spatial Molecular Imager, Unknown, MERSCOPE Ultra, Juno System, timsTOF FleX, Custom: Multiphoton, CyTOF XT, Helios, EVOS M7000, Aperio AT2, Phenocycler-Fusion 2.0, Axio Observer 5, Axio Observer 7, Axio Observer 3, NanoZoomer-SQ, NanoZoomer S210, NanoZoomer S60, NanoZoomer S360, DM6 B, MoticEasyScan One, In-House, NextSeq 500, BZ-X710, QTRAP 5500, DMi8, NextSeq 550, HiSeq 2500, HiSeq 4000, NovaSeq 6000, Opera Phenix Plus HCS, SYNAPT G2-Si, Q Exactive HF, Orbitrap Fusion Tribrid, Orbitrap Fusion Lumos Tribrid, Q Exactive, VS200 Slide Scanner, Not applicable True
source_storage_duration_unit Assigned Value The unit of measurement used to specify the source storage duration value. Example: hour hour, month, day, minute, year True
time_since_acquisition_instrument_calibration_unit Assigned Value The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month month, day, year False
metadata_schema_id Textfield The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9   True
preparation_protocol_doi Link The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1   True
is_targeted Radio Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes Yes, No True
antibodies_path Textfield The path to the antibodies.tsv file relative to the root directory of the upload structure. This path should start with “.” and is typically formatted as “./extras/antibodies.tsv”. Example: ./extras/antibodies.tsv   True
parent_sample_id Textfield The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442   True
non_global_files Textfield Specifies a semicolon-separated list of non-global files that are to be included in the dataset. The file paths assume that the files are located in the “TOP/non-global/” directory. For instance, if the file is located at TOP/non-global/lab_processed/images/1-tissue-boundary.geojson, the value for this field would be “./lab_processed/images/1-tissue-boundary.geojson”. Once ingested, these files will be copied to their appropriate locations within the respective dataset directory tree. This field is intended for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee Example: ./lab_processed/images/1-tissue-boundary.geojson   False
cell_boundary_marker_or_stain Textfield The name of the marker or stain used to identify all cell boundaries in the tissue. This name must exactly match the antibody-targeted molecule marker or non-antibody targeted molecule stain as found in the imaging data. For example, in the case of using the PhenoCycler, ensure the name corresponds to the value in the XPD output file. If multiple markers or stains are employed, list them in a comma-separated format. Example: Pan-Cytokeratin, E-Cadherin   False
nuclear_marker_or_stain Textfield The nuclear marker or stain used, which can be an antibody-targeted molecule present in or around the cell nucleus. For protein targets, use the protein or gene symbol that identifies the antibody target, ensuring it matches the antibody target from the panel used or custom panels. Preferably, if using a custom antibody marker, this symbol should be the HGNC symbol (https://www.genenames.org/). For non-protein targets, provide the stain name (e.g., DAPI) and, when applicable, include the associated staining kit and vendor. For the PhenoCycler, ensure the symbol matches the value found in the XPD output file. Example: DAPI   False
number_of_channels Numeric The number of fluorescent channels that are imaged during each cycle. Example: 3   True