| lab_id |
Textfield |
An internal field labs can use it to add whatever ID(s) they want or need for dataset validation and tracking. This could be a single ID (e.g., “Visium_9OLC_A4_S1”) or a delimited list of IDs (e.g., “9OL; 9OLC.A2; Visium_9OLC_A4_S1”). This field will not be accessible to anyone outside of the consortium and no effort will be made to check if IDs provided by one data provider are also used by another. |
|
False |
| dataset_type |
Assigned Value |
The specific type of dataset being produced. |
Visium HD, 4i, LC-MS, Thick section Multiphoton MxIF, Light Sheet, ATACseq, Resolve, HiFi-Slide, MPLEx, 10X Multiome, MALDI, Histology, Cell DIVE, FACS, MS Lipidomics, Visium (no probes), MUSIC, RNAseq, GeoMx (NGS), GeoMx (nCounter), RNAseq (with probes), Singular Genomics G4X, Molecular Cartography, CosMx Transcriptomics, MERFISH, Pixel-seqV2, 2D Imaging Mass Cytometry, Confocal, seqFISH, DART-FISH, MIBI, Olink, Enhanced Stimulated Raman Spectroscopy (SRS), DESI, Xenium, CyCIF, SNARE-seq2, nanoSPLITS, Stereo-seq, Visium (with probes), SIMS, Auto-fluorescence, CyTOF, CosMx Proteomics, DBiT-seq, PhenoCycler, CODEX, Second Harmonic Generation (SHG), Seq-Scope |
True |
| analyte_class |
Assigned Value |
Analytes are the target molecules being measured with the assay. |
Nucleic acid + protein, Lipid + metabolite, Collagen, RNA, Fluorochrome, DNA, Metabolite, DNA + RNA, Saturated lipid, Lipid, Peptide, Protein, Unsaturated lipid, Endogenous fluorophore, Chromatin, Polysaccharide |
True |
| acquisition_instrument_vendor |
Assigned Value |
An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass. |
Complete Genomics, Cytek Biosciences, Thermo Fisher Scientific, Sciex, Vizgen, Leica Microsystems, Akoya Biosciences, Keyence, Andor, Standard BioTools (Fluidigm), Leica Biosystems, Zeiss Microscopy, Ionpath, Motic, In-House, Evident Scientific (Olympus), GE Healthcare, Element Biosciences, Hamamatsu, Bruker, Illumina, 3DHISTECH, Singular Genomics, Huron Digital Pathology, Resolve Biosciences, NanoString, Cytiva, 10x Genomics, Microscopes International, BGI Genomics |
True |
| acquisition_instrument_model |
Assigned Value |
Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data. |
NovaSeq X, NovaSeq X Plus, Cytek Northern Lights, Lightsheet 7, Resolve Biosciences Molecular Cartography, timsTOF HT, timsTOF Pro 2, timsTOF Pro, timsTOF Ultra, timsTOF Ultra 2, timsTOF SCP, Axio Scan.Z1, MALDI timsTOF Flex Prototype, CosMx Spatial Molecular Imager, Unknown, MERSCOPE Ultra, Juno System, timsTOF FleX, Custom: Multiphoton, CyTOF XT, Helios, EVOS M7000, Aperio AT2, Phenocycler-Fusion 2.0, Axio Observer 5, Axio Observer 7, Axio Observer 3, NanoZoomer-SQ, NanoZoomer S210, NanoZoomer S60, NanoZoomer S360, DM6 B, MoticEasyScan One, In-House, NextSeq 500, BZ-X710, QTRAP 5500, NextSeq 550, HiSeq 2500, HiSeq 4000, NovaSeq 6000, Q Exactive HF, Orbitrap Fusion Lumos Tribrid, Q Exactive, VS200 Slide Scanner, Not applicable, Orbitrap Eclipse Tribrid, MIBIscope, IN Cell Analyzer 2200, timsTOF FleX MALDI-2 |
True |
| source_storage_duration_value |
Numeric |
How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began. |
|
True |
| source_storage_duration_unit |
Assigned Value |
The time duration unit of measurement |
hour, month, day, minute, year |
True |
| time_since_acquisition_instrument_calibration_value |
Numeric |
The amount of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. |
|
False |
| time_since_acquisition_instrument_calibration_unit |
Assigned Value |
The time unit of measurement |
month, day, year |
False |
| preparation_protocol_doi |
Link |
DOI for the protocols.io page that describes the assay or sample procurement and preparation. For example for an imaging assay, the protocol might begin with staining of a section and finalize with the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1. |
|
True |
| is_targeted |
Radio |
Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay (“Yes” or “No”). The CODEX analyte is protein. |
Yes, No |
True |
| contributors_path |
Textfield |
The path to the file with the ORCID IDs for all contributors of this dataset (e.g., “./extras/contributors.tsv” or “./contributors.tsv”). This is an internal metadata field that is just used for ingest. |
|
True |
| data_path |
Textfield |
The top level directory containing the raw and/or processed data. For a single dataset upload this might be “.” where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called “TEST001-RK” use syntax “./TEST001-RK” for this field. If there are multiple directory levels, use the format “./TEST001-RK/Run1/Pass2” in which “Pass2” is the subdirectory where the single dataset’s data is stored. This is an internal metadata field that is just used for ingest. |
|
True |
| parent_sample_id |
Textfield |
Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102 |
|
True |
| mapped_area_value |
Numeric |
For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, this is the area of the FOV (aka ROI) region being captured. For Xenium this is the total area of the FOV regions (aka ROI) being captured. For Stereo-Seq this is the number of beads. |
|
True |
| mapped_area_unit |
Assigned Value |
The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2. |
um^2, mm^2 |
True |
| spot_size_value |
Numeric |
For assays where spots are used to define discrete capture areas, this is the area of a spot. |
|
True |
| spot_size_unit |
Assigned Value |
The unit for spot size value. |
um^2, mm^2 |
True |
| number_of_spots |
Numeric |
Number of capture spots within the mapped area. For Visium this would be the number of spots covered by tissue, while it’s the number of spots within ROIs for HiFi. |
|
True |
| permeabilization_time_value |
Numeric |
Permeabilization time used for this tissue section. |
|
True |
| permeabilization_time_unit |
Assigned Value |
The unit for the permeabilization time. |
minute |
True |
| slide_id |
Textfield |
A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers. |
|
True |
| metadata_schema_id |
Textfield |
The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9 |
|
True |
| number_of_additional_stains |
Numeric |
This would be minimally 2 (always include DAPI and polyT) and can include 6 more. |
|
True |