STARmap

Version 2.0 (use this one)

Version 2.0 (use this one)

Attribute Name Type Description Allowable Values Required
lab_id Textfield A locally assigned identifier provided by the data provider for the dataset. It is used to reference an external metadata record that may be maintained independently, enabling traceability and supporting provenance tracking. Example: Visium_9OLC_A4_S1   False
dataset_type Assigned Value The specific type of dataset being produced. Example: RNAseq Visium HD, 4i, LC-MS, Thick section Multiphoton MxIF, Light Sheet, ATACseq, Resolve, HiFi-Slide, COMET, MPLEx, 10X Multiome, MALDI, Raman Imaging, Histology, Cell DIVE, FACS, MS Lipidomics, Visium (no probes), MUSIC, RNAseq, GeoMx (NGS), GeoMx (nCounter), RNAseq (with probes), Singular Genomics G4X, Molecular Cartography, CosMx Transcriptomics, MERFISH, Pixel-seqV2, 2D Imaging Mass Cytometry, Confocal, seqFISH, DART-FISH, MIBI, Olink, Enhanced Stimulated Raman Spectroscopy (SRS), DESI, Xenium, CyCIF, SNARE-seq2, nanoSPLITS, STARmap, Stereo-seq, Visium (with probes), SIMS, Auto-fluorescence, CyTOF, CosMx Proteomics, Virtual Histology, DBiT-seq, PhenoCycler True
analyte_class Assigned Value The analyte class which is the target molecule that the assay is measuring. Example: DNA Nucleic acid + protein, Lipid + metabolite, Collagen, RNA, Fluorochrome, DNA, Metabolite, DNA + RNA, Saturated lipid, Lipid, RNA + protein, Peptide, Protein, Unsaturated lipid, Endogenous fluorophore, Chromatin, Polysaccharide True
acquisition_instrument_vendor Assigned Value The company that manufactures or supplies the acquisition instrument. An acquisition instrument is a device equipped with signal detection hardware and signal processing software. It captures signals produced by assays, such as variations in light intensity or color, or signals corresponding to molecular mass. If the instrument was custom-built or developed internally, enter “In-House”. Example: Illumina Complete Genomics, Cytek Biosciences, Thermo Fisher Scientific, Sciex, Vizgen, Leica Microsystems, Akoya Biosciences, Keyence, Andor, Standard BioTools (Fluidigm), Leica Biosystems, Zeiss Microscopy, Ionpath, Motic, In-House, Revvity, Evident Scientific (Olympus), GE Healthcare, Element Biosciences, Hamamatsu, Waters, Bruker, Illumina, 3DHISTECH, Singular Genomics, Huron Digital Pathology, Resolve Biosciences, NanoString, Cytiva, 10x Genomics, Microscopes International, BGI Genomics True
acquisition_instrument_model Assigned Value The specific model of the acquisition instrument, as manufacturers often offer various versions with differing features or sensitivities. These differences may be relevant to the processing or interpretation of the data. If the instrument was custom-built or developed internally, enter “In-House”. If the model is unknown, enter “Unknown”. Example: HiSeq 4000 NovaSeq X, NovaSeq X Plus, Cytek Northern Lights, Lightsheet 7, Resolve Biosciences Molecular Cartography, timsTOF HT, timsTOF Pro 2, timsTOF Pro, timsTOF Ultra, timsTOF Ultra 2, timsTOF SCP, Axio Scan.Z1, MALDI timsTOF Flex Prototype, CosMx Spatial Molecular Imager, Unknown, MERSCOPE Ultra, Juno System, timsTOF FleX, Custom: Multiphoton, CyTOF XT, Helios, EVOS M7000, Aperio AT2, Phenocycler-Fusion 2.0, Axio Observer 5, Axio Observer 7, Axio Observer 3, NanoZoomer-SQ, NanoZoomer S210, NanoZoomer S60, NanoZoomer S360, DM6 B, MoticEasyScan One, In-House, NextSeq 500, BZ-X710, QTRAP 5500, DMi8, NextSeq 550, HiSeq 2500, HiSeq 4000, NovaSeq 6000, Opera Phenix Plus HCS, SYNAPT G2-Si, Q Exactive HF, Orbitrap Fusion Tribrid, Orbitrap Fusion Lumos Tribrid, Q Exactive, VS200 Slide Scanner, Not applicable True
source_storage_duration_value Numeric The length of time the sample was stored prior to processing it. For assays performed on tissue sections, this refers to how long the tissue section (e.g., slide) was stored before the assay began (e.g., imaging). For assays performed on suspensions, such as sequencing, it refers to how long the suspension was stored before library construction started. Example: 12   True
source_storage_duration_unit Assigned Value The unit of measurement used to specify the source storage duration value. Example: hour hour, month, day, minute, year True
time_since_acquisition_instrument_calibration_value Numeric The length of time since the acquisition instrument was last serviced or calibrated. This provides a metric for assessing drift in data capture. Example: 10   False
time_since_acquisition_instrument_calibration_unit Assigned Value The unit of measurement used to specify the time since acquisition instrument calibration value. Example: month month, day, year False
preparation_protocol_doi Link The DOI for the protocols.io page that details the assay or the procedures used for sample procurement and preparation. For example, in the case of an imaging assay, the protocol may start with tissue section staining and end with the generation of an OME-TIFF file. The documented protocol should also include any image processing steps involved in producing the final OME-TIFF. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1   True
is_targeted Radio Indicates whether a specific molecule or set of molecules is targeted for detection or measurement by the assay. Example: Yes Yes, No True
contributors_path Textfield The name of the file containing the ORCID IDs for all contributors to this dataset. Example: ./contributors.csv   True
data_path Textfield The top-level directory containing the raw and/or processed data. For a single dataset upload, this might be represented as “.”, whereas for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For example, if the data is within a directory named “TEST001-RK”, use the syntax “./TEST001-RK” for this field. If there are multiple directory levels, use the format “./TEST001-RK/Run1/Pass2”, where “Pass2” is the subdirectory where the single dataset’s data is stored. This is an internal metadata field used solely for data ingestion. Example: ./TEST001-RK   True
parent_sample_id Textfield The unique identifier from HuBMAP or SenNet for the sample (such as a block, section, or suspension) used to perform the assay. For instance, in an RNAseq assay, the parent sample would be the suspension, while in imaging assays, it would be the tissue section. If the assay is derived from multiple parent samples, this field should contain a comma-separated list of identifiers. Example: HBM386.ZGKG.235, HBM672.MKPK.442   True
mapped_area_value Numeric The mapped area value, which refers to the specific area covered or captured in various assays. For Visium, it is the area of spots covered by tissue within the captured area, excluding the total possible captured area. For GeoMx, it refers to the area of the AOI being captured. In HiFi, it is the summed area of the ROIs in a single flowcell lane. For CosMx and Resolve, it indicates the area of the FOV (also known as ROI) region being captured. For Xenium, it is the total area of the FOV regions (also known as ROI) being captured. For Stereo-Seq, this value represents the number of beads. Example: 42.25   True
mapped_area_unit Assigned Value The unit of measurement for the mapped area value. If mapping area is not specified, this field may be left blank. Example: um^2 um^2, mm^2 True
target_retrieval_incubation_temperature Numeric The incubation temperature required for target retrieval, which is typically 100 degrees Celsius for RNA assays and 80 degrees Celsius for protein assays. Example: 100   False
target_retrieval_incubation_time_value Numeric The duration for which a sample is exposed to a target retrieval solution. Example: 15   False
target_retrieval_incubation_time_unit Assigned Value The unit of measurement for the target retrieval incubation time value. If no incubation time is specified, this field may be left blank. Example: minute minute False
proteinasek_concentration Numeric The concentration of the enzyme Proteinase K within a sample, measured in micrograms per milliliter (ug/ml). Example: 10   False
proteinasek_incubation_time_value Numeric The duration for which a sample is incubated with Proteinase K. Example: 15   False
proteinasek_incubation_time_unit Assigned Value The unit of measurement for the proteinaseK incubation time value. If no incubation time is specified, this field may be left blank. Example: minute minute False
probe_hybridization_time_value Numeric The duration for which the oligo-conjugated RNA or oligo-conjugated antibody probes were hybridized with the sample. Example: 30   False
probe_hybridization_time_unit Assigned Value The unit of measurement for the probe hybridization time value. If the hybridization time is not specified, this field may be left blank. Example: minute hour, minute False
is_custom_probes_used Radio Indicates whether custom RNA or antibody probes were utilized in the assay. If custom probes were employed, they should be documented in the “custom_probe_set.csv” file. Example: No Yes, No True
number_of_panel_targets Numeric The number of panel targets, which refers to the total count of genes, RNA isoforms, or RNA regions that are targeted by probes. Example: 1000   True
anatomical_structure_label Textfield The label for the overarching anatomical structure. If the anatomical structure is not applicable or not specified, this field may be left blank. Example: Kidney   False
anatomical_structure_id Textfield The ontology ID associated with the anatomical structure, typically represented by an UBERON ID. Example: UBERON:0002113   False
non_global_files Textfield Specifies a semicolon-separated list of non-global files that are to be included in the dataset. The file paths assume that the files are located in the “TOP/non-global/” directory. For instance, if the file is located at TOP/non-global/lab_processed/images/1-tissue-boundary.geojson, the value for this field would be “./lab_processed/images/1-tissue-boundary.geojson”. Once ingested, these files will be copied to their appropriate locations within the respective dataset directory tree. This field is intended for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee Example: ./lab_processed/images/1-tissue-boundary.geojson   False
metadata_schema_id Textfield The unique string identifier for the metadata specification version, which is easily interpretable by computers for purposes of data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9   True