GeoMx

NOTE: Several versions of this metadata schema have been created over time. The (Latest) version contains most attributes, but there may be some deprecated attributes in the older versions for which data has been collected. HuBMAP is in the process of creating a reference which combines all of these versions into a single view. That reference will be available here once completed.

NGS Version 2 (current)

NGS Version 2 (current)

attribute type description value required
dataset_type Textfield The specific type of dataset being produced.   True
analyte_class Textfield Analytes are the target molecules being measured with the assay.   True
acquisition_instrument_vendor Textfield An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass.   True
acquisition_instrument_model Textfield Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data.   True
source_storage_duration_value Numeric How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began.   True
source_storage_duration_unit Textfield The time duration unit of measurement   True
time_since_acquisition_instrument_calibration_value Numeric The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture.   False
time_since_acquisition_instrument_calibration_unit Textfield The time unit of measurement   False
preparation_protocol_doi Textfield DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1   True
is_targeted Allowable Value Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay (“Yes” or “No”). The CODEX analyte is protein. [‘Yes’, ‘No’] True
contributors_path Textfield The path to the file with the ORCID IDs for all contributors of this dataset (e.g., “./extras/contributors.tsv” or “./contributors.tsv”). This is an internal metadata field that is just used for ingest.   True
data_path Textfield The top level directory containing the raw and/or processed data. For a single dataset upload this might be “.” where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called “TEST001-RK” use syntax “./TEST001-RK” for this field. If there are multiple directory levels, use the format “./TEST001-RK/Run1/Pass2” in which “Pass2” is the subdirectory where the single dataset’s data is stored. This is an internal metadata field that is just used for ingest.   True
parent_sample_id Textfield Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102   True
mapped_area_value Numeric For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured.   True
mapped_area_unit Textfield The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2.   True
slide_id Textfield A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers.   True
number_of_channels Numeric The number of distinct color channels in the image.   True
target_retrieval_incubation_temperature Numeric Will normally be 100 degrees Celsius for RNA assays, and 80 degrees Celsius for protein assays.   True
target_retrieval_incubation_time_value Numeric The duration for which a sample is exposed to a target retrieval solution.   True
target_retrieval_incubation_time_unit Textfield The units for target retrieval incubation time value.   True
proteinasek_concentration Numeric The amount or concentration of the enzyme Proteinase K within a sample (in ug/ml).   False
proteinasek_incubation_time_value Numeric The duration for which a sample is exposed to Proteinase K.   False
proteinasek_incubation_time_unit Textfield The units for proteinaseK incubation time value.   False
roi_label Textfield A label for the region of interest (ROI). For Xenium, Resolve and CosMx, this is the field of view (FOV) label. For GeoMx this can be found in the “Initial Dataset” spreadsheet (download from within Data Analysis Suite).   True
is_roi_segmentation_performed Allowable Value Was the image segmented. For GeoMx this refers to whether segmentation was used to split ROIs (regions of interest) into AOIs (areas of interest). [‘Yes’, ‘No’] True
roi_segmentation_strategy Textfield The method of segmentation that was applied in a GeoMx assay. If an overlay was used the overlay image needs to be included in the dataset upload.   False
anatomical_structure_label Textfield The overarching anatomical structure.   False
anatomical_structure_id Textfield The ontology ID for the parent structure. Typically this would be an UBERON ID.   False
targeted_entity_label Textfield State what cell type(s) or functional tissue unit was targeted in this ROI/AOI.   True
targeted_entity_id Textfield The ontology ID for the targeted entity.   False
segment_id Textfield This is the ID for the area of interest (AOI) in a GeoMx dataset. From “Initial Dataset” spreadsheet (download from within Data Analysis Suite), e.g. 9a828e39-43d8-4051-9bcc-581a520a85d4.   True
is_technical_replicate Allowable Value Is the sequencing reaction run in replicate, “Yes” or “No”. If “Yes”, FASTQ files in dataset need to be merged. [‘Yes’, ‘No’] True
metadata_schema_id Textfield The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9   True
non_global_files Textfield A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the “TOP/non-global/” directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be “./lab_processed/images/1-tissue-boundary.geojson”. After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee   True
nCounter Version 2 (current)

nCounter Version 2 (current)

attribute type description value required
dataset_type Textfield The specific type of dataset being produced.   True
analyte_class Textfield Analytes are the target molecules being measured with the assay.   True
acquisition_instrument_vendor Textfield An acquisition instrument is the device that contains the signal detection hardware and signal processing software. Assays generate signals such as light of various intensities or color or signals representing the molecular mass.   True
acquisition_instrument_model Textfield Manufacturers of an acquisition instrument may offer various versions (models) of that instrument with different features or sensitivities. Differences in features or sensitivities may be relevant to processing or interpretation of the data.   True
source_storage_duration_value Numeric How long was the source material stored, prior to this sample being processed? For assays applied to tissue sections, this would be how long the tissue section (e.g., slide) was stored, prior to the assay beginning (e.g., imaging). For assays applied to suspensions such as sequencing, this would be how long the suspension was stored before library construction began.   True
source_storage_duration_unit Textfield The time duration unit of measurement   True
time_since_acquisition_instrument_calibration_value Numeric The amount of time since the acqusition instrument was last serviced by the vendor. This provides a metric for assessing drift in data capture.   False
time_since_acquisition_instrument_calibration_unit Textfield The time unit of measurement   False
preparation_protocol_doi Textfield DOI for the protocols.io page that describes the assay or sample procurment and preparation. For example for an imaging assay, the protocol might include staining of a section through the creation of an OME-TIFF file. In this case the protocol would include any image processing steps required to create the OME-TIFF file. Example: https://dx.doi.org/10.17504/protocols.io.eq2lyno9qvx9/v1   True
is_targeted Allowable Value Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement by the assay (“Yes” or “No”). The CODEX analyte is protein. [‘Yes’, ‘No’] True
contributors_path Textfield The path to the file with the ORCID IDs for all contributors of this dataset (e.g., “./extras/contributors.tsv” or “./contributors.tsv”). This is an internal metadata field that is just used for ingest.   True
data_path Textfield The top level directory containing the raw and/or processed data. For a single dataset upload this might be “.” where as for a data upload containing multiple datasets, this would be the directory name for the respective dataset. For instance, if the data is within a directory called “TEST001-RK” use syntax “./TEST001-RK” for this field. If there are multiple directory levels, use the format “./TEST001-RK/Run1/Pass2” in which “Pass2” is the subdirectory where the single dataset’s data is stored. This is an internal metadata field that is just used for ingest.   True
parent_sample_id Textfield Unique HuBMAP or SenNet identifier of the sample (i.e., block, section or suspension) used to perform this assay. For example, for a RNAseq assay, the parent would be the suspension, whereas, for one of the imaging assays, the parent would be the tissue section. If an assay comes from multiple parent samples then this should be a comma separated list. Example: HBM386.ZGKG.235, HBM672.MKPK.442 or SNT232.UBHJ.322, SNT329.ALSK.102   True
mapped_area_value Numeric For Visium, this is the area of spots that was covered by tissue within the captured area, not the total possible captured area which is fixed. For GeoMx this would be the area of the AOI being captured. For HiFi this is the summed area of the ROIs in a single flowcell lane. For CosMx, Xenium and Resolve, this is the area of the FOV (aka ROI) region being captured.   True
mapped_area_unit Textfield The unit of measurement for the mapping area. For Visium and GeoMx this is typically um^2.   True
slide_id Textfield A unique ID denoting the slide used. This allows users the ability to determine which tissue sections were processed together on the same slide. It is recommended that data providers prefix the ID with the center name, to prevent values overlapping across centers.   True
number_of_channels Numeric The number of distinct color channels in the image.   True
target_retrieval_incubation_temperature Numeric Will normally be 100 degrees Celsius for RNA assays, and 80 degrees Celsius for protein assays.   True
target_retrieval_incubation_time_value Numeric The duration for which a sample is exposed to a target retrieval solution.   True
target_retrieval_incubation_time_unit Textfield The units for target retrieval incubation time value.   True
proteinasek_concentration Numeric The amount or concentration of the enzyme Proteinase K within a sample (in ug/ml).   False
proteinasek_incubation_time_value Numeric The duration for which a sample is exposed to Proteinase K.   False
proteinasek_incubation_time_unit Textfield The units for proteinaseK incubation time value.   False
roi_label Textfield A label for the region of interest (ROI). For Xenium, Resolve and CosMx, this is the field of view (FOV) label. For GeoMx this can be found in the “Initial Dataset” spreadsheet (download from within Data Analysis Suite).   True
is_roi_segmentation_performed Allowable Value Was the image segmented. For GeoMx this refers to whether segmentation was used to split ROIs (regions of interest) into AOIs (areas of interest). [‘Yes’, ‘No’] True
roi_segmentation_strategy Textfield The method of segmentation that was applied in a GeoMx assay. If an overlay was used the overlay image needs to be included in the dataset upload.   False
anatomical_structure_label Textfield The overarching anatomical structure.   False
anatomical_structure_id Textfield The ontology ID for the parent structure. Typically this would be an UBERON ID.   False
targeted_entity_label Textfield State what cell type(s) or functional tissue unit was targeted in this ROI/AOI.   True
targeted_entity_id Textfield The ontology ID for the targeted entity.   False
segment_id Textfield This is the ID for the area of interest (AOI) in a GeoMx dataset. From “Initial Dataset” spreadsheet (download from within Data Analysis Suite), e.g. 9a828e39-43d8-4051-9bcc-581a520a85d4.   True
is_technical_replicate Allowable Value Is the sequencing reaction run in replicate, “Yes” or “No”. If “Yes”, FASTQ files in dataset need to be merged. [‘Yes’, ‘No’] True
metadata_schema_id Textfield The string that serves as the definitive identifier for the metadata schema version and is readily interpretable by computers for data validation and processing. Example: 22bc762a-5020-419d-b170-24253ed9e8d9   True
hybcode_pack_lot_number Textfield Enter the lot number noted within the LabWorksheet.txt file (and used in downstream nCounter processing).   True
probe_hybridization_time_value Numeric How many hours were the oligo-conjugated RNA or oligo-conjugated antibody probes hybridized with the sample?   True
probe_hybridization_time_unit Textfield The units for probe hybridization time value.   True
oligo_probe_panel Textfield This is the probe panel used to target genes and/or proteins. In cases where there is a core panel and add-on modules, the core panel should be selected here. If additional panels are used, then they must be included in the “additional_panels_used.csv” file that’s uploaded with the dataset.   True
is_custom_probes_used Allowable Value State (“Yes” or “No”) whether custom RNA or antibody probes were used. If custom probes were used, they must be listed in the “custom_probe_set.csv” file. [‘Yes’, ‘No’] True
non_global_files Textfield A semicolon separated list of non-shared files to be included in the dataset. The path assumes the files are located in the “TOP/non-global/” directory. For example, for the file is TOP/non-global/lab_processed/images/1-tissue-boundary.geojson the value of this field would be “./lab_processed/images/1-tissue-boundary.geojson”. After ingest, these files will be copied to the appropriate locations within the respective dataset directory tree. This field is used for internal HuBMAP processing. Examples for GeoMx and PhenoCycler are provided in the File Locations documentation: https://docs.google.com/document/d/1n2McSs9geA9Eli4QWQaB3c9R3wo5d5U1Xd57DWQfN5Q/edit#heading=h.1u82i4axggee   True