Section 12 - dbGap Submission
After the registration of a dbGaP study by a data provider, the submission of data to that dbGaP study is a joint process undertaken by both the data provider and the HIVE or CODCC. We encourage data providers to create a dbGaP study and notify the HIVE or CODCC as early as possible. NOTE: The dbGaP submission process can take up to six months after the publication of your sequence data in the HuBMAP Data portal or SenNet Data Sharing portal.
- Identify an NIH GPA: The Data Provider PI works with their Program Officer to identify an NIH Genomic Program Administrator (GPA) who will help them prepare a study for dbGaP following the process outlined here.
- Register dbGaP study / create Bioproject: The Data Provider PI and NIH GPA register a study in the dbGaP Submission System and create a Bioproject for the study. This step should be done by the data provider as early as possible, ideally before any data are generated. When establishing a dbGaP study, please ensure the title of the study starts with “HuBMAP” or “SenNet”, as appropriate. For example: “HuBMAP: A Spatially Resolved Molecular Atlas of Human Endothelium”. See this example of a HuBMAP dbGaP study.
- Designate Data Submitters: The data provider PI or PM identifies a team member to serve in this capacity.
- This person contacts the HIVE (or CODCC) via the HuBMAP Helpdesk (or SenNet Helpdesk).
- The HIVE (or CODCC) identifies a HIVE (or CODCC) data submitter to work with the team member on the dbGap data submission.
- The data provider PI adds the data submitters to the dbGaP study on the NCBI dbGaP submission portal.
NOTE: This is distinct from the dbGaP Submission system identified in step 2, above.
- Submission Portal Questionnaire: The Data Provider PI (or their designated data submitter) completes this on the NCBI dbGaP submission portal for the registered study (step 2 above).
- Required submission forms (see list below): Following the dbGaP instructions, the data submitters work together to complete the dbGaP Submission Guide Templates. Gather and complete ALL files BEFORE submitting.
These forms will include at least the following for HuBMAP projects:
- Study Config: To be completed by designated data submitter (online web form).
- Subject Consent DS & DD: To be completed by the HIVE or CODCC data submitter. (DS = Dataset, DD = Data Dictionary)
- Subject Sample Mapping DS & DD: To be completed by the HIVE data submitter.
- Sample Attributes DS & DD: To be completed by the HIVE data submitter.
NOTE: DS = Dataset, DD = Data Dictionary
Depending on the answers provided to the Submission Portal Questionnaire, the designated data submitter may also need to complete the following files for upload:
- Mapping Study Samples (e.g. Sample GEO Mapping DS and DD)
- Subject Phenotypes DS & DD
- Pedigree DS & DD
- Study Documents (consent forms, protocols, etc.)
- Review checklist, submit forms: The HIVE (or CODCC) data submitter reviews the checklist to ensure that the Phenotype Datasets and Data dictionary files pass the dbGaP quality control tests. Then the HIVE (or CODCC) data submitter submits all required forms.
- dbGaP contacts submitters: dbGaP Phenotype curator contacts the submitters when the above information has been loaded into dbGaP and entities have received NCBI BioSample and SRA IDs.
- Upload sequence metadata: The HIVE (or CODCC) data submitter follows instructions to upload sequencing metadata.
- Upload raw sequence reads After validation of the sequence metadata, the HIVE (or CODCC) data submitter works with a SRA curator to upload raw sequence reads to a protected area of SRA.
- SRA processes the data: SRA processes the sequence data and metadata and notifies dbGaP and the submitters.
- Human sequence data: Distributed through Authorized Access upon dbGaP release of the study.
IMPORTANT: This process can take an additional 6-8 weeks.
Updating an existing dbGaP study
Making any additions or deletions to the data in a published dbGaP study require creating a new version of the study. Note, the previous version of the study will no longer be available for download after the new version has been released.
- For edits to the Study Config page text only, contact the assigned dbGaP curator directly.
All datasets from the same data provider and consortium should be submitted to the same study. For any questions regarding this, please contact the HuBMAP Helpdesk or SenNet Helpdesk.
Verify that all of the datasets for the new version have been published on the portal, then complete the Study Data Outline in the dbGaP Submission Portal.
After the version is created, your GPA will be notified and should complete the registration in the dbGaP Submission System. Send any consent changes and/or Acknowledgment Statement changes to the GPA.
Update the Study Config to include information about all versions of the study (not only the new version), and notify the HuBMAP Helpdesk or SenNet Helpdesk when it is complete. From this point, the HIVE dbGaP team will work with the data provider to identify changes in the data, and to create and upload required files and data.
For additional information, including how version and participant set numbers are determined, see the dbGaP Data Submission Guide > dbGaP Versions > #30 > steps 1-3.2.