Schema

What does dbGaP metadata contain?

It contains five files:

  1. study_dataset_info.txt
  2. study_info.txt
  3. study_variable_code_value.txt
  4. study_variable_info.txt
  5. study_dataset_info.txt

Here's a random sample of what's in each files:

  1. study_dataset_info.txt

| |row_names | study_id| study_version|study_accession | dataset_id| dataset_version|dataset_accession |name |dataset_type_category |description |short_description | |:-----|:---------|--------:|-------------:|:---------------|----------:|---------------:|:-----------------|:----------------------------|:---------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |15726 |15726 | 470| 4|phs000470.v4 | 2636| 2|pht002636.v2 |TARGET_RT_Subject_Phenotypes |subject |The data tables contains Rhabdoid Tumor (RT) subject phenotypes including gender (n=1 variable). The complete dataset has not been submitted, so more data is forthcoming. |The data tables contains Rhabdoid Tumor (RT) subject phenotypes including gender (n=1 variable). The complete dataset has not been submitted, so more data is forthcoming. | |4724 |4724 | 7| 27|phs000007.v27 | 303| 5|pht000303.v5 |icam3_1s |subject |Intercellular Adhesion Molecule 1, Generation 3 Cohort Exam 1 |Intercellular Adhesion Molecule 1, Generation 3 Cohort Exam 1 | |13483 |13483 | 285| 3|phs000285.v3 | 1564| 2|pht001564.v2 |A4F06BH |subject |1985-1986, Year 0. Nutrient Data, Dietary History. Cardiovascular disease and its risk factors among young adult participants. |1985-1986, Year 0. Nutrient Data, Dietary History. Cardiovascular disease and its risk factors among young adult participants. | |7582 |7582 | 80| 12|phs000080.v12 | 171| 2|pht000171.v2 |psel1_7s |subject |P-Selectin, Offspring Cohort Exam 7 |P-Selectin, Offspring Cohort Exam 7 | |4325 |4325 | 7| 26|phs000007.v26 | 194| 5|pht000194.v5 |brach3_1s |subject |Non-invasive Vascular Brachial Test, Generation 3 Cohort Exam 1 |Non-invasive Vascular Brachial Test, Generation 3 Cohort Exam 1 | 2. study_info.txt

| |row_names | root_study_id| root_study_version|root_study_accession_p |root_study_accession | this_study_id| this_study_version|this_study_accession_p |this_study_accession |name |disease | |:----|:---------|-------------:|------------------:|:----------------------|:--------------------|-------------:|------------------:|:----------------------|:--------------------|:---------------------------------------------------------------------|:----------------------------| |2461 |2461 | 707| 1|phs000707.v1.p1 |phs000707.v1 | 707| 1|phs000707.v1.p1 |phs000707.v1 |Next Generation Mendelian Genetics: Hereditary Neurological Disorders |Nervous System Diseases | |1684 |1684 | 7| 17|phs000007.v17.p6 |phs000007.v17 | 282| 6|phs000282.v6.p6 |phs000007.v17 |NHLBI Framingham Candidate Gene Association Resource (CARe) |Cardiovascular Diseases | |1418 |1418 | 7| 9|phs000007.v9.p4 |phs000007.v9 | 194| 4|phs000194.v4.p4 |phs000007.v9 |Framingham SHARe VEGF Study |Cardiovascular System | |2031 |2031 | 218| 12|phs000218.v12.p2 |phs000218.v12 | 463| 9|phs000463.v9.p2 |phs000218.v12 |TARGET: Acute Lymphoblastic Leukemia (ALL) Pilot Phase 1 |Acute Lymphoblastic Leukemia | |2042 |2042 | 218| 9|phs000218.v9.p1 |phs000218.v9 | 464| 6|phs000464.v6.p1 |phs000218.v9 |TARGET: Acute Lymphoblastic Leukemia (ALL) Expansion Phase 2 |Acute Lymphoblastic Leukemia |

  1. study_variable_code_value.txt

| |row_names |study_id | study_version|study_accession | variable_id| variable_version|variable_accession |code |value |rank_order | |:-------|:---------|:--------|-------------:|:---------------|-----------:|----------------:|:------------------|:----|:----------------------------------------------------|:----------| |2865133 |2865133 |280 | 1|phs000280.v1 | 92915| 1|phv00092915.v1 |2 |Plaque only |2 | |2345598 |2345598 |134 | 6|phs000134.v6 | 55151| 1|phv00055151.v1 |NA |UNKNOWN |3 | |55302 |55302 |7 | 13|phs000007.v13 | 55197| 1|phv00055197.v1 |NA |UNKNOWN OR ARTIFACT OR TEST STOPPED |2 | |2162768 |2162768 |21 | 2|phs000021.v2 | 35016| 1|phv00035016.v1 |0 |Absent (or insufficient evidence to rate as present) |1 | |669001 |669001 |7 | 21|phs000007.v21 | 1004| 1|phv00001004.v1 |0 |NO |2 |

  1. study_variable_info.txt

| |row_names |study_id | study_version|study_accession | dataset_id| dataset_version|dataset_accession |dataset_name | variable_id| variable_version|variable_accession |name |units |reported_type |calculated_type |dataset_type_category |male_count |female_count |max |min |sd |non_null_count |null_count | valid_value_count|description | |:------|:---------|:--------|-------------:|:---------------|----------:|---------------:|:-----------------|:------------------------|-----------:|----------------:|:------------------|:---------------|:-----|:-------------|:---------------|:---------------------|:----------|:------------|:---|:---|:-----|:--------------|:----------|-----------------:|:----------------------------------------------------------------------------------------| |471787 |471787 |7 | 29|phs000007.v29 | 4812| 1|pht004812.v1 |t_physactf_ex09_1b_0833s | 247282| 1|phv00247282.v1 |bgt_modminthu12 |NA |NA |integer |subject |836 |997 |54 |NA |3.487 |3666 |NA | 3666|Bouts >=10 minutes Moderate Minutes Thursday Hour: 12 (KV1324912509) ah: bgt_modminthu12 |

  1. study_dataset_info.txt

| |row_names | study_id| study_version|study_accession | dataset_id| dataset_version|dataset_accession |name |dataset_type_category |description |short_description | |:-----|:---------|--------:|-------------:|:---------------|----------:|---------------:|:-----------------|:-----------------------------|:---------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |17030 |17030 | 708| 1|phs000708.v1 | 3727| 1|pht003727.v1 |POAT_GEDWR_Subject_Phenotypes |subject |This subject phenotype table includes subject gender, age, weight, height, congestive heart failure status, stable warfarin dose, glomerular filtration rate (n=2 variables), amiodarone co-therapy use, and aspirin use. |This subject phenotype table includes subject gender, age, weight, height, congestive heart failure status, stable warfarin dose, glomerular filtration rate (n=2 variables), amiodarone co-therapy use, and aspirin use. |

                                                  |


dbGaPdb/dbGaPdb documentation built on May 24, 2019, 2:04 a.m.