Description Usage Arguments Details Value
View source: R/check_functions.R
Check contents of a sample subject mapping file for dbGaP posting.
1 2 3 |
dsfile |
Path to the data file on disk |
ddfile |
Path to the data dictionary file on disk |
na_vals |
Vector of strings that should be read in as NA/missing in data file (see details of |
ssm_exp |
Dataframe of expected SAMPLE_ID and SUBJECT_ID |
sampleID_col |
Column name for sample-level ID |
subjectID_col |
Column name for subject-level ID |
The sample subject mapping file should be a tab-delimited .txt file.
When ssm_exp != NULL
, checks for expected correspondence between
SAMPLE_ID and SUBJECT_ID. Any differences in mapping between the two,
or a difference in the list of expected SAMPLE_IDs or SUBJECT_IDs,
will be returned in the output.
If a data dictionary is provided ddfile != NULL
, additionally checks
correspondence between column names in data file and entries in data dictionary.
Data dictionary files can be Excel (.xls, .xlsx) or tab-delimited .txt.
ssm_report, a list of the following issues (when present):
dup_samples |
List of duplicated sample IDs |
blank_idx |
Row index of blank/missing subject or sample IDs |
dd_errors |
Differences in fields between data file and data dictionary |
extra_subjects |
Subjects in data file missing from |
missing_subjects |
Subjects in |
extra_samples |
Samples in data file missing from |
missing_samples |
Samples in |
ssm_diffs |
Discrepancies in mapping between SAMPLE_ID and SUBJECT_ID. Lists entries in |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.