python/xptcleaner/README.md

xptcleaner

Installation

Using pip

Probably the easiest way: from your conda, virtualenv or just base installation do:

pip install xptcleaner

If you are running on a machine without admin rights, and you want to install against your base installation you can do:

pip install xptcleaner --user

Using source archive or using wheel file

You can choose to install xptcleaner using source archive or using wheel file.

$ py -m pip install ./dist/xptcleaner-{version}.tar.gz

$ py -m pip install ./dist/xptcleaner-{version}-py3-none-any.whl

The following required python packages will be installed during the xptcleaner package installation:

* pandas

* pyreadstat

Functions

gen_vocab(in_file, out_path)

    Create json file for vocabulary mappings.
    Keys are synonyms and values are the CDISC Controlled Terminology Submission values.
    Vocabularies are defined by column values from the tab-delimited files.

    Parameters
    ----------
    in_file : str
        List of tab-delimited files with synonyms and preferred terms.
    out_path : str
        output json filename.

standardize_file(input_xpt_dir, output_xpt_dir, json_file)

    Standardizes SEND xpt files using CDISC controlled terminologies.
    Here is the list of CDISC codelist supported.
    - Sex
    - Strain/Substrain
    - Species
    - SEND Severity
    - Route of Administration Response
    - Standardized Disposition Term
    - Specimen
    - Non-Neoplastic Finding Type
    - SEND Control Type

     Parameters
    ----------
    input_xpt_dir : str
        input folder name with xpt files under the folder.
    output_xpt_dir : str
        output folder name for writing the cleaned xpt files.
    json_file : str
        json filename used for mapping.

How to use

xptcleaner can be used from python script and from R script.

Use xptcleaner from python script

# xptcleaner and module xptclean import
import xptcleaner
from xptcleaner import xptclean

#input CDISC and Extensible CT files.
infile1="{path to CT file}/SEND_Terminology_EXTENSIBLE.txt"
infile2="{path to CT file}/SEND Terminology_2021_12_17.txt"
#output JSON file
jsonfile="{path to CT file to be created}/SENDct.json"

#Call the gen_vocab function with the input and output files
xptclean.gen_vocab([infile1,infile2],jsonfile)

#Call the standardize_file function to clean the xpt file
rawXptFolder = "{path to xpt files}/96298/"
cleanXptFolder = "{path to cleaned xpt files}/96298/"
xptclean.standardize_file(rawXptFolder, cleanXptFolder, jsonfile)

Use xptcleaner from R script

xptcleaner is integrated with sendigR package. refer to installation and usage on sendigR.



phuse-org/sendigR documentation built on April 5, 2025, 1:29 a.m.