createYeast: Create Yeast Protein Localization dataset

Description Usage Arguments Details Value References See Also

Description

Task: multiclass: formula(Class ~ . -Name)

Usage

1
createYeast(file = getfilepath("yeast.rds"), write = TRUE, read = TRUE)

Arguments

file

character; path/filename to write data file to

write

logical; should the dataset be written to disk for later use? (default: TRUE)

read

logical; should we try to read the dataset from the specified location first? (default: TRUE)

Details

1. Sequence Name: Accession number for the SWISS-PROT database 2. mcg: McGeoch's method for signal sequence recognition. 3. gvh: von Heijne's method for signal sequence recognition. 4. alm: Score of the ALOM membrane spanning region prediction program. 5. mit: Score of discriminant analysis of the amino acid content of the N-terminal region (20 residues long) of mitochondrial and non-mitochondrial proteins. 6. erl: Presence of "HDEL" substring (thought to act as a signal for retention in the endoplasmic reticulum lumen). #' Binary attribute. 7. pox: Peroxisomal targeting signal in the C-terminus. 8. vac: Score of discriminant analysis of the amino acid content of vacuolar and extracellular proteins. 9. nuc: Score of discriminant analysis of nuclear localization signals of nuclear and non-nuclear proteins.

CYT (cytosolic or cytoskeletal) 463 NUC (nuclear) 429 MIT (mitochondrial) 244 ME3 (membrane protein, no N-terminal signal) 163 ME2 (membrane protein, uncleaved signal) 51 ME1 (membrane protein, cleaved signal) 44 EXC (extracellular) 37 VAC (vacuolar) 30 POX (peroxisomal) 20 ERL (endoplasmic reticulum lumen)

Value

The dataset as a data.table

References

"A Knowledge Base for Predicting Protein Localization Sites inEukaryotic Cells", Kenta Nakai & Minoru Kanehisa, Genomics 14:897-911, 1992.

See Also

https://archive.ics.uci.edu/ml/machine-learning-databases/yeast/


jkrijthe/createdatasets documentation built on May 19, 2019, 12:44 p.m.