create_expt | R Documentation |
Note: You should just be using create_se(). It does everything the expt does, but better.
create_expt(
metadata = NULL,
gene_info = NULL,
count_dataframe = NULL,
sanitize_rownames = TRUE,
sample_colors = NULL,
title = NULL,
notes = NULL,
include_type = "all",
countdir = NULL,
include_gff = NULL,
file_column = "file",
id_column = NULL,
savefile = NULL,
low_files = FALSE,
handle_na = "drop",
researcher = "elsayed",
study_name = NULL,
file_type = NULL,
annotation_name = "org.Hs.eg.db",
tx_gene_map = NULL,
feature_type = "gene",
ignore_tx_version = TRUE,
...
)
metadata |
Comma separated file (or excel) describing the samples with information like condition, batch, count_filename, etc. |
gene_info |
Annotation information describing the rows of the data set, this often comes from a call to import.gff() or biomart or organismdbi. |
count_dataframe |
If one does not wish to read the count tables from the filesystem, they may instead be fed as a data frame here. |
sanitize_rownames |
Clean up weirdly written gene IDs? |
sample_colors |
List of colors by condition, if not provided it will generate its own colors using colorBrewer. |
title |
Provide a title for the expt? |
notes |
Additional notes? |
include_type |
I have usually assumed that all gff annotations should be used, but that is not always true, this allows one to limit to a specific annotation type. |
countdir |
Directory containing count tables. |
include_gff |
Gff file to help in sorting which features to keep. |
file_column |
Column to use in a gene information dataframe for |
id_column |
Column which contains the sample IDs. |
savefile |
Rdata filename prefix for saving the data of the resulting expt. |
low_files |
Explicitly lowercase the filenames when searching the filesystem? |
handle_na |
How does one wish to deal with NA values in the data? |
researcher |
Used to make the creation of gene sets easier, set the researcher tag. |
study_name |
Ibid, but set the study tag. |
file_type |
Explicitly state the type of files containing the count data. I have code which autodetects the method used to import count data, this short-circuits it. |
annotation_name |
Ibid, but set the orgdb (or other annotation) instance. |
tx_gene_map |
Dataframe of transcripts to genes, primarily for tools like salmon. |
feature_type |
Make explicit the type of feature used so it may be printed later. |
... |
More parameters are fun! |
The primary innovation of this function is that it will check the metadata for columns containing filenames for the count tables, thus hopefully making the collation and care of metadata/counts easier. For example, I have some data which has been mapped against multiple species. I can use this function and just change the file_column argument to pick up each species' tables.
experiment an expressionset
[Biobase] [cdm_expt_rda] [example_gff] [sb_annot] [sb_data] [extract_metadata()] [set_expt_conditions()] [set_expt_batches()] [set_expt_samplenames()] [subset_expt()] [set_expt_colors()] [set_expt_genenames()] [tximport] [load_annotations()]
cdm_expt_rda <- system.file("share", "cdm_expt.rda", package = "hpgldata")
load(file = cdm_expt_rda)
head(cdm_counts)
head(cdm_metadata)
## The gff file has differently labeled locus tags than the count tables, also
## the naming standard changed since this experiment was performed, therefore I
## downloaded a new gff file.
example_gff <- system.file("share", "gas.gff", package = "hpgldata")
gas_gff_annot <- load_gff_annotations(example_gff)
rownames(gas_gff_annot) <- make.names(gsub(pattern = "(Spy)_", replacement = "\\1",
x = gas_gff_annot[["locus_tag"]]), unique = TRUE)
mgas_expt <- create_expt(metadata = cdm_metadata, gene_info = gas_gff_annot,
count_dataframe = cdm_counts)
head(pData(mgas_expt))
## An example using count tables referenced in the metadata.
sb_annot <- system.file("share", "sb", "trinotate_head.csv.xz", package = "hpgldata")
sb_annot <- load_trinotate_annotations(trinotate = sb_annot)
sb_annot <- as.data.frame(sb_annot)
rownames(sb_annot) <- make.names(sb_annot[["transcript_id"]], unique = TRUE)
sb_annot[["rownames"]] <- NULL
sb_data <- system.file("share", "sb", "preprocessing.tar.xz", package = "hpgldata")
untarred <- utils::untar(tarfile = sb_data)
sb_expt <- create_expt(metadata = "preprocessing/kept_samples.xlsx",
gene_info = sb_annot)
dim(exprs(sb_expt))
dim(fData(sb_expt))
pData(sb_expt)
## There are lots of other ways to use this, for example:
## Not run:
new_experiment <- create_expt(metadata = "some_csv_file.csv", gene_info = gene_df)
## Remember that this depends on an existing data structure of gene annotations.
meta <- extract_metadata("some_supplementary_materials_xls_file_I_downloaded.xls")
another_expt <- create_expt(metadata = meta, gene_info = annotations, count_dataframe = df_I_downloaded)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.