region_info: Print relevant info about candidate probe sequence.

View source: R/region_info.R

region_infoR Documentation

Print relevant info about candidate probe sequence.

Description

region_info returns annotation of a single potential probe sequence or list of sequences and, if specified, prints the resuts in a .csv file.

Usage

region_info(
  REGION,
  CSV = TRUE,
  SEQ = TRUE,
  OUTDIR = tempdir(),
  CODING_ONLY = FALSE
)

Arguments

REGION

Either a single hg19 genomic sequence including the chromosome, start, end, and optionally strand separated by colons (e.g., 'chr20:10199446-10288068:+'), or a string of sequences to be annotated. Must be character. Chromosome must be proceeded by 'chr'.

CSV

A logical(1) value indicating if the results should be exported in a .csv file.

SEQ

A 'logical(1)“ value indicating if the base sequence should be returned.

OUTDIR

If a .csv file is to be exported, this parameter indicates the path where the file should be saved. By default the file will be saved in a temporary directory.

CODING_ONLY

A logical vector of length 1 specifying whether to subset the Annotated Genes to only the coding genes. That is, whether to subset the genes by whether they have a non-NA CSS value. The Annotated Genes are downloaded with GenomicState::GenomicStateHub().

Value

This function annotates all input sequences using bumphunter::matchGenes(). It returns a data frame where each row is a genomic sequence specified in REGION. The columns c('seqnames', 'start', 'end', 'width', 'strand') list the chromosome, range, sequence length, and strand of the REGION. The columns c('name', 'annotation', 'description', 'region', 'distance', 'subregion', 'insideDistance', 'exonnumber', 'nexons', 'UTR', 'geneL', 'codingL', 'Geneid', 'subjectHits') are described in bumphunter::matchGenes() documentation.

If SEQ=TRUE, a column 'Sequence' will be included. This is recommended for sending the probe sequence to be synthesized.

If CSV=TRUE, a .csv file called region_info.csv will be saved to a temporary directory unless otherwise specified in OUTDIR.

Author(s)

Amanda J Price

Examples

x <- region_info("chr20:10286777-10288069:+", CSV = FALSE)
head(x)

## You can easily transform this data.frame to a GRanges object
GenomicRanges::GRanges(x)

y <- region_info(
    c(
        "chr20:10286777-10288069:+",
        "chr18:74690788-74692427:-",
        "chr19:49932861-49933829:-"
    ),
    CSV = FALSE, SEQ = FALSE
)
head(y)

candidates <- c(
    "chr20:10286777-10288069:+",
    "chr18:74690788-74692427:-",
    "chr19:49932861-49933829:-"
)
region_info(candidates, CSV = FALSE)

## Explore the effect of changing CODING_ONLY
## Check how the "distance", "name", "Geneid" among other values change
region_info("chr10:135379301-135379311:+", CSV = FALSE)
region_info("chr10:135379301-135379311:+", CSV = FALSE, CODING_ONLY = TRUE)
## Not run: 
region_info(candidates, OUTDIR = "/path/to/directory/")

region_info("chr20:10286777-10288069:+", OUTDIR = "/path/to/directory")

## End(Not run)

LieberInstitute/brainflowprobes documentation built on May 6, 2024, 5:55 a.m.