gather_utrs_padding | R Documentation |
For some species, we do not have a fully realized set of UTR boundaries, so it can be useful to query some arbitrary and consistent amount of sequence before/after every CDS sequence. This function can provide that information. Note, I decided to use tibble for this so that if one accidently prints too much it will not freak out.
gather_utrs_padding(
bsgenome,
annot_df,
gid = NULL,
name_column = "gid",
chr_column = "chromosome",
start_column = "start",
end_column = "end",
strand_column = "strand",
type_column = "annot_gene_type",
gene_type = "protein coding",
padding = 120,
...
)
bsgenome |
BSgenome object containing the genome of interest. |
annot_df |
Annotation data frame containing all the entries of interest, this is generally extracted using a function in the load_something_annotations() family (load_orgdb_annotations() being the most likely). |
gid |
Specific GID(s) to query. |
name_column |
Give each gene a name using this column. |
chr_column |
Column name of the chromosome names. |
start_column |
Column name of the start information. |
end_column |
Ibid, end column. |
strand_column |
Ibid, strand. |
type_column |
Subset the annotation data using this column, if not null. |
gene_type |
Subset the annotation data using the type_column with this type. |
padding |
Return this number of nucleotides for each gene. |
... |
Arguments passed to child functions (I think none currently). |
Dataframe of UTR, CDS, and UTR+CDS sequences.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.