buildSsTypePwms: Building Position Weight Matrices for Splice Sites of U12 and...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/buildSsTypePwms.R

Description

Builds position Weigh Matrices for the donor and acceptor sites of the U12 and U2 type introns, and the branchpoint of the U12 type introns. if pdfFileSeqLogos is defined a pdf is also produced that contains the sequence logos of the results. The result is a list that contains PWMs of the splice sites of U12 and U2 dependent introns.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
buildSsTypePwms( cexSeqLogo=1, pdfWidth=35, pdfHeight=10, tmpDir="./",
	u12dbSpecies="Homo_sapiens", 
	pwmSource="U12DB", 
	u12DonorBegin, u12BranchpointBegin, u12AcceptorBegin, 
	u2DonorBegin, u2AcceptorBegin, u12DonorEnd, 
	u12BranchpointEnd, u12AcceptorEnd, u2DonorEnd, 
	u2AcceptorEnd, pasteSites=FALSE,
	splicerackSsLinks=list(
		U12_AT_AC_donor=
			"http://katahdin.mssm.edu/splice/out/9606_logo_file.25", 
		U12_AT_AC_branchpoint=
			"http://katahdin.mssm.edu/splice/out/9606_logo_file.26",
		U12_AT_AC_acceptor=
			"http://katahdin.mssm.edu/splice/out/9606_logo_file.29",
		U12_GT_AG_donor=
			"http://katahdin.mssm.edu/splice/out/9606_logo_file.22", 
		U12_GT_AG_branchpoint=
			"http://katahdin.mssm.edu/splice/out/9606_logo_file.27",
		U12_GT_AG_acceptor=
			"http://katahdin.mssm.edu/splice/out/9606_logo_file.21",
		U2_GC_AG_donor="http://katahdin.mssm.edu/splice/out/9606_logo_file.24",
		U2_GC_AG_acceptor=
			"http://katahdin.mssm.edu/splice/out/9606_logo_file.30", 
		U2_GT_AG_donor="http://katahdin.mssm.edu/splice/out/9606_logo_file.23",
		U2_GT_AG_acceptor=
			"http://katahdin.mssm.edu/splice/out/9606_logo_file.28"),
	u12dbLink="ftp://genome.imim.es/pub/software/u12/u12db_v1_0.sql.gz",
	u12dbDbName="u12db", u12dbDropDb=TRUE,  pdfFileSeqLogos="", 
	removeTempFiles=TRUE, ...)

Arguments

cexSeqLogo

Font size of sequence logo plots; used only if pdfFileSeqLogos is defined.

pdfWidth, pdfHeight

The width and height of the graphics region of the pdf in inches. The default values are 35 and 10.

tmpDir

Path to directory used for storing temporary files.

u12dbSpecies

What species data to use when getting the data from the U12DB database (pwmSource="U12DB").

pwmSource

The source used to buildSplice Sites of U12 and U2 type introns the PWM for U12 and U2 dependent introns. Default is U12DB; but also accepts SpliceRack.

u12DonorBegin, u12DonorEnd

Integer values. They correspond to the begin and end point of the donor sequences of U12-type introns to consider (optional).

u12BranchpointBegin, u12BranchpointEnd

Integer values. Begin and end points of the branch point sequences of U12-type introns (optional).

u12AcceptorBegin, u12AcceptorEnd

Integer values. Begin and end points of the acceptor sequences of U12-type introns (optional).

u2DonorBegin, u2DonorEnd

Integer values. Begin and end points of the donor sequences of U2-type introns (optional).

u2AcceptorBegin, u2AcceptorEnd

Integer values. Begin and end points of the acceptor sequences of U2-type introns (optional).

pasteSites

Logical. If TRUE the donor, branch point and acceptor seqs are pasted before a PWM is built; then the PWMs of each (donor, acceptor and bp) are assigned. If FALSE (default) the PWMs for each is built separately.

splicerackSsLinks

A list (or vector) that contains the SpliceRack URL links to the text files that contain Position Weigh Matrices of the splice sites of U12 and U2 introns. This parameter is used only when pwmSource="SpliceRack". You can get the links to PWM files from this URL (choose logo files with "File" links): http://katahdin.mssm.edu/splice/splice_matrix.cgi?database=spliceNew. The links should be defined in the following order: U12_AT_AC_donor, U12_AT_AC_branchpoint, U12_AT_AC_acceptor, U12_GT_AG_donor, U12_GT_AG_branchpoint, U12_GT_AG_acceptor, U2_GC_AG_donor, U2_GC_AG_acceptor, U2_GT_AG_donor, and U2_GT_AG_acceptor.

u12dbLink

A character string containing the URL for downloading the zipped MySQL dump file of the U12DB. Used when pwmSource="U12DB".

u12dbDbName

Name of the database copy of the U12DB that is build locally. Used when pwmSource="U12DB".

u12dbDropDb

Drop (or remove) the local copy of the U12DB database at the end of the run. Used when pwmSource="U12DB".

pdfFileSeqLogos

Path to PDF file containing the sequence logos of the results. By default it does not produce a file.

removeTempFiles

Whether remove temporary files at the end of the run; accepts TRUE or FALSE values (default is TRUE).

...

Authorization arguments needed by the DBMS instance. See the manual for dbConnect of the DBI package for more info.

Value

pwmDonorU12

Matrix (with 4 rows represnting A, C, G, T and n columns representing the genomic coordinates) representing the Position Weight Matrix of donor site of U12-type introns.

pwmBpU12

Position Weight Matrix of branchpoint of U12-type introns.

pwmAccU12

Position Weight Matrix of acceptor site of U12-type introns.

pwmDonU2

Position Weight Matrix of donor site of U2-type introns.

pwmAccU2

Position Weight Matrix of acceptor site of U2-type introns.

Author(s)

Ali Oghabian

See Also

annotateU12.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Time demanding function
## Not run: 
#Build temp directory  
tmpDir<- tempdir()


# Creating subdirectory for storing u12db temp files
dir.create(paste(tmpDir, "u12dbTmp", sep="/"))

# Extracting PWMs of Splice Sites of U12 and U2 type introns -
# based on u12db
u12dbPwm<-buildSsTypePwms(
	tmpDir=paste(tmpDir, "u12dbTmp", sep="/"),
	u12dbSpecies="Homo_sapiens",
	resource="U12DB",
	u12dbDbName="u12db",
	u12dbDropDb=TRUE,
	removeTempFiles=TRUE)


# Creating subdirectory for storing SpliceRack temp files
dir.create(paste(tmpDir, "splicerackTmp", sep="/"))

# Extracting PWMs of Splice Sites of U12 and U2 type introns - 
# based on SpliceRack
spliceRackPwm<- buildSsTypePwms(
	tmpDir= paste(tmpDir, "splicerackTmp", sep="/"),
	resource="SpliceRack",
	removeTempFiles=TRUE)

## End(Not run)

IntEREst documentation built on Nov. 8, 2020, 8:05 p.m.