findGenecM: Find the cM location of genes.

Description Usage Arguments Details Value Examples

Description

findGenecM Using the physical position of genetic markers, infer the mapping position of every gene.

Usage

1
2
3
findGenecM(cross, marker.info, gff, gffCols = NULL,
  attributeParse = c("ID="), seqnameParse = c("Chr", "scaffold_"),
  dropNonColinearMarkers = TRUE, verbose = TRUE, ...)

Arguments

cross

The qtl cross object.

marker.info

The qtlTools standard data.frame containing map and physiical position of markers. See details. The base-pair positions of the markers must be known.

gff

The .gff file containing information about each gene. This object must be of standard format containing field described in details.

gffCols

If the gff file does not follow the standard format, this vector specifies the chr,feature, start, end, strand and attribute columns of the supplied gff-like file.

attributeParse

Character vector of strings to drop from the first element of the attribute column. Defaults to "ID=".

seqnameParse

Character vector of strings in the seqname gff column to remove to make the cross chromosome names match the gff seqname. Defaults to c("Chr","scaffold")

dropNonColinearMarkers

logical, should markers that are not in the right bp order be dropped?

verbose

Logical, should updates be reported?

...

Not currently in use.

Details

standard gff fields are as follows:

  1. seqname: name of the chromosome or scaffold

  2. source: name of the program that generated this feature, or the data source (database or project name)

  3. feature: feature type name, e.g. Gene, Variation, Similarity **Note** the term "Gene" must be present in this column

  4. start: Start position of the feature, with sequence numbering starting at 1.

  5. end: End position of the feature, with sequence numbering starting at 1.

  6. score: A floating point value.

  7. strand: defined as + (forward) or - (reverse)

  8. frame: One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on..

  9. attribute: A semicolon-separated list of tag-value pairs, providing additional information about each feature.

marker.info fields - names must match exactly. The first three fields can be generated using qtlTools::pullMap(cross)

  1. marker.name: Marker names (rownames from pull.map with as.table=T)

  2. chr: the chromosome of the marker

  3. pos: the mapping position of the marker

  4. bp: the base-pair position of the marker

Value

The gff file, with three added columns:

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Not run: 
library(qtl)
library(qtlTools)
data(multitrait)
map<-pullMap(multitrait)
#simulate the bp positions of the markers with
#low recombination at the center of the chromosome
map$bp<-0
for(i in unique(map$chr)){
  n<-sum(map$chr==i)
  p<-sin((1:n/n)*pi)
  map$bp[map$chr==i]<-cumsum(p*1000000)
}


#simulate a gff w/ 1000 genes
gff<-data.frame(chr = rep(paste0("scaffold_",1:5),each = 200),
   feature = rep("gene",1000),
   start = rep(seq(from = 0, to = max(map$bp), length = 200), 5),
   end = rep(seq(from = 0, to = max(map$bp), length = 200))+1000,
   strand = rep("+",1000),
   attribute = paste0("gene",1:1000,";","gene",1:1000,".1"), stringsAsFactors=F)
test<-findGenecM(cross = multitrait, marker.info = map, gff = gff,
   gffCols = c("chr","feature","start","end","strand","attribute"))

par(mfrow=c(3,2))
for(i in unique(map$chr)){
with(test[test$chr==i,], plot(pos,bp, col="grey"))
with(map[map$chr==i,], points(pos,bp, col=i, pch = 19, cex=.8))
}

## End(Not run)

jtlovell/qtlTools documentation built on May 20, 2019, 3:14 a.m.