index_snps: Create index of equivalent SNPs

View source: R/index_snps.R

index_snpsR Documentation

Create index of equivalent SNPs

Description

For a set of SNPs and a map of marker/pseudomarkers, partition the SNPs into groups that are contained within common intervals and have the same strain distribution pattern, and then create an index to a set of distinct SNPs, one per partition.

Usage

index_snps(map, snpinfo, tol = 0.00000001)

Arguments

map

Physical map of markers and pseudomarkers; generally created from insert_pseudomarkers() and used for a set of genotype probabilities (calculated with calc_genoprob()) that are to be used to interpolate SNP genotype probabilities (with genoprob_to_snpprob()).

snpinfo

Data frame with SNP information with the following columns:

  • chr - Character string or factor with chromosome

  • pos - Position (in same units as in the "map").

  • sdp - Strain distribution pattern: an integer, between 1 and 2^n - 2 where n is the number of strains, whose binary encoding indicates the founder genotypes

  • snp - Character string with SNP identifier (if missing, the rownames are used).

tol

Tolerance for determining whether a SNP is exactly at a position at which genotype probabilities were already calculated.

Details

We split the SNPs by chromosome and identify the intervals in the map that contain each. For SNPs within tol of a position at which the genotype probabilities were calculated, we take the SNP to be at that position. For each marker position or interval, we then partition the SNPs into groups that have distinct strain distribution patterns, and choose a single index SNP for each partition.

Value

A data frame containing the input snpinfo with three added columns: "index" (which indicates the groups of equivalent SNPs), "interval" (which indicates the map interval containing the SNP, with values starting at 0), and on_map (which indicates that the SNP is within tol of a position on the map). The rows get reordered, so that they are ordered by chromosome and position, and the values in the "index" column are by chromosome.

See Also

genoprob_to_snpprob(), scan1snps(), find_index_snp()

Examples

## Not run: 
# load example data and calculate genotype probabilities
file <- paste0("https://raw.githubusercontent.com/rqtl/",
               "qtl2data/main/DO_Recla/recla.zip")
recla <- read_cross2(file)

# founder genotypes for a set of SNPs
snpgeno <- rbind(m1=c(3,1,1,3,1,1,1,1),
                 m2=c(1,3,1,3,1,3,1,3),
                 m3=c(1,1,1,1,3,3,3,3),
                 m4=c(1,3,1,3,1,3,1,3))
sdp <- calc_sdp(snpgeno)
snpinfo <- data.frame(chr=c("19", "19", "X", "X"),
                      pos=c(40.36, 40.53, 110.91, 111.21),
                      sdp=sdp,
                      snp=c("m1", "m2", "m3", "m4"), stringsAsFactors=FALSE)

# update snp info by adding the SNP index column
snpinfo <- index_snps(recla$pmap, snpinfo)

## End(Not run)


qtl2 documentation built on May 29, 2024, 11:46 a.m.