readQ: Convert run files (q-matrices) to qlist.

Description Usage Arguments Details Value See Also Examples

View source: R/read.R

Description

Takes one or more STRUCTURE, TESS, BAPS, BASIC (numeric delimited runs) or CLUMPP format files and converts them to a qlist (list of dataframes).

Usage

1
readQ(files = NULL, filetype = "auto", indlabfromfile = FALSE, readci = FALSE)

Arguments

files

A character or character vector of one or more files.

filetype

A character indicating input filetype. Options are 'auto', 'structure','tess2','baps','basic' or 'clumpp'. See details.

indlabfromfile

A logical indicating if individual labels must be read from input file and used as row names for resulting dataframe. Spaces in labels may be replaced with _. Currently only applicable to STRUCTURE runs.

readci

A logical indicating if confidence intervals from the STRUCTURE run file (if available) should be read. Set to FALSE by default as it take up excess space. This argument is only applicable to STRUCTURE run files.

Details

STRUCTURE, TESS2 and BAPS run files have unique layout and format (See vignette). BASIC files can be Admixture run files, fastStructure meanQ files or any tab-delimited, space-delimited or comma-delimited tabular data without a header. CLUMPP files can be COMBINED, ALIGNED or MERGED files. COMBINED files are generated from clumppExport. ALIGNED and MERGED files are generated by CLUMPP.

To convert TESS3 R objects to pophelper qlist, see readQTess3.

See the vignette for more details.

Value

A list of lists with dataframes is returned. List items are named by input filenames. File extensions such as '.txt','.csv','.tsv' and '.meanQ' are removed from filename. In case filenames are missing or not available, lists are named sample1, sample2 etc. For STRUCTURE runs, if individual labels are present in the run file and indlabfromfile=TRUE, they are added to the dataframe as row names. Structure metadata including loci, burnin, reps, elpd, mvll, and vll is added as attributes to each dataframe. When readci=TRUE and if CI data is available in STRUCTURE run files, it is read in and attached as attribute named ci. For CLUMPP files, multiple runs within one file are suffixed by -1, -2 etc.

See Also

readQTess3

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
# STRUCTURE files
sfiles <- list.files(path=system.file("files/structure",package="pophelper"),
full.names=TRUE)
# create a qlist of all runs
slist <- readQ(sfiles)
slist <- readQ(sfiles,filetype="structure")

# use ind names from file
slist <- readQ(sfiles[1],indlabfromfile=TRUE)

# access the first run
slist <- readQ(sfiles)[[1]]

# access names of runs
names(slist)

# get attributes of a run
attributes(slist[[1]])

# get attributes of all runs
lapply(slist,attributes)

# TESS files
tfiles <- list.files(path=system.file("files/tess",package="pophelper"),
full.names=TRUE)
# create a qlist
tlist <- readQ(tfiles)

# BASIC files
afiles <- list.files(path=system.file("files/admixture",package="pophelper"),
full.names=TRUE)
# create a qlist
alist <- readQ(afiles)

# CLUMPP files
cfiles1 <- system.file("files/STRUCTUREpop_K4-combined.txt",
package="pophelper")
cfiles2 <- system.file("files/STRUCTUREpop_K4-combined-aligned.txt",
package="pophelper")
cfiles3 <- system.file("files/STRUCTUREpop_K4-combined-merged.txt",
package="pophelper")

# create a qlist
clist1 <- readQ(cfiles1)
clist2 <- readQ(cfiles2)
clist3 <- readQ(cfiles3)

# manually create qlist
df1 <- data.frame(Cluster1=c(0.2,0.4,0.6,0.2),Cluster2=c(0.8,0.6,0.4,0.8))
df2 <- data.frame(Cluster1=c(0.3,0.1,0.5,0.6),Cluster2=c(0.7,0.9,0.5,0.4))

# one-element qlist
q1 <- list("sample1"=df1)
str(q1)

# two-element qlist
q2 <- list("sample1"=df1,"sample2"=df2)
str(q2)

royfrancis/pophelper documentation built on Jan. 1, 2021, 4:58 p.m.