interp_genoprob()
for linear interpolation of genotype
probabilities, useful when comparing two sets of genotype
probabilities derived by different methods and calculated at
slightly different positions.Have zip_datafiles()
give a warning if any of the paths include
".."
.
insert_pseudomarkers()
now has a cores
argument for doing
calculations in parallel on multiple CPUs.
compare_genoprob()
for getting tabular comparison of two
sets of genotype probabilities, for a single individual on a single
chromosome.deterministic
argument to guess_phase()
.For crosstype "ail3", forgot to include the functions for requiring and checking the founder genotypes.
check_cross2 wasn't checking for missing names in gmap, pmap, and founder_geno
guess_phase()
, for guessing the phase of imputed
genotypes (such as from maxmarg()
), mostly for visualization of
genotypes of one individual from a multi-parent population.mpp_decode_geno()
and mpp_encode_alleles()
."dh6"
for 6-way doubled haploids (for
a set of maize MAGIC populations developed at the
Wisconsin Crop Innovation Center)
and ail3
for 3-way advanced intercross lines.check_founder_geno_size()
for checking
the dimensions of the founder genotype data from R, and added this
check to the check_cross2()
function.read_csv_numer()
that is like read_csv()
but
assumes that all columns expect the first are strictly numeric.
Now used by read_pheno()
to vastly speed things up (because
otherwise I was assuming everything was character and then
converting, and that conversion to numeric was deathly slow).summary.cross2()
within
RStudio.Added function scale_kinship()
which converts a kinship matrix (or
a list of such, in the case of the "leave one chromosome out"
method) to be like a correlation matrix.
Removed the "normalize" argument from calc_kinship()
, though left
the internal function normalize_kinship()
in place, for now.
insert_pseudomarkers
now gives an error if the input map
is
NULL
.count_xo
now works with the output of sim_geno
. The result is a
3d-array of counts of crossovers for each individual on each
chromosome in each imputation.chisq_colpairs
which performs chi-square tests for
independence on all pairs of columns of a matrix. It just calculates
the statistics.save_rf
to est_map()
; if TRUE
, the estimated
recombinations are saved as an attribute ("rf"
) of the result.
This can be useful for diagnostic purposes, for example when the
estimated recombination fraction between markers is > 1/2.
(After converting to genetic distance, rf>1/2 is indistinguishable
from rf=1/2.)New function reduce_map_gaps
that reduces the length of any gaps
in map. (Gaps greater than min_gap
are reduced to min_gap
.)
maxmarg
now picks at random among genotypes that jointly share the
maximum probability. Previously, it picked the first among these.
Added an argument tol
; if two genotypes have probabilities that
differ by no more than tol
, they are treated as having the same
probability.
New function calc_entropy
takes the results of calc_genoprob
and
calculates, for each individual at each genomic postion, the entropy
of the genotype probability distribution, as a measure of missing
information.
Fix bug in find_map_gaps
regarding the case that the output are
empty.
Fix bug in attempting to subsett calc_genoprob
output by
individual using individuals that aren't present in the data.
Fix bug in est_map
where it was producing NaN
s in some cases.
read_cross2
now unzips a .zip
file to a separate directory, to
avoid possibility of clashing of multiple sets of files.
read_cross2
will now ignore any JSON or YAML files in the .zip
file that have the pattern __MACOSX/._*
.
read_cross2
will stop with an error if a .zip
file contains
multiple JSON or multiple YAML files. If there's both a YAML and a
JSON file, the YAML file is used and a warning is issued.
est_map
now gives a warning if it reaches the maximum number of
iterations without converging."risib4"
, "risib8"
, and "magic19"
.
The "risib8"
cross type corresponds to the Collaborative Cross.
The "magic19"
cross type corresponds to the 19-way Arabidopsis
MAGIC lines of
Kover et al (2009) PLOS Genetics 5:e1000551.Added argument lowmem
to est_map
; default is FALSE
, which
corresponds to a new implementation that uses more memory but is
considerably faster.
Added function find_map_gaps
for identifying larger inter-marker gaps
in a genetic map.
Added function calc_geno_freq
for calculating genotype
frequencies, by individual or by marker (from the multipoint
genotype probabilities returned by calc_genoprob
).
"riself4"
, "riself8"
, and
"riself16"
, for multi-way MAGIC populations (multi-way RILs by
selfing).read_cross2
in the case that data has a physical
map but not a genetic map.write_control_file
; if TRUE
,
overwrite the file, if it's present. (Previously, you were always
forced to first remove it.)Added function ind_ids_covar
to grab individual IDs from the
covariate data.
ind_ids()
now return individuals that are in any of geno, pheno, covar.
subset_cross2()
now deals properly with the case that chromosome
or individual IDs are not found in cross object, and deals with the
case that geno and pheno (and covar) have different individuals.Added functions count_xo
and locate_xo
for getting estimates of
the number of crossovers on each chromosome in each individual, and
of their locations.
Added compare_geno
for comparing raw genotypes between pairs of
individuals (to look for possible sample duplicates).
Added calc_errorlod
to help identify potential genotyping errors
(and problem markers or individuals).
Made various small improvements to the handling of problems in the input files.
Small changes to better handle genotype probabilities that are in the qtl2feather format.
Added internal functions dim.calc_genoprob
and
dimnames.calc_genoprob
, from
Brian Yandell, for use with
qtl2feather, which uses
feather to store genotype
probabilities in a file (to save memory).
In precess of revising various functions to use qtl2feather,
particularly in grabbing dimnames (with the above functions), but
also to avoid seq(along=genoprobs)
and instead use
seq_len(length(genoprobs))
.
Removed the distinction between "lines" and "individuals", and the
linemap
component in the input that connected them.
(While for RILs like the Collaborative Cross, we may want to work
with individual-level phenotypes, it seems best to deal with that
outside of the cross object.)
Removed the functions n_lines()
and line_ids()
. Added some
functions:
n_ind_geno()
for number of genotyped individuals, and
ind_ids_geno()
to get their IDs.n_ind_pheno()
for number of phenotyped individuals, and
ind_ids_pheno()
to get their IDs.n_ind_gnp()
for number of individuals with both genotypes
and phenotypes, and ind_ids_gnp()
to get their IDs.Also, n_ind()
and ind_ids()
now return the total number of
individuals, across both genotypes and phenotypes.
find_ibd_segments
that takes genotypes for a set
of inbred strains and searches for segments where strain pairs look
to be IBD.calc_genoprob
now needs you to provide a
pseudomarker map (created with insert_pseudomarkers
).Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.