dfmockdata: Generate mock data

Description Usage Arguments Value Author(s) See Also Examples

View source: R/dfmockdata.R

Description

This function produces a mock survey with observed log-masses x.obs with Gaussian uncertainties and distances r, using a custom mass function (MF) and selection function.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
dfmockdata(
  n = NULL,
  seed = 1,
  veff = NULL,
  f = NULL,
  dVdr = NULL,
  gdf = function(x) dfmodel(x, c(-2, 11, -1.3), type = "Schechter"),
  g = NULL,
  sigma = 0,
  rmin = 0,
  rmax = 20,
  xmin = 2,
  xmax = 13,
  shot.noise = FALSE,
  verbose = FALSE
)

Arguments

n

Number of objects (galaxies) to be generated. If n=NULL, the number is determined from the mass function (gdf) and the selection criteria (specified by f and dVdr). Otherwise, the survey volume (specified by the derivative dVdr) is automatically multiplied by the scaling factor required to obtain the requested number of objects n.

seed

An interger number used as seed for the random number generator. If you wish to generate different realizations, with the same survey specifications, it suffices to vary this number.

veff

is the effective volume function veff(x), definied as the cosmic volume in which sources of log-mass x can be detected by the survey. If this function is specified, f, dVdr and g cannot be specified.

f

is the selection function f(x,r), giving the ratio between the expected number of detected galaxies and true galaxies of log-mass x and comoving distance r. Normally this function is bound between 0 and 1. It takes the value 1 at distances, where objects of mass x are easily detected, and 0 at distances, where such objects are impossible to detect. A rapid, continuous drop from 1 to 0 normally occurs at the limting distance rmax, at which a galaxy of log-mass x can be picked up. f(x,r) can never by smaller than 0, but values larger than 1 are conceivable, if there is a large number of false positive detections in the survey. The default is f = function(x,r) erf((1-1e3*r/sqrt(10^x))*20)*0.5+0.5, which mimiks a sensitivity-limited survey with a fuzzy limit.

dVdr

is the function dVdr(r), spedifying the derivative of the survey volume V(r) as a function of comoving distance r. This survey volume is simply the total observed volume, irrespective of the detection probability, which is already specified by the function f. Normally, the survey volume is given by V(r)=Omega*r^3/3, where Omega is the solid angle of the survey. Hence, the derivative is dVdr(r)=Omega*r^2. The default is Omega=2.13966 [sterradians], chosen such that the expected number of galaxies is exactly 1000 when combined with the default selection function f(x,r).

gdf

is the 'generative distribution function', i.e. the underlying mass function, from which the galaxies are drawn. This function is a function of log-mass x. It returns the expected number of galaxies per unit of cosmic volume V and log-mass x. The default is a Schechter function.

g

function of distance r descibing the number-density variation of galaxies due to cosmic large-scale structure (LSS). Explicitly, g(r)>0 is the number-density at r, relative to the number-density without LSS. Values between 0 and 1 are underdense regions, values larger than 1 are overdense regions. In the absence of LSS, g(r)=1. Note that g is automatically rescaled, such that its average value in the survey volume is 1.

sigma

Gaussian observing errors in log-mass x, which are automatically added to the survey. sigma can either be (1) a scalar, (2) a vector of n elements, or a function of the true log-mass x.

rmin, rmax

Minimum and maximum distance of the survey. Outside these limits the function f(x,r) will automatically be assumed to be 0.

xmin, xmax

Minimum and maximum log-mass in the survey. For optimal performance, specify these boubdaries in such a way that they certainly contain all sources generated by the survey, but don't span a much larger range.

shot.noise

Logical flag. If set to TRUE, the number of galaxies in the survey can differ from the expected number, following a Poisson distribution.

verbose

Logical flag. If set to TRUE, some information will be displayed in the console while generating the mock survey.

Value

dfmockdata returns a list of arrays and scalars:

x

Array of observed log-mass.

x.err

Gaussian uncertainties on x.

x.true

Array of true log-masses, i.e. the values of x before they were perturbed by random uncertainties x.err.

r

Array of comoving distances, only available if a function f is given.

f

Selection function provided as input argument.

g

Cosmic LSS function provided as input argument.

dVdr

Derivative of survey volume provided as input argument, but rescaled to the requested number of galaxies n.

veff

Function returning the effective volume as a function of log-mass x.

veff.values

Array of effective volumes for each galaxy.

scd

Function returning the expected source count density as a function of log-mass x.

rmin,rmax

Range of comoving distances r, spanned by the survey. Same as input arguments.

xmin,xmax

Range of log-masses x provided as input argument. This range is generally larger than the range spanned by the values of x and is meant to span the maximally conceivable range of x given the survey specifications.

rescaling.factor

Value of rescaling factor applied to the cosmic volume to match the requested number of galaxies n.

Author(s)

Danail Obreschkow

See Also

dffit

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# draw 1000 galaxies with mass errors of 0.3 dex from a Schechter function
# with parameters (-2,11,-1.3) and a preset selection function
mock = dfmockdata(sigma = 0.3)

# plot the distance-log(mass) relation of observed data, true data, and approximate survey limit
plot(mock$r,mock$x,col='blue')
points(mock$r,mock$x.true,pch=20)
x = seq(5,11,0.01)
lines(1e-2*sqrt(10^x),x,col='red')

# These data can then be used to fit a MF in several ways. For instance,
# assuming that the effective volume function Veff(x) is known:
selection = mock$veff
survey = dffit(mock$x, selection, mock$x.err)

# or assuming that Veff is known only on a galaxy-by-galaxy basis
selection = mock$veff.values
dffit(mock$x, selection, mock$x.err)

# or assuming that Veff is known on a galaxy-by-balaxy basis, but approximate analytically
# outside the range of observed galaxy masses
selection = list(mock$veff.values, mock$veff)
dffit(mock$x, selection, mock$x.err)

# or assuming that the full selection function f(x,r) and the observing volume
# derivative dVdr(r) are known
selection = list(mock$f, mock$dVdr, mock$rmin,mock$rmax)
dffit(mock$x, selection, mock$x.err)

obreschkow/dftools documentation built on June 25, 2021, 10:45 p.m.