population: Generate a Population with Controlled Sparsity and Unique...
In predomics/predomicspkg: Interpretable Prediction in Omics Data

population

R Documentation

Generate a Population with Controlled Sparsity and Unique Individuals

Description

This function generates a population of individuals (feature subsets) with specified sparsity, optional use of a best ancestor, and controls for unique individuals. It creates feature subsets according to specified parameters, optionally seeding some individuals based on a provided best ancestor.

Usage

population(
  clf,
  size_ind,
  size_world,
  best_ancestor = NULL,
  size_pop = NULL,
  seed = NULL
)

Arguments

`clf`	A classifier object containing parameters for population generation, including: - 'size_pop': Size of the population to generate. - 'perc_best_ancestor': Percentage of the population to generate based on the best ancestor. - 'unique_vars': Boolean indicating whether each individual in the population should be unique. - 'popSourceFile': Path to a file containing saved populations to import. - 'current_sparsity': Target sparsity for each individual in the population.
`size_ind`	Integer specifying the number of features (sparsity) for each individual.
`size_world`	Integer representing the size of the "world" (total number of possible features).
`best_ancestor`	Optional vector or list representing the best ancestor to use as a template for generating a portion of the population.
`size_pop`	Optional integer specifying the number of individuals in the population. Defaults to 'clf$params$size_pop'.
`seed`	Optional integer seed for random number generation, ensuring reproducibility.

Details

The function generates a population by: 1. **Best Ancestor**: If 'best_ancestor' is provided, a portion of the population is generated by slightly modifying the ancestor. 2. **Population Size**: Determines the final population size based on combinatory limits and 'size_pop'. 3. **Uniqueness**: If 'unique_vars' is 'TRUE', each individual is checked for uniqueness before adding it to the population.

**Additional Options**: - The function can import previously saved populations from 'popSourceFile' and add them to the current population. - Ensures that population size and sparsity constraints are respected.

Value

A list of individuals (feature subsets), each represented as a sorted vector of selected feature indices.

Examples

## Not run: 
clf <- list(
  params = list(size_pop = 10, perc_best_ancestor = 20, unique_vars = TRUE, popSourceFile = "NULL", current_sparsity = 5)
)
size_ind <- 5
size_world <- 20
best_ancestor <- sample(1:size_world, size_ind - 1, replace = FALSE)
population <- population(clf, size_ind, size_world, best_ancestor)
print(population)

## End(Not run)

predomics/predomicspkg documentation built on Dec. 11, 2024, 11:06 a.m.