format_species: Format species names

View source: R/format_species.R

format_speciesR Documentation

Format species names

Description

Format scientific species names into a standardised manner.

Usage

format_species(
  species,
  remove_parentheses = TRUE,
  abbrev = FALSE,
  remove_subspecies = FALSE,
  remove_subspecies_exceptions = c("Canis lupus familiaris"),
  split_char = " ",
  collapse = " ",
  remove_chars = c(" ", ".", "(", ")", "[", "]"),
  replace_char = "",
  lowercase = FALSE,
  trim = "'",
  standardise_scientific = FALSE
)

Arguments

species

Species query (e.g. "human", "homo sapiens", "hsapiens", or 9606). If given a list, will iterate queries for each item. Set to NULL to return all species.

remove_parentheses

Remove substring within parentheses: e.g. "Xenopus (Silurana) tropicalis" –> "Xenopus tropicalis"

abbrev

Abbreviate all taxonomic levels except the last one: e.g. "Canis lupus familiaris" ==> "C l familiaris"

remove_subspecies

Only keep the first two taxonomic levels: e.g. "Canis lupus familiaris" –> "Canis lupus"

remove_subspecies_exceptions

Selected species to ignore when remove_subspecies=TRUE. e.g. "Canis lupus familiaris" –> "Canis lupus familiaris"

split_char

Character to split species names by.

collapse

Character to re-collapse species names with after splitting with split_char.

remove_chars

Characters to remove.

replace_char

Character to replace remove_chars with.

lowercase

Make species names all lowercase.

trim

Characters to trim from the beginning/end of each species name.

standardise_scientific

Automatically sets multiple arguments at once to create standardised scientific names for each species. Assumes that species is provided in some version of scientific species names: e.g. "Xenopus (Silurana) tropicalis" –> "Xenopus tropicalis"

Value

A named vector where the values are the standardised species names and the names are the original input species names.

Examples

species <- c("Xenopus (Silurana) tropicalis","Canis lupus familiaris")
species2 <- format_species(species = species, abbrev=TRUE)
species3 <- format_species(species = species, 
                           standardise_scientific=TRUE,
                           remove_subspecies_exceptions=NULL)

neurogenomics/orthogene documentation built on Jan. 30, 2024, 4:44 a.m.