annotate_foods: Text Mining Pipeline to Annotate Free Nutritional Text with...
In pcastellanoescuder/fobibsa: Tools for Manipulating the FOBI Ontology

View source: R/annotate_foods.R

annotate_foods

R Documentation

Text Mining Pipeline to Annotate Free Nutritional Text with FOBI

Description

This function provides a text mining pipeline to map nutritional free text to Food-Biomarker Ontology. This pipeline is composed of five sequential layers to map food items to FOBI with the maximum accuracy as possible.

Usage

annotate_foods(foods, similarity = 0.85, reference = fobitools::foods)

Arguments

`foods`	A two column data frame. First column must contain the ID (should be unique) and the second column must contain food items (it can be a word or a string).
`similarity`	Numeric between 0 (low) and 1 (high). This value indicates the semantic similarity cutoff used at the last layer of the text mining pipeline. 1 = exact match; 0 = very poor match. Values below 0.85 are not recommended.
`reference`	FOBI foods table obtained with 'parse_fobi(terms = "FOBI:0001", get = "des")'. If this value is set to NULL, the last version of FOBI will be downloaded from GitHub.

Value

A list containing two tibble objects: annotated and unannotated food items.

Author(s)

Pol Castellano-Escuder

References

Pol Castellano-Escuder, Raúl González-Domínguez, David S Wishart, Cristina Andrés-Lacueva, Alex Sánchez-Pla, FOBI: an ontology to represent food intake data and associate it with metabolomic data, Database, Volume 2020, 2020, baaa033, https://doi.org/10.1093/databa/baaa033.

Examples


# Free text annotation in FOBI
free_text <- data.frame(id = c(101, 102, 103, 104),
                        text = c("Yesterday I ate eggs and bacon with a butter toast and black tea", 
                                 "Crisp bread and rice crackers with wholegrain", 
                                 "Beef and veal, one apple", "pizza without meat"))
annotate_foods(free_text)

pcastellanoescuder/fobibsa documentation built on Dec. 1, 2024, 5:25 p.m.