suppressPackageStartupMessages({
  library(magrittr)
  library(dplyr)
  library(tidyr)
  library(glue)
  library(rgl)

  library(minilexer)
})

rgl::setupKnitr()

knitr::opts_chunk$set(echo = TRUE)

Example parser: obj format for 3d objects

A simple text file to store 3d objects is the Wavefront obj format. The filetype is well documented on the internet (e.g. 1, 2, 3), and an example octahedron object is show below which has 6 vertices and 8 faces.

octahedron_obj <- '
# OBJ file created by ply_to_obj.c
#
g Object001

v  1  0  0
v  0  -1  0
v  -1  0  0
v  0  1  0
v  0  0  1
v  0  0  -1

f  2  1  5
f  3  2  5
f  4  3  5
f  1  4  5
f  1  2  6
f  2  3  6
f  3  4  6
f  4  1  6
'

The basic structure of a .obj file is:

Use lex() to turn the text into tokens

  1. Start by defining the regular expression patterns for each element in the obj file.
  2. Use minilexer::lex() to turn the obj text into tokens
  3. Throw away whitespace, newlines and comments, since I'm not interested in them.
obj_patterns <- c(
  comment    = '(#.*?)\n',  # assume comments take up the whole line
  number     = pattern_number,  # This regex is defined in `minilex` and matches most numeric values
  symbol     = '\\w+',
  newline    = '\n',
  whitespace = '\\s+'
)

Tokenising the obj

Split the obj text data into tokens, but then remove anything that we don't need to create the actual data structure representing the 3d object.

tokens <- lex(octahedron_obj, obj_patterns)
tokens <- tokens[!(names(tokens) %in% c('whitespace', 'newline', 'comment'))]
tokens


coolbutuseless/minilexer documentation built on May 14, 2019, 6:09 a.m.