pcAxisParser | R Documentation |
Reads and creates the syntactical tree from a PC-AXIS format file or text.
pcAxisParser(streamParser)
streamParser |
stream parse associated to the file/text to be recognised |
Grammar definition, wider than the strict PC-AXIS definition
pcaxis = { rule } , eof ; rule = keyword , [ '[' , language , ']' ] , [ '(' , parameterList , ')' ] , = , ruleRight ; parameterList = parameter , { ',' , parameterList } ; ruleRight = string , string , { string } , ';' | string , { ',' , string } , ';' | number , sepearator , { , number } , ( ';' | eof ) | symbolic | 'TLIST' , '(' , symbolic , ( ( ')' , { ',' , string }) | ( ',' , string , '-' , string , ')' ) ) , ';' ; keyword = symbolic ; language = symbolic ; parameter = string ; separator = ' ' | ',' | ';' ; eof = ? eof ? ; string = ? string ? ; symbolic = ? symbolic ? ; number = ? number ? ;
Normally, this function is a previous step in order to eventually call pcAxisCubeMake
:
cstream <- pcAxisParser(stream)
if ( cstream$status == 'ok' ) cube <- pcAxisCubeMake(cstream)
Returns a list with "status" "node" "stream":
status |
"ok" or "fail" |
stream |
Stream situation after recognition |
node |
List, one node element for each "keyword" in PC-AXIS file. Each node element is a list with: "keyword" "language" "parameters" "ruleRight":
|
PC-Axis file format.
https://www.scb.se/en/services/statistical-programs-for-px-files/px-file-format/
PC-Axis file format manual. Statistics of Finland.
https://tilastokeskus.fi/tup/pcaxis/tiedostomuoto2006_laaja_en.pdf
## Not run: ## significant time reductions may be achieve by doing: library("compiler") enableJIT(level=3) ## End(Not run) name <- system.file("extdata","datInSFexample6_1.px", package = "qmrparser") stream <- streamParserFromFileName(name,encoding="UTF-8") cstream <- pcAxisParser(stream) if ( cstream$status == 'ok' ) { ## HEADING print(Filter(function(e) e$keyword=="HEADING",cstream$node)[[1]] $ruleRight$value) ## STUB print(Filter(function(e) e$keyword=="STUB",cstream$node)[[1]] $ruleRight$value) ## DATA print(Filter(function(e) e$keyword=="DATA",cstream$node)[[1]] $ruleRight$value) } ## Not run: # # Error messages like # " ... invalid multibyte string ... " # or warnings # " input string ... is invalid in this locale" # # For example, in Linux the error generated by this code: name <- "https://www.ine.es/pcaxisdl//t20/e245/p04/a2009/l0/00000008.px" stream <- streamParserFromString( readLines( name ) ) cstream <- pcAxisParser(stream) if ( cstream$status == 'ok' ) cube <- pcAxisCubeMake(cstream) # # is caused by files with a non-readable 'encoding'. # In the case where it could be read, there may also be problems # with string-handling functions, due to multibyte characters. # In Windows, according to \code{link{Sys.getlocale}()}, # file may be read but accents, ñ, ... may not be correctly recognised. # # # There are, at least, the following options: # - File conversion to utf-8, from the OS, with # "iconv - Convert encoding of given files from one encoding to another" # # - File conversion in R: name <- "https://www.ine.es/pcaxisdl//t20/e245/p04/a2009/l0/00000008.px" stream <- streamParserFromString( iconv( readLines( name ), "IBM850", "UTF-8") ) cstream <- pcAxisParser(stream) if ( cstream$status == 'ok' ) cube <- pcAxisCubeMake(cstream) # # In the latter case, latin1 would also work, but accents, ñ, ... would not be # correctly read. # # - Making the assumption that the file does not contain multibyte characters: # localeOld <- Sys.getlocale("LC_CTYPE") Sys.setlocale(category = "LC_CTYPE", locale = "C") # name <- "https://www.ine.es/pcaxisdl//t20/e245/p04/a2009/l0/00000008.px" stream <- streamParserFromString( readLines( name ) ) cstream <- pcAxisParser(stream) if ( cstream$status == 'ok' ) cube <- pcAxisCubeMake(cstream) # Sys.setlocale(category = "LC_CTYPE", locale = localeOld) # # However, some characters will not be correctly read (accents, ñ, ...) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.