read.sas7bdat.parso: Read sas7bdat files with Parso

Description Usage Arguments Details Value Author(s) Examples

View source: R/read.sas7bdat.parso.R

Description

Read sas7bdat files with the Java Parso library by GGASoftware.

Usage

1
2
3

Arguments

file

string; sas7bdat filename

READ_FUNC

function; function used to read temporary CSV file

...

other arguments passed to READ_FUNC which defaults to READ_FUNC = read.csv for read.sas7bdata.parso and READ_FUNC = data.table::fread for fread.sas7bdata.parso

Details

The read.sas7bdat function uses the rJava package to interface with the GGASoftware Parso library. The Parso library builds on the sas7bdat file format documentation and code provided by the sas7bdat package.

The fread.sas7bdat function uses the fread function in data.table package and defaults to returning a data.table.

The Parso library is licensed according to the GPLv3. A copy of the GPLv3 is provided in the inst/doc directory. The Parso library depends on the SLF4J library by the software company QOS.ch. The SLF4J library is subject to the terms of the MIT license. A copy of the MIT license is provided in the inst/doc directory. The function implementation in rather inefficient, since each row of the sas7bdat database is read separately and written to a temporary CSV file. Once all rows are read, the temporary CSV file is read using the READ_FUNC function which defaults to READ_FUNC = read.csv. The code could be made more efficient by reading all of the rows at once. This is possible with the Parso library. However, there must be sufficient memory to store the full dataset.

Value

A data frame representation of the data stored in the sas7bdat file if the default is kept.

A data.table representation of the data stored in the sas7bdat file if fread.sas7bdat.parso is used.

Author(s)

Matt Shotwell

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
  #read.sas7bdat.parso("sdrug.sas7bdat")   
  #fread.sas7bdat.parso("sdrug.sas7bdat")
  

  ## The code below illustrates the mechanism to
  ## read/write sas7bdat files sequentially in R
  ## where 's7bfile' and 'csvfile' are the input
  ## and output filenames
  
  #   sin <- .jnew("java/io/FileInputStream", s7bfile)
  #   s7b <- .jnew("com/ggasoftware/parso/SasFileReader",
  #                .jcast(sin, "java/io/InputStream"))
  #
  #   cst <- "Ljava/util/List;"
  #   ost <- "[Ljava/lang/Object;"
  #   pst <- "Lcom/ggasoftware/parso/SasFileProperties;"   
  #
  #   col <- .jcall(s7b, cst, "getColumns")
  #   sfp <- .jcall(s7b, pst, "getSasFileProperties")
  #
  #   rct <- .jcall(sfp, "J", "getRowCount") 
  #   
  #
  #   flw <- .jnew("java/io/FileWriter", csvfile)
  #   cdw <- .jnew("com/ggasoftware/parso/CSVDataWriter",
  #                .jcast(flw, "java/io/Writer"))
  #
  #   .jcall(cdw, "V", "writeColumnNames", col)
  #   for(i in 1:rct)
  #     .jcall(cdw, "V", "writeRow", col,
  #       .jcall(s7b, ost, "readNext", evalArray=FALSE))
  #
  #   .jcall(flw, "V", "close")

BioStatMatt/sas7bdat.parso documentation built on May 5, 2019, 4:46 p.m.