reader-package: Suite of Functions to Flexibly Read Data from Files

Description Details Author(s) See Also Examples

Description

A set of functions to simplify reading data from files. The main function, reader(), should read most common R datafile types without needing any parameters except the filename. Other functions provide simple ways of handling file paths and extensions, and automatically detecting file format and structure.

Details

Package: reader
Type: Package
Version: 1.0.6
Date: 2016-12-29
License: GPL (>= 2)

The reader() function, for which the package is named, should be able to read most of the common types of datafiles used in R without needing any arguments other than the filename. The structure, header, file-format and delimiter are determined automatically. Usually no extra parameters are needed. Other functions provide similarly flexibility to run contigent on data type and file format, or can look for an input file in multiple directory locations. The function cat.path() provides a simple interface to construct file paths using directories, suffixes, prefixes and file extension. Functions in this package can be nested inside new functions, providing flexible parameter format, without having to use multiple if-statements to cope with contigencies. Supported types included delimited text files, R binary files, big.matrix files, text list files, and unstructured text. Note that the file type that will be attempted to read in is initially determine by the file extension, using the function: 'classify.ext()'.

List of key functions:

Author(s)

Nicholas Cooper

Maintainer: Nicholas Cooper <njcooper@gmx.co.uk>

See Also

NCmisc ~~

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
mydir <- "/Documents"
cat.path(mydir,"temp.doc","NEW",suf=5)
## example for the reader() function ##
df <- data.frame(ID=paste("ID",101:110,sep=""),
                 scores=sample(70,10,TRUE)+30,age=sample(7,10,TRUE)+11)
test.files <- c("temp.txt","temp2.csv","temp3.rda")
write.table(df,file=test.files[1],col.names=TRUE,row.names=TRUE,sep="\t",quote=FALSE)
# file.nrow and file.ncol examples
file.nrow(test.files[1])
file.ncol(test.files[1])
write.csv(df,file=test.files[2])
save(df,file=test.files[3])
# use the same simple reader() function call to read in each file type
for(cc in 1:length(test.files)) {
    cat(test.files[cc],"\n")
    myobj <- reader(test.files[cc])  # add 'quiet=F' to see some working
    print(myobj); cat("\n\n")
}
# inspect files before deleting if desired:
#  unlink(test.files)
#
# find id column in data frame
new.frame <- data.frame(day=c("M","T","W"),time=c(9,12,3),staff=c("Mary","Jane","John"))
staff.ids <- c("Mark","Jane","John","Andrew","Sally","Mary")
new.frame; find.id.col(new.frame,staff.ids)

Example output

Loading required package: NCmisc

Attaching package: 'reader'

The following objects are masked from 'package:NCmisc':

    cat.path, get.ext, rmv.ext

[1] "/Documents/NEWtemp.doc5"
temp.txt 
      11 
[1] 4
temp.txt 
      ID scores age
1  ID101     42  13
2  ID102     77  17
3  ID103     93  14
4  ID104     99  14
5  ID105     59  18
6  ID106     67  15
7  ID107     69  15
8  ID108     65  17
9  ID109     66  15
10 ID110     65  15


temp2.csv 
      ID scores age
1  ID101     42  13
2  ID102     77  17
3  ID103     93  14
4  ID104     99  14
5  ID105     59  18
6  ID106     67  15
7  ID107     69  15
8  ID108     65  17
9  ID109     66  15
10 ID110     65  15


temp3.rda 
      ID scores age
1  ID101     42  13
2  ID102     77  17
3  ID103     93  14
4  ID104     99  14
5  ID105     59  18
6  ID106     67  15
7  ID107     69  15
8  ID108     65  17
9  ID109     66  15
10 ID110     65  15


  day time staff
1   M    9  Mary
2   T   12  Jane
3   W    3  John
$col
[1] 3

$maxpc
[1] 0.5

$index
[1] NA  2  3 NA NA  1

$result
[1] <NA> Jane John <NA> <NA> Mary
Levels: Jane John Mary

reader documentation built on May 2, 2019, 9:27 a.m.