pak::pkg_install()
to remotes::install_github()
is_globs()
and update_globs()
norgeo::cast_geo()
.geo_merge()
gains a localtable
argument. It can now be used on mapping tables generated with geo_map_multi
. geo_map_multi()
, to generate a multiyear mapping table. 9b3e1a1geo_merge()
d163c9aread_file
ba980a7debug.rows
options.options
with either orgdata.encoding.access
or
orgdata.encoding.csv
. Check
config
file.geo_recode()
using fix=TRUE
as default argument.debug.row
to debug.rows
.dummy_grk
(#318)year
to year.geo
in make_file()
to be more explicit (#317)pak
package for upgrade.norgeo
package version 2.3.1norgeo
package version 2.3.0 from Github ie. dev, instead of CRAN.KOBLID
(#315)orgEnv
.select
arg to ease choosing file(s) in make_file()
instead of KOBLID
.
Read the document on how to use the argument.dbDisconnect
error (#312)IBRUKTIL
and IBRUKFRA
for selecting original files.
IBRUKTIL
uses >=
and IBRUKFRA
uses <
of the specified date. (#309)delete
or slett
when recoding with RE options to ""
. This
replaces empty
and tom
as in #285.xx99
to geo_map()
. Unidentified municipalities do not exist from API. (#304)AgeCat
can use mix categories easily with [x]
. See example in find_age_category()
. (#305) KONTROLLERT
column in Access (#298)see_data()
from data warehouse
(#300)xx99
in geo_recode()
function ie.
geo record table in Access. (#302)xx99
for known county. (#303)read_file()
accept SPSS file too (#297)read_file()
now accept txt file extension.find_age_category()
(#294) AgeCat
function in EXTRA (#295)match.arg()
(#263)see_data()
. Use "all"
in koblid
argument to select all data on the chosen filegroup (#291)empty
or tom
to represent regular expression to replace to since Access makes symbol ""
to be invisible (#285)see_org()
to see_data()
for viewing data in the data warehouse.|
to separate multiple arguments in column EXTRA
(#288)AgeCat()
. This function can be use in table for filegroup under EXTRA
column (#287 #289)rf()
to rdf()
since rf()
is already in use in stats package.BRUKTIL
date other than 01-01-9999
will be excluded. Now filtering with date will be compared against current date. (#272)KONTROLLERT
instead of updating the dataset due to time consuming by updating it. The users have to mark the column to save or read the dataset in the database. (#278)raw
in make_file()
to FALSE
as in config file.is_colour_txt()
can specify symbol directly without needing to rely on the global options or to use withr
package. Just for cosmetic purposes :smiley:see_org()
function.NAs with coercion
instead of just the GEO
number where the coercion took place (#274)data.table::fwrite()
in save_file()
function.future.apply
package conditionally to reduce package vulnerability.VAL
is handled properly.
When reshaping multiple columns to be just one VAL
then leaving
RESHAPE_VAL
empty ie. use all columns not defined in RESHAPE_ID
should
work as before (#269)RESHAPE_KOL
and
RESHAPE_VAL
(#268)geo_merge()
to add geo granularity that aren't from API. This
can be a csv, xls or any other format that is accepted by read_file
(#262)raw = TRUE
when using function make_file()
(#264)geo_levels()
to geo_map()
for mapping geo codes
granularity.read_file()
. (#250) read_file()
accept Stata file with dta
extension (#252)see_org()
to read the raw data in the database. Argument action =
"delete"
can be used to delete the data from the raw database.geo_merge()
for merging geo codes that aren't available from API to
the mapping table ie. tblGeo, in the geo database. The data could be in any
file format accepted by read_file()
function. The data to be merged must
have column to be merged ie. id.file
, that is equivalent to the column id in
the database ie. id.table
. The id.file
must be unique.raw
is
used, else give error message. #246read_file()
accept filegroup name as argument in file
to read the
completed file after running make_file()
function. #247debug_opt("deep")
or options(orgdata.debug = "deep")
#243PS
in codebook. The function is used when there is a
need to recode the value of a column after the dataset have been clean and
aggregated. Specification to select the row to be recoded uses either standard
expression or R syntax of data.table
style. When using R syntax the value
must have raw
prefix eg. raw(AAR %in% c(2000, 2005))
. #244 #245"-"
minus symbol in TIL
column in the codebook is
accepted for do_recode_post()
.update_orgdata()
. Basically it's just a wrapper
for remotes::install_github()
.orgdata.num
. Ensure thise columns are numeric and give warning as well as
log when coercion where NA is introduced. (#235)NA
(#229)code99
also include koblid. The files will be named as
code99_koblidxxx
(#222)parallel = 0.75
in the argment make_file()
or in the global
options orgdata.parallel
(#225)orgdata.year
to specify production year if not using
current year. (#216)make_file()
with argument parallel = TRUE
.
(#217)make_filegroups()
. (#199)EXTRA
column on filegroup level
with argument DeleteOldBydel
. (#204 #206)KOBLID
to be more specific. (#208)tot_elev
is the product of both mestringsnivå
and klassetrinn
. (#188)debug_opt()
. (#196)tot_elev
represents the total
number of student with mestringsnivå
and not the grand total of students.
The number of students with mestringsnivå
is represented in column
ant_elev
. Therefore the long format for mestringsnivå
needs to be
restructured to wide with value from the ant_elev
to ensure summing up
tot_elev
when creating denominator will not create a grand total of students
instead of the total number of students with mestringsnivå
. (#184)write = TRUE
and the table
doesn't exist in the geo-code database.orgdata.debug.geo
or
orgdata.debug.aggregate
are active and make the default to kommune
. (#166)unknown
codes with either xxxx9999
or xxxx99
. (#177)C:/Users/YourUserName/orgdata_logs
when path
argument is not specified in save_file()
. (#179)xxxx99
or xxxx9999
. As in #177
but recode is done on municipality codes before merging back to the original
dataset. (#182)codeDelete
in log
for geographical codes that aren't able to be
merged. The codes will be excluded in the dataset. To access all the deleted
codes use log$codeDelete
. (#149)path
is missing in save_file()
. (#152)geo_map()
or geo_recode()
. (#156)AAR
in the dataset. Use argumentbase
or global option orgdata.recode.base
with logical input. TRUE
will
select the base year for recoding geographical code from the year of the
original file to the current year. Default is FALSE
ie. include all
available geographical codes available in the codebook. (#157)
- Fixed #139 for orgdata.debug.geo
keep original geo codes for enumeration
areas before adding 9999. (#140)
- Fixed #142 show codes that have problem to recode directly instead of row
numbers (#144)
- Save all codes that have problem in log
environment for easy access. To list
the codes is either with log$code00
or log$codeShort
- Recode geo even when argument aggregate = FALSE
in make_file()
function.
- Rename make_filegroup
and lag_filgruppe
to plural ie. make_filegroups
and lag_filgrupper
.
- Use options orgdata.debug.rows
to select only specific row(s) for further
processing. It can be activated via global options with
options(orgdata.debug.rows = 20:50)
or via argument row = 20:50
in
make_file()
to select row 20 to 50.
- Fixed #135 with incorrect geo recode. (#131)
- Make multiple filegroups via make_filegroups
. (#137)
- Fixed #132 LANDSSB must be string
- Convert whitespace to NA to be able to delete all rows with NA
- Fixed #119 able to mutate for TABS and VALS as well (#126)
- Fixed #122 delete rows with NA via EXTRA column (#127)
- Fixed #118 warning text when column(s) aren't defined in FILGRUPPE and will be
deleted (#128)
- Edit error message for columns with existing NA value before aggregating.
Total value will be NA and this will conflict with the allready existing NA
category in the aggregated column(s). Therefore existing NA value in the
selected column(s) must be recoded to a valid value.
- Use options(orgdata.debug.geo = TRUE)
to keep old geo codes for debuging (#120)
- Use reset_options()
to reset to default options.
- Warn when process discontinued due to debugging.
- Add vignettes for Standardize git and Debugging
- Fixed #121 recode geographical code for split codes (#120)
- Change database filename to raw-database_BE.accdb
- Fixed #106 split long messages (#107)
- Fixed #108 #112 grunnkrets codes that have changed before 2002 not available
via API from SSB while code changes for municipality includes changes
from 1977. Check from SSB
website. We
use the municipality codes to create uspesified grunnkrets codes for data
before 2002 (#109 #113)
- Fixed #110 updating SQL code for new table name for codebook (#111)
- Check columns to aggregate for any possible NA
(#98). Columns that have NA
should be recoded to uoppgitt
or something equivalent since leaving the
category to NA
will conflict with NA
representing total value when
aggregating.
- Fixed #100 for grunnkrets that ends with 00
have no correspond codes from
SSB API. Need to add it manually (#101)
- Fixed #99 when geo codes fails to be recoded then the row index will be shown (#103)
- Geo codes ends with 4 zeros xxxx0000
neither have equivalent codes from SSB
nor representing a correct coding structure as so called Delområde
that ends
with 2 zeros xxxxxx00
. To avoid missing the information, these geo codes are
recoded to xxxx9999
with function is_grunnkrets_0000()
as in PR (#103).
- see_file()
accepts just a single numeric as well.
- Fixed #85 see_file()
list all the columns when columnames or column indexes
are not specified. The variables are sorted whenever possible. (#87)
- Add more function tests (#88)
- Exclude LANDSSB
in aggregate when split to LANDBAK
and INNVKAT
. This is
because code 0
will be recoded to 20
when split and causes unnecessary more
rows (#84)
- Delete deprecated functions.
- Fixed #93 when source level can't be identified due to NA
.
- Fixed #95 for grunnkrets codes that aren't missing but have less number of
digits ie. less than 7 digits. Assuming these are codes for municipality then
9999
is added at the end of these codes (#96)
- Gives row number for GEO codes that get coerced as NA
when converted to
integer. This will make it easy to check in the original raw data (#96)
- Aggregate now give total to all dimensions including those specified in
AGGKOL
(#82)
- Function see_file()
accept column index as well (#83)
- Recode variables using regular expression when defined in codebook with type
RE
. Finding pattern can either be written in ordinary regular expression ie.
\\d{4}.*
or with rex()
package. (#78)
- New feature for checking categories for variables with see_file()
(#75)
- Fixed #65 make TABS and VALS dynamic for easy extension for these columns (#66)
- Fixed #64 recode of variable that has different class (#68)
- Fixed #63 implicit null includes all possible VAL columns when exist (#69)
- Fixed #70 recode GEO of different object class (#71)
- Fixed #67 aggregate with total values for standard variables ie. UTDANN
,
LANDSSB
, LANDBAK
and INNVKAT
(#72)
- Fixed #61 use AGGKOL in Access registration database to specify other columns to
aggregate other than the standard eg. KJONN
, TAB1
, TAB2
etc. (#73)
- Fixed #55 to recode standard variables via codebook instead of hard coded (#58)
- Fixed #52 skip split if not specified (#59)
- Fixed #57 split column with duplicated values will keep the original column (#60)
- Fixed #56 aggregate all VAL columns whenever specified and not only specific to
VAL1
(#62)
- Edit verbose messages
- Reshape dataset from wide to long. Reshape can have more than one measure
variables
. Please read how this is specified in Access registration database.
- Split columns must have equal number of values to the defined SPLITTIL
.
Duplicate the value if it is less than the maximum SPLITTIL
. For example for
value 0
in column LANDSSB
which will be split into LANDBAK
and
INNVKAT
, the value will be duplicated into 00
to avoid split with value NA
.
- Recode for LANDBAK
and INNVKAT
after aggregating are done internally ie.
hard coded, in do_aggregate_recode_standard()
. Total is coded with 20
. Any
eventuality for future change should also look other related functions such as
is_aggregate_standard_cols()
and is_col_num()
.
- Change argument parameter for find_spec()
function.
- Update text document in several places.
- Add colour type warn2 for warning message without Warning:
prefix.
- Request (#43) messages with specific colour
- Fixed (#46) recode to string even though columns is type integer or numeric.
- Unknown bydel ie. (uoppgitt) is added when enumeration areas codes ie.
(grunnkrets) for bydel is XXXX9999
in function geo_level()
.
- Add unknown grunnkrets for kommune when not available since some of the
datasets have unknown grunnkrets that aren't listed in API downloaded data (#39).
- Exclude TAB1
, TAB2
and TAB3
from being aggregated. (#44)
- Recode for aggregated variables uses AG
in TYPE column in the codebook
instead of FILGRUPPE with AGGREGATE
as it was implemented in ver 0.2.0. This
will make it possible so specify FILGRUPPE and LESID to implement the
principle for GENERAL, COMMON and SPECIFIC variables.
- Change function name do_aggregate_recode
to do_aggregate_recode_standard
for standard variables.
- Recode for aggregated categories can be defined in Recode form ie. codebook,
and use AGGREGATE
in the specification under FILGRUPPE
- Delete rows when defined in codebook using minus symbol under TIL column.
Similar principles is implemented for GENERAL, COMMON and SPECIFIC
feature as in recode. Read detail in ver 0.0.5 - alpha.
- Display both columnames to be recoded that are found in the dataset or those
that aren't found when defined as ALLE
in the codebook so user will be aware
of its existence.
- Standardize some most used arguments to read_file()
such as nrows
,
header
, skip
, trimws
and na
. Read details in read_file()
function
description.
- Output to read_file()
as data.table class.
- Use standard columnames with V1
, V2
etc when argment header = FALSE
is specified.
- Error message with list of unmatch columns in do_column_standard()
.
- Give clearer message and debug message eg. Execute: read_file()
.
- Change MAPPE
to UTMAPPE
to make it more explicit for path specification to save file.
- Defun orgdata.active
global options to use columnames from original dataset.
- Use global options options(orgdata.debug.nrow = TRUE)
to read only first 20 rows. Suitable for debug purposes.
- Fix (#28) GEO derived from two columns with empty INNLESARG.
- Add column LEVEL
for granularity level ie. grunnkrets, fylke, kommune, bydel etc
geo
and val
in make_file
. Output data must use standard
columnames instead of keeping the columnames from original dataset.read_raw
or lesraw
to make_file
or lag_fil
(#27)<
and more than >
. For instance when column KJONN
doesn't exist in the
original data, we can specify with <2>
in under column KJONN
in the Access
registration database. The output will add a new column KJONN
with value 2
. (#15)orgdata.verbose
is TRUE
.orgdata.implicit.null
with default as TRUE
. Use
options(orgdata.implicit.null = FALSE)
to deactivate (#19)LANDBAK
to LANDSSB
for column in original data
received from SOB containing information about country of origin.MAPPE
in Access registration database or
specify in path
argument for function save_file
. (#12)GEO
with comma separated eg. nameGeoCol1, nameGeoCol2
.GEO
, AAR
, ALDER
, KJONN
save_file
from lagfil
to lagrefil
.KOLNAVN
instead of ADDKOL
.do_addcols
and get_addcols
to do_colname
and
get_colname
to be consistent with the changes in Access registration database.tbl_KodeBok
uses:ALLE
and are used to
recode variables in all groups.SPECIFIC variables are when FILGRUPPE and LESID are specified. This will recode variables in that specified FILGRUPPE of the specified FILID.
When all these three specification exist in tbl_KodeBok
:
COMMON variables will overrule GENERAL variables
Write as <NA>
in codebook under column FRA
when specifying missing
variables indicating that a missing column to be recoded to value in column TIL
. This
will differentiate between real missing and a real column value of NA
. (#5)
Error message will be given if LESID is specified without FILGRUPPE since LESID is not unique ID.
is_col_separate()
old
convert to integer and use and index for columnsChanges is in PR #2
VAL1=TOTAL, TAB1=ICD
is valid inputChanges is in PR #1
Things that are implemented
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.