read.tsv: Reads the .swd-file

Description Usage Details

View source: R/read_tsv.r

Description

The .swd file contains the time shifted viewing for a given day.

Usage

1

Details

The time shifted viewing is stored in ".swd" files. For each day there is a separate file. The files are of fixed-with type, with the following structure

field name label start end widths type 1 household hh 1 7 7 integer 2 individuum ind 8 9 2 integer 3 station sta 10 13 4 integer 4 recording start time start 14 19 6 integer 5 recording end time end 20 25 6 integer 6 set set 26 26 1 character 7 activity act 27 27 1 character 8 platform plt 28 28 1 integer 9 recording date date.live 29 36 8 integer 10 viewing start time start.tsv 37 42 6 integer 11 viewing end time "" 43 48 6 integer 12 speed "" 49 50 2 integer From 2013-01-01 to 2015-11-23 the ".swd" files have 12 fields (50 characters). After the '2015-11-23' the files only contain fields 1 to 10 (42 characters). Fields 11 and 12 apparently are redundant information and have actually never been used by Kantar. Simply read only fields 1 to 10 for all cases.

The ".swd" file structure file is very similar to the ".swo" (live) files structure, except the additional fields 9 and 10. The first two fields can be read as one single field representing "pin" ("personal identification number"). The hh number is rarely used but can easaly be restored be e.g. as.integer(pin / 100L). Fields with hexadecimal coding have to be read as class character (see labels). Each tsv file contains all tsv-viewing at one particular day. To add tsv +7 to the live viewing at a particular day we have to load 8 tsv files, the one with the same day as live plus the following 7 days. The field 9 gives the date at which the content was first braodcastet. This allows to match tsv viewing to the corresponding live day. So, from all tsv files loaded we're only interested in those rows with the date.live matching the date of the live file. Even more surprising, start.tsv, not start, is the start time of watching. Start represents the start time when the content was broadcasted. has to be later (start - start.tsv) than the corresponding start. There are two ways to look at tsv viewing. "overnight 0-7" flags the viewing by the days past since live broadcasting, while "tsv 0-7" flags the viewing by 24-hours shifts since live broadcasting. The "overnight 0-7" label is given by the difference between date.tsv (the date of the file) and date.live. The "tsv 0-7" labels have to be calculated by the timedifference in seconds between start of watching and start of live broadcasting.


rluech/tv-clone documentation built on Jan. 7, 2022, 12:27 a.m.