sql.entity.relation.generic: Relationship matching SQL query
In cezden/rpostgrexplorer: Simplification of analysis in PostgreSQL and Redshift DBs

The query is constructed on 3 different levels (nested queries):

Level 1: construct simple counts for co-occurence of values in both columns
Level 2: aggregate coocurences, drop the values
Level 3: calculate the statistics of co-occurences

1 2	sql.entity.relation.generic(query1, query1.colname, query1.countname = "cnt", query2, query2.colname, query2.countname = "cnt")

The purpose of Level 1 query is to perform simple counting of distinct values in both columns resulting in the following table:

eone the value from combined list of non-NULL values from column1 and column2
aone.cnt number of occurences of the eone in column1 or NULL in case of no occurences
atwo.cnt number of occurences of the eone in column2 or NULL in case of no occurences

The purpose of Level 2 query is to perform simple processing of Level 1 query and dropping actual dictionary values, which results in the following table (described only in part concerning column1 - the part concerning column2 is symmetrical) consisting of 8 columns:

aone_cnt number of occurences of the value in column1, 0 or more
aone_cnt_ind occurence indicator of the value in column1, 0 or 1
aone_cnt_nullable number of occurences of the value in column1 as provided by Level 1 query (exact copy of aone.cnt)
aone_cnt_both_present serving the purpose of a correlator of the counts from both columns: it is NULL when atwo.cnt is NULL, otherwise it is aone.cnt

The purpose of Level 3 query is to aggregate results of Level 2 query into single row of the following structure:

distvals_cnt: number of distinct non-NULL values in column1 and column2 (combined)
f_distentities_frac: fraction of (combined) distinct values in column1 (first)
s_distentities_frac: fraction of (combined) distinct values in column2 (second)
distentities_match_frac: fraction of matching values (1+ – 1+ relationships)
f_mincnt, f_avgcnt, f_maxcnt, f_sdevcnt: min, mean, max and sd of distinct value counts in column1 (first)
s_mincnt, s_avgcnt, s_maxcnt, f_sdevcnt: min, mean, max and sd of distinct value counts in column2 (second)
f_sumcnt: number of non-NULL rows in column1
s_sumcnt: number of non-NULL rows in column2
f_avgcnt_when_present, f_sdevcnt_when_present: mean and sd of distinct value counts in column1 (first) when they are present
s_avgcnt_when_present, s_sdevcnt_when_present: mean and sd of distinct value counts in column2 (second) when they are present
f_mincnt_both_present, f_avgcnt_both_present, f_maxcnt_both_present, f_sdevcnt_both_present: min, mean, max and sd of of distinct value counts in column1 (first) when there are matching rows in column2
s_mincnt_both_present, s_avgcnt_both_present, s_maxcnt_both_present, s_sdevcnt_both_present: min, mean, max and sd of of distinct value counts in column2 (second) when there are matching rows in column1

cezden/rpostgrexplorer documentation built on May 13, 2019, 3:08 p.m.

cezden/rpostgrexplorer index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

cezden/rpostgrexplorer
Simplification of analysis in PostgreSQL and Redshift DBs

sql.entity.relation.generic: Relationship matching SQL query
In cezden/rpostgrexplorer: Simplification of analysis in PostgreSQL and Redshift DBs

Description

Usage

Details

Related to sql.entity.relation.generic in cezden/rpostgrexplorer...

R Package Documentation

Browse R Packages

We want your feedback!

cezden/rpostgrexplorer Simplification of analysis in PostgreSQL and Redshift DBs

sql.entity.relation.generic: Relationship matching SQL query In cezden/rpostgrexplorer: Simplification of analysis in PostgreSQL and Redshift DBs

Description

Usage

Details

Related to sql.entity.relation.generic in cezden/rpostgrexplorer...

R Package Documentation

Browse R Packages

We want your feedback!

cezden/rpostgrexplorer
Simplification of analysis in PostgreSQL and Redshift DBs

sql.entity.relation.generic: Relationship matching SQL query
In cezden/rpostgrexplorer: Simplification of analysis in PostgreSQL and Redshift DBs