README.md
In DanielBonnery/BigSyn: X

BigSyn

The 'BigSyn' package allows to synthetise a hierarchical database. All datasets are transposed in a wide format so that they all contain one row per unique identifier, where the identifier is common to all the datasets to synthetise, without being necessarily a primary key of each. The wide formats tables are merged by this identifier, then synthetised. The last step is the back transposition of the synthetised merged transposed

To install with devtools, just run in R:

devtools::install_github("DanielBonnery/BigSyn")

installed.packages()["BigSyn","Depends"]

## [1] "R (>= 3.5.0), devtools, ggplot2, sqldf, lattice, printr,\nknitr, reshape2, data.table, rlist, haven, sas7bdat, partykit"

To run the demo, just run: demo(Synthesize_database) This demo will synthetsize simulated data. The details are given below.

A shiny app was developped to produce graphics to compare synthetic and gold datasets variables distributions.

To run the Shiny app, just run:

BigSyn::InteractiveCompare(BigSyn::tableA,BigSyn::TSTtableA)

Figures below are screenshots of the App.

This first code chunck creates 2 tables: TableA and TableB, to mimic a simple database with two datasets. These datasets have a common identifier, the combinaison of the variables id1a, and id1b.

library(BigSyn)
data(tableA);
data(tableB);
uniqueid<-unique(tableA[,1:2])
tableB<-cbind(uniqueid,tableB[1:nrow(uniqueid),])

We then transpose the two tables. The transposed tables contain one row per unique value of id1a, id1b:

TKtableA<-BigSyn::Generaltransposefunction(tableA,c("id1a","id1b"),c("id2a","id2b"))
TKtableB<-BigSyn::Generaltransposefunction(tableB,c("id1a","id1b"),character(0))

We merge the transposed tables by id1a and id1b1:

Ttable<-merge(TKtableA$TtableA,TKtableB$TtableA, by =c("id1a","id1b"))

We synthesize the merged transposed datasets:

STtable<-BigSyn::SDPSYN2(Ttable,asis = c("id1a","id1b"),nrep = 1)

We separate the synthetic merged transposed datasets by table of origin:

STtableA<-STtable[[1]][c("id1a","id1b",grep("tableA",names(STtable[[1]]),value = TRUE))]
STtableB<-STtable[[1]][c("id1a","id1b",grep("tableB",names(STtable[[1]]),value = TRUE))]

To finish, we back transpose each unmerged synthetic transposed dataset:

TSTtableA<-BigSyn::GeneralReversetransposefunction(TtableA = STtableA,
                                                   key = TKtableA$key)
TSTtableB<-BigSyn::GeneralReversetransposefunction(TtableA = STtableB,
                                                   key = TKtableB$key)

The synthetic data is now ready. We can run comparative analysis on the synthetic and original (gold) data. We realise a univariate analysis and compare the results obtained on the gold and synthetic datasets:

TSTtableA$Origin="Synthetic"
tableA$Origin="Gold"
X=rbind(tableA,TSTtableA[names(tableA)])
ggplot2::ggplot(X,aes(factor1,fill=Origin)) + geom_bar(position = "dodge")

We realise a bivariate analysis and compare the results obtained on the gold and synthetic datasets:

library(gridExtra)
plot1<-ggplot2::ggplot(X,aes(x = cont1,y=cont3,color=Origin,group=Origin))+
  stat_density_2d(geom = "polygon", aes(alpha = ..level.., fill = Origin))
plot2<-plot1+facet_grid(.~Origin)+theme(legend.position="none")
grid.arrange(plot2,plot1)

To check interactively the two tables, run:

BigSyn::InteractiveCompare(tableA,TSTtableA)
BigSyn::InteractiveCompare(tableB,TSTtableB)

DanielBonnery/BigSyn documentation built on June 28, 2020, 7:18 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

DanielBonnery/BigSyn
X

README.md
In DanielBonnery/BigSyn: X

BigSyn

Installation

Package dependencies

Demo

Shiny app to compare Synthetic and original data.

Step by step demo.

R Package Documentation

Browse R Packages

We want your feedback!

DanielBonnery/BigSyn X

README.md In DanielBonnery/BigSyn: X

BigSyn

Installation

Package dependencies

Demo

Shiny app to compare Synthetic and original data.

Step by step demo.

R Package Documentation

Browse R Packages

We want your feedback!

DanielBonnery/BigSyn
X

README.md
In DanielBonnery/BigSyn: X