MDL Multiresolution Linear Regression Framework

MDL Multiresolution Linear Regression Framework

In this work, we provide the framework to analyze a multiresolution partition (e.g. country, provinces, subdistrict) where each individual data point belongs to only one partition in each layer (e.g. $i$ belongs to subdistrict $A$, province $P$, and country $Q$).

We assume that a partition in a higher layer subsumes lower-layer partitions (e.g. a nation is at the 1st layer subsumes all provinces at the 2nd layer).

Given $N$ individuals that have a pair of real values $(x,y)$ that generated from independent variable $X$ and dependent variable $Y$. Each individual $i$ belongs to one partition per layer.

Our goal is to find which partition at which highest level that all individuals in the this partition share the same linear model $Y=f(X)$ where $f$ is a linear function.

Explanation: FindMaxHomoOptimalPartitions(DataT,gamma)

library(MRReg)

# Generate simulation data type 4 by having 100 individuals per homogeneous partition.
DataT<-SimpleSimulation(100,type=4)

gamma <- 0.05 # Gamma parameter

out<-FindMaxHomoOptimalPartitions(DataT,gamma)

Plotting optimal homogeneous tree

The red nodes are homogeneous partitions. All children of a homogeneous partition node share the same linear model.

plotOptimalClustersTree(out)

Printing optimal homogeneous partitions

Selected features: 1 is reserved for an intercept, and d is a selected feature if Y[i] ~ X[i,d-1] in linear model Note that the clustInfoRecRatio values are always NA for last-layer partitions.

PrintOptimalClustersResult(out, selFeature = TRUE)


Try the MRReg package in your browser

Any scripts or data that you put into this service are public.

MRReg documentation built on June 8, 2025, 11:19 a.m.