pnbd: Pareto/NBD models

pnbdR Documentation

Pareto/NBD models

Description

Fits Pareto/NBD models on transactional data with and without covariates.

Usage

## S4 method for signature 'clv.data'
pnbd(
  clv.data,
  start.params.model = c(),
  use.cor = FALSE,
  start.param.cor = c(),
  optimx.args = list(),
  verbose = TRUE,
  ...
)

## S4 method for signature 'clv.data.static.covariates'
pnbd(
  clv.data,
  start.params.model = c(),
  use.cor = FALSE,
  start.param.cor = c(),
  optimx.args = list(),
  verbose = TRUE,
  names.cov.life = c(),
  names.cov.trans = c(),
  start.params.life = c(),
  start.params.trans = c(),
  names.cov.constr = c(),
  start.params.constr = c(),
  reg.lambdas = c(),
  ...
)

## S4 method for signature 'clv.data.dynamic.covariates'
pnbd(
  clv.data,
  start.params.model = c(),
  use.cor = FALSE,
  start.param.cor = c(),
  optimx.args = list(),
  verbose = TRUE,
  names.cov.life = c(),
  names.cov.trans = c(),
  start.params.life = c(),
  start.params.trans = c(),
  names.cov.constr = c(),
  start.params.constr = c(),
  reg.lambdas = c(),
  ...
)

Arguments

clv.data

The data object on which the model is fitted.

start.params.model

Named start parameters containing the optimization start parameters for the model without covariates.

use.cor

Whether the correlation between the transaction and lifetime process should be estimated.

start.param.cor

Start parameter for the optimization of the correlation.

optimx.args

Additional arguments to control the optimization which are forwarded to optimx::optimx. If multiple optimization methods are specified, only the result of the last method is further processed.

verbose

Show details about the running of the function.

...

Ignored

names.cov.life

Which of the set Lifetime covariates should be used. Missing parameter indicates all covariates shall be used.

names.cov.trans

Which of the set Transaction covariates should be used. Missing parameter indicates all covariates shall be used.

start.params.life

Named start parameters containing the optimization start parameters for all lifetime covariates.

start.params.trans

Named start parameters containing the optimization start parameters for all transaction covariates.

names.cov.constr

Which covariates should be forced to use the same parameters for the lifetime and transaction process. The covariates need to be present as both, lifetime and transaction covariates.

start.params.constr

Named start parameters containing the optimization start parameters for the constraint covariates.

reg.lambdas

Named lambda parameters used for the L2 regularization of the lifetime and the transaction covariate parameters. Lambdas have to be >= 0.

Details

Model parameters for the Pareto/NBD model are r, alpha, s, and beta.
s: shape parameter of the Gamma distribution for the lifetime process. The smaller s, the stronger the heterogeneity of customer lifetimes.
beta: rate parameter for the Gamma distribution for the lifetime process.
r: shape parameter of the Gamma distribution of the purchase process. The smaller r, the stronger the heterogeneity of the purchase process.
alpha: rate parameter of the Gamma distribution of the purchase process.

Based on these parameters, the average purchase rate while customers are active is r/alpha and the average dropout rate is s/beta.

Ideally, the starting parameters for r and s represent your best guess concerning the heterogeneity of customers in their buy and die rate. If covariates are included into the model additionally parameters for the covariates affecting the attrition and the purchase process are part of the model.

If no start parameters are given, r=0.5, alpha=15, s=0.5, beta=10 is used for all model parameters and 0.1 for covariate parameters. The model start parameters are required to be > 0.

The Pareto/NBD model

The Pareto/NBD is the first model addressing the issue of modeling customer purchases and attrition simultaneously for non-contractual settings. The model uses a Pareto distribution, a combination of an Exponential and a Gamma distribution, to explicitly model customers' (unobserved) attrition behavior in addition to customers' purchase process.
In general, the Pareto/NBD model consist of two parts. A first process models the purchase behavior of customers as long as the customers are active. A second process models customers' attrition. Customers live (and buy) for a certain unknown time until they become inactive and "die". Customer attrition is unobserved. Inactive customers may not be reactivated. For technical details we refer to the original paper by Schmittlein, Morrison and Colombo (1987) and the detailed technical note of Fader and Hardie (2005).

Pareto/NBD model with static covariates

The standard Pareto/NBD model captures heterogeneity was solely using Gamma distributions. However, often exogenous knowledge, such as for example customer demographics, is available. The supplementary knowledge may explain part of the heterogeneity among the customers and therefore increase the predictive accuracy of the model. In addition, we can rely on these parameter estimates for inference, i.e. identify and quantify effects of contextual factors on the two underlying purchase and attrition processes. For technical details we refer to the technical note by Fader and Hardie (2007).

Pareto/NBD model with dynamic covariates

In many real-world applications customer purchase and attrition behavior may be influenced by covariates that vary over time. In consequence, the timing of a purchase and the corresponding value of at covariate a that time becomes relevant. Time-varying covariates can affect customer on aggregated level as well as on an individual level: In the first case, all customers are affected simultaneously, in the latter case a covariate is only relevant for a particular customer. For technical details we refer to the paper by Bachmann, Meierer and Näf (2020).

Value

Depending on the data object on which the model was fit, pnbd returns either an object of class clv.pnbd, clv.pnbd.static.cov, or clv.pnbd.dynamic.cov.

The function summary can be used to obtain and print a summary of the results. The generic accessor functions coefficients, vcov, fitted, logLik, AIC, BIC, and nobs are available.

Note

The Pareto/NBD model with dynamic covariates can currently not be fit with data that has a temporal resolution of less than one day (data that was built with time unit hours).

References

Schmittlein DC, Morrison DG, Colombo R (1987). “Counting Your Customers: Who-Are They and What Will They Do Next?” Management Science, 33(1), 1-24.

Bachmann P, Meierer M, Naef, J (2021). “The Role of Time-Varying Contextual Factors in Latent Attrition Models for Customer Base Analysis” Marketing Science 40(4). 783-809.

Fader PS, Hardie BGS (2005). “A Note on Deriving the Pareto/NBD Model and Related Expressions.” URL http://www.brucehardie.com/notes/009/pareto_nbd_derivations_2005-11-05.pdf.

Fader PS, Hardie BGS (2007). “Incorporating time-invariant covariates into the Pareto/NBD and BG/NBD models.” URL http://www.brucehardie.com/notes/019/time_invariant_covariates.pdf.

Fader PS, Hardie BGS (2020). “Deriving an Expression for P(X(t)=x) Under the Pareto/NBD Model.” URL https://www.brucehardie.com/notes/012/pareto_NBD_pmf_derivation_rev.pdf

See Also

clvdata to create a clv data object, SetStaticCovariates to add static covariates to an existing clv data object.

gg to fit customer's average spending per transaction with the Gamma-Gamma model

predict to predict expected transactions, probability of being alive, and customer lifetime value for every customer

plot to plot the unconditional expectation as predicted by the fitted model

pmf for the probability to make exactly x transactions in the estimation period, given by the probability mass function (PMF).

newcustomer to predict the expected number of transactions for an average new customer.

The generic functions vcov, summary, fitted.

SetDynamicCovariates to add dynamic covariates on which the pnbd model can be fit.

Examples


data("apparelTrans")
clv.data.apparel <- clvdata(apparelTrans, date.format = "ymd",
                            time.unit = "w", estimation.split = 52)

# Fit standard pnbd model
pnbd(clv.data.apparel)

# Give initial guesses for the model parameters
pnbd(clv.data.apparel,
     start.params.model = c(r=0.5, alpha=15, s=0.5, beta=10))


# pass additional parameters to the optimizer (optimx)
#    Use Nelder-Mead as optimization method and print
#    detailed information about the optimization process
apparel.pnbd <- pnbd(clv.data.apparel,
                     optimx.args = list(method="Nelder-Mead",
                                        control=list(trace=6)))

# estimated coefs
coef(apparel.pnbd)

# summary of the fitted model
summary(apparel.pnbd)

# predict CLV etc for holdout period
predict(apparel.pnbd)

# predict CLV etc for the next 15 periods
predict(apparel.pnbd, prediction.end = 15)


# Estimate correlation as well
pnbd(clv.data.apparel, use.cor = TRUE)


# To estimate the pnbd model with static covariates,
#   add static covariates to the data
data("apparelStaticCov")
clv.data.static.cov <-
 SetStaticCovariates(clv.data.apparel,
                     data.cov.life = apparelStaticCov,
                     names.cov.life = c("Gender", "Channel"),
                     data.cov.trans = apparelStaticCov,
                     names.cov.trans = c("Gender", "Channel"))

# Fit pnbd with static covariates
pnbd(clv.data.static.cov)

# Give initial guesses for both covariate parameters
pnbd(clv.data.static.cov, start.params.trans = c(Gender=0.75, Channel=0.7),
                   start.params.life  = c(Gender=0.5, Channel=0.5))

# Use regularization
pnbd(clv.data.static.cov, reg.lambdas = c(trans = 5, life=5))

# Force the same coefficient to be used for both covariates
pnbd(clv.data.static.cov, names.cov.constr = "Gender",
                   start.params.constr = c(Gender=0.5))

# Fit model only with the Channel covariate for life but
# keep all trans covariates as is
pnbd(clv.data.static.cov, names.cov.life = c("Channel"))

# Add dynamic covariates data to the data object
#   add dynamic covariates to the data

## Not run: 
data("apparelDynCov")
clv.data.dyn.cov <-
  SetDynamicCovariates(clv.data = clv.data.apparel,
                       data.cov.life = apparelDynCov,
                       data.cov.trans = apparelDynCov,
                       names.cov.life = c("High.Season", "Gender", "Channel"),
                       names.cov.trans = c("High.Season", "Gender", "Channel"),
                       name.date = "Cov.Date")


# Fit PNBD with dynamic covariates
pnbd(clv.data.dyn.cov)

# The same fitting options as for the
#  static covariate are available
pnbd(clv.data.dyn.cov, reg.lambdas = c(trans=10, life=2))

## End(Not run)



CLVTools documentation built on Oct. 13, 2024, 9:07 a.m.