RFM - Transaction Level Data"

Introduction

library(rfm)
library(knitr)
library(kableExtra)
library(magrittr)
library(dplyr)
library(ggplot2)
library(DT)
library(grDevices)
library(RColorBrewer)
options(knitr.table.format = "html")
options(tibble.width = Inf)

RFM (recency, frequency, monetary) analysis is a behavior based technique used to segment customers by examining their transaction history such as

It is based on the marketing axiom that 80% of your business comes from 20% of your customers. RFM helps to identify customers who are more likely to respond to promotions by segmenting them into various categories.

Data

To calculate the RFM score for each customer we need transaction data which should include the following:

rfm includes a sample data set rfm_data_orders which includes the above details:

rfm_data_orders

RFM Score

So how is the RFM score computed for each customer? The below steps explain the process:

The customers with the highest RFM scores are most likely to respond to an offer. Now that we have understood how the RFM score is computed, it is time to put it into practice. Use rfm_table_order() to generate the score for each customer from the sample data set rfm_data_orders.

rfm_table_order() takes 8 inputs:

RFM Table

analysis_date <- lubridate::as_date("2006-12-31")
rfm_result <- rfm_table_order(rfm_data_orders, customer_id, order_date, revenue, analysis_date)
rfm_result
analysis_date <- lubridate::as_date("2006-12-31")
rfm_result <- rfm_table_order(rfm_data_orders, customer_id, order_date, revenue, analysis_date)
rfm_result %>%
  use_series(rfm) %>%
  slice(1:10) %>%
  kable() %>%
  kable_styling()

rfm_table_order() will return the following columns as seen in the above table:

Heat Map

The heat map shows the average monetary value for different categories of recency and frequency scores. Higher scores of frequency and recency are characterized by higher average monetary value as indicated by the darker areas in the heatmap.

rfm_heatmap(rfm_result)

Bar Chart

Use rfm_bar_chart() to generate the distribution of monetary scores for the different combinations of frequency and recency scores.

rfm_bar_chart(rfm_result)

Histogram

Use rfm_histograms() to examine the relative distribution of

rfm_histograms(rfm_result)

Customers by Orders

Visualize the distribution of customers across orders.

rfm_order_dist(rfm_result)

Scatter Plots

The best customers are those who:

Now let us examine the relationship between the above.

Recency vs Monetary Value

Customers who visited more recently generated more revenue compared to those who visited in the distant past. The customers who visited in the recent past are more likely to return compared to those who visited long time ago as most of those would be lost customers. As such, higher revenue would be associated with most recent visits.

rfm_rm_plot(rfm_result)

Frequency vs Monetary Value

As the frequency of visits increases, the revenue generated also increases. Customers who visit more frquently are your champion customers, loyal customers or potential loyalists and they drive higher revenue.

rfm_fm_plot(rfm_result)

Recency vs Frequency

Customers with low frequency visited in the distant past while those with high frequency have visited in the recent past. Again, the customers who visited in the recent past are more likely to return compared to those who visited long time ago. As such, higher frequency would be associated with the most recent visits.

rfm_rf_plot(rfm_result)

Segments

Let us classify our customers based on the individual recency, frequency and monetary scores.

segment <- c(
  "Champions", "Loyal Customers", "Potential Loyalist",
  "New Customers", "Promising", "Need Attention",
  "About To Sleep", "At Risk", "Can't Lose Them", "Hibernating",
  "Lost"
)
description <- c(
  "Bought recently, buy often and spend the most",
  "Spend good money. Responsive to promotions",
  "Recent customers, spent good amount, bought more than once",
  "Bought more recently, but not often",
  "Recent shoppers, but haven't spent much",
  "Above average recency, frequency & monetary values",
  "Below average recency, frequency & monetary values",
  "Spent big money, purchased often but long time ago",
  "Made big purchases and often, but long time ago",
  "Low spenders, low frequency, purchased long time ago",
  "Lowest recency, frequency & monetary scores"
)
recency <- c("4 - 5", "2 - 5", "3 - 5", "4 - 5", "3 - 4", "2 - 3", "2 - 3", "<= 2", "<= 1", "1 - 2", "<= 2")
frequency <- c("4 - 5", "3 - 5", "1 - 3", "<= 1", "<= 1", "2 - 3", "<= 2", "2 - 5", "4 - 5", "1 - 2", "<= 2")
monetary <- c("4 - 5", "3 - 5", "1 - 3", "<= 1", "<= 1", "2 - 3", "<= 2", "2 - 5", "4 - 5", "1 - 2", "<= 2")
segments <- tibble(
  Segment = segment, Description = description,
  R = recency, `F` = frequency, M = monetary
)
segments %>%
  kable() %>%
  kable_styling(full_width = TRUE, font_size = 12)

Segmented Customer Data

We can use the segmented data to identify

Once we have classified a customer into a particular segment, we can take appropriate action to increase his/her lifetime value.

segment_names <- c("Champions", "Loyal Customers", "Potential Loyalist",
  "New Customers", "Promising", "Need Attention", "About To Sleep",
  "At Risk", "Can't Lose Them", "Lost")

recency_lower <- c(4, 2, 3, 4, 3, 2, 2, 1, 1, 1)
recency_upper <- c(5, 5, 5, 5, 4, 3, 3, 2, 1, 2)
frequency_lower <- c(4, 3, 1, 1, 1, 2, 1, 2, 4, 1)
frequency_upper <- c(5, 5, 3, 1, 1, 3, 2, 5, 5, 2)
monetary_lower <- c(4, 3, 1, 1, 1, 2, 1, 2, 4, 1)
monetary_upper <- c(5, 5, 3, 1, 1, 3, 2, 5, 5, 2)

segments <- rfm_segment(rfm_result, segment_names, recency_lower, recency_upper,
frequency_lower, frequency_upper, monetary_lower, monetary_upper)

# use datatable
segments %>%
  datatable(
    filter = "top",
    options = list(pageLength = 5, autoWidth = TRUE),
    colnames = c(
      "Customer", "Segment", "RFM",
      "Orders", "Recency", "Total Spend"
    )
  )

Segment Size

Now that we have defined and segmented our customers, let us examine the distribution of customers across the segments. Ideally, we should have very few or no customer in segments such as At Risk or Needs Attention.

segments %>%
  count(segment) %>%
  arrange(desc(n)) %>%
  rename(Segment = segment, Count = n)

We can also examine the median recency, frequency and monetary value across segments to ensure that the logic used for customer classification is sound and practical.

Median Recency

rfm_plot_median_recency(segments)

Median Frequency

rfm_plot_median_frequency(segments)

Median Monetary Value

rfm_plot_median_monetary(segments)

References



Try the rfm package in your browser

Any scripts or data that you put into this service are public.

rfm documentation built on July 21, 2020, 5:06 p.m.