CK Cafe: Using Association Rules to Find Basket of Goods
In this lab session, I share how to use apriori algorithm for association mining. The goal is to find useful causal and association rules which can help in designing promotions for the company. Plus, you get to see what's served at an Indian cafe.
November 21, 2022
Business Context
CK Cafe is an Indian food and beverages chain with about 19 outlets in 5 cities. Their outlets are popular “hangout” places for young and old alike. People often go to their stores for meeting their friends, family or just getting their Chai-tea or coffee. Imagine a cafe, basically.
Their prices are not low for Indian standards but they aren’t a luxurious store either. They offer about 100 items at their store, though only about 20 generate most revenue.
Their two most popular items are the Chai (tea) and Coffee (which they like to call Kaapi). Chai can be of several types, depending on the spice in it. It could have ginger (Adrak) and be called Adrak Chai for example. In the table below, I’m providing some popular food items and their pictures/ details.
Item | Description | Picture |
---|---|---|
Adrak Chai / Kadak Chai / Elaichi Chai / Other types of Chai | Chai-tea with Ginger / Chai-tea with strong spices / Chai-tea with Cardamom / etc. | |
Kulhad Chai | Chai-tea served in earthen pot. Popular in Northern India, especially New Delhi | |
Indian Filter Kaapi | Filter Coffee, popular in Southern India | |
Paneer Puff | A croissant-like bread filled with Paneer (Indian cottage cheese) | |
Veg Club Sandwich | Vegetarian sandwich with grated vegetables, cheese, etc. | |
Maska Bun | Bread and butter; commonly eaten with Chai | |
Biryani | A slow-cooked rice dish made with Basmati rice, spices and choice of meat or vegetables |
Data Analysis
You can find the dataset and codes on my GitHub.
Loading Packages and Setting Working Directory
Tidyverse for manipulation and visualisation. arules
and arulesViz
for association rules mining and visualisation. I like the theme theme_clean()
from ggthemes
package.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.6.9000 ✔ purrr 0.3.4
## ✔ tibble 3.1.7 ✔ dplyr 1.0.9
## ✔ tidyr 1.2.0 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(arules)
## Loading required package: Matrix
##
## Attaching package: 'Matrix'
## The following objects are masked from 'package:tidyr':
##
## expand, pack, unpack
##
## Attaching package: 'arules'
## The following object is masked from 'package:dplyr':
##
## recode
## The following objects are masked from 'package:base':
##
## abbreviate, write
library(arulesViz)
library(DT)
theme_set(ggthemes::theme_clean())
Loading Data
You can load the CSV data and then convert it to a list format as required by arules
package. It will take about 3 minutes to process.
# NOT RUN
df = read_csv("CK_data_anon.csv") %>%
janitor::clean_names()
df1 = df %>%
select(invoice_name, item_name)
invoices = unique(df1$invoice_name)
all_items = list()
for (i in invoices)
{
l = df1 %>%
filter(invoice_name == i) %>%
pull(item_name) %>%
as.character()
all_items = append(all_items, list(l))
}
Or, you can directly import the list file I created for you after processing it. Download it here.
df = readRDS("CK_data_anon.RDS")
Getting Ready for Analysis
All analysis with association rules has to be done on a list item. See ?transactions
for more details.
Converting the df
to transactions file.
trans = transactions(df)
## Warning in asMethod(object): removing duplicated items in transactions
Let’s see a summary of what we have.
summary(trans)
## transactions as itemMatrix in sparse format with
## 56737 rows (elements/itemsets/transactions) and
## 211 columns (items) and a density of 0.00928914
##
## most frequent items:
## Kadak Chai Water Bottle 500 ML Adrak Chai Indian Filter Kaapi
## 13910 10986 9748 8935
## Elaichi Chai (Other)
## 3301 64325
##
## element (itemset/transaction) length distribution:
## sizes
## 1 2 3 4 5 6 7 8 9 10 11 12 13
## 24890 18374 7980 3315 1361 508 153 87 28 16 11 6 3
## 17 19 20 30
## 2 1 1 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.00 1.00 2.00 1.96 2.00 30.00
##
## includes extended item information - examples:
## labels
## 1 Aam Panna
## 2 Adrak Chai
## 3 Adrak Chai Full
Let’s look at the most frequent items. Note that on the y-axis, we have the Support.
itemFrequencyPlot(trans,topN = 20)
Another way to visualise the data.
ggplot(
tibble(
Support = sort(itemFrequency(trans, type = "absolute"), decreasing = TRUE),
Item = seq_len(ncol(trans))
), aes(x = Item, y = Support)) + geom_line()
You can note that the most popular items are very popular and the rest of the items are not as popular.
Number of Possible Associations
For this dataset, the number of possible associations is huge. But how much exactly?
2^ncol(trans)
## [1] 3.291009e+63
Woah.
Frequent Itemsets
Let’s try to find the frequent itemsets.
its = apriori(trans, parameter=list(target = "frequent"))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## NA 0.1 1 none FALSE TRUE 5 0.1 1
## maxlen target ext
## 10 frequent itemsets TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 5673
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[211 item(s), 56737 transaction(s)] done [0.00s].
## sorting and recoding items ... [4 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 done [0.00s].
## sorting transactions ... done [0.00s].
## writing ... [4 set(s)] done [0.00s].
## creating S4 object ... done [0.00s].
its
## set of 4 itemsets
Support is a parameter that needs to be optimised. To see all parameters that can be optimised, see ?ASparameter
.
The lower the support parameter, the higher the number of itemsets you can generate. For large datasets, you should start from higher support values and make your way down. In this case, I tried several values and found 0.1 gave me 4 itemsets, 0.01 gave me 52 itemsets, 0.005 gave me 104 itemsets, and 0.001 gave me 440 itemsets.
It will be your call to choose the right value of support.
its = apriori(trans, parameter=list(target = "frequent", support = 0.001))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## NA 0.1 1 none FALSE TRUE 5 0.001 1
## maxlen target ext
## 10 frequent itemsets TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 56
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[211 item(s), 56737 transaction(s)] done [0.00s].
## sorting and recoding items ... [123 item(s)] done [0.00s].
## creating transaction tree ... done [0.01s].
## checking subsets of size 1 2 3 4 done [0.00s].
## sorting transactions ... done [0.01s].
## writing ... [440 set(s)] done [0.00s].
## creating S4 object ... done [0.00s].
its
## set of 440 itemsets
Let’s see what we find.
its = sort(its, by = "support")
inspect(head(its, n = 10))
## items support count
## [1] {Kadak Chai} 0.24516629 13910
## [2] {Water Bottle 500 ML} 0.19363026 10986
## [3] {Adrak Chai} 0.17181028 9748
## [4] {Indian Filter Kaapi} 0.15748101 8935
## [5] {Elaichi Chai} 0.05818073 3301
## [6] {Lemon Ice Tea} 0.04642473 2634
## [7] {Kadak Chai, Water Bottle 500 ML} 0.04379858 2485
## [8] {Masala Chai} 0.03985054 2261
## [9] {Paneer Puff} 0.03831715 2174
## [10] {Extra Cheese Grated} 0.03646650 2069
Let’s see how many items are brought together.
ggplot(tibble(`Itemset Size` = factor(size(its))), aes(`Itemset Size`)) + geom_bar()
Most itemsets are of size two, followed by single items.
Let’s see the most popular “couples”.
inspect(its[size(its) == 2])
## items support count
## [1] {Kadak Chai,
## Water Bottle 500 ML} 0.043798579 2485
## [2] {Adrak Chai,
## Water Bottle 500 ML} 0.031372825 1780
## [3] {Indian Filter Kaapi,
## Water Bottle 500 ML} 0.027407159 1555
## [4] {Indian Filter Kaapi,
## Kadak Chai} 0.025239262 1432
## [5] {Employee Meal,
## Kadak Chai} 0.018700319 1061
## [6] {Adrak Chai,
## Extra Elaichi Flavor} 0.017819060 1011
## [7] {Adrak Chai,
## Kadak Chai} 0.015774539 895
## [8] {Adrak Chai,
## Indian Filter Kaapi} 0.015439660 876
## [9] {Kadak Chai,
## Maska Bun} 0.011667871 662
## [10] {Indian Filter Kaapi Large,
## Water Bottle 500 ML} 0.010451733 593
## [11] {Elaichi Chai,
## Water Bottle 500 ML} 0.010345982 587
## [12] {Extra Cheese Grated,
## Water Bottle 500 ML} 0.010099230 573
## [13] {Adrak Chai,
## Elaichi Chai} 0.009552849 542
## [14] {Paneer Puff,
## Water Bottle 500 ML} 0.009535224 541
## [15] {Lemon Ice Tea,
## Water Bottle 500 ML} 0.008336711 473
## [16] {Elaichi Chai,
## Kadak Chai} 0.007614079 432
## [17] {Kadak Chai,
## Paneer Puff} 0.007120574 404
## [18] {Masala Chai,
## Water Bottle 500 ML} 0.007050073 400
## [19] {Adrak Chai,
## Maska Bun} 0.007014823 398
## [20] {CK Sandwich,
## Water Bottle 500 ML} 0.006820946 387
## [21] {Adrak Chai,
## Extra Cheese Grated} 0.006627069 376
## [22] {Extra Cheese Grated,
## Veg Club} 0.006538943 371
## [23] {CK Sandwich,
## Extra Cheese Grated} 0.006433192 365
## [24] {Bana Ke,
## Paneer Puff} 0.005745810 326
## [25] {Italian Noodles,
## Water Bottle 500 ML} 0.005728184 325
## [26] {French Fries – Piri Piri,
## Water Bottle 500 ML} 0.005534307 314
## [27] {Exotic Corn Mayo,
## Extra Cheese Grated} 0.005305180 301
## [28] {Elaichi Chai,
## Indian Filter Kaapi} 0.005199429 295
## [29] {Adrak Chai,
## Masala Chai} 0.005146553 292
## [30] {Adrak Chai,
## CK Sandwich} 0.005111303 290
## [31] {Indian Filter Kaapi,
## Paneer Puff} 0.005111303 290
## [32] {Exotic Corn Mayo,
## Water Bottle 500 ML} 0.005093678 289
## [33] {Adrak Chai,
## French Fries – Piri Piri} 0.005093678 289
## [34] {Kadak Chai,
## Lemon Ice Tea} 0.004987927 283
## [35] {Adrak Chai,
## Lemon Ice Tea} 0.004846925 275
## [36] {Veg Biryani,
## Water Bottle 500 ML} 0.004794050 272
## [37] {Cheese Chutney,
## Water Bottle 500 ML} 0.004758799 270
## [38] {Extra Cheese Grated,
## Nachos with Dip} 0.004741174 269
## [39] {Indian Filter Kaapi,
## Indian Filter Kaapi Large} 0.004582548 260
## [40] {Veg Club,
## Water Bottle 500 ML} 0.004582548 260
## [41] {Adrak Chai,
## Veg Club} 0.004547297 258
## [42] {Indori Upma,
## Water Bottle 500 ML} 0.004423921 251
## [43] {Adrak Chai,
## Exotic Corn Mayo} 0.004423921 251
## [44] {Adrak Chai,
## Paneer Puff} 0.004423921 251
## [45] {CK Sandwich,
## Kadak Chai} 0.004318170 245
## [46] {Extra Cheese Grated,
## White Sauce Pasta} 0.004300545 244
## [47] {Adrak Chai,
## Italian Noodles} 0.004177168 237
## [48] {Extra Elaichi Flavor,
## Water Bottle 500 ML} 0.004124293 234
## [49] {Adrak Chai,
## Chilli Garlic Cheese Toast} 0.004089042 232
## [50] {Maska Bun,
## Water Bottle 500 ML} 0.004089042 232
## [51] {Adrak Chai,
## Cheese Chutney} 0.004071417 231
## [52] {Water Bottle 500 ML,
## White Sauce Pasta} 0.004018542 228
## [53] {Desi Noodle,
## Water Bottle 500 ML} 0.004000917 227
## [54] {Extra Cheese Grated,
## Kadak Chai} 0.003965666 225
## [55] {Adrak Chai,
## Small Kulladh} 0.003824665 217
## [56] {Extra Cheese Grated,
## Indian Filter Kaapi} 0.003807039 216
## [57] {Chocolate Kaapi,
## Water Bottle 500 ML} 0.003754164 213
## [58] {Kadak Chai,
## Veg Club} 0.003754164 213
## [59] {Indori Upma,
## Kadak Chai} 0.003736539 212
## [60] {Indian Filter Kaapi,
## Masala Chai} 0.003666038 208
## [61] {French Fries – Piri Piri,
## Kadak Chai} 0.003630788 206
## [62] {French Fries – Piri Piri,
## Indian Filter Kaapi} 0.003542662 201
## [63] {Chilli Garlic Cheese Toast,
## Water Bottle 500 ML} 0.003525037 200
## [64] {Masala Omlette,
## Water Bottle 500 ML} 0.003454536 196
## [65] {Extra IFC Decoaction,
## Indian Filter Kaapi} 0.003419285 194
## [66] {Exotic Corn Mayo,
## Kadak Chai} 0.003278284 186
## [67] {Kadak Chai,
## Masala Omlette} 0.003225408 183
## [68] {Indian Filter Kaapi,
## Lemon Ice Tea} 0.003190158 181
## [69] {Italian Noodles,
## Kadak Chai} 0.003137283 178
## [70] {Kadak Chai,
## Masala Chai} 0.003137283 178
## [71] {Adrak Chai,
## Indori Upma} 0.003066782 174
## [72] {Indian Filter Kaapi,
## Italian Noodles} 0.003066782 174
## [73] {Cheese Chutney,
## Kadak Chai} 0.003013906 171
## [74] {Frappe,
## Water Bottle 500 ML} 0.002961031 168
## [75] {Elaichi Chai,
## Paneer Puff} 0.002961031 168
## [76] {Egg Sandwich,
## Water Bottle 500 ML} 0.002802404 159
## [77] {CK Brownie Blast,
## Water Bottle 500 ML} 0.002802404 159
## [78] {Adrak Chai,
## Masala Omlette} 0.002802404 159
## [79] {Indian Filter Kaapi,
## Veg Club} 0.002784779 158
## [80] {Adrak Chai,
## Chocolate Chai} 0.002731903 155
## [81] {Green Tea,
## Water Bottle 500 ML} 0.002696653 153
## [82] {Sprouts Sauted,
## Water Bottle 500 ML} 0.002696653 153
## [83] {Adrak Chai,
## Desi Noodle} 0.002643777 150
## [84] {Indian Filter Kaapi,
## Maska Bun} 0.002643777 150
## [85] {Adrak Chai,
## Veg Biryani} 0.002626152 149
## [86] {Chilli Garlic Cheese Toast,
## Kadak Chai} 0.002590902 147
## [87] {CK Sandwich,
## Indian Filter Kaapi} 0.002573277 146
## [88] {Adrak Chai,
## Extra Adrak Flavor} 0.002538026 144
## [89] {Thandi Kaapi,
## Water Bottle 500 ML} 0.002538026 144
## [90] {Desi Noodle,
## Extra Cheese Grated} 0.002538026 144
## [91] {Extra Adrak Flavor,
## Kadak Chai} 0.002485151 141
## [92] {Adrak Chai,
## White Sauce Pasta} 0.002467526 140
## [93] {Egg Sandwich,
## Kadak Chai} 0.002449900 139
## [94] {Adrak Chai,
## French fries - Salted} 0.002432275 138
## [95] {Egg Sandwich,
## Extra Cheese Grated} 0.002397025 136
## [96] {Extra Cheese Grated,
## Lemon Ice Tea} 0.002397025 136
## [97] {Desi Noodle,
## Kadak Chai} 0.002379400 135
## [98] {Indian Filter Kaapi Large,
## Kadak Chai} 0.002379400 135
## [99] {Cheese Chutney,
## Extra Cheese Grated} 0.002361775 134
## [100] {Exotic Corn Mayo,
## Indian Filter Kaapi} 0.002361775 134
## [101] {French Fries – Piri Piri,
## Lemon Ice Tea} 0.002291274 130
## [102] {Frappe,
## Indian Filter Kaapi} 0.002273649 129
## [103] {Adrak Chai,
## Paneer Sandwich} 0.002256023 128
## [104] {Nachos with Dip,
## Water Bottle 500 ML} 0.002256023 128
## [105] {Kadak Chai,
## Veg Biryani} 0.002238398 127
## [106] {Paneer Sandwich,
## Water Bottle 500 ML} 0.002220773 126
## [107] {Masala Lemonade,
## Water Bottle 500 ML} 0.002220773 126
## [108] {Irani Chai,
## Kadak Chai} 0.002203148 125
## [109] {Extra Cheese Grated,
## Paneer Sandwich} 0.002185523 124
## [110] {Adrak Chai,
## Irani Chai} 0.002185523 124
## [111] {Adrak Chai,
## Frappe} 0.002185523 124
## [112] {Extra Adrak Flavor,
## Kulladh Chai} 0.002132647 121
## [113] {Chilli Garlic Cheese Toast,
## Indian Filter Kaapi} 0.002132647 121
## [114] {Extra Cheese Grated,
## Italian Noodles} 0.002115022 120
## [115] {Bana Ke,
## Water Bottle 500 ML} 0.002097397 119
## [116] {Extra Adrak Flavor,
## Maska Bun} 0.002097397 119
## [117] {Extra Cheese Grated,
## French Fries – Piri Piri} 0.002097397 119
## [118] {Indian Filter Kaapi,
## Indori Upma} 0.002079772 118
## [119] {Egg Sandwich,
## Indian Filter Kaapi} 0.002062146 117
## [120] {Baked Samosa,
## Kadak Chai} 0.002044521 116
## [121] {Adrak Chai,
## Sprouts Sauted} 0.002044521 116
## [122] {Adrak Chai,
## Egg Sandwich} 0.002026896 115
## [123] {Indian Filter Kaapi,
## Masala Omlette} 0.002026896 115
## [124] {Americano,
## Indian Filter Kaapi} 0.001991646 113
## [125] {Baked Samosa,
## Water Bottle 500 ML} 0.001991646 113
## [126] {Frappe,
## Kadak Chai} 0.001991646 113
## [127] {Elaichi Chai,
## Masala Chai} 0.001974020 112
## [128] {Extra Cheese Grated,
## Extra Elaichi Flavor} 0.001956395 111
## [129] {Anda Biryani,
## Water Bottle 500 ML} 0.001921145 109
## [130] {French fries - Salted,
## Kadak Chai} 0.001921145 109
## [131] {Chocolate Kaapi,
## Extra Cheese Grated} 0.001921145 109
## [132] {Indian Filter Kaapi,
## Thandi Kaapi} 0.001903520 108
## [133] {Kadak Chai,
## Thandi Kaapi} 0.001903520 108
## [134] {Chocolate Kaapi,
## Indian Filter Kaapi} 0.001903520 108
## [135] {Cheese Chutney,
## Indian Filter Kaapi} 0.001868269 106
## [136] {Chocolate Chai,
## Water Bottle 500 ML} 0.001850644 105
## [137] {Italian Noodles,
## Lemon Ice Tea} 0.001850644 105
## [138] {Adrak Chai,
## Garlic Butter Bread Spread} 0.001833019 104
## [139] {Extra Cheese Grated,
## Masala Chai} 0.001833019 104
## [140] {Indian Filter Kaapi,
## Veg Biryani} 0.001815394 103
## [141] {Adrak Chai,
## Chocolate Kaapi} 0.001815394 103
## [142] {Elaichi Chai,
## Maska Bun} 0.001815394 103
## [143] {Burnt Garlic Maggi,
## Water Bottle 500 ML} 0.001780143 101
## [144] {CK Cheesy Blast Omelette,
## Water Bottle 500 ML} 0.001762518 100
## [145] {Adrak Chai,
## Thandi Kaapi} 0.001762518 100
## [146] {Aam Panna,
## Water Bottle 500 ML} 0.001744893 99
## [147] {Indian Filter Kaapi,
## White Sauce Pasta} 0.001744893 99
## [148] {Adrak Chai,
## Green Tea} 0.001727268 98
## [149] {Green Tea,
## Kadak Chai} 0.001727268 98
## [150] {Oreo Shake,
## Water Bottle 500 ML} 0.001727268 98
## [151] {Irani Chai,
## Maska Bun} 0.001727268 98
## [152] {French Fries – Piri Piri,
## Masala Chai} 0.001727268 98
## [153] {Mexican Maggi,
## Water Bottle 500 ML} 0.001709643 97
## [154] {Chocolate Kaapi,
## Kadak Chai} 0.001709643 97
## [155] {Kadak Chai,
## White Sauce Pasta} 0.001709643 97
## [156] {Exotic Corn Mayo,
## Lemon Ice Tea} 0.001709643 97
## [157] {Irani Chai,
## Water Bottle 500 ML} 0.001692018 96
## [158] {Chocolate Chai,
## Kadak Chai} 0.001674392 95
## [159] {Adrak Chai,
## Nachos with Dip} 0.001674392 95
## [160] {CK Brownie Blast,
## Extra Cheese Grated} 0.001674392 95
## [161] {Bana Ke,
## Kadak Chai} 0.001639142 93
## [162] {Extra Elaichi Flavor,
## Indian Filter Kaapi} 0.001639142 93
## [163] {Masala Chai,
## Maska Bun} 0.001639142 93
## [164] {French fries - Salted,
## Water Bottle 500 ML} 0.001621517 92
## [165] {Extra Cheese Grated,
## Frappe} 0.001621517 92
## [166] {Adrak Chai,
## Indian Filter Kaapi Large} 0.001621517 92
## [167] {Americano,
## Kadak Chai} 0.001603892 91
## [168] {CK Sandwich,
## Lemon Ice Tea} 0.001586266 90
## [169] {anda Ghotala,
## Water Bottle 500 ML} 0.001551016 88
## [170] {Baked Samosa,
## Bana Ke} 0.001551016 88
## [171] {Elaichi Chai,
## Extra Cheese Grated} 0.001551016 88
## [172] {Elaichi Chai,
## French Fries – Piri Piri} 0.001533391 87
## [173] {Elaichi Chai,
## Lemon Ice Tea} 0.001533391 87
## [174] {Adrak Chai,
## Burnt Garlic Maggi} 0.001515766 86
## [175] {CK Sandwich,
## Elaichi Chai} 0.001515766 86
## [176] {Berry Blast,
## Water Bottle 500 ML} 0.001480515 84
## [177] {Lemon Ice Tea,
## White Sauce Pasta} 0.001480515 84
## [178] {Indian Filter Kaapi,
## Sprouts Sauted} 0.001462890 83
## [179] {Indian Filter Kaapi,
## Nachos with Dip} 0.001445265 82
## [180] {Exotic Corn Mayo,
## French Fries – Piri Piri} 0.001445265 82
## [181] {Vanilla Kaapi,
## Water Bottle 500 ML} 0.001427640 81
## [182] {Adrak Chai,
## Baked Samosa} 0.001427640 81
## [183] {Cheese Chutney,
## French Fries – Piri Piri} 0.001427640 81
## [184] {Kadak Chai,
## Small Kulladh} 0.001410015 80
## [185] {Adrak Chai,
## Mexican Maggi} 0.001410015 80
## [186] {Desi Noodle,
## Indian Filter Kaapi} 0.001410015 80
## [187] {CK Sandwich,
## French Fries – Piri Piri} 0.001410015 80
## [188] {Baked Samosa,
## Paneer Puff} 0.001392389 79
## [189] {Kadak Chai,
## Sprouts Sauted} 0.001392389 79
## [190] {Frappe,
## Lemon Ice Tea} 0.001392389 79
## [191] {Chilli Garlic Cheese Toast,
## Elaichi Chai} 0.001392389 79
## [192] {Chocolate Kaapi,
## Lemon Ice Tea} 0.001374764 78
## [193] {Kiwi Mint Banana,
## Water Bottle 500 ML} 0.001357139 77
## [194] {Extra Elaichi Flavor,
## Kadak Chai} 0.001357139 77
## [195] {Kadak Chai,
## Mexican Bhel Poori} 0.001339514 76
## [196] {CK Pasta,
## Water Bottle 500 ML} 0.001321889 75
## [197] {Chilli Garlic Cheese,
## Water Bottle 500 ML} 0.001321889 75
## [198] {Mexican Bhel Poori,
## Water Bottle 500 ML} 0.001321889 75
## [199] {Apple Mojito,
## Water Bottle 500 ML} 0.001304264 74
## [200] {Extra Elaichi Flavor,
## Maska Bun} 0.001304264 74
## [201] {Lemon Ice Tea,
## Orange Ice Tea} 0.001286638 73
## [202] {Adrak Chai,
## Mexican Bhel Poori} 0.001286638 73
## [203] {Chocolate Shake,
## Water Bottle 500 ML} 0.001286638 73
## [204] {Indori Upma,
## Sprouts Sauted} 0.001286638 73
## [205] {French Fries – Piri Piri,
## Veg Club} 0.001286638 73
## [206] {Extra IFC Decoaction,
## Water Bottle 500 ML} 0.001269013 72
## [207] {Americano,
## Water Bottle 500 ML} 0.001269013 72
## [208] {French fries - Salted,
## Indian Filter Kaapi} 0.001269013 72
## [209] {Exotic Corn Mayo,
## Italian Noodles} 0.001269013 72
## [210] {Lemon Ice Tea,
## Veg Club} 0.001269013 72
## [211] {Garlic Butter Bread Spread,
## Kadak Chai} 0.001251388 71
## [212] {Lemon Ice Tea,
## Peach Ice Tea} 0.001251388 71
## [213] {Peach Ice Tea,
## Water Bottle 500 ML} 0.001251388 71
## [214] {Baked Samosa,
## Indian Filter Kaapi} 0.001251388 71
## [215] {Cheese Chutney,
## Lemon Ice Tea} 0.001251388 71
## [216] {CK Cheesy Blast Fries,
## Water Bottle 500 ML} 0.001233763 70
## [217] {Kadak Chai,
## Masala Lemonade} 0.001233763 70
## [218] {Frappe,
## French Fries – Piri Piri} 0.001233763 70
## [219] {Cheese Chutney,
## Elaichi Chai} 0.001233763 70
## [220] {Lemon Ice Tea,
## Masala Chai} 0.001233763 70
## [221] {Chocolate Kaapi,
## Vanilla Kaapi} 0.001216138 69
## [222] {Adrak Chai,
## CK Cheesy Blast Omelette} 0.001216138 69
## [223] {Indian Filter Kaapi,
## Small Kulladh} 0.001216138 69
## [224] {Extra Cheese Grated,
## Mexican Maggi} 0.001216138 69
## [225] {Adrak Chai,
## Chilli Garlic Cheese} 0.001216138 69
## [226] {Kadak Chai,
## Nachos with Dip} 0.001216138 69
## [227] {Chilli Garlic Cheese Toast,
## Lemon Ice Tea} 0.001216138 69
## [228] {CK Sandwich,
## Masala Chai} 0.001216138 69
## [229] {Chocolate Kaapi,
## French Fries – Piri Piri} 0.001198512 68
## [230] {French Fries – Piri Piri,
## Italian Noodles} 0.001198512 68
## [231] {Adrak Chai,
## CK Brownie Blast} 0.001180887 67
## [232] {Italian Noodles,
## White Sauce Pasta} 0.001180887 67
## [233] {CK Nimbu Pani,
## Water Bottle 500 ML} 0.001163262 66
## [234] {Chilli Garlic Cheese,
## Kadak Chai} 0.001163262 66
## [235] {Lemon Ice Tea,
## Thandi Kaapi} 0.001163262 66
## [236] {Indori Upma,
## Masala Omlette} 0.001163262 66
## [237] {Chilli Garlic Cheese Toast,
## Masala Chai} 0.001163262 66
## [238] {Extra IFC Decoaction,
## Indian Filter Kaapi Large} 0.001145637 65
## [239] {Masala Chai,
## Small Kulladh} 0.001145637 65
## [240] {Chocolate Wallnut Brownie,
## Water Bottle 500 ML} 0.001128012 64
## [241] {Kadak Chai,
## Paneer Sandwich} 0.001128012 64
## [242] {Lemon Ice Tea,
## Veg Biryani} 0.001128012 64
## [243] {Lemon Ice Tea,
## Paneer Puff} 0.001128012 64
## [244] {CK Pasta,
## Extra Cheese Grated} 0.001110387 63
## [245] {French Fries – Piri Piri,
## White Sauce Pasta} 0.001110387 63
## [246] {Chilli Garlic Cheese Toast,
## Italian Noodles} 0.001110387 63
## [247] {Chilli Garlic Cheese,
## Extra Cheese Grated} 0.001092761 62
## [248] {Green Tea,
## Indian Filter Kaapi} 0.001092761 62
## [249] {Extra Adrak Flavor,
## Masala Chai} 0.001092761 62
## [250] {Elaichi Chai,
## Extra Adrak Flavor} 0.001092761 62
## [251] {Lemon Ice Tea,
## Nachos with Dip} 0.001092761 62
## [252] {Elaichi Chai,
## Masala Omlette} 0.001092761 62
## [253] {Chilli Garlic Cheese Toast,
## Extra Cheese Grated} 0.001092761 62
## [254] {Masala Chai,
## Veg Club} 0.001092761 62
## [255] {Small Kulladh,
## Water Bottle 500 ML} 0.001075136 61
## [256] {Chocolate Chai,
## Indian Filter Kaapi} 0.001075136 61
## [257] {Desi Noodle,
## French Fries – Piri Piri} 0.001075136 61
## [258] {Cheese Chutney,
## Veg Club} 0.001075136 61
## [259] {Elaichi Chai,
## Veg Club} 0.001075136 61
## [260] {Adrak Chai,
## Green Tea Lemon} 0.001057511 60
## [261] {Indian Filter Kaapi,
## Paneer Sandwich} 0.001057511 60
## [262] {Exotic Corn Mayo,
## Masala Chai} 0.001057511 60
## [263] {Adrak Chai,
## Adrak Chai Full} 0.001039886 59
## [264] {Green Tea Lemon,
## Water Bottle 500 ML} 0.001039886 59
## [265] {Chocolate Wallnut Brownie,
## Kadak Chai} 0.001039886 59
## [266] {Adrak Chai,
## Peach Ice Tea} 0.001039886 59
## [267] {Adrak Chai,
## Masala Lemonade} 0.001039886 59
## [268] {Extra Cheese Grated,
## Masala Omlette} 0.001039886 59
## [269] {Indian Filter Kaapi Large,
## Lemon Ice Tea} 0.001039886 59
## [270] {Elaichi Chai,
## Exotic Corn Mayo} 0.001039886 59
## [271] {Masala Chai,
## Paneer Puff} 0.001039886 59
## [272] {CK Tadka Burger,
## Extra Cheese Slice} 0.001004635 57
## [273] {Elaichi Chai,
## Indori Upma} 0.001004635 57
## [274] {Chilli Garlic Cheese Toast,
## French Fries – Piri Piri} 0.001004635 57
What items are consumed in groups of three?
inspect(its[size(its) == 3])
## items support count
## [1] {Indian Filter Kaapi,
## Kadak Chai,
## Water Bottle 500 ML} 0.005587183 317
## [2] {Adrak Chai,
## Kadak Chai,
## Water Bottle 500 ML} 0.004617798 262
## [3] {Adrak Chai,
## Extra Elaichi Flavor,
## Water Bottle 500 ML} 0.003771789 214
## [4] {Adrak Chai,
## Indian Filter Kaapi,
## Water Bottle 500 ML} 0.003525037 200
## [5] {Kadak Chai,
## Paneer Puff,
## Water Bottle 500 ML} 0.002485151 141
## [6] {Adrak Chai,
## Kadak Chai,
## Maska Bun} 0.002220773 126
## [7] {Elaichi Chai,
## Kadak Chai,
## Water Bottle 500 ML} 0.002185523 124
## [8] {Kadak Chai,
## Maska Bun,
## Water Bottle 500 ML} 0.002150272 122
## [9] {Adrak Chai,
## Extra Cheese Grated,
## Water Bottle 500 ML} 0.002150272 122
## [10] {CK Sandwich,
## Extra Cheese Grated,
## Water Bottle 500 ML} 0.002115022 120
## [11] {Extra Adrak Flavor,
## Kadak Chai,
## Maska Bun} 0.001956395 111
## [12] {Adrak Chai,
## Elaichi Chai,
## Water Bottle 500 ML} 0.001885895 107
## [13] {Extra Cheese Grated,
## Veg Club,
## Water Bottle 500 ML} 0.001850644 105
## [14] {Adrak Chai,
## Extra Cheese Grated,
## Extra Elaichi Flavor} 0.001833019 104
## [15] {Bana Ke,
## Paneer Puff,
## Water Bottle 500 ML} 0.001797769 102
## [16] {Exotic Corn Mayo,
## Extra Cheese Grated,
## Water Bottle 500 ML} 0.001709643 97
## [17] {Indian Filter Kaapi,
## Indian Filter Kaapi Large,
## Water Bottle 500 ML} 0.001639142 93
## [18] {Adrak Chai,
## Maska Bun,
## Water Bottle 500 ML} 0.001603892 91
## [19] {Adrak Chai,
## CK Sandwich,
## Water Bottle 500 ML} 0.001533391 87
## [20] {CK Sandwich,
## Kadak Chai,
## Water Bottle 500 ML} 0.001533391 87
## [21] {Adrak Chai,
## CK Sandwich,
## Extra Cheese Grated} 0.001498141 85
## [22] {Indian Filter Kaapi Large,
## Kadak Chai,
## Water Bottle 500 ML} 0.001462890 83
## [23] {Adrak Chai,
## Extra Elaichi Flavor,
## Indian Filter Kaapi} 0.001462890 83
## [24] {Bana Ke,
## Kadak Chai,
## Paneer Puff} 0.001445265 82
## [25] {Adrak Chai,
## Indian Filter Kaapi,
## Kadak Chai} 0.001392389 79
## [26] {Indian Filter Kaapi,
## Paneer Puff,
## Water Bottle 500 ML} 0.001304264 74
## [27] {Indian Filter Kaapi,
## Kadak Chai,
## Maska Bun} 0.001198512 68
## [28] {Adrak Chai,
## Italian Noodles,
## Water Bottle 500 ML} 0.001198512 68
## [29] {Indori Upma,
## Kadak Chai,
## Water Bottle 500 ML} 0.001163262 66
## [30] {Extra Cheese Grated,
## Water Bottle 500 ML,
## White Sauce Pasta} 0.001163262 66
## [31] {Adrak Chai,
## Extra Adrak Flavor,
## Kadak Chai} 0.001145637 65
## [32] {Adrak Chai,
## Cheese Chutney,
## Water Bottle 500 ML} 0.001145637 65
## [33] {Adrak Chai,
## Extra Cheese Grated,
## Veg Club} 0.001128012 64
## [34] {Adrak Chai,
## Extra Adrak Flavor,
## Maska Bun} 0.001110387 63
## [35] {Adrak Chai,
## French Fries – Piri Piri,
## Water Bottle 500 ML} 0.001110387 63
## [36] {Adrak Chai,
## Masala Chai,
## Water Bottle 500 ML} 0.001110387 63
## [37] {Adrak Chai,
## Elaichi Chai,
## Kadak Chai} 0.001092761 62
## [38] {Extra Cheese Grated,
## Kadak Chai,
## Water Bottle 500 ML} 0.001092761 62
## [39] {Adrak Chai,
## Paneer Puff,
## Water Bottle 500 ML} 0.001075136 61
## [40] {Extra Cheese Grated,
## Indian Filter Kaapi,
## Water Bottle 500 ML} 0.001075136 61
## [41] {Adrak Chai,
## Veg Club,
## Water Bottle 500 ML} 0.001057511 60
## [42] {Adrak Chai,
## Chilli Garlic Cheese Toast,
## Water Bottle 500 ML} 0.001004635 57
What items are consumed in groups of four?
inspect(its[size(its) > 3])
## items support count
## [1] {Adrak Chai,
## Extra Adrak Flavor,
## Kadak Chai,
## Maska Bun} 0.001022261 58
What are the business implications of these?
- Water 500 ml looks like its sold with a lot of items. As a business, consider adding this as a discounted pair? For example, a bottle of water costs $5. If you buy with Chai, it will cost $3.
Representing Itemsets
Maximal Itemsets
In the previously found itemsets, we included the itemsets and their supersets. However, it would not make a lot of business sense to do that.
For example, consider {Adrak Chai, Maska Bun, Water Bottle 500 ML} is one itemset. If we include this, should we also include {Adrak Chai, Water Bottle 500 ML}? Probably no.
The function ?is.maximal
keeps only those itemsets if no proper superset exists for it.
its_max = its[is.maximal(its)]
its_max
## set of 309 itemsets
Let’s look at them.
inspect(head(its_max, by = "support"))
## items support count
## [1] {Employee Meal,
## Kadak Chai} 0.018700319 1061
## [2] {Sultan’s Kaapi} 0.008389587 476
## [3] {Lemon Ice Tea,
## Water Bottle 500 ML} 0.008336711 473
## [4] {Orange Slush} 0.006133564 348
## [5] {Cinnamon Kaapi} 0.005851561 332
## [6] {Indian Filter Kaapi,
## Kadak Chai,
## Water Bottle 500 ML} 0.005587183 317
Association Rule Mining
These rules are to be interpreted as If This Then That (IFTT).
rules = apriori(trans, parameter = list(support = 0.001, confidence = 0.2))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.2 0.1 1 none FALSE TRUE 5 0.001 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 56
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[211 item(s), 56737 transaction(s)] done [0.00s].
## sorting and recoding items ... [123 item(s)] done [0.00s].
## creating transaction tree ... done [0.01s].
## checking subsets of size 1 2 3 4 done [0.00s].
## writing ... [131 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
length(rules)
## [1] 131
inspect(head(rules))
## lhs rhs support confidence
## [1] {} => {Kadak Chai} 0.245166294 0.2451663
## [2] {Kulladh Chai} => {Extra Adrak Flavor} 0.002132647 0.5845411
## [3] {Extra Adrak Flavor} => {Kulladh Chai} 0.002132647 0.2494845
## [4] {Adrak Chai Full} => {Adrak Chai} 0.001039886 0.2243346
## [5] {Extra Cheese Slice} => {CK Tadka Burger} 0.001004635 0.2968750
## [6] {Garlic Butter Bread Spread} => {Adrak Chai} 0.001833019 0.3623693
## coverage lift count
## [1] 1.000000000 1.000000 13910
## [2] 0.003648413 68.381662 121
## [3] 0.008548214 68.381662 121
## [4] 0.004635423 1.305711 59
## [5] 0.003384035 48.263028 57
## [6] 0.005058427 2.109125 104
Let’s see their quality
quality(head(rules))
## support confidence coverage lift count
## 1 0.245166294 0.2451663 1.000000000 1.000000 13910
## 2 0.002132647 0.5845411 0.003648413 68.381662 121
## 3 0.002132647 0.2494845 0.008548214 68.381662 121
## 4 0.001039886 0.2243346 0.004635423 1.305711 59
## 5 0.001004635 0.2968750 0.003384035 48.263028 57
## 6 0.001833019 0.3623693 0.005058427 2.109125 104
Rules with highest lift
rules = sort(rules, by = "lift")
inspect(head(rules, n = 10))
## lhs rhs support confidence coverage lift count
## [1] {Kulladh Chai} => {Extra Adrak Flavor} 0.002132647 0.5845411 0.003648413 68.38166 121
## [2] {Extra Adrak Flavor} => {Kulladh Chai} 0.002132647 0.2494845 0.008548214 68.38166 121
## [3] {Adrak Chai,
## Kadak Chai,
## Maska Bun} => {Extra Adrak Flavor} 0.001022261 0.4603175 0.002220773 53.84955 58
## [4] {Extra Cheese Slice} => {CK Tadka Burger} 0.001004635 0.2968750 0.003384035 48.26303 57
## [5] {Adrak Chai,
## Extra Adrak Flavor,
## Kadak Chai} => {Maska Bun} 0.001022261 0.8923077 0.001145637 37.30793 58
## [6] {Extra Adrak Flavor,
## Kadak Chai} => {Maska Bun} 0.001956395 0.7872340 0.002485151 32.91474 111
## [7] {Kadak Chai,
## Paneer Puff} => {Bana Ke} 0.001445265 0.2029703 0.007120574 28.22531 82
## [8] {Bana Ke,
## Kadak Chai} => {Paneer Puff} 0.001445265 0.8817204 0.001639142 23.01112 82
## [9] {Bana Ke,
## Water Bottle 500 ML} => {Paneer Puff} 0.001797769 0.8571429 0.002097397 22.36969 102
## [10] {Bana Ke} => {Paneer Puff} 0.005745810 0.7990196 0.007191075 20.85279 326
Visualisation
You can also visualise the rules you created, thanks to arulesViz
package.
plot(rules)
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
Plot with order of the itemset.
plot(rules, shading = "order")
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
Grouped plot
plot(rules, method = "grouped")
Graph plot
plot(rules, method = "graph")
## Warning: Too many rules supplied. Only plotting the best 100 using
## 'lift' (change control parameter max if needed).
## Warning: ggrepel: 6 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
There are too many rules. Let’s retune the parameters for fewer rules.
rules = apriori(trans, parameter = list(support = 0.001, confidence = 0.4))
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.4 0.1 1 none FALSE TRUE 5 0.001 1
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 56
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[211 item(s), 56737 transaction(s)] done [0.00s].
## sorting and recoding items ... [123 item(s)] done [0.00s].
## creating transaction tree ... done [0.01s].
## checking subsets of size 1 2 3 4 done [0.00s].
## writing ... [26 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
plot(rules, method = "graph")
Interactive Table and Visualisation
You can also see the rules interactively.
Table of Rules
inspectDT(rules)
Plot of Rules
plot(rules, engine = "html")
Matrix of Rules
plot(rules, method = "matrix", engine = "html")
Graph of Rules
plot(rules, method = "graph", engine = "html")
Single-shot Analysis
You can simply pass the data here to visualise the rules directly.
ruleExplorer(df)
Reference
A large part of this tutorial follows the book chapter, Association Analysis: Basic Concepts and Algorithms.
This was originally presented to MS (Business Analytics) students on November 21, 2022 at the Haslam College of Business, University of Tennessee in Prof Charles Liu’s class on Data Mining. Thanks to Prof Charles for providing me this opportunity and resources to make this class a success.
- Posted on:
- November 21, 2022
- Length:
- 38 minute read, 8019 words
- Categories:
- package R statistics workshop
- See Also: