Danny love for Japanese food lead him to embark on risky venture opening a little cute resturant serving his 3 favourite foods : sushi , curry , ramen.
Danny is in need of assitance to help the restaurant stay afloat. He want to use basic data collected over few months of operation and answer few simple questions about his customer , their visiting pattern, how much money they've spent and which menu items are their favourite. Having this deeper connection with his customers will help him deliver better and more personalized experience for his loyal customers.
Further he plans on using these insights to help him decide whether he should expand the existing customer loyalty progran,
Load Extension
%load_ext sql
Initialize connection
It will create a database file with the name dannys_diner under datasets directory, it is not existing already else it will simply initialize the connection to existing database.
I'm using sqlite for this project
%sql sqlite:///datasets/dannys_diner.db
sales
¶The sales table with three columns - customer id, order_date, product_id.
capturing all customer_id level purchases with an corresponding order_date and product_id information for when and what menu items were ordered
%%sql
CREATE TABLE sales (
"customer_id" VARCHAR(1),
"order_date" DATE,
"product_id" INTEGER
);
INSERT INTO sales
("customer_id", "order_date", "product_id")
VALUES
('A', '2021-01-01', '1'),
('A', '2021-01-01', '2'),
('A', '2021-01-07', '2'),
('A', '2021-01-10', '3'),
('A', '2021-01-11', '3'),
('A', '2021-01-11', '3'),
('B', '2021-01-01', '2'),
('B', '2021-01-02', '2'),
('B', '2021-01-04', '1'),
('B', '2021-01-11', '1'),
('B', '2021-01-16', '3'),
('B', '2021-02-01', '3'),
('C', '2021-01-01', '3'),
('C', '2021-01-01', '3'),
('C', '2021-01-07', '3');
* sqlite:///datasets/dannys_diner.db Done. 15 rows affected.
[]
Querying table 'sales' just created
%%sql
SELECT *
FROM sales
* sqlite:///datasets/dannys_diner.db Done.
customer_id | order_date | product_id |
---|---|---|
A | 2021-01-01 | 1 |
A | 2021-01-01 | 2 |
A | 2021-01-07 | 2 |
A | 2021-01-10 | 3 |
A | 2021-01-11 | 3 |
A | 2021-01-11 | 3 |
B | 2021-01-01 | 2 |
B | 2021-01-02 | 2 |
B | 2021-01-04 | 1 |
B | 2021-01-11 | 1 |
B | 2021-01-16 | 3 |
B | 2021-02-01 | 3 |
C | 2021-01-01 | 3 |
C | 2021-01-01 | 3 |
C | 2021-01-07 | 3 |
menu
¶The menu table maps the product_id to the actual product_name and price of each menu item.
%%sql
CREATE TABLE menu(
"product_id" INTEGER,
"product_name" VARCHAR(5),
"price" INTEGER
);
INSERT INTO menu
("product_id","product_name","price")
VALUES
('1','sushi','10'),
('2','curry','15'),
('3','ramen','12');
* sqlite:///datasets/dannys_diner.db Done. 3 rows affected.
[]
%%sql
SELECT *
FROM menu
* sqlite:///datasets/dannys_diner.db Done.
product_id | product_name | price |
---|---|---|
1 | sushi | 10 |
2 | curry | 15 |
3 | ramen | 12 |
members
¶The final members table captures the join_date when a customer_id joined the beta version of the Danny’s Diner loyalty program
%%sql
DROP TABLE members
* sqlite:///datasets/dannys_diner.db Done.
[]
%%sql
CREATE TABLE members(
"customer_id" VARCHAR(1),
"join_date" DATE
);
INSERT INTO members
("customer_id","join_date")
VALUES
('A','2021-01-07'),
('B','2021-01-09');
* sqlite:///datasets/dannys_diner.db Done. 2 rows affected.
[]
%%sql
SELECT *
FROM members
* sqlite:///datasets/dannys_diner.db Done.
customer_id | join_date |
---|---|
A | 2021-01-07 |
B | 2021-01-09 |
%%sql
SELECT column_name, data_type
FROM INFORMATION_SCHEMA.TABLES
* sqlite:///datasets/dannys_diner.db (sqlite3.OperationalError) no such table: INFORMATION_SCHEMA.TABLES [SQL: SELECT column_name, data_type FROM INFORMATION_SCHEMA.TABLES] (Background on this error at: http://sqlalche.me/e/13/e3q8)
%%sql
SELECT name
FROM sqlite_master
WHERE TYPE ='table'
* sqlite:///datasets/dannys_diner.db Done.
name |
---|
sales |
menu |
members |
Generating temp table basic_table
so Danny and his team can easily inspect the data without needing to use SQL
%%sql
CREATE TEMP TABLE basic_table AS
SELECT
s.customer_id,
s.order_date,
m.product_id,
m.product_name,
m.price,
CASE
WHEN s.customer_id IN
(SELECT customer_id
FROM members mem
WHERE s.order_date >= mem.join_date
)
THEN 'Y'
ELSE 'N' END AS member
FROM sales s
JOIN menu m
ON s.product_id = m.product_id
ORDER BY s.customer_id,s.order_date , price DESC
* sqlite:///datasets/dannys_diner.db Done.
[]
%%sql
SELECT *
FROM basic_table
* sqlite:///datasets/dannys_diner.db Done.
customer_id | order_date | product_id | product_name | price | member |
---|---|---|---|---|---|
A | 2021-01-01 | 2 | curry | 15 | N |
A | 2021-01-01 | 1 | sushi | 10 | N |
A | 2021-01-07 | 2 | curry | 15 | Y |
A | 2021-01-10 | 3 | ramen | 12 | Y |
A | 2021-01-11 | 3 | ramen | 12 | Y |
A | 2021-01-11 | 3 | ramen | 12 | Y |
B | 2021-01-01 | 2 | curry | 15 | N |
B | 2021-01-02 | 2 | curry | 15 | N |
B | 2021-01-04 | 1 | sushi | 10 | N |
B | 2021-01-11 | 1 | sushi | 10 | Y |
B | 2021-01-16 | 3 | ramen | 12 | Y |
B | 2021-02-01 | 3 | ramen | 12 | Y |
C | 2021-01-01 | 3 | ramen | 12 | N |
C | 2021-01-01 | 3 | ramen | 12 | N |
C | 2021-01-07 | 3 | ramen | 12 | N |
%%sql
SELECT customer_id, SUM(price) AS total_spent
FROM basic_table
GROUP BY customer_id
ORDER BY customer_id
* sqlite:///datasets/dannys_diner.db Done.
customer_id | total_spent |
---|---|
A | 76 |
B | 74 |
C | 36 |
%%sql
SELECT
customer_id,
COUNT(DISTINCT order_date) AS days_vistied
FROM basic_table
GROUP BY customer_id
* sqlite:///datasets/dannys_diner.db Done.
customer_id | days_vistied |
---|---|
A | 4 |
B | 6 |
C | 2 |
B visited often more than A and C. But customers may have visited restaurants twice a day.
%%sql
SELECT
customer_id,
COUNT(order_date) AS days_vistied
FROM basic_table
GROUP BY customer_id
* sqlite:///datasets/dannys_diner.db Done.
customer_id | days_vistied |
---|---|
A | 6 |
B | 6 |
C | 3 |
%%sql
SELECT
customer_id,
GROUP_CONCAT(product_name,',') AS first_item
FROM basic_table t2
WHERE order_date = (
SELECT MIN(order_date)
FROM basic_table as t1
WHERE t1.customer_id = t2.customer_id
GROUP BY customer_id
)
GROUP BY customer_id
* sqlite:///datasets/dannys_diner.db Done.
customer_id | first_item |
---|---|
A | curry,sushi |
B | curry |
C | ramen,ramen |
Ramen is popular item from the menu
%%sql
SELECT
product_name,
COUNT(*) AS n_times_purchased
FROM basic_table
GROUP BY product_name
ORDER BY n_times_purchased DESC
* sqlite:///datasets/dannys_diner.db Done.
product_name | n_times_purchased |
---|---|
ramen | 8 |
curry | 4 |
sushi | 3 |
%%sql
WITH purchase_table AS(
SELECT
customer_id,
product_name,
COUNT(*) AS n_times_purchased
FROM basic_table
GROUP BY customer_id,product_name
),
purchase_rank_table AS(
SELECT
customer_id,
product_name,
n_times_purchased,
RANK() OVER (
PARTITION BY customer_id
ORDER BY n_times_purchased DESC) AS rank
FROM purchase_table
)
SELECT
customer_id,
product_name,
n_times_purchased
FROM purchase_rank_table
WHERE rank = 1
* sqlite:///datasets/dannys_diner.db Done.
customer_id | product_name | n_times_purchased |
---|---|---|
A | ramen | 3 |
B | curry | 2 |
B | ramen | 2 |
B | sushi | 2 |
C | ramen | 3 |
Ramen is still popular among Danny's customers
%%sql
SELECT *
FROM
(
SELECT
customer_id,
product_name,
order_date,
RANK() OVER(
PARTITION BY customer_id
ORDER BY order_date) AS rank
FROM basic_table
WHERE member = 'Y'
)
WHERE rank = 1
* sqlite:///datasets/dannys_diner.db Done.
customer_id | product_name | order_date | rank |
---|---|---|---|
A | curry | 2021-01-07 | 1 |
B | sushi | 2021-01-11 | 1 |
%%sql
SELECT *
FROM members
* sqlite:///datasets/dannys_diner.db Done.
customer_id | join_date |
---|---|
A | 2021-01-07 |
B | 2021-01-09 |
C is still not a member of loyalty program. so we are excluding C for this enquiry
%%sql
WITH before_membership AS(
SELECT *
FROM basic_table
WHERE member = 'N' AND
customer_id IN (
SELECT
DISTINCT customer_id
FROM members
)
),
before_membership_rank AS (
SELECT customer_id,
order_date,
product_name,
RANK() OVER(PARTITION BY customer_id
ORDER BY order_date DESC) AS rank
FROM
before_membership
)
SELECT *
FROM before_membership_rank
WHERE rank = 1
* sqlite:///datasets/dannys_diner.db Done.
customer_id | order_date | product_name | rank |
---|---|---|---|
A | 2021-01-01 | curry | 1 |
A | 2021-01-01 | sushi | 1 |
B | 2021-01-04 | sushi | 1 |
A and B customer last purchased curry and sushi respectively before becoming member of loyalty program.
Again A and B customer first purchased curry and sushi respectively after becoming member.
%%sql
SELECT
customer_id,
COUNT(product_name) AS total_purchased,
SUM(price) AS total_spent
FROM basic_table
WHERE member = 'N'
GROUP BY customer_id
* sqlite:///datasets/dannys_diner.db Done.
customer_id | total_purchased | total_spent |
---|---|---|
A | 2 | 25 |
B | 3 | 40 |
C | 3 | 36 |
%%sql
SELECT
customer_id,
COUNT(product_name) AS total_purchased,
SUM(price) AS total_spent
FROM basic_table
WHERE member = 'Y'
GROUP BY customer_id
* sqlite:///datasets/dannys_diner.db Done.
customer_id | total_purchased | total_spent |
---|---|---|
A | 4 | 51 |
B | 3 | 34 |
%%sql
WITH loyalty_point_table AS(
SELECT
*,
CASE
WHEN product_id = 1 THEN price*20
ELSE price*10
END AS points
FROM basic_table
)
SELECT customer_id,
SUM(points) AS total_loyalty_points
FROM loyalty_point_table
GROUP BY customer_id
* sqlite:///datasets/dannys_diner.db Done.
customer_id | total_loyalty_points |
---|---|
A | 860 |
B | 940 |
C | 360 |
Day 1 - Day 7 (1 week from the 1st day joined as Loyalty member) Each $1 spent on all items is 20 points.
Day x to Day 1 (first day as loyalty member) Or Week after Day 1 Each $1 spent on Sushi is 20 points And on rest item is 10 points
%%sql
SELECT *,
date(join_date, '7 days')
FROM members
* sqlite:///datasets/dannys_diner.db Done.
customer_id | join_date | date(join_date, '7 days') |
---|---|---|
A | 2021-01-07 | 2021-01-14 |
B | 2021-01-09 | 2021-01-16 |
Cte table dates_cte to record 7th day from join_date
%%sql
WITH dates_cte AS(
SELECT *,
date(join_date, '7 days') AS valid_date
FROM members
),
loyalty_point_table AS(
SELECT
t1.customer_id,
t1.order_date,
CASE
WHEN (t1.order_date < t2.join_date)
OR (t1.order_date > t2.valid_date)
THEN
CASE WHEN t1.product_id = 1
THEN t1.price*10*2
ELSE t1.price*10
END
WHEN (t1.order_date >= t2.join_date)
AND (t1.order_date <= t2.valid_date)
THEN price * 2*10
END AS points
FROM basic_table t1
JOIN dates_cte t2
ON t1.customer_id = t2.customer_id
)
SELECT
customer_id,
SUM(points) AS loyalty_points
FROM loyalty_point_table
WHERE STRFTIME('%m',order_date) = '01'
GROUP BY customer_id
* sqlite:///datasets/dannys_diner.db Done.
customer_id | loyalty_points |
---|---|
A | 1370 |
B | 940 |
%%sql
WITH first_order_table AS (
SELECT
customer_id,
MIN(order_date) AS first_order_date
FROM basic_table t1
WHERE customer_id IN (
SELECT DISTINCT customer_id
FROM members
)
GROUP BY customer_id
)
SELECT
customer_id,
CAST(JULIANDAY((SELECT join_date
FROM members t2
WHERE t1.customer_id = t2.customer_id)) - JULIANDAY(first_order_date)
AS INTEGER) AS days
FROM first_order_table t1
* sqlite:///datasets/dannys_diner.db Done.
customer_id | days |
---|---|
A | 6 |
B | 8 |
From the analysis, we discover few interesting insights that would be certainly useful for Danny