Case Study 1¶

Danny's Dinner¶

Source : 8 Week SQL Challenge

Table of Content¶

  • Introduction
  • Creating Tables in Database
  • Entity Relationship Diagram
  • Case study questions and solutions

Introduction¶

Danny love for Japanese food lead him to embark on risky venture opening a little cute resturant serving his 3 favourite foods : sushi , curry , ramen.

Danny is in need of assitance to help the restaurant stay afloat. He want to use basic data collected over few months of operation and answer few simple questions about his customer , their visiting pattern, how much money they've spent and which menu items are their favourite. Having this deeper connection with his customers will help him deliver better and more personalized experience for his loyal customers.

Further he plans on using these insights to help him decide whether he should expand the existing customer loyalty progran,

Load Extension

In [1]:
%load_ext sql

Initialize connection

It will create a database file with the name dannys_diner under datasets directory, it is not existing already else it will simply initialize the connection to existing database.

I'm using sqlite for this project

In [2]:
%sql sqlite:///datasets/dannys_diner.db

Creating tables in Database¶

Table 1 : sales¶

The sales table with three columns - customer id, order_date, product_id.

capturing all customer_id level purchases with an corresponding order_date and product_id information for when and what menu items were ordered

In [5]:
%%sql 

CREATE TABLE sales (
  "customer_id" VARCHAR(1),
  "order_date" DATE,
  "product_id" INTEGER
);
INSERT INTO sales
  ("customer_id", "order_date", "product_id")
VALUES
  ('A', '2021-01-01', '1'),
  ('A', '2021-01-01', '2'),
  ('A', '2021-01-07', '2'),
  ('A', '2021-01-10', '3'),
  ('A', '2021-01-11', '3'),
  ('A', '2021-01-11', '3'),
  ('B', '2021-01-01', '2'),
  ('B', '2021-01-02', '2'),
  ('B', '2021-01-04', '1'),
  ('B', '2021-01-11', '1'),
  ('B', '2021-01-16', '3'),
  ('B', '2021-02-01', '3'),
  ('C', '2021-01-01', '3'),
  ('C', '2021-01-01', '3'),
  ('C', '2021-01-07', '3');
 * sqlite:///datasets/dannys_diner.db
Done.
15 rows affected.
Out[5]:
[]

Querying table 'sales' just created

In [10]:
%%sql
SELECT *
FROM sales
 * sqlite:///datasets/dannys_diner.db
Done.
Out[10]:
customer_id order_date product_id
A 2021-01-01 1
A 2021-01-01 2
A 2021-01-07 2
A 2021-01-10 3
A 2021-01-11 3
A 2021-01-11 3
B 2021-01-01 2
B 2021-01-02 2
B 2021-01-04 1
B 2021-01-11 1
B 2021-01-16 3
B 2021-02-01 3
C 2021-01-01 3
C 2021-01-01 3
C 2021-01-07 3

Table 2 : menu¶

The menu table maps the product_id to the actual product_name and price of each menu item.

In [14]:
%%sql
CREATE TABLE menu(
    "product_id" INTEGER,
    "product_name" VARCHAR(5),
    "price" INTEGER
);
INSERT INTO menu
    ("product_id","product_name","price")
VALUES
    ('1','sushi','10'),
    ('2','curry','15'),
    ('3','ramen','12');
 * sqlite:///datasets/dannys_diner.db
Done.
3 rows affected.
Out[14]:
[]
In [15]:
%%sql
SELECT *
FROM menu
 * sqlite:///datasets/dannys_diner.db
Done.
Out[15]:
product_id product_name price
1 sushi 10
2 curry 15
3 ramen 12

Table 3: members¶

The final members table captures the join_date when a customer_id joined the beta version of the Danny’s Diner loyalty program

In [35]:
%%sql
DROP TABLE members
 * sqlite:///datasets/dannys_diner.db
Done.
Out[35]:
[]
In [36]:
%%sql
CREATE TABLE members(
    "customer_id" VARCHAR(1),
    "join_date" DATE
);
INSERT INTO members
    ("customer_id","join_date")
VALUES
    ('A','2021-01-07'),
    ('B','2021-01-09');
 * sqlite:///datasets/dannys_diner.db
Done.
2 rows affected.
Out[36]:
[]
In [37]:
%%sql
SELECT *
FROM members
 * sqlite:///datasets/dannys_diner.db
Done.
Out[37]:
customer_id join_date
A 2021-01-07
B 2021-01-09
In [52]:
%%sql

SELECT column_name, data_type
FROM INFORMATION_SCHEMA.TABLES
 * sqlite:///datasets/dannys_diner.db
(sqlite3.OperationalError) no such table: INFORMATION_SCHEMA.TABLES
[SQL: SELECT column_name, data_type
FROM INFORMATION_SCHEMA.TABLES]
(Background on this error at: http://sqlalche.me/e/13/e3q8)

Querying table names¶

In [19]:
%%sql
SELECT name
FROM sqlite_master
WHERE TYPE ='table'
 * sqlite:///datasets/dannys_diner.db
Done.
Out[19]:
name
sales
menu
members

Entity Relationship Diagram¶

Join All the Things¶

Generating temp table basic_table so Danny and his team can easily inspect the data without needing to use SQL

In [4]:
%%sql
CREATE TEMP TABLE basic_table AS
SELECT
    s.customer_id,
    s.order_date,
    m.product_id,
    m.product_name,
    m.price,
    CASE 
    WHEN s.customer_id IN 
        (SELECT customer_id 
         FROM members mem
         WHERE s.order_date >= mem.join_date
        )
    THEN 'Y'
    ELSE 'N' END AS member
FROM sales s
JOIN menu m
ON s.product_id = m.product_id
ORDER BY s.customer_id,s.order_date , price DESC
 * sqlite:///datasets/dannys_diner.db
Done.
Out[4]:
[]
In [56]:
%%sql
SELECT *
FROM basic_table
 * sqlite:///datasets/dannys_diner.db
Done.
Out[56]:
customer_id order_date product_id product_name price member
A 2021-01-01 2 curry 15 N
A 2021-01-01 1 sushi 10 N
A 2021-01-07 2 curry 15 Y
A 2021-01-10 3 ramen 12 Y
A 2021-01-11 3 ramen 12 Y
A 2021-01-11 3 ramen 12 Y
B 2021-01-01 2 curry 15 N
B 2021-01-02 2 curry 15 N
B 2021-01-04 1 sushi 10 N
B 2021-01-11 1 sushi 10 Y
B 2021-01-16 3 ramen 12 Y
B 2021-02-01 3 ramen 12 Y
C 2021-01-01 3 ramen 12 N
C 2021-01-01 3 ramen 12 N
C 2021-01-07 3 ramen 12 N

Case study questions¶

1. What is the total amount each customer spent at the restaurant?¶

In [58]:
%%sql
SELECT customer_id, SUM(price) AS total_spent
FROM basic_table
GROUP BY customer_id
ORDER BY customer_id
 * sqlite:///datasets/dannys_diner.db
Done.
Out[58]:
customer_id total_spent
A 76
B 74
C 36

2. How many days has each customer visited the restaurant?¶

In [62]:
%%sql

SELECT 
    customer_id,
    COUNT(DISTINCT order_date) AS days_vistied 
FROM basic_table
GROUP BY customer_id
 * sqlite:///datasets/dannys_diner.db
Done.
Out[62]:
customer_id days_vistied
A 4
B 6
C 2

B visited often more than A and C. But customers may have visited restaurants twice a day.

In [64]:
%%sql
SELECT 
    customer_id,
    COUNT(order_date) AS days_vistied 
FROM basic_table
GROUP BY customer_id
 * sqlite:///datasets/dannys_diner.db
Done.
Out[64]:
customer_id days_vistied
A 6
B 6
C 3

3. What was the first item from the menu purchased by each customer?¶

In [72]:
%%sql
SELECT
    customer_id,
    GROUP_CONCAT(product_name,',') AS first_item
FROM basic_table t2
WHERE order_date = (
    SELECT MIN(order_date)
    FROM basic_table as t1
    WHERE t1.customer_id = t2.customer_id 
    GROUP BY customer_id
)
GROUP BY customer_id
 * sqlite:///datasets/dannys_diner.db
Done.
Out[72]:
customer_id first_item
A curry,sushi
B curry
C ramen,ramen

Ramen is popular item from the menu

4. What is the most purchased item on the menu and how many times was it purchased by all customers?¶

In [77]:
%%sql
SELECT
    product_name,
    COUNT(*) AS n_times_purchased
FROM basic_table
GROUP BY product_name
ORDER BY n_times_purchased DESC
 * sqlite:///datasets/dannys_diner.db
Done.
Out[77]:
product_name n_times_purchased
ramen 8
curry 4
sushi 3

5. Which item was the most popular for each customer?¶

In [33]:
%%sql
WITH purchase_table AS(
    SELECT
        customer_id,
        product_name,
        COUNT(*) AS n_times_purchased
    FROM basic_table
    GROUP BY customer_id,product_name
),
purchase_rank_table AS(
    SELECT
        customer_id,
        product_name,
        n_times_purchased,
        RANK() OVER (
            PARTITION BY customer_id
            ORDER BY n_times_purchased DESC) AS rank
    FROM purchase_table
)

SELECT 
    customer_id,
    product_name,
    n_times_purchased
FROM purchase_rank_table
WHERE rank = 1
 * sqlite:///datasets/dannys_diner.db
Done.
Out[33]:
customer_id product_name n_times_purchased
A ramen 3
B curry 2
B ramen 2
B sushi 2
C ramen 3

Ramen is still popular among Danny's customers

6. Which item was purchased first by the customer after they became a member?¶

In [5]:
%%sql

SELECT *
FROM
(
    SELECT
        customer_id,
        product_name,
        order_date,
        RANK() OVER(
        PARTITION BY customer_id
        ORDER BY order_date) AS rank
    FROM basic_table
    WHERE member = 'Y'
)
WHERE rank = 1
 * sqlite:///datasets/dannys_diner.db
Done.
Out[5]:
customer_id product_name order_date rank
A curry 2021-01-07 1
B sushi 2021-01-11 1
In [89]:
%%sql
SELECT *
FROM members
 * sqlite:///datasets/dannys_diner.db
Done.
Out[89]:
customer_id join_date
A 2021-01-07
B 2021-01-09

7. Which item was purchased just before the customer became a member¶

C is still not a member of loyalty program. so we are excluding C for this enquiry

In [117]:
%%sql
WITH before_membership AS(
    SELECT *
    FROM basic_table
    WHERE member = 'N' AND 
    customer_id IN (
        SELECT 
            DISTINCT customer_id
        FROM members
        )
),
before_membership_rank AS (
    SELECT customer_id,
        order_date,
        product_name,
        RANK() OVER(PARTITION BY customer_id
        ORDER BY order_date DESC) AS rank
    FROM
        before_membership
)
SELECT *
FROM before_membership_rank
WHERE rank = 1
 * sqlite:///datasets/dannys_diner.db
Done.
Out[117]:
customer_id order_date product_name rank
A 2021-01-01 curry 1
A 2021-01-01 sushi 1
B 2021-01-04 sushi 1

A and B customer last purchased curry and sushi respectively before becoming member of loyalty program.

Again A and B customer first purchased curry and sushi respectively after becoming member.

8. a. What is the total items and amount spent for each member before they became a member?¶

In [119]:
%%sql
SELECT 
    customer_id,
    COUNT(product_name) AS total_purchased,
    SUM(price) AS total_spent
FROM basic_table
WHERE member = 'N'
GROUP BY customer_id
 * sqlite:///datasets/dannys_diner.db
Done.
Out[119]:
customer_id total_purchased total_spent
A 2 25
B 3 40
C 3 36

8.b. What is the total items and amount spent for each member after they became a member?¶

In [123]:
%%sql
SELECT 
    customer_id,
    COUNT(product_name) AS total_purchased,
    SUM(price) AS total_spent
FROM basic_table
WHERE member = 'Y'
GROUP BY customer_id
 * sqlite:///datasets/dannys_diner.db
Done.
Out[123]:
customer_id total_purchased total_spent
A 4 51
B 3 34

9. If each 1dollar spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?¶

In [10]:
%%sql
WITH loyalty_point_table AS(
    SELECT
        *,
        CASE 
        WHEN product_id = 1 THEN price*20
        ELSE price*10
        END AS points
    FROM basic_table
)
SELECT customer_id,
    SUM(points) AS total_loyalty_points
FROM loyalty_point_table
GROUP BY customer_id
 * sqlite:///datasets/dannys_diner.db
Done.
Out[10]:
customer_id total_loyalty_points
A 860
B 940
C 360

10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customer A and B have at the end of January?¶

Day 1 - Day 7 (1 week from the 1st day joined as Loyalty member) Each $1 spent on all items is 20 points.

Day x to Day 1 (first day as loyalty member) Or Week after Day 1 Each $1 spent on Sushi is 20 points And on rest item is 10 points

In [15]:
%%sql
SELECT *,
    date(join_date, '7 days')
FROM members
 * sqlite:///datasets/dannys_diner.db
Done.
Out[15]:
customer_id join_date date(join_date, '7 days')
A 2021-01-07 2021-01-14
B 2021-01-09 2021-01-16

Cte table dates_cte to record 7th day from join_date

In [31]:
%%sql
WITH dates_cte AS(
    SELECT *,
        date(join_date, '7 days') AS valid_date
    FROM members
    
),
loyalty_point_table AS(
    SELECT 
        t1.customer_id,
        t1.order_date, 
        CASE 
            WHEN (t1.order_date < t2.join_date)
            OR (t1.order_date > t2.valid_date)
            THEN
                CASE WHEN t1.product_id = 1
                        THEN t1.price*10*2
                        ELSE t1.price*10
                END
            WHEN (t1.order_date >= t2.join_date) 
            AND (t1.order_date <= t2.valid_date)
            THEN price * 2*10
        END AS points
    FROM basic_table t1
    JOIN dates_cte t2
    ON t1.customer_id = t2.customer_id
)

SELECT
    customer_id,
    SUM(points) AS loyalty_points
FROM loyalty_point_table
WHERE STRFTIME('%m',order_date) = '01'
GROUP BY customer_id
 * sqlite:///datasets/dannys_diner.db
Done.
Out[31]:
customer_id loyalty_points
A 1370
B 940

11. Calculating days from first purchase till the day they become loyalty member?¶

In [190]:
%%sql
WITH first_order_table AS (
    SELECT
        customer_id,
        MIN(order_date) AS first_order_date
    FROM basic_table t1
    WHERE customer_id IN (
        SELECT DISTINCT customer_id
        FROM members
    )
    GROUP BY customer_id
)

SELECT 
    customer_id,
    CAST(JULIANDAY((SELECT join_date
        FROM members t2
        WHERE t1.customer_id = t2.customer_id)) - JULIANDAY(first_order_date)
    AS INTEGER) AS days
FROM first_order_table t1
 * sqlite:///datasets/dannys_diner.db
Done.
Out[190]:
customer_id days
A 6
B 8

Insights¶

From the analysis, we discover few interesting insights that would be certainly useful for Danny

  • Danny' Diner's most popular item is Ramen.
  • The last item Customer A and B order before becoming loyalty members are sushi and curry. Also the first thing they purchased after becoming member are sushi and curry. Apart from popular item as Ramen, other item sushi and curry must be really delicious.
  • Loyatly customer spend more at restaurant. Customer A & B spend more than customer C.
  • Loyalty customer visited more often those who are not in loyalty program
  • Both loyalty member signed up for this program within a week from their first visit.

Back to top

This Project (on Github)¶

In [ ]: