SQL Intermediate Guide — Hands-On with Joins, CTE's & Window Functions
SQL JOINs Explained: INNER, LEFT, RIGHT, FULL OUTER
In relational databases, data is often split across multiple tables. To extract meaningful insights, you frequently need to combine data from two or more tables — this is where SQL JOINs come in.
What is a JOIN?
A JOIN in SQL combines rows from two or more tables based on a related column (typically a primary-foreign key relationship). The resulting dataset will vary depending on the type of JOIN used.
INNER JOIN
Definition: Returns only the rows that have matching values in both tables.
Syntax:
SELECT table1.column, table2.column
FROM table1
INNER JOIN table2
ON table1.common_column = table2.common_column;
Example: Join customers and orders tables to get customers who have placed an order.
SELECT c.customer_name, o.order_date
FROM customers c
INNER JOIN orders o
ON c.customer_id = o.customer_id;
Key Points:
LEFT JOIN (LEFT OUTER JOIN)
Definition: Returns all rows from the left table, and matched rows from the right table. If there's no match, NULLs are returned for columns from the right table.
Syntax:
SELECT table1.column, table2.column
FROM table1
LEFT JOIN table2
ON table1.common_column = table2.common_column;
Example: List all customers and their orders (if any).
SELECT c.customer_name, o.order_date
FROM customers c
LEFT JOIN orders o
ON c.customer_id = o.customer_id;
Key Points:
RIGHT JOIN (RIGHT OUTER JOIN)
Definition: Returns all rows from the right table, and matched rows from the left table. If there's no match, NULLs are returned for columns from the left table.
Syntax:
SELECT table1.column, table2.column
FROM table1
RIGHT JOIN table2
ON table1.common_column = table2.common_column;
Example: List all orders and the customer who placed them. Even if the customer record no longer exists, the order will appear.
SELECT c.customer_name, o.order_date
FROM customers c
RIGHT JOIN orders o
ON c.customer_id = o.customer_id;
Key Points:
FULL OUTER JOIN
Definition: Returns all rows from both tables. Where no match exists, NULLs are returned in place.
Syntax:
SELECT table1.column, table2.column
FROM table1
FULL OUTER JOIN table2
ON table1.common_column = table2.common_column;
Example: List all customers and all orders, matching where possible.
SELECT c.customer_name, o.order_date
FROM customers c
FULL OUTER JOIN orders o
ON c.customer_id = o.customer_id;
Key Points:
Summary of JOIN Types
Pro Tips:
Subqueries vs CTEs (Common Table Expressions)
When writing SQL queries to solve real-world problems, you often need to structure your logic using temporary result sets. These can be created using either subqueries or Common Table Expressions (CTEs). Both approaches are used to improve modularity and handle complexity, but they differ in flexibility, readability, and performance.
1. Subqueries (Nested Queries)
Definition: A subquery is a query nested inside another SQL query. It can appear in the SELECT, FROM, or WHERE clause.
Syntax Example:
SELECT customer_id, customer_name
FROM customers
WHERE customer_id IN (
SELECT customer_id
FROM orders
WHERE order_date >= '2024-01-01'
);
Common Use Cases:
Types of Subqueries:
Limitations:
2. Common Table Expressions (CTEs):
Definition: A CTE (Common Table Expression) is a named temporary result set that can be referenced within the main SQL query. It improves clarity and enables reuse of intermediate logic.
Syntax Example:
WITH recent_orders AS (
SELECT customer_id
FROM orders
WHERE order_date >= '2024-01-01'
)
SELECT customer_id, customer_name
FROM customers
WHERE customer_id IN (SELECT customer_id FROM recent_orders);
Common Use Cases:
Advantages:
Subquery vs CTE – Comparison Table
Real-World Scenario:
Requirement: List all customers who placed orders in the last 30 days and also show how many total orders they placed.
Approach 1: Using Subqueries
SELECT c.customer_name,
(SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.customer_id) AS total_orders
FROM customers c
WHERE c.customer_id IN (
SELECT customer_id
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
);
Approach 2: Using CTE
WITH recent_orders AS (
SELECT customer_id, COUNT(*) AS total_orders
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY customer_id
)
SELECT c.customer_name, ro.total_orders
FROM customers c
JOIN recent_orders ro ON c.customer_id = ro.customer_id;
Why CTE is better here
Summary:
Window Functions in SQL – Deep Dive
What is a Window Function?
A Window Function performs a calculation across a set of rows (a window) related to the current row. This is defined using the OVER() clause.
Unlike aggregate functions, window functions do not reduce the number of rows returned.
Syntax:
function_name(column) OVER (
PARTITION BY column1
ORDER BY column2
)
-- PARTITION BY: Divides the dataset into groups (optional).
-- RDER BY: Defines the order of rows within each partition (required for ranking functions).
1. ROW_NUMBER():
Definition: Assigns a unique sequential number to each row within a partition, ordered by the specified column(s).
Syntax:
SELECT column1, column2,
ROW_NUMBER() OVER (PARTITION BY columnA ORDER BY columnB) AS row_num
FROM table_name;
Example: Assign a row number to each employee within their department based on salary.
SELECT employee_id, department, salary,
ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) AS row_num
FROM employees;
Key Points:
2. RANK()
Definition: Assigns a rank to each row within a partition based on the ordering. Tied values receive the same rank, and the next rank(s) are skipped.
Syntax:
SELECT column1, column2,
RANK() OVER (PARTITION BY columnA ORDER BY columnB) AS ranking
FROM table_name;
Example: Rank employees by salary within departments.
SELECT employee_id, department, salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees;
Key Points:
3. LEAD()
Definition: Accesses the value from the next row in the result set, relative to the current row.
Syntax:
SELECT column1, column2,
LEAD(column_to_look_ahead) OVER (ORDER BY columnB) AS next_value
FROM table_name;
Example: Compare an employee’s salary to the one just below.
SELECT employee_id, salary,
LEAD(salary) OVER (ORDER BY salary DESC) AS next_salary
FROM employees;
Key Points:
4. LAG()
Definition: Accesses the value from the previous row in the result set, relative to the current row.
Syntax:
SELECT column1, column2,
LAG(column_to_look_behind) OVER (ORDER BY columnB) AS previous_value
FROM table_name;
Example: Compare current and previous salary.
SELECT employee_id, salary,
LAG(salary) OVER (ORDER BY salary DESC) AS previous_salary
FROM employees;
Key Points:
Summary Table:
UNION vs UNION ALL
Combining result sets from multiple SELECT queries is a common requirement in SQL. The UNION and UNION ALL operators serve this purpose — but with key differences in performance, behavior, and result handling.
UNION
Definition: UNION combines the result sets of two or more SELECT statements and removes duplicate rows from the final output.
Syntax:
SELECT column1, column2
FROM table1
UNION
SELECT column1, column2
FROM table2;
Example: Get a unique list of customer emails from both online_customers and retail_customers.
SELECT email FROM online_customers
UNION
SELECT email FROM retail_customers;
Key Points:
UNION ALL
Definition: UNION ALL combines result sets from two or more SELECT statements and retains all rows, including duplicates.
Syntax:
SELECT column1, column2
FROM table1
UNION ALL
SELECT column1, column2
FROM table2;
Example: Get all emails (including duplicates) from online_customers and retail_customers.
SELECT email FROM online_customers
UNION ALL
SELECT email FROM retail_customers;
Key Points:
Use Case Comparison:
UNION vs UNION ALL – Feature Comparison:
Real-Life Scenario Example
Scenario: You are analyzing all the customers who interacted via app or website.
Using UNION (for unique users):
SELECT customer_id FROM app_logins
UNION
SELECT customer_id FROM web_logins;
Using UNION ALL (for total login activity count):
SELECT customer_id FROM app_logins
UNION ALL
SELECT customer_id FROM web_logins;
Best Practices
CASE Statements in SQL
Definition:
The CASE expression allows you to implement conditional logic in SQL — similar to IF-ELSE statements in programming. It is often used to transform data, derive new columns, or apply logic-based categorization within queries.
Syntax:
CASE
WHEN condition1 THEN result1
WHEN condition2 THEN result2
...
ELSE default_result
END
Example 1: Categorizing Salary Levels
SELECT employee_id, salary,
CASE
WHEN salary < 40000 THEN 'Low'
WHEN salary BETWEEN 40000 AND 70000 THEN 'Medium'
ELSE 'High'
END AS salary_level
FROM employees;
Purpose: Classifies employees into salary bands based on their income.
Example 2: Flagging Order Status
SELECT order_id, delivery_date, CASE WHEN delivery_date IS NULL THEN 'Pending' WHEN delivery_date <= CURRENT_DATE THEN 'Delivered' ELSE 'In Transit' END AS status FROM orders;
Purpose: Derives a custom column to show human-readable order status.
Types of CASE
1. Simple CASE
Compares a single expression to multiple values.
CASE department
WHEN 'HR' THEN 'Human Resources'
WHEN 'IT' THEN 'Technology'
ELSE 'Other'
END
2. Searched CASE
Evaluates multiple logical conditions (most common form).
CASE
WHEN salary < 40000 THEN 'Low'
WHEN salary BETWEEN 40000 AND 70000 THEN 'Medium'
ELSE 'High'
END
Use Cases
Example 3: Conditional Aggregation
SELECT department,
COUNT(*) AS total_employees,
SUM(CASE WHEN gender = 'Female' THEN 1 ELSE 0 END) AS female_count
FROM employees
GROUP BY department;
Purpose: Aggregate conditionally to count only female employees per department.
Best Practices
✅ Enjoyed this Edition?
📩 Follow & Subscribe
To never miss an update, subscribe to DataTalks: SQL Edition for practical, real-world SQL learning — from basics to advanced.
🔗 Follow Ashish Zope 📬 Subscribe to this newsletter for more hands-on examples and insights.