SQL: Advanced Queries
GROUP BY - Grouping Data
The GROUP BY clause groups rows that have the same values in specified columns, allowing you to perform aggregate calculations on each group.
Basic GROUP BY Syntax
SELECT column1, aggregate_function(column2) FROM table GROUP BY column1;
Example: Count Products by Category
SELECT category, COUNT(*) AS product_count FROM products GROUP BY category;
Output:
+-------------+---------------+ | category | product_count | +-------------+---------------+ | Electronics | 25 | | Books | 48 | | Furniture | 12 | +-------------+---------------+
Multiple Aggregate Functions
-- Get category statistics
SELECT
category,
COUNT(*) AS product_count,
AVG(price) AS avg_price,
MIN(price) AS min_price,
MAX(price) AS max_price,
SUM(stock) AS total_stock
FROM products
GROUP BY category;
GROUP BY Multiple Columns
-- Count products by category and status
SELECT
category,
is_active,
COUNT(*) AS count
FROM products
GROUP BY category, is_active;
Output:
+-------------+-----------+-------+ | category | is_active | count | +-------------+-----------+-------+ | Electronics | 1 | 22 | | Electronics | 0 | 3 | | Books | 1 | 45 | | Books | 0 | 3 | +-------------+-----------+-------+
HAVING - Filtering Groups
The HAVING clause filters groups created by GROUP BY. It's like WHERE but for grouped data.
WHERE vs HAVING
- WHERE: Filters individual rows BEFORE grouping
- HAVING: Filters groups AFTER grouping
- WHERE: Cannot use aggregate functions (COUNT, SUM, AVG)
- HAVING: Can use aggregate functions
Example: Categories with More Than 10 Products
SELECT category, COUNT(*) AS product_count FROM products GROUP BY category HAVING COUNT(*) > 10;
Example: High-Value Categories
-- Categories with average price over $100
SELECT
category,
COUNT(*) AS products,
AVG(price) AS avg_price
FROM products
GROUP BY category
HAVING AVG(price) > 100
ORDER BY avg_price DESC;
Combining WHERE and HAVING
-- Active products, grouped by category, with more than 5 items
SELECT
category,
COUNT(*) AS active_count,
AVG(price) AS avg_price
FROM products
WHERE is_active = 1 -- Filter rows before grouping
GROUP BY category
HAVING COUNT(*) > 5 -- Filter groups after grouping
ORDER BY active_count DESC;
Subqueries (Nested Queries)
A subquery is a query nested inside another query. It can be used in SELECT, FROM, WHERE, or HAVING clauses.
Subquery in WHERE Clause
-- Find products more expensive than the average price SELECT name, price FROM products WHERE price > (SELECT AVG(price) FROM products);
Subquery with IN
-- Find users who have placed orders SELECT username, email FROM users WHERE id IN (SELECT DISTINCT user_id FROM orders);
Subquery in SELECT
-- Show each product with category average price
SELECT
name,
price,
category,
(SELECT AVG(price)
FROM products p2
WHERE p2.category = products.category) AS category_avg_price
FROM products;
Subquery in FROM (Derived Table)
-- Get categories with their stats, then filter
SELECT *
FROM (
SELECT
category,
COUNT(*) AS product_count,
AVG(price) AS avg_price
FROM products
GROUP BY category
) AS category_stats
WHERE avg_price > 50;
EXISTS and NOT EXISTS
Tests whether a subquery returns any rows.
EXISTS Example
-- Find users who have placed at least one order
SELECT username, email
FROM users
WHERE EXISTS (
SELECT 1
FROM orders
WHERE orders.user_id = users.id
);
NOT EXISTS Example
-- Find users who have NEVER placed an order
SELECT username, email
FROM users
WHERE NOT EXISTS (
SELECT 1
FROM orders
WHERE orders.user_id = users.id
);
Performance Tip: EXISTS is often faster than IN for large datasets because it stops searching as soon as it finds one matching row.
UNION - Combining Results
UNION combines results from multiple SELECT statements into one result set.
Basic UNION
-- Combine active and featured products SELECT id, name, 'active' AS type FROM products WHERE is_active = 1 UNION SELECT id, name, 'featured' AS type FROM products WHERE is_featured = 1;
UNION vs UNION ALL
- UNION: Removes duplicate rows (slower)
- UNION ALL: Keeps all rows including duplicates (faster)
-- UNION ALL: Faster, keeps duplicates SELECT name FROM products WHERE category = 'Electronics' UNION ALL SELECT name FROM products WHERE price > 100;
UNION Requirements
- All SELECT statements must have the same number of columns
- Columns must have compatible data types
- Column names come from the first SELECT
CASE Statements
Add conditional logic to your queries with CASE expressions.
Simple CASE
SELECT
name,
price,
CASE
WHEN price < 50 THEN 'Cheap'
WHEN price < 200 THEN 'Moderate'
WHEN price < 500 THEN 'Expensive'
ELSE 'Premium'
END AS price_category
FROM products;
CASE with GROUP BY
-- Count products in each price range
SELECT
CASE
WHEN price < 50 THEN 'Budget'
WHEN price < 200 THEN 'Mid-Range'
ELSE 'Premium'
END AS price_range,
COUNT(*) AS product_count
FROM products
GROUP BY
CASE
WHEN price < 50 THEN 'Budget'
WHEN price < 200 THEN 'Mid-Range'
ELSE 'Premium'
END;
Date and Time Functions
Current Date/Time
SELECT NOW(); -- Current date and time: 2024-01-15 14:30:45 SELECT CURDATE(); -- Current date: 2024-01-15 SELECT CURTIME(); -- Current time: 14:30:45
Date Arithmetic
-- Orders from last 7 days SELECT * FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 7 DAY); -- Orders from last month SELECT * FROM orders WHERE created_at >= DATE_SUB(NOW(), INTERVAL 1 MONTH); -- Future date SELECT DATE_ADD(NOW(), INTERVAL 30 DAY) AS delivery_date;
Date Formatting
SELECT
created_at,
DATE_FORMAT(created_at, '%Y-%m-%d') AS date_only,
DATE_FORMAT(created_at, '%M %d, %Y') AS formatted_date,
DATE_FORMAT(created_at, '%W') AS day_of_week
FROM orders;
Date Extraction
SELECT
YEAR(created_at) AS year,
MONTH(created_at) AS month,
DAY(created_at) AS day,
HOUR(created_at) AS hour
FROM orders;
String Functions
Common String Operations
SELECT
CONCAT(first_name, ' ', last_name) AS full_name,
UPPER(username) AS uppercase_name,
LOWER(email) AS lowercase_email,
LENGTH(description) AS desc_length,
SUBSTRING(name, 1, 10) AS short_name,
TRIM(' text ') AS trimmed,
REPLACE(description, 'old', 'new') AS updated_desc
FROM users;
Mathematical Functions
SELECT
price,
ROUND(price) AS rounded,
ROUND(price, 2) AS two_decimals,
CEIL(price) AS rounded_up,
FLOOR(price) AS rounded_down,
ABS(-50) AS absolute_value,
POWER(2, 3) AS power_result, -- 2^3 = 8
SQRT(16) AS square_root -- 4
FROM products;
Practical Advanced Query Examples
Example 1: Sales Report by Month
SELECT
DATE_FORMAT(created_at, '%Y-%m') AS month,
COUNT(*) AS total_orders,
SUM(total_amount) AS monthly_revenue,
AVG(total_amount) AS avg_order_value
FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 12 MONTH)
GROUP BY DATE_FORMAT(created_at, '%Y-%m')
ORDER BY month DESC;
Example 2: Top 5 Customers by Spending
SELECT
users.username,
users.email,
COUNT(orders.id) AS order_count,
SUM(orders.total_amount) AS total_spent
FROM users
INNER JOIN orders ON users.id = orders.user_id
GROUP BY users.id, users.username, users.email
ORDER BY total_spent DESC
LIMIT 5;
Example 3: Products Below Average Price by Category
SELECT
p1.name,
p1.category,
p1.price,
(SELECT AVG(p2.price)
FROM products p2
WHERE p2.category = p1.category) AS category_avg
FROM products p1
WHERE p1.price < (
SELECT AVG(p2.price)
FROM products p2
WHERE p2.category = p1.category
)
ORDER BY p1.category, p1.price;
Example 4: Customer Purchase Analysis
SELECT
username,
email,
CASE
WHEN total_orders = 0 THEN 'No Orders'
WHEN total_orders < 5 THEN 'Occasional'
WHEN total_orders < 20 THEN 'Regular'
ELSE 'VIP'
END AS customer_tier,
total_orders,
total_spent
FROM (
SELECT
u.username,
u.email,
COUNT(o.id) AS total_orders,
COALESCE(SUM(o.total_amount), 0) AS total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.username, u.email
) AS customer_stats
ORDER BY total_spent DESC;
Example 5: Inventory Alert System
SELECT
name,
stock,
CASE
WHEN stock = 0 THEN 'OUT OF STOCK'
WHEN stock < 5 THEN 'CRITICAL'
WHEN stock < 20 THEN 'LOW'
ELSE 'OK'
END AS stock_status,
price * stock AS inventory_value
FROM products
WHERE is_active = 1
HAVING stock_status IN ('OUT OF STOCK', 'CRITICAL', 'LOW')
ORDER BY
CASE stock_status
WHEN 'OUT OF STOCK' THEN 1
WHEN 'CRITICAL' THEN 2
WHEN 'LOW' THEN 3
END;
Exercise: Advanced Analytics Queries
Write SQL queries for the following scenarios:
- Find the top 3 most popular product categories by total sales count
- List users who have spent more than the average customer spending
- Show monthly revenue trends for the past 6 months
- Find products that have never been ordered
- Calculate the percentage of orders in each status (pending, shipped, delivered, cancelled)
- Get the day of the week with the most orders
- List categories where the average product price is above $200
- Find users who placed orders in January but not in February
Query Optimization Tips
Writing Efficient Queries
- Use indexes: Add indexes on columns used in WHERE, JOIN, ORDER BY, GROUP BY
- Select specific columns: Avoid
SELECT * - Limit results: Use LIMIT for testing and pagination
- Avoid functions on indexed columns:
WHERE YEAR(date) = 2024prevents index use - Use EXISTS instead of IN: For large subqueries
- Avoid wildcards at start:
LIKE '%text'can't use indexes - Use EXPLAIN: Analyze query execution plan
-- Check query performance EXPLAIN SELECT * FROM products WHERE category = 'Electronics';
Common Aggregate Functions Summary
| Function | Description | Example |
|---|---|---|
COUNT() |
Count rows | COUNT(*) |
SUM() |
Sum values | SUM(price) |
AVG() |
Average | AVG(rating) |
MIN() |
Minimum value | MIN(price) |
MAX() |
Maximum value | MAX(salary) |
GROUP_CONCAT() |
Concatenate values | GROUP_CONCAT(name) |
What's Next?
In the next lesson, you'll master SQL JOINs - learning how to combine data from multiple related tables using INNER JOIN, LEFT JOIN, RIGHT JOIN, and more advanced joining techniques.