条件过滤与聚合查询
条件过滤与聚合查询
只查询”全部数据”几乎没有任何实际意义。真实业务场景中,我们总是需要按条件过滤、排序结果、限制数量,以及对数据进行分组统计。MySQL 8.x 提供了丰富的子句来满足这些需求,本篇逐一讲解。
环境准备
使用前文的 shop.users 表演示:
USE shop;
-- 扩充测试数据
INSERT INTO users (username, email, age, city) VALUES
('alice', 'alice@example.com', 28, 'Beijing'),
('bob', 'bob@example.com', 22, 'Shanghai'),
('charlie', 'charlie@example.com', 35, 'Beijing'),
('david', 'david@example.com', 30, 'Guangzhou'),
('emma', 'emma@example.com', 25, 'Shanghai'),
('frank', 'frank@example.com', 40, 'Beijing'),
('grace', 'grace@example.com', 19, 'Chengdu'),
('henry', 'henry@example.com', 33, 'Shanghai'),
('ivy', 'ivy@example.com', 27, 'Chengdu'),
('jack', 'jack@example.com', 31, 'Beijing');
-- 创建订单表用于聚合演示
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT,
amount DECIMAL(10, 2),
status VARCHAR(20),
created_at DATE
);
INSERT INTO orders (user_id, amount, status, created_at) VALUES
(1, 150.00, 'completed', '2026-01-05'),
(1, 320.50, 'completed', '2026-02-12'),
(2, 89.00, 'pending', '2026-02-20'),
(3, 450.00, 'completed', '2026-01-18'),
(3, 75.00, 'refunded', '2026-02-25'),
(4, 200.00, 'completed', '2026-03-01'),
(5, 130.00, 'pending', '2026-03-10');WHERE 条件过滤
WHERE 子句在数据读取后立即过滤,只保留满足条件的行。
比较运算符
-- 基础比较
SELECT * FROM users WHERE age > 25; -- 大于
SELECT * FROM users WHERE age >= 25; -- 大于等于
SELECT * FROM users WHERE city = 'Beijing'; -- 等于(字符串需加引号)
SELECT * FROM users WHERE city <> 'Beijing'; -- 不等于(也可用 !=)
SELECT * FROM users WHERE age BETWEEN 20 AND 30; -- 闭区间逻辑运算符
-- AND:同时满足多个条件
SELECT * FROM users WHERE age > 25 AND city = 'Beijing';
-- OR:满足任一条件(注意优先级,复杂条件用括号包裹)
SELECT * FROM users WHERE city = 'Beijing' OR city = 'Shanghai';
-- NOT:取反
SELECT * FROM users WHERE NOT city = 'Beijing';
-- 组合:先 OR 后 AND(括号决定优先级)
SELECT * FROM users
WHERE (city = 'Beijing' OR city = 'Shanghai') AND age > 30;IN 与 NOT IN
-- IN:匹配列表中的任意值(替代多个 OR)
SELECT * FROM users WHERE city IN ('Beijing', 'Shanghai', 'Guangzhou');
-- NOT IN:排除列表中的值
SELECT * FROM users WHERE city NOT IN ('Beijing', 'Shanghai');LIKE 模糊匹配
-- %:匹配任意长度字符
SELECT * FROM users WHERE email LIKE '%@example.com'; -- 以 @example.com 结尾
SELECT * FROM users WHERE username LIKE 'a%'; -- 以 a 开头
-- _:匹配单个字符
SELECT * FROM users WHERE username LIKE 'j___'; -- 恰好 4 个字符,以 j 开头
-- LIKE 不区分大小写(默认 case-insensitive)
SELECT * FROM users WHERE username LIKE '%ICE%'; -- 匹配 alice、ICE...性能提示:
LIKE '前缀%'可以利用索引(如LIKE 'Beijing%'),但LIKE '%keyword%'或LIKE '%suffix'无法使用索引,不适合大表。
NULL 判断
判断空值必须使用 IS NULL,不能用 = NULL:
-- 正确:IS NULL / IS NOT NULL
SELECT * FROM users WHERE email IS NULL;
SELECT * FROM users WHERE email IS NOT NULL;
-- 错误(永远返回空):= NULL 不会匹配任何行
-- SELECT * FROM users WHERE email = NULL; -- 不要这样写ORDER BY 排序
ORDER BY 在所有过滤之后对结果集排序:
-- 单列升序(ASC,默认值,可省略)
SELECT * FROM users ORDER BY age;
-- 单列降序(DESC)
SELECT * FROM users ORDER BY age DESC;
-- 多列排序:先按第一列排序,相同时按第二列排序
SELECT * FROM users ORDER BY city ASC, age DESC;
-- 按表达式排序:按年龄距离 25 岁的偏差排序
SELECT * FROM users ORDER BY ABS(age - 25);
-- 按列位置排序(不推荐,可读性差)
SELECT username, age, city FROM users ORDER BY 2 DESC; -- 按第 2 列(age)降序LIMIT 分页
-- 前 3 条
SELECT * FROM users ORDER BY age DESC LIMIT 3;
-- 标准分页:第 N 页,每页 M 条
-- 第 1 页:LIMIT 0, 10
-- 第 2 页:LIMIT 10, 10
-- 第 3 页:LIMIT 20, 10
SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 20;
-- 分页常配合 ORDER BY 使用,确保结果稳定
SELECT * FROM users ORDER BY id LIMIT 10 OFFSET 10;深度分页问题:在大表中
OFFSET N会跳过前 N 行(MySQL 先读取并丢弃),当 N 很大时性能急剧下降。生产环境建议使用游标分页(基于 ID > last_id 过滤)替代 OFFSET 分页。
GROUP BY 分组
GROUP BY 将行按一列或多列的值分组,常与聚合函数配合使用:
-- 按城市分组,统计每个城市的用户数量
SELECT city, COUNT(*) AS user_count
FROM users
GROUP BY city;
-- 结果:
-- city | user_count
-- ------------+---------
-- Beijing | 4
-- Shanghai | 3
-- Guangzhou | 1
-- Chengdu | 2
-- 分组 + 多聚合函数
SELECT
city,
COUNT(*) AS user_count,
AVG(age) AS avg_age,
MIN(age) AS youngest,
MAX(age) AS oldest,
SUM(age) AS total_age
FROM users
GROUP BY city;MySQL 8.x 特性:
ONLY_FULL_GROUP_BY模式默认开启,SELECT列表中只能出现分组列和聚合函数,避免语义不明确。
常用聚合函数
-- COUNT:计数
SELECT COUNT(*) FROM users; -- 总行数
SELECT COUNT(email) FROM users; -- 非空 email 数量
SELECT COUNT(DISTINCT city) FROM users; -- 不同城市的数量
-- AVG / SUM / MIN / MAX:数值聚合
SELECT
AVG(amount) AS avg_order_amount,
SUM(amount) AS total_amount,
MIN(amount) AS min_order,
MAX(amount) AS max_order
FROM orders WHERE status = 'completed';HAVING 过滤分组
WHERE 过滤行,HAVING 过滤分组。两者的执行顺序不同:
-- WHERE 在 GROUP BY 之前执行(过滤原始行)
-- HAVING 在 GROUP BY 之后执行(过滤分组结果)
-- 找出用户数 >= 3 的城市
SELECT city, COUNT(*) AS user_count
FROM users
GROUP BY city
HAVING user_count >= 3;
-- 找出平均订单金额 > 200 的已完成订单
SELECT user_id, AVG(amount) AS avg_amount
FROM orders
WHERE status = 'completed'
GROUP BY user_id
HAVING avg_amount > 200
ORDER BY avg_amount DESC;完整查询示例
将所有子句组合起来:
-- 需求:统计各城市 25 岁以上用户的订单总额,只显示总额超过 300 的城市,按总额降序
SELECT
u.city AS 城市,
COUNT(DISTINCT u.id) AS 用户数,
SUM(o.amount) AS 订单总额,
ROUND(AVG(o.amount), 2) AS 平均订单金额
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.age > 25 AND o.status = 'completed' -- WHERE 过滤原始行
GROUP BY u.city -- 按城市分组
HAVING SUM(o.amount) > 300 -- HAVING 过滤分组
ORDER BY SUM(o.amount) DESC -- 按总额降序
LIMIT 10; -- 最多 10 条执行顺序:FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT
总结
| 子句 | 作用 | 执行时机 |
|---|---|---|
WHERE | 过滤行(原始数据) | GROUP BY 之前 |
GROUP BY | 分组 | WHERE 之后 |
HAVING | 过滤分组(聚合结果) | GROUP BY 之后 |
ORDER BY | 排序 | 最后 |
LIMIT | 限制行数 | 最后 |
下一篇文章我们将进入进阶技能部分,学习 多表查询与子查询,掌握 JOIN、UNION、EXISTS 等高级查询技巧。
评论
Written by
AI-Writer
Related Articles
MySQL 安装与配置
在 macOS、Linux、Windows 上安装 MySQL 8.x、启动服务、配置连接,并介绍常用客户端工具 MySQL Workbench、DBeaver 与命令行 CLI 的使用方法
Read More