MySQL部门工资最高的员工-蒲公英云

MySQL部门工资最高的员工

SQL架构
题目描述
题解
- 方法：使用 JOIN 和 IN 语句
- 方法2
- 方法： GROUP BY 语句
知识点
- exists 和 in
- - 1.原理
  - 2.分析
  - 3.总结
  - 4.效率
  - 5.举例说明

SQL架构

Create table If Not Exists Employee (Id int, Name varchar(255), Salary int, DepartmentId int);
Create table If Not Exists Department (Id int, Name varchar(255));
insert into Employee (Id, Name, Salary, DepartmentId) values ('1', 'Joe', '70000', '1');
insert into Employee (Id, Name, Salary, DepartmentId) values ('2', 'Jim', '90000', '1');
insert into Employee (Id, Name, Salary, DepartmentId) values ('3', 'Henry', '80000', '2');
insert into Employee (Id, Name, Salary, DepartmentId) values ('4', 'Sam', '60000', '2');
insert into Employee (Id, Name, Salary, DepartmentId) values ('5', 'Max', '90000', '1');
insert into Department (Id, Name) values ('1', 'IT');
insert into Department (Id, Name) values ('2', 'Sales');

题目描述

Employee 表包含所有员工信息，每个员工有其对应的 Id, salary 和 department Id。

+----+-------+--------+--------------+
| Id | Name  | Salary | DepartmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Jim   | 90000  | 1            |
| 3  | Henry | 80000  | 2            |
| 4  | Sam   | 60000  | 2            |
| 5  | Max   | 90000  | 1            |
+----+-------+--------+--------------+

Department 表包含公司所有部门的信息。

+----+----------+
| Id | Name     |
+----+----------+
| 1  | IT       |
| 2  | Sales    |
+----+----------+

编写一个 SQL 查询，找出每个部门工资最高的员工。对于上述表，您的 SQL 查询应返回以下行（行的顺序无关紧要）。

+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Max      | 90000  |
| IT         | Jim      | 90000  |
| Sales      | Henry    | 80000  |
+------------+----------+--------+

解释：

Max 和 Jim 在 IT 部门的工资都是最高的，Henry 在销售部的工资最高。

题解

方法：使用 JOIN 和 IN 语句

算法

因为 Employee 表包含 Salary 和 DepartmentId 字段，我们可以以此在部门内查询最高工资。

SELECT
    DepartmentId, MAX(Salary)
FROM
    Employee
GROUP BY DepartmentId;

注意：有可能有多个员工同时拥有最高工资，所以最好在这个查询中不包含雇员名字的信息。

+--------------+-------------+
| DepartmentId | MAX(Salary) |
+--------------+-------------+
|            1 |       90000 |
|            2 |       80000 |
+--------------+-------------+
2 rows in set (0.00 sec)

然后，我们可以把表 Employee 和 Department 连接，再在这张临时表里用 IN 语句查询部门名字和工资的关系。

SELECT
    Department.name AS 'Department',
    Employee.name AS 'Employee',
    Salary
FROM
    Employee
        JOIN
    Department ON Employee.DepartmentId = Department.Id
WHERE
    (Employee.DepartmentId , Salary) IN
    (   SELECT
            DepartmentId, MAX(Salary)
        FROM
            Employee
        GROUP BY DepartmentId
    );
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Jim      |  90000 |
| Sales      | Henry    |  80000 |
| IT         | Max      |  90000 |
+------------+----------+--------+
3 rows in set (0.00 sec)

方法2

select bb.Name as Department, aa.Employee, aa.Salary
from (
select 
       a.Id,
       a.Name as Employee,
       a.Salary,
       a.DepartmentId
  from Employee a
  join (
        select DepartmentId, max(Salary) as max_sal
          from Employee
         group by DepartmentId
       ) b on a.DepartmentId = b.DepartmentId and a.Salary = b.max_sal
) aa
join Department bb on aa.DepartmentId = bb.Id;
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Jim      |  90000 |
| Sales      | Henry    |  80000 |
| IT         | Max      |  90000 |
+------------+----------+--------+
3 rows in set (0.01 sec)

方法： GROUP BY 语句

SELECT
    Department.NAME AS Department,
    Employee.NAME AS Employee,
    Salary 
FROM
    Employee,
    Department 
WHERE
    Employee.DepartmentId = Department.Id 
    AND ( Employee.DepartmentId, Salary ) 
    IN (SELECT DepartmentId, max( Salary ) 
        FROM Employee 
        GROUP BY DepartmentId );
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Jim      |  90000 |
| Sales      | Henry    |  80000 |
| IT         | Max      |  90000 |
+------------+----------+--------+
3 rows in set (0.00 sec)
SELECT D.Name AS Department,
       E1.Name AS Employee,
       E1.Salary
  FROM Employee AS E1
       INNER JOIN Department AS D
       ON E1.DepartmentId = D.Id
 WHERE NOT EXISTS (SELECT * 
                     FROM Employee AS E2
                    WHERE E1.DepartmentId = E2.DepartmentId
                      AND E1.Salary < E2.Salary);

知识点

exists 和 in

1.原理

通过使用 EXISTS，Oracle会首先检查主查询，然后运行子查询直到它找到第一个匹配项，
这就节省了时间。

Oracle在执行IN子查询时，首先执行子查询，并将获得的结果列表存放在一个加了索引的临时表中。

在执行子查询之前，系统先将主查询挂起，待子查询执行完毕，存放在临时表中以后再执行主查询。

这也就是使用EXISTS比使用IN通常查询速度快的原因。

2.分析

in 是把外表和内表作hash 连接，而exists是对外表作loop循环。

每次loop循环再对内表进行查询not exists:做NLL，对子查询先查，有个虚表，有确定值，所以就算子查询有NULL最终也有值返回not in:做hash，对子查询表建立内存数组，用外表匹配，那子查询要是NULL那外表没的匹配最终无值返回。

一直以来认为exists比in效率高的说法是不准确的。

但是，如果查询的两个表大小相当，那么用in和exists差别不大。

3.总结

外表大，用IN；内表大，用EXISTS。

4.效率

- select * from T1 where exists(select 1 from T2 where T1.a=T2.a) ;
    T1数据量小而T2数据量非常大时，T1<<T2 时，1) 的查询效率高。
- select * from T1 where T1.a in (select T2.a from T2);
    T1数据量非常大而T2数据量小时，T1>>T2 时，2) 的查询效率高。

5.举例说明

例如：表A（小表），表B（大表） 
 1： select * from A where cc in (select cc from B) 
     效率低，用到了A表上cc列的索引； 
    select * from A where exists(select cc from B where cc=A.cc) 
    效率高，用到了B表上cc列的索引。
 2. select * from B where cc in (select cc from A) 
    效率高，用到了B表上cc列的索引； 
    select * from B where exists(select cc from A where cc=B.cc) 
    效率低，用到了A表上cc列的索引。