mysql 死锁原因（MySQL锁等待与死锁问题分析）

时间：2021-10-15 00:09:39类别：数据库

mysql 死锁原因

MySQL锁等待与死锁问题分析

前言：

在 mysql 运维过程中，锁等待和死锁问题是令各位 dba 及开发同学非常头痛的事。出现此类问题会造成业务回滚、卡顿等故障，特别是业务繁忙的系统，出现死锁问题后影响会更严重。本篇文章我们一起来学习下什么是锁等待及死锁，出现此类问题又应该如何分析处理呢？

1.了解锁等待与死锁

出现锁等待或死锁的原因是访问数据库需要加锁，那你可能要问了，为啥要加锁呢？原因是为了确保并发更新场景下的数据正确性，保证数据库事务的隔离性。

试想一个场景，如果你要去图书馆借一本《高性能mysql》，为了防止有人提前把这本书借走，你可以提前进行预约（加锁），这把锁可以怎么加？

封锁图书馆（数据库级别的锁）
把数据库相关的书都锁住（表级别的锁）
只锁 mysql 相关的书（页级别的锁）
只锁《高性能mysql》这本书（行级别的锁）

锁的粒度越细，并发级别越高，实现也更复杂。

锁等待也可称为事务等待，后执行的事务等待前面处理的事务释放锁，但是等待时间超过了 mysql 的锁等待时间，就会引发这个异常。等待超时后的报错为“lock wait timeout exceeded...”。

死锁发生的原因是两个事务互相等待对方释放相同资源的锁，从而造成的死循环。产生死锁后会立即报错“deadlock found when trying to get lock...”。

2.现象复现及处理

下面我们以 mysql 5.7.23 版本为例（隔离级别是 rr ），来复现下上述两种异常现象。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41 mysql> show create table test_tb\g

*************************** 1. row ***************************

table: test_tb

create table: create table `test_tb` (

`id` int(11) not null auto_increment,

`col1` varchar(50) not null default '',

`col2` int(11) not null default '1',

`col3` varchar(20) not null default '',

primary key (`id`),

key `idx_col1` (`col1`)

) engine=innodb auto_increment=4 default charset=utf8

1 row in set (0.00 sec)

mysql> select * from test_tb;

+----+------+------+------+

| id | col1 | col2 | col3 |

+----+------+------+------+

| 1 | fdg | 1 | abc |

| 2 | a | 2 | fg |

| 3 | ghrv | 2 | rhdv |

+----+------+------+------+

3 rows in set (0.00 sec)

# 事务一首先执行

mysql> begin;

query ok, 0 rows affected (0.00 sec)

mysql> select * from test_tb where col1 = 'a' for update;

+----+------+------+------+

| id | col1 | col2 | col3 |

+----+------+------+------+

| 2 | a | 2 | fg |

+----+------+------+------+

1 row in set (0.00 sec)

# 事务二然后执行

mysql> begin;

query ok, 0 rows affected (0.01 sec)

mysql> update test_tb set col2 = 1 where col1 = 'a';

error 1205 (hy000): lock wait timeout exceeded; try restarting transaction

出现上种异常的原因是事务二在等待事务一的行锁，但事务一一直没提交，等待超时而报错。innodb 行锁等待超时时间由 innodb_lock_wait_timeout 参数控制，此参数默认值为 50 ，单位为秒，即默认情况下，事务二会等待 50s ，若仍拿不到行锁则会报等待超时异常并回滚此条语句。

对于 5.7 版本，出现锁等待时，我们可以查看 information_schema 中的几张系统表来查询事务状态。

innodb_trx 当前运行的所有事务。
innodb_locks 当前出现的锁。
innodb_lock_waits 锁等待的对应关系

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100 # 锁等待发生时查看innodb_trx表可以看到所有事务

# trx_state值为lock wait 则代表该事务处于等待状态

mysql> select * from information_schema.innodb_trx\g

*************************** 1. row ***************************

trx_id: 38511

trx_state: lock wait

trx_started: 2021-03-24 17:20:43

trx_requested_lock_id: 38511:156:4:2

trx_wait_started: 2021-03-24 17:20:43

trx_weight: 2

trx_mysql_thread_id: 1668447

trx_query: update test_tb set col2 = 1 where col1 = 'a'

trx_operation_state: starting index read

trx_tables_in_use: 1

trx_tables_locked: 1

trx_lock_structs: 2

trx_lock_memory_bytes: 1136

trx_rows_locked: 1

trx_rows_modified: 0

trx_concurrency_tickets: 0

trx_isolation_level: repeatable read

trx_unique_checks: 1

trx_foreign_key_checks: 1

trx_last_foreign_key_error: null

trx_adaptive_hash_latched: 0

trx_adaptive_hash_timeout: 0

trx_is_read_only: 0

trx_autocommit_non_locking: 0

*************************** 2. row ***************************

trx_id: 38510

trx_state: running

trx_started: 2021-03-24 17:18:54

trx_requested_lock_id: null

trx_wait_started: null

trx_weight: 4

trx_mysql_thread_id: 1667530

trx_query: null

trx_operation_state: null

trx_tables_in_use: 0

trx_tables_locked: 1

trx_lock_structs: 4

trx_lock_memory_bytes: 1136

trx_rows_locked: 3

trx_rows_modified: 0

trx_concurrency_tickets: 0

trx_isolation_level: repeatable read

trx_unique_checks: 1

trx_foreign_key_checks: 1

trx_last_foreign_key_error: null

trx_adaptive_hash_latched: 0

trx_adaptive_hash_timeout: 0

trx_is_read_only: 0

trx_autocommit_non_locking: 0

2 rows in set (0.00 sec)

# innodb_trx 字段值含义

trx_id：事务id。

trx_state：事务状态，有以下几种状态：running、lock wait、rolling back 和 committing。

trx_started：事务开始时间。

trx_requested_lock_id：事务当前正在等待锁的标识，可以和 innodb_locks 表 join 以得到更多详细信息。

trx_wait_started：事务开始等待的时间。

trx_weight：事务的权重。

trx_mysql_thread_id：事务线程 id，可以和 processlist 表 join。

trx_query：事务正在执行的 sql 语句。

trx_operation_state：事务当前操作状态。

trx_tables_in_use：当前事务执行的 sql 中使用的表的个数。

trx_tables_locked：当前执行 sql 的行锁数量。

trx_lock_structs：事务保留的锁数量。

trx_isolation_level：当前事务的隔离级别。

# sys.innodb_lock_waits 视图也可看到事务等待状况，且给出了杀链接的sql

mysql> select * from sys.innodb_lock_waits\g

*************************** 1. row ***************************

wait_started: 2021-03-24 17:20:43

wait_age: 00:00:22

wait_age_secs: 22

locked_table: `testdb`.`test_tb`

locked_index: idx_col1

locked_type: record

waiting_trx_id: 38511

waiting_trx_started: 2021-03-24 17:20:43

waiting_trx_age: 00:00:22

waiting_trx_rows_locked: 1

waiting_trx_rows_modified: 0

waiting_pid: 1668447

waiting_query: update test_tb set col2 = 1 where col1 = 'a'

waiting_lock_id: 38511:156:4:2

waiting_lock_mode: x

blocking_trx_id: 38510

blocking_pid: 1667530

blocking_query: null

blocking_lock_id: 38510:156:4:2

blocking_lock_mode: x

blocking_trx_started: 2021-03-24 17:18:54

blocking_trx_age: 00:02:11

blocking_trx_rows_locked: 3

blocking_trx_rows_modified: 0

sql_kill_blocking_query: kill query 1667530

sql_kill_blocking_connection: kill 1667530

sys.innodb_lock_waits 视图整合了事务等待状况，同时给出杀掉堵塞源端的 kill 语句。不过是否要杀掉链接还是需要综合考虑的。

死锁与锁等待稍有不同，我们同样也来简单复现下死锁现象。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19 # 开启两个事务

# 事务一执行

mysql> update test_tb set col2 = 1 where col1 = 'a';

query ok, 1 row affected (0.00 sec)

rows matched: 1 changed: 1 warnings: 0

# 事务二执行

mysql> update test_tb set col2 = 1 where id = 3;

query ok, 1 row affected (0.00 sec)

rows matched: 1 changed: 1 warnings: 0

# 回到事务一执行回车后此条语句处于锁等待状态

mysql> update test_tb set col1 = 'abcd' where id = 3;

query ok, 1 row affected (5.71 sec)

rows matched: 1 changed: 1 warnings: 0

# 回到事务二再执行此时二者相互等待发生死锁

mysql> update test_tb set col3 = 'gddx' where col1 = 'a';

error 1213 (40001): deadlock found when trying to get lock; try restarting transaction

发生死锁后会选择一个事务进行回滚，想查明死锁原因，可以执行 show engine innodb status 来查看死锁日志，根据死锁日志，结合业务逻辑来进一步定位死锁原因。

在实际应用中，我们要尽量避免死锁现象的发生，可以从以下几个方面入手：

事务尽可能小，不要将复杂逻辑放进一个事务里。
涉及多行记录时，约定不同事务以相同顺序访问。
业务中要及时提交或者回滚事务，可减少死锁产生的概率。
表要有合适的索引。
可尝试将隔离级别改为 rc 。

总结：

本篇文章简单介绍了锁等待及死锁发生的原因，其实真实业务中发生死锁还是很难分析的，需要一定的经验积累。本篇文章只是面向初学者，希望各位对死锁能够有个初印象。

以上就是mysql锁等待与死锁问题分析的详细内容，更多关于mysql锁等待与死锁的资料请关注开心学习网其它相关文章！

原文链接：https://mp.weixin.qq.com/s/dnkM-Biu9lC7RbLRwG1Pyg

标签：

mysql 死锁锁等待

mysql 死锁原因（MySQL锁等待与死锁问题分析）