org.springframework.dao.DeadlockLoserDataAccessException: ### Error updating database. Cause: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction ### The error may exist in com/atome/ops/mapper/CampaignJobInstanceMapper.java (best guess) ### The error may involve com.atome.ops.mapper.CampaignJobInstanceMapper.update-Inline ### The error occurred while setting parameters ### SQL: update `campaign_job_instance` SET `total_count`=?, `status`=?, `fetch_start_time`=?, `fetch_end_time`=?, `send_start_time`=?, `send_end_time`=?, `update_timestamp`=? where id=? ### Cause: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction ; Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
主键更新死锁,why?
## 执行以下语句查看最近一次死锁日志 show engine innodb status; ... ------------------------ LATEST DETECTED DEADLOCK ------------------------ 2022-02-07 21:00:00 0x7f3972be1700 *** (1) TRANSACTION: TRANSACTION 401368548, ACTIVE 0 sec starting index read mysql tables in use 1, locked 1 LOCK WAIT 4 lock struct(s), heap size 1136, 4 row lock(s), undo log entries 3 MySQL thread id 24621163, OS thread handle 139885409634048, query id 1074586194 172.21.32.245 root Updating update `campaign_job_instance` SET `total_count`=6, `status`='SENDING', `fetch_start_time`=1644238560063, `fetch_end_time`=1644238560115, `send_start_time`=1644238800550, `send_end_time`=null, `update_timestamp`=1644238800550 where id=5193 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 10003 page no 263 n bits 104 index PRIMARY of table `user_label`.`campaign_job_instance` trx id 401368548 lock_mode X locks rec but not gap waiting Record lock, heap no 35 PHYSICAL RECORD: n_fields 17; compact format; info bits 0 0: len 4; hex 00001449; asc I;; 1: len 6; hex 000017ec6739; asc g9;; 2: len 7; hex 390002800f1934; asc 9 4;; 3: len 28; hex 6a6f622d363166653865333464636531613630303031346636333166; asc job-61fe8e34dce1a600014f631f;; 4: len 24; hex 363230313136653033333863656530303031326637346337; asc 620116e0338cee00012f74c7;; 5: len 30; hex 3631666538653334646365316136303030313466363331642d3230323230; asc 61fe8e34dce1a600014f631d-20220; (total 37 bytes); 6: len 3; hex 534d53; asc SMS;; 7: len 24; hex 363166653865333464636531613630303031346636333164; asc 61fe8e34dce1a600014f631d;; 8: len 30; hex 66353464623661372d336263662d346361312d613630302d643337616234; asc f54db6a7-3bcf-4ca1-a600-d37ab4; (total 36 bytes); 9: len 4; hex 53454e54; asc SENT;; 10: len 4; hex 80000006; asc ;; 11: len 8; hex 8000017ed4415b3f; asc ~ A[?;; 12: len 8; hex 8000017ed4415b73; asc ~ A[s;; 13: len 8; hex 8000017ed44505ab; asc ~ E ;; 14: len 8; hex 8000017ed44506a1; asc ~ E ;; 15: len 8; hex 8000017ed4415b3f; asc ~ A[?;; 16: len 8; hex 8000017ed44506a1; asc ~ E ;; *** (2) TRANSACTION: TRANSACTION 401368889, ACTIVE 0 sec starting index read mysql tables in use 1, locked 1 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1 MySQL thread id 24621104, OS thread handle 139884714923776, query id 1074586215 172.21.32.245 root Updating update `campaign_job_instance` SET `total_count`=6, `status`='SENT', `fetch_start_time`=1644238560026, `fetch_end_time`=1644238560120, `send_start_time`=1644238800280, `send_end_time`=1644238800554, `update_timestamp`=1644238800554 where id=5192 *** (2) HOLDS THE LOCK(S): RECORD LOCKS space id 10003 page no 263 n bits 104 index PRIMARY of table `user_label`.`campaign_job_instance` trx id 401368889 lock_mode X locks rec but not gap Record lock, heap no 35 PHYSICAL RECORD: n_fields 17; compact format; info bits 0 0: len 4; hex 00001449; asc I;; 1: len 6; hex 000017ec6739; asc g9;; 2: len 7; hex 390002800f1934; asc 9 4;; 3: len 28; hex 6a6f622d363166653865333464636531613630303031346636333166; asc job-61fe8e34dce1a600014f631f;; 4: len 24; hex 363230313136653033333863656530303031326637346337; asc 620116e0338cee00012f74c7;; 5: len 30; hex 3631666538653334646365316136303030313466363331642d3230323230; asc 61fe8e34dce1a600014f631d-20220; (total 37 bytes); 6: len 3; hex 534d53; asc SMS;; 7: len 24; hex 363166653865333464636531613630303031346636333164; asc 61fe8e34dce1a600014f631d;; 8: len 30; hex 66353464623661372d336263662d346361312d613630302d643337616234; asc f54db6a7-3bcf-4ca1-a600-d37ab4; (total 36 bytes); 9: len 4; hex 53454e54; asc SENT;; 10: len 4; hex 80000006; asc ;; 11: len 8; hex 8000017ed4415b3f; asc ~ A[?;; 12: len 8; hex 8000017ed4415b73; asc ~ A[s;; 13: len 8; hex 8000017ed44505ab; asc ~ E ;; 14: len 8; hex 8000017ed44506a1; asc ~ E ;; 15: len 8; hex 8000017ed4415b3f; asc ~ A[?;; 16: len 8; hex 8000017ed44506a1; asc ~ E ;; *** (2) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 10003 page no 263 n bits 104 index PRIMARY of table `user_label`.`campaign_job_instance` trx id 401368889 lock_mode X locks rec but not gap waiting Record lock, heap no 23 PHYSICAL RECORD: n_fields 17; compact format; info bits 0 0: len 4; hex 00001448; asc H;; 1: len 6; hex 000017ec65e4; asc e ;; 2: len 7; hex 0e000002a31da0; asc ;; 3: len 28; hex 6a6f622d363230303832346464636531613630303031346636333261; asc job-6200824ddce1a600014f632a;; 4: len 24; hex 363230313136653033333863656530303031326637346334; asc 620116e0338cee00012f74c4;; 5: len 30; hex 3632303038323464646365316136303030313466363332382d3230323230; asc 6200824ddce1a600014f6328-20220; (total 37 bytes); 6: len 3; hex 534d53; asc SMS;; 7: len 24; hex 363230303832346464636531613630303031346636333238; asc 6200824ddce1a600014f6328;; 8: len 30; hex 33333839633535662d663262612d346630392d393432632d353839633865; asc 3389c55f-f2ba-4f09-942c-589c8e; (total 36 bytes); 9: len 7; hex 53454e44494e47; asc SENDING;; 10: len 4; hex 80000006; asc ;; 11: len 8; hex 8000017ed4415b1a; asc ~ A[ ;; 12: len 8; hex 8000017ed4415b78; asc ~ A[x;; 13: len 8; hex 8000017ed4450699; asc ~ E ;; 14: SQL NULL; 15: len 8; hex 8000017ed4415b1a; asc ~ A[ ;; 16: len 8; hex 8000017ed4450699; asc ~ E ;; *** WE ROLL BACK TRANSACTION (2) ------------ TRANSACTIONS ------------
mysql之show engine innodb status解读
两条不同主键更新的语句相互等待,why?
从上述日志可以看出以下几点:
一个误区:锁定的数据行并不一定是由锁冲突的sql语句引起的,应该关注同一事务中是否有其它语句加锁
class SaveUserJob implements SimpleJob { @Autowired @Transactional CampaignJobInstanceService campaignJobInstanceService @Override void execute(ShardingContext shardingContext) { log.info("save user:${JsonUtil.toString(shardingContext)}") List<CampaignJobInstance> jobInstances = campaignJobInstanceService.findByStatus(CampaignJobInstanceStatusEnum.SAVING) jobInstances.each { jobInstance -> ... updateById(jobInstance) ... } } } class SendJob implements DataflowJob<CampaignUserLog> { ... @Override @Transactional void processData(ShardingContext shardingContext, List<CampaignUserLog> logs) { connectorExecuteService.execute(logs.findAll { it.status == CampaignUserJobStatusEnum.SENDING }) logs.groupBy { it.jobInstanceId }.each { CampaignJobInstance jobInstance = findByInstanceId(it.key) ... updateById(jobInstance) } } ... }
SaveUserJob、SendJob中均涉及到遍历主键更新campaign_job_instance,且均开启事务,两个事务中更新顺序的不通便会造成相互持有对方需要的锁,从而死锁
为什么之前自测时没有触发?
圈选的用户量过少,只有1个,save user的瞬间还没来得及更新状态就发送完成了
把SendJob的频次由2分钟改为了5分钟,会累积多个jobInstance到一个事务中
减小锁粒度
class SaveUserJob implements SimpleJob { @Autowired CampaignJobInstanceService campaignJobInstanceService @Override // 外层仅查询,去除事务 void execute(ShardingContext shardingContext) { log.info("save user:${JsonUtil.toString(shardingContext)}") List<CampaignJobInstance> jobInstances = campaignJobInstanceService.findByStatus(CampaignJobInstanceStatusEnum.SAVING) jobInstances.each { jobInstance -> // 抽取到service层,加上事务 campaignJobInstanceService.saveUser(jobInstance) } } } class SendJob implements DataflowJob<CampaignUserLog> { ... @Override // 外层仅查询,去除事务 void processData(ShardingContext shardingContext, List<CampaignUserLog> logs) { ... logs.groupBy { it.jobInstanceId }.each { ... // 抽取到service层,加上事务 CampaignJobInstance jobInstance = findByInstanceId(it.key) campaignJobInstanceService.checkJobInstance(jobInstance) } } ... }