RocketMQ版本4.6.0,记录自己看源码的过程
定时消息是指消息发送到Broker后,并不立即被消费者消费而是要等到特定的时间后才能被消费,RocketMQ不支持任意的时间精度。
发送消息时,只要给消息设置一个延时级别message.setDelayTimeLevel(3),消息发送到Broker后会延时固定时间后才可以被消费到。
延时有一下几个级别:
private String messageDelayLevel = "1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h";
从1s到2h分别对应着等级1到18。除了主动设置延时消息,消息消费失败也会进入延时消息队列,所以消息发送时间与设置的延时等级和重试次数有关。
Broker接收定时消息前半段处理跟普通消息一样,但在CommitLog中准备去存储消息时处理不大一样
CommitLog
public PutMessageResult putMessage(final MessageExtBrokerInner msg) { // Set the storage time msg.setStoreTimestamp(System.currentTimeMillis()); // Set the message body BODY CRC (consider the most appropriate setting // on the client) msg.setBodyCRC(UtilAll.crc32(msg.getBody())); // Back to Results AppendMessageResult result = null; StoreStatsService storeStatsService = this.defaultMessageStore.getStoreStatsService(); String topic = msg.getTopic(); int queueId = msg.getQueueId(); final int tranType = MessageSysFlag.getTransactionValue(msg.getSysFlag()); if (tranType == MessageSysFlag.TRANSACTION_NOT_TYPE || tranType == MessageSysFlag.TRANSACTION_COMMIT_TYPE) { // 延时消息,则改变主题 if (msg.getDelayTimeLevel() > 0) { // 超过最大延时级别则设为最大延时级别 if (msg.getDelayTimeLevel() > this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel()) { msg.setDelayTimeLevel(this.defaultMessageStore.getScheduleMessageService().getMaxDelayLevel()); } // topic设为SCHEDULE_TOPIC_XXXX topic = ScheduleMessageService.SCHEDULE_TOPIC; // 队列id = 延时级别-1 queueId = ScheduleMessageService.delayLevel2QueueId(msg.getDelayTimeLevel()); // 备份真实主题和队列id,REAL_TOPIC为延时之前的topic,REAL_QID为延时之前的queueId MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_TOPIC, msg.getTopic()); MessageAccessor.putProperty(msg, MessageConst.PROPERTY_REAL_QUEUE_ID, String.valueOf(msg.getQueueId())); msg.setPropertiesString(MessageDecoder.messageProperties2String(msg.getProperties())); // 延迟消息则需要将消息发送到延时主题上 msg.setTopic(topic); msg.setQueueId(queueId); } } // 后面省略,跟普通消息处理一样 }
随后的处理方式跟普通消息一样,存入CommitLog文件。
然后后台会有一个定时消息服务ScheduleMessageService,该服务会消息存储组件一起创建、加载和启动
ScheduleMessageService
public class ScheduleMessageService extends ConfigManager { private static final InternalLogger log = InternalLoggerFactory.getLogger(LoggerName.STORE_LOGGER_NAME); public static final String SCHEDULE_TOPIC = "SCHEDULE_TOPIC_XXXX"; private static final long FIRST_DELAY_TIME = 1000L; private static final long DELAY_FOR_A_WHILE = 100L; private static final long DELAY_FOR_A_PERIOD = 10000L; /** * 延时级别表,key时延时级别,value是延时时间 */ private final ConcurrentMap<Integer /* level */, Long/* delay timeMillis */> delayLevelTable = new ConcurrentHashMap<Integer, Long>(32); /** * 缓存每个延时级别对应队列的消费进度 */ private final ConcurrentMap<Integer /* level */, Long/* offset */> offsetTable = new ConcurrentHashMap<Integer, Long>(32); public void start() { if (started.compareAndSet(false, true)) { this.timer = new Timer("ScheduleMessageTimerThread", true); // 为每个延时级别创建一个定时器,1s后执行 for (Map.Entry<Integer, Long> entry : this.delayLevelTable.entrySet()) { Integer level = entry.getKey(); Long timeDelay = entry.getValue(); Long offset = this.offsetTable.get(level); if (null == offset) { offset = 0L; } if (timeDelay != null) { this.timer.schedule(new DeliverDelayedMessageTimerTask(level, offset), FIRST_DELAY_TIME); } } // 每隔10s持久化延时队列的消息消费进度 this.timer.scheduleAtFixedRate(new TimerTask() { @Override public void run() { try { if (started.get()) ScheduleMessageService.this.persist(); } catch (Throwable e) { log.error("scheduleAtFixedRate flush exception", e); } } }, 10000, this.defaultMessageStore.getMessageStoreConfig().getFlushDelayOffsetInterval()); } } } public boolean load() { // 加载消费进度到offsetTable中 boolean result = super.load(); // 构造消息延时表 result = result && this.parseDelayLevel(); return result; }
ScheduleMessageService会为每个延时级别创建一个调度任务,可以看到实现类是DeliverDelayedMessageTimerTask
/** * 定时调度任务实现类 */ class DeliverDelayedMessageTimerTask extends TimerTask { private final int delayLevel; private final long offset; public DeliverDelayedMessageTimerTask(int delayLevel, long offset) { this.delayLevel = delayLevel; this.offset = offset; } @Override public void run() { try { if (isStarted()) { this.executeOnTimeup(); } } catch (Exception e) { // XXX: warn and notify me log.error("ScheduleMessageService, executeOnTimeup exception", e); ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask( this.delayLevel, this.offset), DELAY_FOR_A_PERIOD); } } public void executeOnTimeup() { // 查找消费队列 ConsumeQueue cq = ScheduleMessageService.this.defaultMessageStore.findConsumeQueue(SCHEDULE_TOPIC, delayLevel2QueueId(delayLevel)); long failScheduleOffset = offset; if (cq != null) { // 根据offset从消息消费队列中获取当前队列中所有有效的消息 SelectMappedBufferResult bufferCQ = cq.getIndexBuffer(this.offset); if (bufferCQ != null) { try { long nextOffset = offset; int i = 0; ConsumeQueueExt.CqExtUnit cqExtUnit = new ConsumeQueueExt.CqExtUnit(); // consumerQueue中每个消息是20字节,从起始消费进度开始解析出每个消息 for (; i < bufferCQ.getSize(); i += ConsumeQueue.CQ_STORE_UNIT_SIZE) { long offsetPy = bufferCQ.getByteBuffer().getLong(); int sizePy = bufferCQ.getByteBuffer().getInt(); long tagsCode = bufferCQ.getByteBuffer().getLong(); if (cq.isExtAddr(tagsCode)) { if (cq.getExt(tagsCode, cqExtUnit)) { tagsCode = cqExtUnit.getTagsCode(); } else { //can't find ext content.So re compute tags code. log.error("[BUG] can't find consume queue extend file content!addr={}, offsetPy={}, sizePy={}", tagsCode, offsetPy, sizePy); long msgStoreTime = defaultMessageStore.getCommitLog().pickupStoreTimestamp(offsetPy, sizePy); tagsCode = computeDeliverTimestamp(delayLevel, msgStoreTime); } } long now = System.currentTimeMillis(); long deliverTimestamp = this.correctDeliverTimestamp(now, tagsCode); // 下一次拉取的进度 nextOffset = offset + (i / ConsumeQueue.CQ_STORE_UNIT_SIZE); long countdown = deliverTimestamp - now; // 判断延时消息到时间没,到时间就从commitLog取出来 if (countdown <= 0) { // 根据偏移量和消息大小从commitLog文件中加载完整的消息 MessageExt msgExt = ScheduleMessageService.this.defaultMessageStore.lookMessageByOffset( offsetPy, sizePy); if (msgExt != null) { try { // 创建一个新的消息,清除延时级别属性并恢复原来的主题和队列 // 如果是普通消息,则是生产者发送时的topic和queueId // 如果时重试消息,则是重试的topic和queueId,在消费之前会再恢复成消息原本所属的topic MessageExtBrokerInner msgInner = this.messageTimeup(msgExt); // 将新的消息存入commitLog并转发给原本主题队列中供消费者消费 PutMessageResult putMessageResult = ScheduleMessageService.this.writeMessageStore .putMessage(msgInner); if (putMessageResult != null && putMessageResult.getPutMessageStatus() == PutMessageStatus.PUT_OK) { continue; } else { // XXX: warn and notify me log.error( "ScheduleMessageService, a message time up, but reput it failed, topic: {} msgId {}", msgExt.getTopic(), msgExt.getMsgId()); ScheduleMessageService.this.timer.schedule( new DeliverDelayedMessageTimerTask(this.delayLevel, nextOffset), DELAY_FOR_A_PERIOD); ScheduleMessageService.this.updateOffset(this.delayLevel, nextOffset); return; } } catch (Exception e) { /* * XXX: warn and notify me */ log.error( "ScheduleMessageService, messageTimeup execute error, drop it. msgExt=" + msgExt + ", nextOffset=" + nextOffset + ",offsetPy=" + offsetPy + ",sizePy=" + sizePy, e); } } } else { // 差countdown到时间,就重新创建一个countdown时间后调度的定时器 ScheduleMessageService.this.timer.schedule( new DeliverDelayedMessageTimerTask(this.delayLevel, nextOffset), countdown); // 更新前面时间到的消费进度 ScheduleMessageService.this.updateOffset(this.delayLevel, nextOffset); // 前面的都没到时间,后面的肯定也没到时间,所以就不用再循环判断了,直接返回了 return; } } // end of for nextOffset = offset + (i / ConsumeQueue.CQ_STORE_UNIT_SIZE); ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask( this.delayLevel, nextOffset), DELAY_FOR_A_WHILE); // 更新该延时级别的消费进度 ScheduleMessageService.this.updateOffset(this.delayLevel, nextOffset); return; } finally { bufferCQ.release(); } } // end of if (bufferCQ != null) else { // 获取队列中最小的偏移量,如果该偏移量比之前的大,说明之前的偏移量是无效的, // 则下次从该偏移量开始获取 long cqMinOffset = cq.getMinOffsetInQueue(); if (offset < cqMinOffset) { failScheduleOffset = cqMinOffset; log.error("schedule CQ offset invalid. offset=" + offset + ", cqMinOffset=" + cqMinOffset + ", queueId=" + cq.getQueueId()); } } } // end of if (cq != null) // 没找到消费队列,就表示还没有重试消息,创建新的调度器,延迟100ms后重新调度 ScheduleMessageService.this.timer.schedule(new DeliverDelayedMessageTimerTask(this.delayLevel, failScheduleOffset), DELAY_FOR_A_WHILE); } }
Step1:根据延时级别找到对应的消费队列。如果没找到,则忽略本次任务,重新创建一个调度任务。
Step2:根据offset从消息消费队列中获取当前队列中所有有效的消息bufferCQ,如果没找到,则要判断下offset是否比队列中的最小偏移量还小,如果小,则要替换,重新创建一个调度任务,下次获取就从最小的偏移量开始。
Step3:consumerQueue中每个消息是20字节,从第二步bufferCQ中开始循环解析出每个消息的物理偏移量,消息长度和tag hash,为从CommitLog中加载消息做准备。
Step4:判断延时消息到时间没,到时间就从commitLog取出来,根据物理偏移量和消息大小从commitLog文件中加载完整的消息。如果没到时间,则差多久到时间,就重新创建一个多久时间后执行调度任务,更新前面时间到的消费进度,然后返回。该调度任务到时间后也会按上面的步骤执行。
Step5:根据取出的消息重新构建新的消息对象,清除消息的延迟级别,并恢复消息原先的消息主题和消息消费队列。
Step6:将新的消息存入commitLog并转发给原本主题队列中供消费者消费。
Step7:更新延迟队列拉取进度。
总结:延时消息会有一个单独的topic:SCHEDULE_TOPIC_XXXX,所有的延时消息都发往该主题,该主题下的队列数量等于配置的延时级别数量,队列id=延时级别-1。每个延时级别都有一个定时任务,消息发送时,在broker端会将消息的原主题和队列id存入消息的属性中,然后改变消息的主题和队列,将消息存入到延时队列的消费队列中。
各个级别的定时任务根据延时拉取消息消费进度从延时队列中拉取消息,然后从commitLog文件中加载完整的消息,清除延时级别属性并恢复原来的主题和队列,再次创建一个新的消息存入到 commitLog中并转发到消息消费队列供消费者消费。
参考资料
《儒猿技术窝——从 0 开始带你成为消息中间件实战高手》