MySQL 的用户群体很大,为了能够增强数据的实时性,很多解决方案会利用 binlog 将数据写入到 ClickHouse。为了能够监听 binlog 事件,我们需要用到类似 canal 这样的第三方中间件,这无疑增加了系统的复杂度。
ClickHouse 20.8.2.3版本新增加了 MaterializeMySQL 的 database 引擎,该 database 能映射到 MySQL 中的某个 database,并自动在 ClickHouse 中创建对应的 ReplacingMergeTree。ClickHouse服务做为MySQL副本,读取Binlog并执行DDL和DML请求,实现了基于MySQL Binlog机制的业务数据库实时同步功能。
ClickHouse 版本:21.7.3.14-2.
MySql 版本 5.7.16
my.cnf 配置
#数据库id server-id = 1 ##启动binlog,该参数的值会作为binlog的文件名 log-bin=mysql-bin ##binlog类型,maxwell要求为row类型 binlog_format=ROW gtid-mode=on # 设置为主从强一致性 enforce-gtid-consistency=1 # 记录日志 log-slave-updates=1
今天在测试 MaterializeMySQL 功能的时候除出现了一个现象,ClickHouse中创建 MaterializeMySQL 数据库后,ClickHouse数据库建立后,表没有同步到ClickHouse,检查了一下 my.cnf 配置信息,和ClickHouse同步信息,都没有问题;查看日之后发一个问题;
[root@hadoop201 hui]# tail -f /var/log/clickhouse-server/clickhouse-server.err.log
2022.08.06 12:18:23.930444 [ 2640 ] {} <Error> MaterializeMySQLSyncThread: Code: 48, e.displayText() = DB::Exception: The test.book_info cannot be materialized, because there is no primary keys.: While executing dump MySQL test.book_info table data., Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x8d31b5a in /usr/bin/clickhouse
1. DB::MySQLInterpreter::InterpreterCreateImpl::getRewrittenQueries(DB::MySQLParser::ASTCreateQuery const&, std::__1::shared_ptr<DB::Context const>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xfe4d726 in /usr/bin/clickhouse
2. DB::MySQLInterpreter::InterpreterMySQLDDLQuery<DB::MySQLInterpreter::InterpreterCreateImpl>::execute() @ 0xfaafbdb in /usr/bin/clickhouse
3. DB::InterpreterExternalDDLQuery::execute() @ 0xfaade78 in /usr/bin/clickhouse
4. ? @ 0xfe22253 in /usr/bin/clickhouse
5. DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum, bool) @ 0xfe208e3 in /usr/bin/clickhouse
6. ? @ 0xf7ba406 in /usr/bin/clickhouse
7. ? @ 0xf7d4b88 in /usr/bin/clickhouse
8. DB::commitMetadata(std::__1::function<void ()> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0xf807632 in /usr/bin/clickhouse
9. DB::MaterializeMetadata::transaction(DB::MySQLReplication::Position const&, std::__1::function<void ()> const&) @ 0xf80a000 in /usr/bin/clickhouse
10. DB::MaterializeMySQLSyncThread::prepareSynchronized(DB::MaterializeMetadata&) @ 0xf7b2231 in /usr/bin/clickhouse
11. DB::MaterializeMySQLSyncThread::synchronization() @ 0xf7b169c in /usr/bin/clickhouse
12. ? @ 0xf7d4074 in /usr/bin/clickhouse
13. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x8d72a1f in /usr/bin/clickhouse
14. ? @ 0x8d76303 in /usr/bin/clickhouse
15. start_thread @ 0x7ea5 in /usr/lib64/libpthread-2.17.so
16. __clone @ 0xfe9fd in /usr/lib64/libc-2.17.so
好像是说 MySQL 的表没设置主键,之前的建表语句
CREATE TABLE `book_info` ( `id` INT(11) , `name` VARCHAR(20) COLLATE utf8_german2_ci DEFAULT NULL, `author` VARCHAR(20) COLLATE utf8_german2_ci DEFAULT NULL ) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_german2_ci
确实没有设置主键,删表重建下
drop table test.book_info; CREATE TABLE `book_info` ( `id` INT(11), `name` VARCHAR(20) COLLATE utf8_german2_ci DEFAULT NULL, `author` VARCHAR(20) COLLATE utf8_german2_ci DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE = INNODB DEFAULT CHARSET = utf8 COLLATE = utf8_german2_ci
插入数据
INSERT INTO `book_info` (`id`, `name`, `author`) VALUES ('1001', '侠客行', '金庸'); INSERT INTO `book_info` (`id`, `name`, `author`) VALUES ('1002', '孔雀翎', '古龙'); INSERT INTO `book_info` (`id`, `name`, `author`) VALUES ('1003', '萍踪侠影', '梁羽生'); commit;
此时回到 ClickHouse ,把刚才建的数据库删了,重新建一下
drop database test_binlog; set allow_experimental_database_materialize_mysql=1; CREATE DATABASE test_binlog ENGINE = MaterializeMySQL('hadoop201:3306','test','root','mypwd');
现在查看 ClickHouse 下的表已经有了
select * from test_binlog.book_info
1002 孔雀翎 古龙
1003 萍踪侠影 梁羽生
1001 侠客行 金庸
现在更新一下Mysql 里的数据
UPDATE book_info SET NAME = '侠客行-1980' WHERE id = '1001'; COMMIT; SELECT * FROM book_info; 1001 侠客行-1980 金庸 1002 孔雀翎 古龙 1003 萍踪侠影 梁羽生
查看ClickHouse 结果
select * from test_binlog.book_info 1002 孔雀翎 古龙 1003 萍踪侠影 梁羽生 1001 侠客行-1980 金庸
至此:问题得以解决。