1.1 概述
现有postgresql数据库为单一数据库,没有备份数据,为保证数据安全正常运行,现将给定的3台服务器,组成一个1主2从的高可用架构。
1.2 基本信息
主机名 | IP地址 | 服务器角色 | 系统信息 | 数据库目录 |
---|---|---|---|---|
72_123 | 192.168.247.123 | Postgresql主节点 | Red Hat7.7 | /data |
72_124 | 192.168.247.124 | Postgresql从节点 | Red Hat7.7 | /data |
72_125 | 192.168.247.125 | Postgresql从节点 | Red Hat7.7 | /data |
1.3实施计划
先将247.124和247.125设置为247.123的从库,然后安装Keepalived 高可用组件,搭建高可用架构
2.1 各节点安装postgres (root用户)
准备postgresql-11-10.tar.gz软件包
https://www.postgresql.org/ftp/source/v11.10/
安装相关依赖包
yum -y install readline readline-devel zlib zlib-devel
创建用户、安装目录、数据库目录
useradd postgres password postgres mkdir /opt/pg11/ mkdir /data/ chown -R postgres:postgres /opt/pg11/ chown -R postgres:postgres /data/
解压并编译安装
tar -xvf postgresql-11.10.tar.gz cd postgresql-11.10/ ./configure --prefix=/opt/pg11/ make world make install-word
设置环境变量
su - postgres vim .bashrc export LD_LIBRARY_PATH=/opt/pg11/lib export PGDATA=/data/ export PGHOME=/opt/pg11/ export PATH=$PATH:/opt/pg11/bin/
初始化数据库
/opt/pg11/bin/initdb -D /data/
修改配置文件(参数值根据服务器配置填写)
listen_addresses = '*' port = 5432 max_connections = 100 shared_buffers = 128MB dynamic_shared_memory_type = posix wal_level = replica max_wal_size = 1GB min_wal_size = 80MB track_commit_timestamp = on log_timezone = 'PRC' datestyle = 'iso, mdy' timezone = 'PRC' lc_messages = 'en_US.UTF-8' lc_monetary = 'en_US.UTF-8' lc_numeric = 'en_US.UTF-8' lc_time = 'en_US.UTF-8' default_text_search_config = 'pg_catalog.english'
启动数据库
pg_ctl -D /data/ -l logfile start
如果是新建数据库可以将以下主从同步一同配置后再启动数据库
2.2 主库配置(postgres用户配置)
创建replicator 用户进行主从同步
create role replicator with login replication password 'replicator';
修改pg_hba.conf,允许replicator 用户来同步
host all all 192.168.247.0/24 md5 host replication replicator 192.168.247.0/24 md5
修改postgresql.conf,修改或添加以下内容
wal_level = replica #热备模式 max_wal_senders= 10 #设置最达流复制链接,控制主库最多可以有多少个并发standby数据库) wal_keep_segments = 512 #重要配置(配置保存wal日志大小) wal_sender_timeout = 60s
重启数据库生效
pg_ctl restart
2.3 从库配置(postgres用户配置)
将主库数据拷贝至从库(247.124,247.125)
pg_basebackup -h 192.168.247.123 -D /home/postgres/data/ -F p -X stream -v -P -U replicator 输入密码:replicator
修改或添加postgresql.conf以下内容
max_connections = 1100 #一般从库的最大连接数要大于主库 hot_standby = on #说明这台机器不仅仅是用于数据归档,也用于数据查询 max_standby_streaming_delay = 30s #数据流备份的最大延迟时间 wal_receiver_status_interval = 10s #多久向主报告一次从的状态,当然从每次数据复制都会向主报告状态,这里只是设置最长的间隔时间
配置recovery.conf
standby_mode = on primary_conninfo = ‘host=192.168.247.123 port=5432 user=replicator password=replicator’ recovery_target_timeline = ‘latest’
启动数据库(247.124,247.125)
pg_ctl start
2.4 安装keepalived(root用户配置)
主节点247.123,从节点247.124,247.125
yum -y install keepalived touch /etc/keepalived/pg_check.sh chmod +x /etc/keepalived/pg_check.sh
pg_check.sh脚本
#!/bin/bash #PG is alive or not? A=`ps -ef | grep postgres | grep "/opt/pg11/bin/postgres*" |grep -v grep| wc -l` # where the VIP floats to B=`ip a | grep 192.168.247.121 | wc -l` # the slave is in a waiting state? C=`ps -ef | grep postgres | grep 'startup process' |grep -v grep| wc -l` #the secondary is positive for linking to the master ? D=`ps -ef | grep postgres | grep 'receiver' |grep -v grep | wc -l` #he connection between the master and the slave library is normal? E=`ps -ef | grep postgres | grep 'sender' |grep -v grep| wc -l` #If pg dies, log the message and turn off keepalived if [ $A -eq 0 ];then echo "`date "+%Y-%m-%d--%H:%M:%S"` postgresql stop so vip stop " >> /etc/keepalived/log/check_pg.log systemctl stop keepalived else #When the master is dead, the VIP floats to slave, elevating slave's status to make him read and write if [ $B -eq 1 -a $C -eq 1 -a $D -eq 0 ];then su - postgres -c "/opt/pg11/bin/pg_ctl promote -D /data/" echo "`date "+%Y-%m-%d--%H:%M:%S"` standby promote " >> /etc/keepalived/log/check_pg.log fi #Judge yourself as Lord and lose contact with him if [ $B -eq 1 -a $C -eq 0 -a $D -eq 0 -a $E -eq 0 ];then sleep 10 echo "`date "+%Y-%m-%d--%H:%M:%S"` can't find standby " >> /etc/keepalived/log/check_pg.log fi fi
配置Keepalived日志
vim /etc/sysconfig/keepalived # Options for keepalived. See `keepalived --help' output and keepalived(8) and # keepalived.conf(5) man pages for a list of all options. Here are the most # common ones : # # --vrrp -P Only run with VRRP subsystem. # --check -C Only run with Health-checker subsystem. # --dont-release-vrrp -V Dont remove VRRP VIPs & VROUTEs on daemon stop. # --dont-release-ipvs -I Dont remove IPVS topology on daemon stop. # --dump-conf -d Dump the configuration data. # --log-detail -D Detailed log messages. # --log-facility -S 0-7 Set local syslog facility (default=LOG_DAEMON) KEEPALIVED_OPTIONS="-D -d -S 0"
修改/etc/rsyslog.conf文件,在末尾添加如下配置
local0.* /var/log/keepalived.log
重启日志记录服务
systemctl restart rsyslog
查看Keepalived日志
tail -f /var/log/keepalived.log
修改Keepalived配置文件
(主节点247.123)
mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived global_defs { notification_email { **@qq.com } notification_email_from **@163.com smtp_server smtp.163.com smtp_connect_timeout 30 router_id postgres_master } vrrp_script check_pg_alived { script "/etc/keepalived/pg_check.sh" interval 10 weight 2 } vrrp_instance VI_1 { state MASTER interface ens33 virtual_router_id 66 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1234 } track_script { check_pg_alived } virtual_ipaddress { 192.168.247.121 } }
(从节点247.124,247.125)
! Configuration File for keepalived global_defs { notification_email { **@qq.com } notification_email_from **@163.com smtp_server smtp.163.com smtp_connect_timeout 30 router_id postgres_master } vrrp_script check_pg_alived { script "/etc/keepalived/pg_check.sh" interval 10 fall 5 } vrrp_instance VI_1 { state BACKUP nopreempt interface ens33 virtual_router_id 66 priority 90 advert_int 1 authentication { auth_type PASS auth_pass 1234 } track_script { check_pg_alived } virtual_ipaddress { 192.168.247.121 } }
所有节点启动keepalived服务,并查看状态(root用户)
systemctl restart keepalived systemctl enable keepalived systemctl status keepalived
3.1各节点postgres进程正常,keepalived状态正常,查看VIP节点
[root@72-123 keepalived]# ip a |grep ens33 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.247.123/24 brd 192.168.247.255 scope global noprefixroute ens33 inet 192.168.247.121/32 scope global ens33
3.2 主节点宕机模拟
[postgres@72-123 ~]$ pg_ctl stop waiting for server to shut down....2021-10-23 01:29:37.950 PDT [59281] LOG: received fast shutdown request 2021-10-23 01:29:37.952 PDT [59281] LOG: aborting any active transactions 2021-10-23 01:29:37.953 PDT [59281] LOG: background worker "logical replication launcher" (PID 59288) exited with exit code 1 2021-10-23 01:29:37.953 PDT [59283] LOG: shutting down 2021-10-23 01:29:37.972 PDT [59281] LOG: database system is shut down done server stopped
3.3 VIP迁移到从节点
[postgres@72-125 ~]$2021-10-23 01:29:42.846 PDT [62019] FATAL: could not connect to the primary server: could not connect to server: Connection refused Is the server running on host "192.168.247.121" and accepting TCP/IP connections on port 5432? 2021-10-23 01:29:47.855 PDT [62052] LOG: connection received: host=192.168.247.121 port=53342 2021-10-23 01:29:47.856 PDT [62053] LOG: connection received: host=192.168.247.124 port=59440 2021-10-23 01:29:47.856 PDT [62052] LOG: replication connection authorized: user=replicator 2021-10-23 01:29:47.857 PDT [62053] LOG: replication connection authorized: user=replicator 2021-10-23 01:29:47.857 PDT [62051] LOG: started streaming WAL from primary at 0/A000000 on timeline 1
查看VIP
[postgres@72-125 ~]$ ip a |grep ens33 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.247.125/24 brd 192.168.247.255 scope global noprefixroute ens33 inet 192.168.247.121/32 scope global ens33
3.4连接测试
[postgres@72-124 ~]$ psql -h 192.168.247.121 -U postres -p psql (11.10) Type "help" for help. postgres=#
3.5恢复节点,启动数据库,并启动keepalived服务
[postgres@72-123 ~]$ pg_ctl start waiting for server to start....2021-10-23 01:37:13.120 PDT [60997] LOG: listening on IPv4 address "0.0.0.0", port 5432 2021-10-23 01:37:13.120 PDT [60997] LOG: listening on IPv6 address "::", port 5432 2021-10-23 01:37:13.122 PDT [60997] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2021-10-23 01:37:13.132 PDT [60998] LOG: database system was shut down at 2021-10-23 01:29:37 PDT 2021-10-23 01:37:13.134 PDT [60997] LOG: database system is ready to accept connections done server started
[root@72-123 keepalived]# systemctl start keepalived [root@72-123 keepalived]# ip a |grep ens33 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.247.123/24 brd 192.168.247.255 scope global noprefixroute ens33 inet 192.168.247.121/32 scope global ens33