Persistent
Redis DataBase
在指定的时间间隔内将内存中的数据集快照写入磁盘,也就是Snapshot快照,它恢复时是将快照文件直接读到内存中。
Redis会单独创建(fork)一个子线程进行持久化,会先将数据写入到一个临时文件中,待持久化都结束了,再用这个临时文件替换上次持久化好的文件。
整个过程中,主进程不进行任何IO操作确保性能。
如果需要进行大规模数据的恢复,且对于数据恢复的完整性不是非常敏感,那RDB方式要比AOF方式更加高效。RDB的缺点是最后一次持久化后的数据可能丢失。
Fork的作用是复制一个与当前进程一样的进程。新进程的所有数据(变量、环境变量、程序计数器等)数值都和原进程一致,但是是一个全新的进程,并作为原进程的子进程。
FLUSHALL、SHUTDOWN会迅速生成dump.rdb
save
save时只管保存,全部阻塞无法写入
bgsave
redis会在后台异步进行快照操作,快照同时还会响应客户端请求。 可以通过lastsave命令获取最后一次成功执行快照的时间
将备份文件(dump.rdb)移动到Redis配置的dir目录下启动服务即可
redis-cli config set save ""
# It is also possible to remove all the previously configured save # points by adding a save directive with a single empty string argument # like in the following example: # save ""
RDB advantages
RDB is a very compact single-file point-in-time representation of your Redis data. RDB files are perfect for backups. For instance you may want to archive your RDB files every hour for the latest 24 hours, and to save an RDB snapshot every day for 30 days. This allows you to easily restore different versions of the data set in case of disasters.
RDB is very good for disaster recovery, being a single compact file that can be transferred to far data centers, or onto Amazon S3 (possibly encrypted).
RDB maximizes Redis performances since the only work the Redis parent process needs to do in order to persist is forking a child that will do all the rest. The parent instance will never perform disk I/O or alike.
RDB allows faster restarts with big datasets compared to AOF.
总结
RDB disadvantages
RDB is NOT good if you need to minimize the chance of data loss in case Redis stops working (for example after a power outage). You can configure different save points where an RDB is produced (for instance after at least five minutes and 100 writes against the data set, but you can have multiple save points). However you'll usually create an RDB snapshot every five minutes or more, so in case of Redis stopping working without a correct shutdown for any reason you should be prepared to lose the latest minutes of data.
RDB needs to fork() often in order to persist on disk using a child process. Fork() can be time consuming if the dataset is big, and may result in Redis to stop serving clients for some millisecond or even for one second if the dataset is very big and the CPU performance not great. AOF also needs to fork() but you can tune how often you want to rewrite your logs without any trade-off on durability.
总结
Append Only File
以日志的形式来记录每个写操作,将Redis执行过的所有写指令都记录下来(读操作不记录),只许追加文件不可以改写文件,Redis启动之初会读取文件重新构建数据,Redis重启就会根据日志文件内容将写指令从前到后重新执行一遍以完成数据的恢复。
aof保存appendonly.aof文件
aof文件和rdb文件同时存在时,Redis启动时先加载aof文件。
AOF采用文件追加方式,文件会越写越大为避免次情况,新增了重写机制,当AOF文件大小超过所设定的阈值时,Redis就会启动AOF文件的内容压缩,只保留可以恢复数据的最小指令集,可以使用命令bgrewriteaof。
AOF文件持续增长而过大时,会fork出一条新的进程来将文件重写(同样也是先写临时文件最后重命名文件),遍历新进程内存中数据,每条记录有一条的Set语句。重写aof文件的操作,并不会读取旧的aof文件而是将整个内存中的数据库内容用命令的方式重写一个新的aof文件。
Redis会记录上次重写时AOF文件大小,默认配置当AOF文件大小是上次rewrite后大小的一倍且文件大于64m时触发
重写时时候可以用Appendfsync,一般使用默认no,保证数据安全性
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
AOF advantages
AOF disadvantages
如果只做缓存使用,可以不开启持久化方式
如果可以承受几分钟的数据丢失可以单单开启rdb
官网上不建议单单开启aof,因为aof可能潜在bug
RDB文件只用做后备用途,建议只在Slave上持久化RDB文件,而且只要15分钟备份一次就够了,只保留save 900 1 这一条规则。
如果开启AOF,好处是最恶劣的情况下也只会丢失1s的数据,启动后加载aof文件即可恢复数据。代价一是持续的IO,二是AOF rewrite最后将rewrite过程中产生的新数据写到新文件造成的阻塞是不可避免的。在硬盘允许的情况要将auto-aof-rewrite-min-size配置放大,可以设置到5G以上。
如果不开启AOF,仅仅靠Master-Salve Replication 实现高可用性也是可行的。能节省很大一笔IO开销同时减少rewrite带来的系统波动。代价是Master/Salve同时宕机会损失十几分钟数据,启动恢复前也要比较Master/Slave中的RDB文件,载入较新的那个。