ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

LOG: invalid primary checkpoint record/PANIC: could not locate a valid checkpoint record恢复方法

2022-04-23 17:00:38  阅读:230  来源: 互联网

标签:locate LOG 04 23 19 08 checkpoint record 2022


正常情况下lightdb/postgresql进程被杀或--mode=immediate模式停止的时候,是会进行正常恢复的。如下:

2022-04-23 08:27:28.085797C [unknown] zjh@postgres ::1(51378) walsender  57P03[2022-04-23 08:27:28 UTC] 0 [82739] FATAL:  the database system is in recovery mode
2022-04-23 08:27:58.093597C [unknown] zjh@postgres ::1(52034) walsender  57P03[2022-04-23 08:27:58 UTC] 0 [82851] FATAL:  the database system is in recovery mode
2022-04-23 08:28:28.098248C [unknown] zjh@postgres ::1(52690) walsender  57P03[2022-04-23 08:28:28 UTC] 0 [82974] FATAL:  the database system is in recovery mode
2022-04-23 08:28:58.106009C [unknown] zjh@postgres ::1(53290) walsender  57P03[2022-04-23 08:28:58 UTC] 0 [83085] FATAL:  the database system is in recovery mode
2022-04-23 08:29:28.112561C [unknown] zjh@postgres ::1(53948) walsender  57P03[2022-04-23 08:29:28 UTC] 0 [83212] FATAL:  the database system is in recovery mode
2022-04-23 08:29:53.571639C  @  postmaster  00000[2022-04-22 05:48:38 UTC] 0 [207357] LOG:  all server processes terminated; reinitializing
2022-04-23 08:29:54.990799C  @  startup  00000[2022-04-23 08:29:54 UTC] 0 [83315] LOG:  database system was interrupted; last known up at 2022-04-22 12:23:26 UTC
2022-04-23 08:29:55.023361C  @  startup  00000[2022-04-23 08:29:54 UTC] 0 [83315] LOG:  database system was not properly shut down; automatic recovery in progress
2022-04-23 08:29:55.023954C  @  startup  00000[2022-04-23 08:29:54 UTC] 0 [83315] LOG:  redo starts at 19/7F15410
2022-04-23 08:29:55.023972C  @  startup  00000[2022-04-23 08:29:54 UTC] 0 [83315] LOG:  invalid record length at 19/7F15448: wanted 24, got 0
2022-04-23 08:29:55.023978C  @  startup  00000[2022-04-23 08:29:54 UTC] 0 [83315] LOG:  redo done at 19/7F15410
2022-04-23 08:29:55.024782C  @  startup  00000[2022-04-23 08:29:54 UTC] 0 [83315] LOG:  checkpoint starting: end-of-recovery immediate
2022-04-23 08:29:55.062822C  @  startup  00000[2022-04-23 08:29:54 UTC] 0 [83315] LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.037 s, sync=0.001 s, total=0.039 s; sync files=0, longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB
2022-04-23 08:29:55.067348C  @  postmaster  00000[2022-04-22 05:48:38 UTC] 0 [207357] LOG:  database system is ready to accept connections

但是某些情况下,异常终止后,可能会导致主检查点记录无效(断电、坏道都有可能),例如:

2022-04-19 19:05:26.104431T [unknown] lightdb@repmgr 127.0.0.1(39248) client backend 57P03[2022-04-19 19:05:26 CST] 0 [45004] FATAL: the database system is shutting down
2022-04-19 19:05:36.105421T [unknown] lightdb@repmgr 127.0.0.1(39250) client backend 57P03[2022-04-19 19:05:36 CST] 0 [45036] FATAL: the database system is shutting down
2022-04-19 19:07:59.027435T @ postmaster 00000[2022-04-18 11:52:58 CST] 0 [47554] LOG: server process (PID 90426) was terminated by signal 9: 已杀死
2022-04-19 19:07:59.027435T @ postmaster 00000[2022-04-18 11:52:58 CST] 0 [47554] DETAIL: Failed process was running: select pg_backend_pid();
2022-04-19 19:07:59.027818T @ postmaster 00000[2022-04-18 11:52:58 CST] 0 [47554] LOG: terminating any other active server processes
2022-04-19 19:07:59.030308T @ postmaster 00000[2022-04-18 11:52:58 CST] 0 [47554] LOG: archiver process (PID 47644) exited with exit code 2
2022-04-19 19:07:59.121224T @ postmaster 00000[2022-04-18 11:52:58 CST] 0 [47554] LOG: abnormal database system shutdown
2022-04-19 19:07:59.212224T @ postmaster 00000[2022-04-18 11:52:58 CST] 0 [47554] LOG: database system is shut down

 

[lightdb@hs-10-19-36-28 log]$ tail -fn 100 lightdb-2022-04-19_190953.log
2022-04-19 19:09:53.964212T @ postmaster 00000[2022-04-19 19:09:52 CST] 0 [45587] LOG: starting LightDB 13.3-22.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (NeoKylin 4.8.5-36), 64-bit
2022-04-19 19:09:53.964485T @ postmaster 00000[2022-04-19 19:09:52 CST] 0 [45587] LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-04-19 19:09:53.964500T @ postmaster 00000[2022-04-19 19:09:52 CST] 0 [45587] LOG: listening on IPv6 address "::", port 5432
2022-04-19 19:09:53.965481T @ postmaster 00000[2022-04-19 19:09:52 CST] 0 [45587] LOG: listening on Unix socket "/tmp/.s.LIGHTDB.5432"
2022-04-19 19:09:53.999301T @ startup 00000[2022-04-19 19:09:53 CST] 0 [45593] LOG: database system was interrupted; last known up at 2022-04-19 18:19:48 CST
2022-04-19 19:09:55.838765T @ startup 00000[2022-04-19 19:09:53 CST] 0 [45593] LOG: invalid primary checkpoint record
2022-04-19 19:09:55.838796T @ startup XX000[2022-04-19 19:09:53 CST] 0 [45593] PANIC: could not locate a valid checkpoint record
2022-04-19 19:09:55.873920T @ postmaster 00000[2022-04-19 19:09:52 CST] 0 [45587] LOG: startup process (PID 45593) was terminated by signal 6: Aborted
2022-04-19 19:09:55.873956T @ postmaster 00000[2022-04-19 19:09:52 CST] 0 [45587] LOG: aborting startup due to startup process failure
2022-04-19 19:09:55.879035T @ postmaster 00000[2022-04-19 19:09:52 CST] 0 [45587] LOG: database system is shut down

  此时可以通过lt_resetwal -f $PGDATA强制清理wal,然后启动,虽然这样会丢失部分数据(在业务量较大的系统中,总是具有高可用环境或至少使用lt_probackup增量备份到其它服务器是非常重要的)。

 https://postgrespro.com/list/thread-id/2559322

https://cdmana.com/2022/03/202203241421490061.html

https://github.com/pgbackrest/pgbackrest/issues/839

https://dockerquestions.com/2020/11/15/docker-postgres-database-invalid-primary-checkpoint-record/

 

标签:locate,LOG,04,23,19,08,checkpoint,record,2022
来源: https://www.cnblogs.com/zhjh256/p/16169442.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有