psql: FATAL the database system is in recovery解决

青旅半醒 2022-05-31 07:45 710阅读 0赞

报错:
FATAL: the database system is in recovery mode
解决思路:
在hawq master节点
1、执行hawq state ,提示 database is down
2、查看hawq master进程: ps aux | grep postgresql ,发现master进程不在
3、查看pg_log 下 当天的log

  1. 2018-02-11 16:34:32.089297 CST,,,p636599,th589375776,,,,0,,,seg-10000,,,,,"LOG","00000","seqserver process (PID 499050) exited with exit code 2",,,,,,,0,,"postmaster.c",4726,
  2. 2018-02-11 16:34:32.089388 CST,,,p636599,th589375776,,,,0,,,seg-10000,,,,,"LOG","00000","walsendserver process (PID 499051) exited with exit code

发现master进程被人为kill掉了。
4、手动启动master
source /usr/local/hawq/greenplum.sh
su gpadmin
hawq start master,因为有master pid存在,系统认为master进程存在,于是手动强制停止master:
hawq stop master -M immediate
hawq start master,成功启动master
发现segments并未注册到master
5、重启整个集群:
hawq restart cluster
再次执行: hawq state 一切正常。


整个问题产生的原因:
hawq master跟namenode同一个节点,运维的一个同事启动Namenode失败,没有确定真正原因的情况下,强行Kill掉了hawq master进程。

发表评论

表情:
评论列表 (有 0 条评论,710人围观)

还没有评论,来说两句吧...

相关阅读