系統不正常關機導致mongodb啟動失敗

URL Link //n.sfs.tw/14613

2020-04-22 21:01:22 By jung

某縣市idp服務主機某天下午突然無法連線,ssh&web都進不去~

對方網路中心傳來hyper-v vm 虛擬機畫面如下

 

後來強制重啟後,卡在centos emergency mode

經過對方教網中心搶救,先是可以正常開機遠端登入了

但是wildfly18無法啟動

要先修改/opt/wildfly18/standalone/configuration/standalone-full-ha.xml

將之前發布的war檔紀錄刪除

<deployments>
        <deployment name="cncauthserver-1.2.5.war" runtime-name="cncauthserver-1.2.5.war">
            <content sha1="e8e76e3007544c7e836c9a0ebe9a609c19515d0f"/>
        </deployment>
        <deployment name="CncResource-1.2.5.war" runtime-name="CncResource-1.2.5.war">
            <content sha1="09353dfe5c33fb863b62cf49d7eb50b27b595bb9"/>
        </deployment>
    </deployments>

再重啟wildfly服務就可以了,程式也需要重新發佈

接著遇到mongo服務啟動失敗的問題,

● mongod.service - MongoDB Database Server
   Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since 三 2020-04-22 18:58:26 CST; 24s ago
     Docs: https://docs.mongodb.org/manual
  Process: 4504 ExecStart=/usr/bin/mongod $OPTIONS (code=exited, status=14)
  Process: 4500 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 4497 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 4495 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS)

 4月 22 18:58:26 matsu.sso.edu.tw systemd[1]: Starting MongoDB Database Server...
 4月 22 18:58:26 matsu.sso.edu.tw mongod[4504]: about to fork child process, waiting until server is ready for connections.
 4月 22 18:58:26 matsu.sso.edu.tw mongod[4504]: forked process: 4506
 4月 22 18:58:26 matsu.sso.edu.tw mongod[4504]: ERROR: child process failed, exited with error number 14
 4月 22 18:58:26 matsu.sso.edu.tw mongod[4504]: To see additional information in this output, start without the "--fork" option.
 4月 22 18:58:26 matsu.sso.edu.tw systemd[1]: mongod.service: control process exited, code=exited status=14
 4月 22 18:58:26 matsu.sso.edu.tw systemd[1]: Failed to start MongoDB Database Server.
 4月 22 18:58:26 matsu.sso.edu.tw systemd[1]: Unit mongod.service entered failed state.
 4月 22 18:58:26 matsu.sso.edu.tw systemd[1]: mongod.service failed.

查了許多文件,使用

mongod --dbpath /data/db --repair

仍然無效,只好果斷將mongo移除

yum erase $(rpm -qa | grep mongodb)

再重新安裝一次

yum install -y mongodb-org

這時還是無法啟動,查到文件顯示/tmp/mongodb-27017.sock permission問題

可能是安裝時使用root身份,導致這個sock檔擁有者變成 root

手動先刪除這個sock檔

$ sudo rm -rf /tmp/mongodb-27017.sock

再重新啟動就正常了

檢查檔案擁有者,變成 mongodb

$ ls -lsah /tmp/mongodb-27017.sock
0 srwx------ 1 mongodb mongodb 0 Aug 24 04:01 /tmp/mongodb-27017.sock

接下來把備份的db data還原

mongorestore -d xxstore --drop /home/../mongobk/../xxstore/

就正常了,正所謂沒事多備份,多備份沒事@@

如果想在docker 裡的mongo容器做restore

由於是來自外部的,所以mongo預設寫在dump裡的index就沒有用了

還原時要忽略以免失敗,會跳錯誤訊息:no indexes to restore for collection

mongorestore --db dbName ./dumpPath/  --noIndexRestore

參考資料:

https://stackoverflow.com/questions/27933169/create-mongo-backup-restore-without-indexes

 

如果想還原單一個collection 用下列指令找到bson檔就可以

mongorestore --port <port>(遠端資料庫連線用) --db <destination database> --collection <collection-name> <data-dump-path/dbname/collection.bson> --drop

https://docs.cloudmanager.mongodb.com/tutorial/restore-single-database/

 

後來想到,要處理/sys/kernel/mm/transparent_hugepage/enabled 問題

先到/etc/init.d/ 加入一個啟動腳本 disable-transparent-hugepages

內容如下:

#!/bin/bash
### BEGIN INIT INFO
# Required-Start: $local_fs
# Required-Stop:
# X-Start-Before: mongod mongodb-mms-automation-agent
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Disable Linux transparent huge pages
# Description: Disable Linux transparent huge pages, to improve
# database performance.
### END INIT INFO
case $1 in
start)
if [ -d /sys/kernel/mm/transparent_hugepage ]; then
thp_path=/sys/kernel/mm/transparent_hugepage
elif [ -d /sys/kernel/mm/redhat_transparent_hugepage ]; then
thp_path=/sys/kernel/mm/redhat_transparent_hugepage
else
return 0
fi
echo 'never' > ${thp_path}/enabled
echo 'never' > ${thp_path}/defrag
re='^[0-1]+$'
if [[ $(cat ${thp_path}/khugepaged/defrag) =~ $re ]]
then
# RHEL 7
echo 0 > ${thp_path}/khugepaged/defrag
else
# RHEL 6
echo 'no' > ${thp_path}/khugepaged/defrag
fi
unset re
unset thp_path
;;
esac

 

mongo

# chkconfig --add disable-transparent-hugepages

# /etc/init.d/disable-transparent-hugepages start

# systemctl restart mongod