mariadb cluster 我以前把節點(node)建在不同的機器之間組成cluster,這樣運作了五六年沒什麼問題。
邏輯上不同機器可以分散在不同的物理位置達到備援或分散的效果,事實上我還是放在同一個esxi中。
這近透濄重建cluster的機會,把多個節點放在同一個docker中,這樣一台機器就能達到cluster的效果,一開始我還懷疑這樣子效能行不行?
畢竟在同一台虛擬機中做叢集似乎就等於在同一台機器建一個mysql server一樣的意思?
事實證明docker又顛覆我的想法,真的感覺效能不錯。
這篇文章主要是參考[2] 的內容整理,移除了測試和監測的部分。
系統與架構
Rocky Linux release 9.4 (Blue Onyx)
Docker version 26.1.3, build b72abbb
Docker Compose version v2.27.0
架構上就是 DB1 DB2 DB3 三個組成叢集,透過haproxy給外部讀取
DOCKER[ (DB1 + DB2 + DB3) <--> HAPROXY ] <---> INTERNET/INTRANET
docker-compose.yml
先處理 mariadb-gelera,原本使用[1]提供的版本,後來一直弄不起來,最後採用[2]提供的文件弄起來,運作很順暢,他的完整設定檔放在[3]。
我略做修改如下:
services:
db01:
image: mariadb-galera
container_name: db01
hostname: db01
ports:
- 13306:3306 # <== 原則上測試連線用,測試完畢後移除
volumes:
- ./db01_data:/var/lib/mysql
- type: bind
source: ./db01/conf.d/galera.cnf
target: /etc/mysql/conf.d/galera.cnf
environment:
- MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=yes # <== 初始建立時方便查錯,建立完請移除
# - MARIADB_ROOT_PASSWORD=123456 <== 建立完畢後設定root密碼
- TZ=Asia/Taipei
restart: on-failure
networks:
db_network:
command: ["--wsrep-new-cluster"]
# db02 db03 基本上和 db01一樣,一點點不同
db02:
image: mariadb-galera
container_name: db02
hostname: db02
volumes:
- ./db02_data:/var/lib/mysql
- type: bind
source: ./db02/conf.d/galera.cnf
target: /etc/mysql/conf.d/galera.cnf
environment:
- MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=yes # <== 初始建立時方便查錯,建立完請移除
# - MARIADB_ROOT_PASSWORD=123456 <== 建立完畢後設定root密碼
- TZ=Asia/Taipei
restart: on-failure
depends_on:
- db01
networks:
db_network:
db03:
image: mariadb-galera
container_name: db03
hostname: db03
volumes:
- ./db03_data:/var/lib/mysql
- type: bind
source: ./db03/conf.d/galera.cnf
target: /etc/mysql/conf.d/galera.cnf
environment:
- MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=yes # <== 初始建立時方便查錯,建立完請移除
# - MARIADB_ROOT_PASSWORD=123456 <== 建立完畢後設定root密碼
- TZ=Asia/Taipei
restart: on-failure
depends_on:
- db01
networks:
db_network:
networks:
db_network:
driver: bridge
先別急著啟動,先完成設定檔
建立目錄 db01, db02, db03, 及裡面的conf.d/ 看起來是這樣的:
├── db01
│ └─── conf.d
│ └── galera.cnf
├── db02
│ └─── conf.d
│ └── galera.cnf
├── db03
│ └─── conf.d
│ └── galera.cnf
└── docker-compose.yml
galera.cnf 設定
三個目錄中的 galera.cnf 的設定都一樣:
[mysqld] log_bin ='/var/log/mysql/mariadb-bin' binlog_format='row' expire_logs_days=1 default-time-zone = "+08:00" #log_error # InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/. # Read the manual for more InnoDB related options. There are many! default_storage_engine='InnoDB' # https://mariadb.com/kb/en/set-transaction/ transaction-isolation='READ-COMMITTED' # https://mariadb.com/docs/reference/mdb/cli/mariadbd/innodb-flush-log-at-trx-commit/ innodb-flush-log-at-trx-commit=1 # https://mariadb.com/docs/reference/mdb/system-variables/innodb_autoinc_lock_mode/ innodb_autoinc_lock_mode=2 # https://www.digitalocean.com/community/tutorials/how-to-change-a-mariadb-data-directory-to-a-new-location-on-centos-7 datadir='/var/lib/mysql' socket='/run/mysqld/mysqld.sock' # https://dba.stackexchange.com/questions/130922/error-wsrep-gcs-src-gcs-cppgcs-open1379-failed-to-open-channel-test-clu/131487 [galera] wsrep_on='ON' wsrep_provider='/usr/lib/galera/libgalera_smm.so' wsrep_cluster_address='gcomm://db01,db02,db03' wsrep_cluster_name='my_galera' wsrep_sst_method='rsync' wsrep_slave_threads=2 wsrep_node_address='db01' #wsrep_notify_cmd='/etc/mysql/scripts/my-wsrep-notify.sh' #pxc_strict_mode='PERMISSIVE' # https://qiita.com/chaspy/items/baad6947ae0f8b169868 wsrep_auto_increment_control='ON' wsrep_drupal_282555_workaround='ON' wsrep_retry_autocommit=10 #bind-address=0.0.0.0 #bind-address=127.0.0.1 #bind_address=* # Can't connect to MySQL server on '127.0.0.1' # https://takapi86.hatenablog.com/entry/2018/12/16/140314 # https://kamatimaru.hatenablog.com/entry/2020/05/30/023739 [client] protocol='TCP' #host='127.0.0.1' #port=3306
主要修改 [galera] 中的欄位。
先別急著啟動docker,請看下面的起動順序。
mariaDB CLUSTER起動順序
先啟動第一台 db01
# docker compose up -d db01
# docker ps -a
確定無誤後再依序啟動 db02 db03
# docker compose up -d db02 db03
# docker ps -a
這時會建立三個目錄,分別是資料庫存放的位置,這就正確了。
├── db01
├── db01_data
├── db02
├── db02_data
├── db03
├── db03_data
└── docker-compose.yml
測試
先安裝os層的mysql client程式:
# yum install mysql
# mysql -u root -h 127.0.0.1 -P 13306
能進得去就ok了。
mariaDB CLUSTER重新起動順序
如果不小時需要重新起動的話,修改 db01_data/grastate.dat
把
safe_to_bootstrap: 0
改成
safe_to_bootstrap: 1
再依上面的起動順序起動。
錯誤排除
[2024.0813]
今天遇到一個情況,docker把系統的空間吃光光了,導致後來的資料無法新增,出現 DISK FULL的報錯。
進到系統中建目錄查看發現是docker的目錄塞爆了空間
# du -ah --max-depth=1
於是當機立斷,把docker關閉/重建(down/up),並釋放占用的空間…結果
就開不起來了。
報錯的內容
[ERROR] Found 6 prepared transactions! It means that mysqld was not shut down properly last time and critical recovery information (last binlog or tc.log file) was manually deleted after a crash. You have to start mysqld with --tc-heuristic-recover switch to commit or rollback pending transactions.
[ERROR] Aborting
簡單來說就是有未完成的寫入,這是由於空間占滿的關係。
最後不斷的試錯,花了五個小時!!!,終於找到解法成功解決了,感謝網路諸多大神前輩不吝提供各種建議(別人的帖子)。
第一步,db01的命令寫成這樣
["--tc-heuristic-recover=COMMIT", "--wsrep-new-cluster"]
執行完畢後會 db01 exited with code 0,出現找不到節點的錯誤
db01 | 2024-08-13 21:32:21 0 [Note] WSREP: Last wsrep seqno to be recovered 1682372
db01 | 2024-08-13 21:32:21 0 [Note] Please restart without --tc-heuristic-recover
db01 | 2024-08-13 21:32:21 0 [ERROR] Can't init tc log
db01 | 2024-08-13 21:32:21 0 [ERROR] Aborting
db01 | 2024-08-13 21:32:21 2 [ERROR] WSREP: Exception: State wait was interrupted
db01 | 2024-08-13 21:32:21 2 [ERROR] WSREP: View callback failed. This is unrecoverable, restart required. (FATAL)
這時把命令列改成這樣,就能順利啟動了
[ "--wsrep-new-cluster"]
再重新建立就救活了。
設置haproxy
docker-compose.yml
haproxy:
image: haproxy:latest
container_name: haprox
ports:
- 3306:3306
volumes:
- ./haproxy:/usr/local/etc/haproxy
networks:
db_network:
請先建立目錄haproxy,目錄看起是這樣的:
├── db01
├── db01_data
├── db02
├── db02_data
├── db03
├── db03_data
├── haproxy
│ └── haproxy.cfg
└── docker-compose.yml
其中的 haproxy.cfg
global # pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats # utilize system-wide crypto-policies ssl-default-bind-ciphers PROFILE=SYSTEM ssl-default-server-ciphers PROFILE=SYSTEM defaults mode tcp log 127.0.0.1 local2 warning retries 3 timeout http-request 30s timeout queue 1m timeout connect 30s timeout client 1m timeout server 1m timeout http-keep-alive 30s timeout check 10s maxconn 3000 frontend mysqldb bind :3306 default_backend mysqlpool backend mysqlpool balance leastconn server node1 db01:3306 check weight 1 server node2 db02:3306 check weight 1 server node3 db03:3306 check weight 1
設置完畢就可以啟動 haproxy
結論
上面看起來有點複雜,事實上真的很複雜,我去蕪存菁設定並測試了幾個禮拜確定無誤後才真正運行,辛苦非常的值得。
參考資料
[1] Bitnami Mariadb https://hub.docker.com/r/bitnami/mariadb-galera
[2] 日文 https://gkzz.dev/posts/mariadb-galera-cluster-docker-compose/
[3] https://github.com/gkzz/mariadb-galera-cluster/blob/main/README.md