IT Japan
DRBD+HB 기본 트러블슈팅 본문
drbd 상태 확인
[root@mysql_01 ~]# drbd-overview
** 정상 커넥션인 경우
0:sktdrbd Connected Primary/Secondary UpToDate/UpToDate C r----- /MYSQL/DATA ext3 9.9G 3.7G 5.8G 39%
Case1. DRBD 커넥션이 끊어진 상태이지만 link down 및 drbd crash 상태는 아닌 경우 (정상리부팅)
[root@mysql_01 ~]# drbd-overview
0:sktdrbd WFConnection Primary/Unknown UpToDate/DUnknown C r----- /MYSQL/DATA ext3 9.9G 3.7G 5.8G 39%
[root@mysql_02 ~]# drbd-overview
0:sktdrbd StandAlone Secondary/Unknown UpToDate/DUnknown r-----
** 해결방법 : StandAlone 상태인 DB에서 아래 명령어 실행cat
[root@mysql_02 ~]# drbdadm connect all
[root@mysql_02 ~]# drbd-overview
0:sktdrbd Connected Secondary/Primary UpToDate/UpToDate C r-----
Case2. 모든 DB가 StandAlone 인 경우 (Split-brain은 아닌 상황)
[root@mysql_01 ~]# drbd-overview
0:sktdrbd StandAlone Primary/Unknown UpToDate/DUnknown r----- /MYSQL/DATA ext3 9.9G 3.7G 5.8G 39%
[root@mysql_02 ~]# drbd-overview
0:sktdrbd StandAlone Secondary/Unknown UpToDate/DUnknown r-----
** 해결방법
1. 모든 DB에서 아래 명령어 실행
[root@mysql_01 ~]# drbdadm connect all
[root@mysql_01 ~]# drbd-overview
0:sktdrbd WFConnection Primary/Unknown UpToDate/DUnknown C r----- /MYSQL/DATA ext3 9.9G 3.7G 5.8G 39%
[root@mysql_02 ~]# drbdadm connect all
[root@mysql_02 ~]# drbd-overview
0:sktdrbd Connected Secondary/Primary UpToDate/UpToDate C r-----
Case3. Split-brain 현상일 경우
#Split-brain: DRBD(또는 다른 이중화 솔루션)으로 데이터가 싱크되는 중에 외부요인으로 인하여 두 데이터가 다르게 되어 데이터 싱크가 실패된 경우를 말함
현상: 위의 drbdadm connect all 명령어로도 StandAlone 인 상태로 나옴
해결방법
** Secondary의 데이터를 모두 지우고 다시 싱크를 해야 한다.
1. 모든 DB에서 drbdadm disconnect all 실행
2. Secondary DB에서 아래 명령어 실행하여 Data delete
drbdadm -- --discard-my-data connect all
3. 모든 DB에서 drbdadm connect all
--> 이 경우 Secondary DB의 Data를 지우게 되므로 주의 요망
heartbeat 상태 확인
** 정상 커넥션인 경우
[root@mysql_01 ~]# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
============
Last updated: Mon Nov 26 10:56:40 2012
Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)
2 Nodes configured.
1 Resources configured.
============
Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online
Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online
Resource Group: group_1
IPaddr_200_200_200_209 (heartbeat::ocf:IPaddr): Started mysql_01
IPaddr_172_18_79_209 (heartbeat::ocf:IPaddr): Started mysql_01
drbddisk_3 (heartbeat:drbddisk): Started mysql_01
Filesystem_4 (heartbeat::ocf:Filesystem): Started mysql_01
mysql_5 (heartbeat::ocf:mysql): Started mysql_01
Case1. mysql 정지 및 재기동 명령어
[root@mysql_01 ~]# crm_resource -r mysql_5 -p target_role -v stopped
[root@mysql_01 ~]# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
============
Last updated: Mon Nov 26 10:58:19 2012
Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)
2 Nodes configured.
1 Resources configured.
============
Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online
Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online
Resource Group: group_1
IPaddr_200_200_200_209 (heartbeat::ocf:IPaddr): Started mysql_01
IPaddr_172_18_79_209 (heartbeat::ocf:IPaddr): Started mysql_01
drbddisk_3 (heartbeat:drbddisk): Started mysql_01
Filesystem_4 (heartbeat::ocf:Filesystem): Started mysql_01
mysql_5 (heartbeat::ocf:mysql): Stopped
** mysql 재기동
[root@mysql_01 ~]# crm_resource -r mysql_5 -p target_role -v started
[root@mysql_01 ~]# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
============
Last updated: Mon Nov 26 10:59:33 2012
Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)
2 Nodes configured.
1 Resources configured.
============
Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online
Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online
Resource Group: group_1
IPaddr_200_200_200_209 (heartbeat::ocf:IPaddr): Started mysql_01
IPaddr_172_18_79_209 (heartbeat::ocf:IPaddr): Started mysql_01
drbddisk_3 (heartbeat:drbddisk): Started mysql_01
Filesystem_4 (heartbeat::ocf:Filesystem): Started mysql_01
mysql_5 (heartbeat::ocf:mysql): Started mysql_01
Case2. Heartbeat 절체 시 mysql 기동 실패 (에러메시지가 발생시)
[root@mysql_01 ~]# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
============
Last updated: Mon Nov 26 11:01:18 2012
Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)
2 Nodes configured.
1 Resources configured.
============
Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online
Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online
Resource Group: group_1
IPaddr_200_200_200_209 (heartbeat::ocf:IPaddr): Started mysql_01
IPaddr_172_18_79_209 (heartbeat::ocf:IPaddr): Started mysql_01
drbddisk_3 (heartbeat:drbddisk): Stopped
Filesystem_4 (heartbeat::ocf:Filesystem): Stopped
mysql_5 (heartbeat::ocf:mysql): Stopped
Failed actions:
mysql_5_start_0 (node=mysql_01, call=38, rc=-2): Timed Out
해결방법
[root@mysql_01 ~]# crm_resource -H mysql_01 -r mysql_5 -C //에러메시지 삭제
[root@mysql_01 ~]# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode
============
Last updated: Mon Nov 26 11:02:23 2012
Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)
2 Nodes configured.
1 Resources configured.
============
Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online
Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online
Resource Group: group_1
IPaddr_200_200_200_209 (heartbeat::ocf:IPaddr): Started mysql_01
IPaddr_172_18_79_209 (heartbeat::ocf:IPaddr): Started mysql_01
drbddisk_3 (heartbeat:drbddisk): Started mysql_01
Filesystem_4 (heartbeat::ocf:Filesystem): Started mysql_01
mysql_5 (heartbeat::ocf:mysql): Started mysql_01
Case3. Heartbeat 절체 시 mysql 기동 실패 (에러메시지를 삭제하여도 계속 DB 기동 실패)
=> 이 경우 DB를 싱글모드로 기동 시킨 후 DB 리커버리를 시도한 후 다시 H/B을 기동하여 DB정상화 시도
[root@mysql_01 ~]# /etc/init.d/heartbeat stop 또는 killall heartbeat
(heartbeat이 정상적으로 stop되지 않을 경우 강제로 heartbeat 관련 데몬을 kill)
[root@mysql_01 ~]# drbdadm primary all
[root@mysql_01 ~]# mount /dev/drbd0 /MYSQL_DATA
[root@mysql_01 ~]# /MYSQL/mysql/bin/mysqld_safe --user=mysql&
(InnoDB가 자동으로 recovery 시도 후 DB 정상 기동 확인)
[root@mysql_01 ~]# /MYSQL/mysql/bin/mysqladmin -uroot -p shutdown
[root@mysql_01 ~]# umount /MYSQL_DATA
[root@mysql_01 ~]# drbdadm secondary all
[root@mysql_01 ~]# /etc/init.d/heartbeat start