IT Japan

DRBD+HB 기본 트러블슈팅 본문

카테고리 없음

DRBD+HB 기본 트러블슈팅

swhwang 2014. 9. 24. 14:49
반응형



drbd 상태 확인

[root@mysql_01 ~]# drbd-overview

**  정상 커넥션인 경우

0:sktdrbd  Connected Primary/Secondary UpToDate/UpToDate C r----- /MYSQL/DATA ext3 9.9G 3.7G 5.8G 39%


Case1. DRBD 커넥션이 끊어진 상태이지만 link down 및 drbd crash 상태는 아닌 경우 (정상리부팅)

[root@mysql_01 ~]# drbd-overview

0:sktdrbd  WFConnection Primary/Unknown UpToDate/DUnknown C r----- /MYSQL/DATA ext3 9.9G 3.7G 5.8G 39%

[root@mysql_02 ~]# drbd-overview

  0:sktdrbd  StandAlone Secondary/Unknown UpToDate/DUnknown r-----


** 해결방법 : StandAlone 상태인 DB에서 아래 명령어 실행cat 

[root@mysql_02 ~]# drbdadm connect all

[root@mysql_02 ~]# drbd-overview

  0:sktdrbd  Connected Secondary/Primary UpToDate/UpToDate C r-----


Case2. 모든 DB가 StandAlone 인 경우 (Split-brain은 아닌 상황)

[root@mysql_01 ~]# drbd-overview

  0:sktdrbd  StandAlone Primary/Unknown UpToDate/DUnknown r----- /MYSQL/DATA ext3 9.9G 3.7G 5.8G 39%

[root@mysql_02 ~]# drbd-overview

  0:sktdrbd  StandAlone Secondary/Unknown UpToDate/DUnknown r-----


** 해결방법 

1. 모든 DB에서 아래 명령어 실행

[root@mysql_01 ~]# drbdadm connect all

[root@mysql_01 ~]# drbd-overview

  0:sktdrbd  WFConnection Primary/Unknown UpToDate/DUnknown C r----- /MYSQL/DATA ext3 9.9G 3.7G 5.8G 39%

[root@mysql_02 ~]# drbdadm connect all

[root@mysql_02 ~]# drbd-overview

  0:sktdrbd  Connected Secondary/Primary UpToDate/UpToDate C r-----



Case3. Split-brain 현상일 경우

#Split-brain: DRBD(또는 다른 이중화 솔루션)으로 데이터가 싱크되는 중에 외부요인으로 인하여 두 데이터가 다르게 되어 데이터 싱크가 실패된 경우를 말함

현상: 위의 drbdadm connect all 명령어로도 StandAlone 인 상태로 나옴

해결방법

** Secondary의 데이터를 모두 지우고 다시 싱크를 해야 한다.

1. 모든 DB에서 drbdadm disconnect all 실행

2. Secondary DB에서 아래 명령어 실행하여 Data delete

drbdadm -- --discard-my-data connect all

3. 모든 DB에서 drbdadm connect all

--> 이 경우 Secondary DB의 Data를 지우게 되므로 주의 요망

heartbeat 상태 확인

**  정상 커넥션인 경우

[root@mysql_01 ~]# crm_mon

Defaulting to one-shot mode

You need to have curses available at compile time to enable console mode


============

Last updated: Mon Nov 26 10:56:40 2012

Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)

2 Nodes configured.

1 Resources configured.

============


Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online

Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online


Resource Group: group_1

    IPaddr_200_200_200_209      (heartbeat::ocf:IPaddr):        Started mysql_01

    IPaddr_172_18_79_209        (heartbeat::ocf:IPaddr):        Started mysql_01

    drbddisk_3  (heartbeat:drbddisk):   Started mysql_01

    Filesystem_4        (heartbeat::ocf:Filesystem):    Started mysql_01

    mysql_5     (heartbeat::ocf:mysql): Started mysql_01




Case1. mysql 정지 및 재기동 명령어

[root@mysql_01 ~]# crm_resource -r mysql_5 -p target_role -v stopped

[root@mysql_01 ~]# crm_mon

Defaulting to one-shot mode

You need to have curses available at compile time to enable console mode



============

Last updated: Mon Nov 26 10:58:19 2012

Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)

2 Nodes configured.

1 Resources configured.

============


Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online

Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online


Resource Group: group_1

    IPaddr_200_200_200_209      (heartbeat::ocf:IPaddr):        Started mysql_01

    IPaddr_172_18_79_209        (heartbeat::ocf:IPaddr):        Started mysql_01

    drbddisk_3  (heartbeat:drbddisk):   Started mysql_01

    Filesystem_4        (heartbeat::ocf:Filesystem):    Started mysql_01

    mysql_5     (heartbeat::ocf:mysql): Stopped


** mysql 재기동

[root@mysql_01 ~]# crm_resource -r mysql_5 -p target_role -v started

[root@mysql_01 ~]# crm_mon

Defaulting to one-shot mode

You need to have curses available at compile time to enable console mode



============

Last updated: Mon Nov 26 10:59:33 2012

Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)

2 Nodes configured.

1 Resources configured.

============


Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online

Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online


Resource Group: group_1

    IPaddr_200_200_200_209      (heartbeat::ocf:IPaddr):        Started mysql_01

    IPaddr_172_18_79_209        (heartbeat::ocf:IPaddr):        Started mysql_01

    drbddisk_3  (heartbeat:drbddisk):   Started mysql_01

    Filesystem_4        (heartbeat::ocf:Filesystem):    Started mysql_01

    mysql_5     (heartbeat::ocf:mysql): Started mysql_01


Case2. Heartbeat 절체 시 mysql 기동 실패 (에러메시지가 발생시)

[root@mysql_01 ~]# crm_mon

Defaulting to one-shot mode

You need to have curses available at compile time to enable console mode


============

Last updated: Mon Nov 26 11:01:18 2012

Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)

2 Nodes configured.

1 Resources configured.

============

Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online

Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online


Resource Group: group_1

    IPaddr_200_200_200_209      (heartbeat::ocf:IPaddr):        Started mysql_01

    IPaddr_172_18_79_209        (heartbeat::ocf:IPaddr):        Started mysql_01

    drbddisk_3  (heartbeat:drbddisk):   Stopped

    Filesystem_4        (heartbeat::ocf:Filesystem):    Stopped

    mysql_5     (heartbeat::ocf:mysql): Stopped


Failed actions:

    mysql_5_start_0 (node=mysql_01, call=38, rc=-2): Timed Out


해결방법

[root@mysql_01 ~]# crm_resource -H mysql_01 -r mysql_5 -C  //에러메시지 삭제

[root@mysql_01 ~]# crm_mon

Defaulting to one-shot mode

You need to have curses available at compile time to enable console mode


============

Last updated: Mon Nov 26 11:02:23 2012

Current DC: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44)

2 Nodes configured.

1 Resources configured.

============


Node: mysql_01 (ad6f9cc8-6269-4ad6-bcce-ad4c074821e4): online

Node: mysql_02 (bad978b4-0686-4335-b49e-f6edb0f50e44): online


Resource Group: group_1

    IPaddr_200_200_200_209      (heartbeat::ocf:IPaddr):        Started mysql_01

    IPaddr_172_18_79_209        (heartbeat::ocf:IPaddr):        Started mysql_01

    drbddisk_3  (heartbeat:drbddisk):   Started mysql_01

    Filesystem_4        (heartbeat::ocf:Filesystem):    Started mysql_01

    mysql_5     (heartbeat::ocf:mysql): Started mysql_01



Case3. Heartbeat 절체 시 mysql 기동 실패 (에러메시지를 삭제하여도 계속 DB 기동 실패)

=> 이 경우 DB를 싱글모드로 기동 시킨 후 DB 리커버리를 시도한 후 다시 H/B을 기동하여 DB정상화 시도

[root@mysql_01 ~]# /etc/init.d/heartbeat stop 또는 killall heartbeat 

(heartbeat이 정상적으로 stop되지 않을 경우 강제로 heartbeat 관련 데몬을 kill)

[root@mysql_01 ~]# drbdadm primary all 

 [root@mysql_01 ~]# mount /dev/drbd0 /MYSQL_DATA

[root@mysql_01 ~]# /MYSQL/mysql/bin/mysqld_safe --user=mysql&

(InnoDB가 자동으로 recovery 시도 후 DB 정상 기동 확인)

[root@mysql_01 ~]# /MYSQL/mysql/bin/mysqladmin -uroot -p shutdown

[root@mysql_01 ~]# umount /MYSQL_DATA

[root@mysql_01 ~]# drbdadm secondary all

[root@mysql_01 ~]# /etc/init.d/heartbeat start



반응형
Comments