corosync+drbd+mysql实现的高可用-原创手记-慕课网

要求：

一、能够在同一网段内直接通信

二、节点名称，要和uname的结果一样，并保证可以根据节点名称解析到节点的IP地址，配置本地/etc/hosts

三、SSH互信通信

四、保证时间同步

环境准备配置：

test1,192.168.10.55配置

1、配置IP地址

[root@test1 ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0

2、配置主机名

[root@test1 ~]# uname -n [root@test1 ~]# hostname master1.local#临时生效 [root@test1 ~]# vim /etc/sysconfig/network#永久生效

3、配置主机名解析

[root@test1 ~]# vim /etc/hosts 添加： 192.168.10.55master1.local 192.168.10.56master2.local

3.2、测试主机名通信

[root@test1 ~]# ping master1.local [root@test1 ~]# ping master2.local

4、配置SSH互信认证

[root@test1 ~]# ssh-keygen -t rsa -P '' [root@test1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@192.168.10.55

5、使用ntp同步时间

在crontab中加入每5分钟执行一次ntpdate命令，用来保证服务器时间是同步的

[root@test1 ~]# crontab -e */5 * * * * /sbin/ntpdate 192.168.10.1 &> /dev/null

test2,192.168.10.56配置

1、配置IP地址

[root@test2 ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0

2、配置主机名

[root@test2 ~]# uname -n [root@test2 ~]# hostname test2.local#临时生效 [root@test2 ~]# vim /etc/sysconfig/network#永久生效

3、配置主机名解析

[root@test2 ~]# vim /etc/hosts 添加： 192.168.10.55test1.localtest1 192.168.10.56test2.localtest2

3.2、测试主机名通信

[root@test2 ~]# ping test1.local [root@test2 ~]# ping test1

4、配置SSH互信认证

[root@test2 ~]# ssh-keygen -t rsa -P '' [root@test2 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@192.168.10.56

5、使用ntp同步时间

在crontab中加入每5分钟执行一次ntpdate命令，用来保证服务器时间是同步的

[root@test2 ~]# crontab -e */5 * * * * /sbin/ntpdate 192.168.10.1 &> /dev/null

安装配置heartbeat

CentOS直接yum安装报错，提示找不到可用的软件包

解决办法：

[root@test1 src]# wget http://mirrors.sohu.com/fedora-epel/6/i386/epel-release-6-8.noarch.rpm [root@test1 src]# rpm -ivh epel-release-6-8.noarch.rpm

6.1、安装heartbeat：

[root@test1 src]# yum install heartbeat

6.2、copy配置文件：

[root@test1 src]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/

6.3、配置认证文件：

[root@test1 src]# dd if=/dev/random count=1 bs=512 |md5sum #生成随机数 [root@test1 src]# vim /etc/ha.d/authkeys auth 1 1 md5 d0f70c79eeca5293902aiamheartbeat [root@test1 src]# chmod 600 authkeys

test2节点的heartbeat安装和test1一样，此处略过。

6.4、heartbeat主配置文件参数：

[root@test2 ~]# vim /etc/ha.d/ha.cf #debugfile /var/log/ha-debug #排错日志 logfile #日志位置 keepalive 2 #多长时间发送一次心跳检测，默认2秒，可以使用ms deadtime 30 #多长时间检测不到主机就认为挂掉 warntime 10 #如果没有收到心跳信息，那么在等待多长时间就认为对方挂掉 initdead 120 #第一个节点起来后，等待其他节点的时间 baud 19200 #串行线缆的发送速率是多少 auto_failback on #故障恢复后是否转移回来 ping 10.10.10.254 #ping node，万一节点主机不通，要ping哪个主机 ping_group group1 10.10.10.254 10.10.10.253 #ping node group，只要组内有一台主机能ping通就可以 respawn hacluster /usr/lib/heartbeat/ipfail #当一个heartbeat服务停止了，会重启对端的heartbeat服务 deadping 30 #ping nodes多长时间ping不通，就真的故障了 # serial serialportname ... #串行设备是什么 serial /dev/ttyS0 # Linux serial /dev/cuaa0 # FreeBSD serial /dev/cuad0 # FreeBSD 6.x serial /dev/cua/a # Solaris # What interfaces to broadcast heartbeats over? #如果使用以太网，定义使用单播、组播还是广播发送心跳信息 bcast eth0 #广播 mcast eth0 225.0.0.1 694 1 0 #组播 ucast eth0 192.168.1.2 #单播，只有两个节点的时候才用单播 #定义stonith主机 stonith_host * baytech 10.0.0.3 mylogin mysecretpassword stonith_host ken3 rps10 /dev/ttyS1 kathy 0 stonith_host kathy rps10 /dev/ttyS1 ken3 0 # Tell what machines are in the cluster #告诉集群中有多少个节点，每一个节点用node和主机名写一行，主机名要和uname -n保持一致 node ken3 node kathy 一般只要定义心跳信息的发送方式、和集群中的节点就行。 bcasteth0 nodetest1.local nodetest2.local

6.5、定义haresources资源配置文件：

[root@test2 ~]# vim /etc/ha.d/haresources #node110.0.0.170Filesystem::/dev/sda1::/data1::ext2#默认用作主节点的主机名，要跟uname -n一样。VIP是多少。自动挂载哪个设备，到哪个目录下，文件类型是什么。资源类型的参数要用双冒号隔开 #just.linux-ha.org135.9.216.110http#和上面一样，这里使用的资源是在/etc/rc.d/init.d/下面的，默认先到/etc/ha.d/resource.d/目录下找资源，找不到在到/etc/rc.d/init.d/目录找 master1.localIPaddr::192.168.10.2/24/eth0 mysqld master1.localIPaddr::192.168.10.2/24/eth0 drbddisk::data Filesystem::/dev/drbd1::/data::ext3mysqld#使用IPaddr脚本来配置VIP

6.6、拷贝master1.local的配置文件到master2.local上

[root@test1 ~]# scp -p ha.cf haresources authkeys master2.local:/etc/ha.d/

7、启动heartbeat

[root@test1 ~]# service heartbeat start [root@test1 ~]# ssh master2.local 'service heartbeat start'#一定要在test1上通过ssh的方式启动test2节点的heartbeat

7.1、查看heartbeat启动日志

[root@test1 ~]# tail -f /var/log/messages Feb 16 15:12:45 test-1 heartbeat: [16056]: info: Configuration validated. Starting heartbeat 3.0.4 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: heartbeat: version 3.0.4 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Heartbeat generation: 1455603909 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: ping heartbeat started. Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_TriggerHandler: Added signal manual handler Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_TriggerHandler: Added signal manual handler Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Local status now set to: 'up' Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Link 192.168.10.1:192.168.10.1 up. Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Status update for node 192.168.10.1: status ping Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Link test1.local:eth0 up. Feb 16 15:12:51 test-1 heartbeat: [16057]: info: Link test2.local:eth0 up. Feb 16 15:12:51 test-1 heartbeat: [16057]: info: Status update for node test2.local: status up Feb 16 15:12:51 test-1 harc(default)[16068]: info: Running /etc/ha.d//rc.d/status status Feb 16 15:12:52 test-1 heartbeat: [16057]: WARN: 1 lost packet(s) for [test2.local] [3:5] Feb 16 15:12:52 test-1 heartbeat: [16057]: info: No pkts missing from test2.local! Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Comm_now_up(): updating status to active Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Local status now set to: 'active' Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Status update for node test2.local: status active Feb 16 15:12:52 test-1 harc(default)[16086]: info: Running /etc/ha.d//rc.d/status status Feb 16 15:13:02 test-1 heartbeat: [16057]: info: local resource transition completed. Feb 16 15:13:02 test-1 heartbeat: [16057]: info: Initial resource acquisition complete (T_RESOURCES(us)) Feb 16 15:13:02 test-1 heartbeat: [16057]: info: remote resource transition completed. Feb 16 15:13:02 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16138]: INFO: Resource is stopped Feb 16 15:13:02 test-1 heartbeat: [16102]: info: Local Resource acquisition completed. Feb 16 15:13:02 test-1 harc(default)[16219]: info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp Feb 16 15:13:02 test-1 ip-request-resp(default)[16219]: received ip-request-resp IPaddr::192.168.10.2/24/eth0 OK yes Feb 16 15:13:02 test-1 ResourceManager(default)[16238]: info: Acquiring resource group: test1.local IPaddr::192.168.10.2/24/eth0 mysqld Feb 16 15:13:02 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16264]: INFO: Resource is stopped Feb 16 15:13:03 test-1 ResourceManager(default)[16238]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.10.2/24/eth0 start Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: Adding inet address 192.168.10.2/24 with broadcast address 192.168.10.255 to device eth0 Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: Bringing device eth0 up Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.10.2 eth0 192.168.10.2 auto not_used not_used Feb 16 15:13:03 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16360]: INFO: Success Feb 16 15:13:03 test-1 ResourceManager(default)[16238]: info: Running /etc/init.d/mysqld start Feb 16 15:13:04 test-1 ntpd[1605]: Listen normally on 15 eth0 192.168.10.2 UDP 123

说明：

1、Link test1.local:eth0 up、Link test2.local:eth0 up#两个节点连接成功并为UP状态。

2、Link 192.168.10.1:192.168.10.1 up#ping节点的IP也已经启动

3、info: Running /etc/init.d/mysqld start#mysql启动成功

4、Listen normally on 15 eth0 192.168.10.2 UDP 123#VIP启动成功

7.2、查看heartbeat的VIP

[root@test1 ha.d]# ip add |grep "10.2" inet 192.168.10.55/24 brd 192.168.10.255 scope global eth0 inet 192.168.10.2/24 brd 192.168.10.255 scope global secondary eth0

[root@test-2 ha.d]# ip add |grep "10.2" inet 192.168.10.56/24 brd 192.168.10.255 scope global eth0

注：可以看到现在VIP是在master1.local主机上。而master2.local上没有VIP

8、测试效果

8.1、正常情况下连接mysql

[root@test3 ha.d]# mysql -uroot -h'192.168.10.2' -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.5.44 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show variables like 'server_id'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | server_id | 1 | +---------------+-------+ 1 row in set (0.00 sec) mysql>

8.2、关闭master1.local上的heartbeat

[root@test1 ha.d]# service heartbeat stop Stopping High-Availability services: Done. [root@test1 ha.d]# ip add |grep "192.168.10.2" inet 192.168.10.55/24 brd 192.168.10.255 scope global eth0 [root@test2 ha.d]# ip add |grep "192.168.10.2" inet 192.168.10.56/24 brd 192.168.10.255 scope global eth0 inet 192.168.10.2/24 brd 192.168.10.255 scope global secondary eth0

注：这个时候VIP已经漂移到了master2.local主机上，我们在来看看连接mysql的server_id

[root@test2 ha.d]# mysql -uroot -h'192.168.10.2' -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.5.44 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show variables like 'server_id'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | server_id | 2 | +---------------+-------+ 1 row in set (0.00 sec) mysql>

注：server_id已经从1变成了2，证明此时访问的是master2.local主机上的mysql服务

测试完毕。下面配置drbd让两台mysql服务器之间使用同一个文件系统，以实现mysql的写高可用。

9、配置DRBD

DRBD：（Distributed Replicated Block Device）分布式复制块设备，是linux内核中的一个模块。DRBD作为磁盘镜像来讲，它一定是主从架构的，它决不允许两个节点同时读写，仅允许一个节点能读写，从节点不能读写和挂载，

但是DRDB有双主的概念，主、从的角色可以切换。DRBD分别将位于两台主机上的硬盘或磁盘分区做成镜像设备，当我们客户端的程序向主节点发起存储请求的时候，这个数据会在底层以TCP/IP协议按位同布一份到备节点，

所以这能保证只要我们在主节点上存的数据，备节点上在按位一定有一模一样的一份数据。这是在两台主机上实现的，这意味着DRBD是工作在内核模块当中。不像RAID1的镜像是在同一台主机上实现的。

DRBD双主模型的实现：一个节点在数据访问的时候，它一定会将数据、元数据载入内存的，而且它对于某个文件内核中加锁的操作，另一个节点的内核是看不到的，那如果它能将它自己施加的锁通知给另一个节点的内核就可以了。

在这种情况下，我们就只能通过message layer（heartbeat、corosync都可）、pathmaker（把DRBD定义成资源），然后把这两个主机上对应的镜像格式化成集群文件系统（GFS2/OCFS2）。

这就是基于结合分布式文件锁管理器（DLM Distributed Lock Manager）以及集群文件系统所完成的双主模型。DRBD集群只允许有两个节点，要么双主，要么主从。

9.1、DRBD的三种工作模型

A、数据在本地DRBD存储完成后向应用程序返回存储成功的消息，异步模型。效率高，性能好。数据不安全

B、数据在本地DRBD存储完成后并且通过TCP/IP把所有数据发送到从DRBD中，才向本地的应用程序返回存储成功的消息，半同步模型。一般不用。

C、数据在本地DRBD存储完成后，通过TCP/IP把所有数据发送到从DRBD中，从DRBD存储完成后才向应用程序返回成功的消息，同步模型。效率低，性能若，但是数据安全可靠性大，用的最多。

9.2、DRBD的资源

1、资源名称，可以是任意的ascii码不包含空格的字符

2、DRBD设备，在双方节点上，此DRBD设备的设备文件，一般为/dev/drbdN，其主设备号相同都是147，此设备号用来标识不通的设备

3、磁盘配置，在双方节点上，各自提供的存储设备，可以是个分区，可以是任何类型的块设备，也可以是lvm

4、网络配置，双方数据同步时，所使用的网络属性

9.3、安装DRBD

drbd在2.6.33开始，才整合进内核的。

9.3.1、下载drbd

[root@test1 ~]# wget -O /usr/local/src http://oss.linbit.com/drbd/8.4/drbd-8.4.3.tar.gz

9.3.2、安装drbd软件

[root@test1 ~]# cd /usr/local/src [root@test1 src]# tar -zxvf drbd-8.4.3.tar.gz [root@test1 src]# cd /usr/local/src/drbd-8.4.3 [root@test1 drbd-8.4.3]# ./configure --prefix=/usr/local/drbd --with-km [root@test1 drbd-8.4.3]# make KDIR=/usr/src/kernels/2.6.32-573.18.1.el6.x86_64

[root@test1 drbd-8.4.3]# make install [root@test1 drbd-8.4.3]# mkdir -p /usr/local/drbd/var/run/drbd [root@test1 drbd-8.4.3]# cp /usr/local/drbd/etc/rc.d/init.d/drbd /etc/rc.d/init.d/

9.3.3、安装drbd模块

[root@test1 drbd-8.4.3]# cd drbd/ [root@test1 drbd]# make clean [root@test1 drbd]# make KDIR=/usr/src/kernels/2.6.32-573.18.1.el6.x86_64 [root@test1 drbd]# cp drbd.ko /lib/modules/`uname -r`/kernel/lib/ [root@test1 drbd]# modprobe drbd [root@test1 drbd]# lsmod | grep drbd

9.3.4、为drbd创建新分区

[root@test1 drbd]# fdisk /dev/sdb WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u'). Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-1305, default 1): Using default value 1 Last cylinder, +cylinders or +size{K,M,G} (1-1305, default 1305): +9G Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. AWRNING: Re-reading the partition table failed with error 16: Device or resource busy. The kernel still uses the old table. The new table will be used at the next reboot. Syncing disks. [root@test1 drbd]# partprobe /dev/sdb

test2节点的drbd安装和分区配置步骤略过，和test1上一样安装，test2节点的drbd配置文件保证和test1节点一样，使用scp传到test2节点即可

10、配置drbd

10.1、配置drbd的通用配置文件

[root@test1 drbd.d]# cd /usr/local/drbd/etc/drbd.d [root@test1 drbd.d]# vim global_common.conf global { #global是全局配置 usage-count no; #官方用来统计有多少个客户使用drbd的 # minor-count dialog-refresh disable-ip-verification } common { #通用配置，用来配置所有资源那些相同属性的。为drbd提供默认属性的 protocol C; #默认使用协议C，即同步模型。 handlers { #处理器段，用来配置drbd的故障切换操作 # These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";# pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; #脑裂之后的操作 local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; #本地i/o错误之后的操作 # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb #设备启动时，两个节点要同步，设置节点的等待时间，超时时间等 } options { # cpu-mask on-no-data-accessible } disk { on-io-error detach; #一旦发生i/o错误，就把磁盘卸载。不继续进行同步 # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes # disk-drain md-flushes resync-rate resync-after al-extents # c-plan-ahead c-delay-target c-fill-target c-max-rate # c-min-rate disk-timeout } net { #设置网络的buffers/cache大小，初始化时间等 # protocol timeout max-epoch-size max-buffers unplug-watermark # connect-int ping-int sndbuf-size rcvbuf-size ko-count # allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri # after-sb-1pri after-sb-2pri always-asbp rr-conflict # ping-timeout data-integrity-alg tcp-cork on-congestion # congestion-fill congestion-extents csums-alg verify-alg # use-rle cram-hmac-alg "sha1"; #数据加密使用的算法 shared-secret "mydrbd1fa2jg8"; #验证密码 } syncer { rate 200M; #定义数据传输速率 } }

10.2、配置资源文件，资源配置文件的名字要和资源文件中的一样

[root@test1 drbd.d]# vim mydrbd.res resource mydrbd { #资源名称，可以是任意的ascii码不包含空格的字符 on test1.local { #节点1，每个节点必须要能使用名称的方式解析对方节点 device /dev/drbd0; #drbd设备的文件名叫什么 disk /dev/sdb1; #分区设备是哪个 address 192.168.10.55:7789;#节点ip和监听的端口 meta-disk internal; #drbd的meta（原数据）放在什么地方，internal是放在设备内部 } on test2.local { device /dev/drbd0; disk /dev/sdb1; address 192.168.10.56:7789; meta-disk internal; } }

10.3、两个节点的配置文件一样，使用工具把配置文件传到另一个节点

[root@test1 drbd.d]# scp -r /usr/local/drbd/etc/drbd.* test2.local:/usr/local/drbd/etc/

10.4、在每个节点上初始化已定义的资源

[root@test1 drbd.d]# drbdadm create-md mydrbd --== Thank you for participating in the global usage survey ==-- The server's response is: Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. [root@test1 drbd.d]# [root@test2 drbd.d]# drbdadm create-md mydrbd --== Thank you for participating in the global usage survey ==-- The server's response is: Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. [root@test2 drbd.d]#

10.5、分别启动两个节点的drbd服务

[root@test1 drbd.d]# service drbd start [root@test2 drbd.d]# service drbd start

11、测试drbd的同步

11.1、查看drbd的启动状态

[root@test1 drbd.d]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test1.local, 2016-02-23 10:23:03 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- #两个节点都是从，将来可以把一个提升为主。Inconsistent处于没有同步的状态 ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

11.2、提升一个节点为主，并覆盖从节点的drbd分区数据。在要提升为主的节点上执行

[root@test1 drbd.d]# drbdadm -- --overwrite-data-of-peer primary mydrbd

11.3、查看主节点同步状态

[root@test1 drbd.d]# watch -n 1 cat /proc/drbd Every 1.0s: cat /proc/drbd Tue Feb 23 17:10:55 2016 version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test1.local, 2016-02-23 10:23:03 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n- ns:619656 nr:0 dw:0 dr:627840 al:0 bm:37 lo:1 pe:8 ua:64 ap:0 ep:1 wo:b oos:369144 [=============>.......] sync'ed: 10.3% (369144/987896)K finish: 0:00:12 speed: 25,632 (25,464) K/sec

11.4、查看从节点的状态

[root@test2 drbd]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test2.local, 2016-02-22 16:05:34 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:4 nr:9728024 dw:9728028 dr:1025 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

11.5、在主节点格式化分区并挂在写入数据测试

[root@test1 drbd]# mke2fs -j /dev/drbd0 [root@test1 drbd]# mkdir /mydrbd [root@test1 drbd]# mount /dev/drdb0 /mydrbd [root@test1 drbd]# cd /mydrbd [root@test1 mydrbd]# touch drbd_test_file [root@test1 mydrbd]# ls /mydrbd/ drbd_test_file lost+found

11.6、把主节点降级为从，把从节点提升为主。查看数据是否同步

11.1、主节点操作

11.1.1、卸载分区，注意卸载的时候要退出挂在目录，否则会显示设备忙，不能卸载

[root@test1 mydrbd]# cd ~ [root@test1 ~]# umount /mydrbd [root@test1 ~]# drbdadm secondary mydrbd

11.1.2、查看现在的drbd状态

[root@test1 ~]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test2.local, 2016-02-22 16:05:34 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:4 nr:9728024 dw:9728028 dr:1025 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

注：可以看到，现在drbd的两个节点的状态都是secondary的，下面把从节点提升为主

11.2、从节点操作

11.2.1、提升操作

[root@test2 ~]# drdbadm primary mydrbd

11.2.2、挂在drbd分区

[root@test2 ~]# mkdir /mydrbd [root@test2 ~]# mount /dev/drbd0 /mydrbd/

11.2.3、查看是否有数据

[root@test2 ~]# ls /myddrbd/ drbd_test_file lost+found

注：可以看到从节点切换成主后，已经同步了数据。drbd搭建完成。下面结合corosync+mysql配置双主高可用。

12、结合corosync+drbd+mysql实现数据库双主高可用

将drbd配置为corosync双节点高可用集群中的资源，能够实现主从角色的自动切换，注意，要把某一个服务配置为高可用集群的资源，一定不能让这个服务开机自动启动。

12.1、关闭两台节点的drbd开机自启动

12.1.1、主节点操作

[root@test1 drbd.d]# chkconfig drbd off [root@test1 drbd.d]# chkconfig --list |grep drbd drbd 0:off 1:off 2:off 3:off 4:off 5:off 6:off

12.1.2、从节点操作

[root@test2 drbd.d]# chkconfig drbd off [root@test2 drbd.d]# chkconfig --list |grep drbd drbd 0:off 1:off 2:off 3:off 4:off 5:off 6:off

12.2、卸载drbd的文件系统并把主节点降级为从节点

12.2.1、从节点操作，注意，这里的从节点刚才提升为主了。现在把他降级

[root@test2 drbd]# umount /mydata/ [root@test2 drbd]# drbdadm secondary mydrbd [root@test2 drbd]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test2.local, 2016-02-22 16:05:34 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:8 nr:9728024 dw:9728032 dr:1073 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

注：确保两个节点都是secondary

12.3、停止两个节点的drbd服务

12.3.1、从节点操作

[root@test2 drbd]# service drbd stop Stopping all DRBD resources: . [root@test2 drbd]#

12.3.2、主节点操作

[root@test1 drbd.d]# service drbd stop Stopping all DRBD resources: . [root@test1 drbd.d]#

12.4、安装corosync并创建日志目录

12.4.1、主节点操作

[root@test1 drbd.d]# wget -P /etc/yum.repos.d http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo [root@test1 drbd.d]# yum install corosync pacemaker crmsh [root@test1 drbd.d]# mkdir /var/log/cluster

12.4.2、从节点操作

[root@test1 drbd.d]# wget -P /etc/yum.repos.d http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo [root@test2 drbd.d]# mkdir /var/log/cluster [root@test2 drbd.d]# yum install corosync pacemaker crmsh

12.5、corosync配置文件

12.5.1、主节点操作

[root@test1 drbd.d]# cd /etc/corosync/ [root@test1 corosync]# cp corosync.conf.example corosync.conf

12.6、配置主节点配置文件，生成corosync秘钥文件并复制给从节点（包括主配置文件）

12.6.1、主节点操作

[root@test1 corosync]# vim corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2# secauth: Enable mutual node authentication. If you choose to # enable this ("on"), then do remember to create a shared # secret with "corosync-keygen". secauth: on threads: 2 # interface: define at least one interface to communicate # over. If you define more than one interface stanza, you must # also set rrp_mode. interface { # Rings must be consecutively numbered, starting at 0. ringnumber: 0 # This is normally the *network* address of the # interface to bind to. This ensures that you can use # identical instances of this configuration file # across all your cluster nodes, without having to # modify this option. bindnetaddr: 192.168.10.0 # However, if you have multiple physical network # interfaces configured for the same subnet, then the # network address alone is not sufficient to identify # the interface Corosync should bind to. In that case, # configure the *host* address of the interface # instead: bindnetaddr: 192.168.10.0 # When selecting a multicast address, consider RFC # 2365 (which, among other things, specifies that # 239.255.x.x addresses are left to the discretion of # the network administrator). Do not reuse multicast # addresses across multiple Corosync clusters sharing # the same network. mcastaddr: 239.212.16.19 # Corosync uses the port you specify here for UDP # messaging, and also the immediately preceding # port. Thus if you set this to 5405, Corosync sends # messages over UDP ports 5405 and 5404. mcastport: 5405 # Time-to-live for cluster communication packets. The # number of hops (routers) that this ring will allow # itself to pass. Note that multicast routing must be # specifically enabled on most network routers. ttl: 1 #每一个数据报文不允许经过路由 } } logging { # Log the source file and line where messages are being # generated. When in doubt, leave off. Potentially useful for # debugging. fileline: off # Log to standard error. When in doubt, set to no. Useful when # running in the foreground (when invoking "corosync -f") to_stderr: no # Log to a log file. When set to "no", the "logfile" option # must not be set. to_logfile: yes logfile: /var/log/cluster/corosync.log # Log to the system log daemon. When in doubt, set to yes. to_syslog: no # Log debug messages (very verbose). When in doubt, leave off. debug: off # Log messages with time stamps. When in doubt, set to on # (unless you are only logging to syslog, where double # timestamps can be annoying). timestamp: on logger_subsys