手记

corosync+drbd+mysql实现的高可用


要求:

一、能够在同一网段内直接通信

二、节点名称,要和uname的结果一样,并保证可以根据节点名称解析到节点的IP地址,配置本地/etc/hosts

三、SSH互信通信

四、保证时间同步

环境准备配置:

test1,192.168.10.55配置

1、配置IP地址

[root@test1 ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0

2、配置主机名

[root@test1 ~]# uname -n [root@test1 ~]# hostname master1.local#临时生效 [root@test1 ~]# vim /etc/sysconfig/network#永久生效

3、配置主机名解析

[root@test1 ~]# vim /etc/hosts 添加: 192.168.10.55master1.local 192.168.10.56master2.local

3.2、测试主机名通信

[root@test1 ~]# ping master1.local [root@test1 ~]# ping master2.local

4、配置SSH互信认证

[root@test1 ~]# ssh-keygen -t rsa -P '' [root@test1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@192.168.10.55

5、使用ntp同步时间

在crontab中加入每5分钟执行一次ntpdate命令,用来保证服务器时间是同步的

[root@test1 ~]# crontab -e */5 * * * * /sbin/ntpdate 192.168.10.1 &> /dev/null

test2,192.168.10.56配置

1、配置IP地址

[root@test2 ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0

2、配置主机名

[root@test2 ~]# uname -n [root@test2 ~]# hostname test2.local#临时生效 [root@test2 ~]# vim /etc/sysconfig/network#永久生效

3、配置主机名解析

[root@test2 ~]# vim /etc/hosts 添加: 192.168.10.55test1.localtest1 192.168.10.56test2.localtest2

3.2、测试主机名通信

[root@test2 ~]# ping test1.local [root@test2 ~]# ping test1

4、配置SSH互信认证

[root@test2 ~]# ssh-keygen -t rsa -P '' [root@test2 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@192.168.10.56

5、使用ntp同步时间

在crontab中加入每5分钟执行一次ntpdate命令,用来保证服务器时间是同步的

[root@test2 ~]# crontab -e */5 * * * * /sbin/ntpdate 192.168.10.1 &> /dev/null

安装配置heartbeat

CentOS直接yum安装报错,提示找不到可用的软件包

解决办法:

[root@test1 src]# wget http://mirrors.sohu.com/fedora-epel/6/i386/epel-release-6-8.noarch.rpm [root@test1 src]# rpm -ivh epel-release-6-8.noarch.rpm

6.1、安装heartbeat:

[root@test1 src]# yum install heartbeat

6.2、copy配置文件:

      

[root@test1 src]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/

6.3、配置认证文件:

[root@test1 src]# dd if=/dev/random count=1 bs=512 |md5sum   #生成随机数 [root@test1 src]# vim /etc/ha.d/authkeys auth 1 1 md5 d0f70c79eeca5293902aiamheartbeat [root@test1 src]# chmod 600 authkeys

test2节点的heartbeat安装和test1一样,此处略过。

6.4、heartbeat主配置文件参数:

[root@test2 ~]# vim /etc/ha.d/ha.cf #debugfile /var/log/ha-debug    #排错日志 logfile                         #日志位置 keepalive 2                 #多长时间发送一次心跳检测,默认2秒,可以使用ms deadtime 30                     #多长时间检测不到主机就认为挂掉 warntime 10                     #如果没有收到心跳信息,那么在等待多长时间就认为对方挂掉 initdead 120                    #第一个节点起来后,等待其他节点的时间 baud   19200                   #串行线缆的发送速率是多少 auto_failback on                #故障恢复后是否转移回来 ping 10.10.10.254               #ping node,万一节点主机不通,要ping哪个主机 ping_group group1 10.10.10.254 10.10.10.253            #ping node group,只要组内有一台主机能ping通就可以 respawn hacluster /usr/lib/heartbeat/ipfail            #当一个heartbeat服务停止了,会重启对端的heartbeat服务 deadping 30                    #ping nodes多长时间ping不通,就真的故障了 # serial  serialportname ...                          #串行设备是什么 serial /dev/ttyS0            # Linux serial /dev/cuaa0                  # FreeBSD serial /dev/cuad0                  # FreeBSD 6.x serial /dev/cua/a                  # Solaris #  What interfaces to broadcast heartbeats over?            #如果使用以太网,定义使用单播、组播还是广播发送心跳信息 bcast  eth0                            #广播 mcast eth0 225.0.0.1 694 1 0                                    #组播 ucast eth0 192.168.1.2                                             #单播,只有两个节点的时候才用单播 #定义stonith主机 stonith_host *     baytech 10.0.0.3 mylogin mysecretpassword stonith_host ken3  rps10 /dev/ttyS1 kathy 0  stonith_host kathy rps10 /dev/ttyS1 ken3 0  #    Tell what machines are in the cluster                 #告诉集群中有多少个节点,每一个节点用node和主机名写一行,主机名要和uname -n保持一致 node   ken3 node   kathy 一般只要定义心跳信息的发送方式、和集群中的节点就行。 bcasteth0 nodetest1.local nodetest2.local

6.5、定义haresources资源配置文件:

[root@test2 ~]# vim /etc/ha.d/haresources #node110.0.0.170Filesystem::/dev/sda1::/data1::ext2#默认用作主节点的主机名,要跟uname -n一样。VIP是多少。自动挂载哪个设备,到哪个目录下,文件类型是什么。资源类型的参数要用双冒号隔开 #just.linux-ha.org135.9.216.110http#和上面一样,这里使用的资源是在/etc/rc.d/init.d/下面的,默认先到/etc/ha.d/resource.d/目录下找资源,找不到在到/etc/rc.d/init.d/目录找 master1.localIPaddr::192.168.10.2/24/eth0 mysqld master1.localIPaddr::192.168.10.2/24/eth0 drbddisk::data Filesystem::/dev/drbd1::/data::ext3mysqld#使用IPaddr脚本来配置VIP

6.6、拷贝master1.local的配置文件到master2.local上

[root@test1 ~]# scp -p ha.cf haresources authkeys master2.local:/etc/ha.d/

7、启动heartbeat

[root@test1 ~]# service heartbeat start [root@test1 ~]# ssh master2.local 'service heartbeat start'#一定要在test1上通过ssh的方式启动test2节点的heartbeat

7.1、查看heartbeat启动日志

[root@test1 ~]# tail -f /var/log/messages Feb 16 15:12:45 test-1 heartbeat: [16056]: info: Configuration validated. Starting heartbeat 3.0.4 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: heartbeat: version 3.0.4 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Heartbeat generation: 1455603909 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: ping heartbeat started. Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_TriggerHandler: Added signal manual handler Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_TriggerHandler: Added signal manual handler Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Local status now set to: 'up' Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Link 192.168.10.1:192.168.10.1 up. Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Status update for node 192.168.10.1: status ping Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Link test1.local:eth0 up. Feb 16 15:12:51 test-1 heartbeat: [16057]: info: Link test2.local:eth0 up. Feb 16 15:12:51 test-1 heartbeat: [16057]: info: Status update for node test2.local: status up Feb 16 15:12:51 test-1 harc(default)[16068]: info: Running /etc/ha.d//rc.d/status status Feb 16 15:12:52 test-1 heartbeat: [16057]: WARN: 1 lost packet(s) for [test2.local] [3:5]     Feb 16 15:12:52 test-1 heartbeat: [16057]: info: No pkts missing from test2.local! Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Comm_now_up(): updating status to active Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Local status now set to: 'active' Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Status update for node test2.local: status active Feb 16 15:12:52 test-1 harc(default)[16086]: info: Running /etc/ha.d//rc.d/status status Feb 16 15:13:02 test-1 heartbeat: [16057]: info: local resource transition completed. Feb 16 15:13:02 test-1 heartbeat: [16057]: info: Initial resource acquisition complete (T_RESOURCES(us)) Feb 16 15:13:02 test-1 heartbeat: [16057]: info: remote resource transition completed. Feb 16 15:13:02 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16138]: INFO:  Resource is stopped Feb 16 15:13:02 test-1 heartbeat: [16102]: info: Local Resource acquisition completed. Feb 16 15:13:02 test-1 harc(default)[16219]: info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp Feb 16 15:13:02 test-1 ip-request-resp(default)[16219]: received ip-request-resp IPaddr::192.168.10.2/24/eth0 OK yes Feb 16 15:13:02 test-1 ResourceManager(default)[16238]: info: Acquiring resource group: test1.local IPaddr::192.168.10.2/24/eth0 mysqld Feb 16 15:13:02 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16264]: INFO:  Resource is stopped Feb 16 15:13:03 test-1 ResourceManager(default)[16238]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.10.2/24/eth0 start Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: Adding inet address 192.168.10.2/24 with broadcast address 192.168.10.255 to device eth0 Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: Bringing device eth0 up Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.10.2 eth0 192.168.10.2 auto not_used not_used Feb 16 15:13:03 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16360]: INFO:  Success Feb 16 15:13:03 test-1 ResourceManager(default)[16238]: info: Running /etc/init.d/mysqld  start Feb 16 15:13:04 test-1 ntpd[1605]: Listen normally on 15 eth0 192.168.10.2 UDP 123

说明:

1、Link test1.local:eth0 up、Link test2.local:eth0 up#两个节点连接成功并为UP状态。

2、Link 192.168.10.1:192.168.10.1 up#ping节点的IP也已经启动

3、info: Running /etc/init.d/mysqld  start#mysql启动成功

4、Listen normally on 15 eth0 192.168.10.2 UDP 123#VIP启动成功

7.2、查看heartbeat的VIP

[root@test1 ha.d]# ip add |grep "10.2" inet 192.168.10.55/24 brd 192.168.10.255 scope global eth0 inet 192.168.10.2/24 brd 192.168.10.255 scope global secondary eth0

[root@test-2 ha.d]# ip add |grep "10.2" inet 192.168.10.56/24 brd 192.168.10.255 scope global eth0

注:可以看到现在VIP是在master1.local主机上。而master2.local上没有VIP

8、测试效果

8.1、正常情况下连接mysql

[root@test3 ha.d]# mysql -uroot -h'192.168.10.2' -p Enter password:  Welcome to the MySQL monitor.  Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.5.44 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show variables like 'server_id'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | server_id     | 1     | +---------------+-------+ 1 row in set (0.00 sec) mysql>

8.2、关闭master1.local上的heartbeat

[root@test1 ha.d]# service heartbeat stop Stopping High-Availability services: Done. [root@test1 ha.d]# ip add |grep "192.168.10.2" inet 192.168.10.55/24 brd 192.168.10.255 scope global eth0 [root@test2 ha.d]# ip add |grep "192.168.10.2" inet 192.168.10.56/24 brd 192.168.10.255 scope global eth0 inet 192.168.10.2/24 brd 192.168.10.255 scope global secondary eth0

注:这个时候VIP已经漂移到了master2.local主机上,我们在来看看连接mysql的server_id

[root@test2 ha.d]# mysql -uroot -h'192.168.10.2' -p Enter password:  Welcome to the MySQL monitor.  Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.5.44 Source distribution Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show variables like 'server_id'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | server_id     | 2     | +---------------+-------+ 1 row in set (0.00 sec) mysql>

注:server_id已经从1变成了2,证明此时访问的是master2.local主机上的mysql服务

测试完毕。下面配置drbd让两台mysql服务器之间使用同一个文件系统,以实现mysql的写高可用。

9、配置DRBD

DRBD:(Distributed Replicated Block Device)分布式复制块设备,是linux内核中的一个模块。DRBD作为磁盘镜像来讲,它一定是主从架构的,它决不允许两个节点同时读写,仅允许一个节点能读写,从节点不能读写和挂载,

但是DRDB有双主的概念,主、从的角色可以切换。DRBD分别将位于两台主机上的硬盘或磁盘分区做成镜像设备,当我们客户端的程序向主节点发起存储请求的时候,这个数据会在底层以TCP/IP协议按位同布一份到备节点,

所以这能保证只要我们在主节点上存的数据,备节点上在按位一定有一模一样的一份数据。这是在两台主机上实现的,这意味着DRBD是工作在内核模块当中。不像RAID1的镜像是在同一台主机上实现的。

DRBD双主模型的实现:一个节点在数据访问的时候,它一定会将数据、元数据载入内存的,而且它对于某个文件内核中加锁的操作,另一个节点的内核是看不到的,那如果它能将它自己施加的锁通知给另一个节点的内核就可以了。

在这种情况下,我们就只能通过message layer(heartbeat、corosync都可)、pathmaker(把DRBD定义成资源),然后把这两个主机上对应的镜像格式化成集群文件系统(GFS2/OCFS2)。

这就是基于结合分布式文件锁管理器(DLM Distributed Lock Manager)以及集群文件系统所完成的双主模型。DRBD集群只允许有两个节点,要么双主,要么主从。

9.1、DRBD的三种工作模型

A、数据在本地DRBD存储完成后向应用程序返回存储成功的消息,异步模型。效率高,性能好。数据不安全

B、数据在本地DRBD存储完成后并且通过TCP/IP把所有数据发送到从DRBD中,才向本地的应用程序返回存储成功的消息,半同步模型。一般不用。

C、数据在本地DRBD存储完成后,通过TCP/IP把所有数据发送到从DRBD中,从DRBD存储完成后才向应用程序返回成功的消息,同步模型。效率低,性能若,但是数据安全可靠性大,用的最多。

9.2、DRBD的资源

1、资源名称,可以是任意的ascii码不包含空格的字符

2、DRBD设备,在双方节点上,此DRBD设备的设备文件,一般为/dev/drbdN,其主设备号相同都是147,此设备号用来标识不通的设备

3、磁盘配置,在双方节点上,各自提供的存储设备,可以是个分区,可以是任何类型的块设备,也可以是lvm

4、网络配置,双方数据同步时,所使用的网络属性

9.3、安装DRBD

drbd在2.6.33开始,才整合进内核的。

9.3.1、下载drbd

[root@test1 ~]# wget -O /usr/local/src http://oss.linbit.com/drbd/8.4/drbd-8.4.3.tar.gz

9.3.2、安装drbd软件

[root@test1 ~]# cd /usr/local/src [root@test1 src]# tar -zxvf drbd-8.4.3.tar.gz [root@test1 src]# cd /usr/local/src/drbd-8.4.3 [root@test1 drbd-8.4.3]# ./configure --prefix=/usr/local/drbd --with-km [root@test1 drbd-8.4.3]# make KDIR=/usr/src/kernels/2.6.32-573.18.1.el6.x86_64

[root@test1 drbd-8.4.3]# make install [root@test1 drbd-8.4.3]# mkdir -p /usr/local/drbd/var/run/drbd [root@test1 drbd-8.4.3]# cp /usr/local/drbd/etc/rc.d/init.d/drbd  /etc/rc.d/init.d/

9.3.3、安装drbd模块

[root@test1 drbd-8.4.3]# cd drbd/ [root@test1 drbd]# make clean [root@test1 drbd]# make KDIR=/usr/src/kernels/2.6.32-573.18.1.el6.x86_64 [root@test1 drbd]# cp drbd.ko /lib/modules/`uname -r`/kernel/lib/ [root@test1 drbd]# modprobe drbd [root@test1 drbd]# lsmod | grep drbd

9.3.4、为drbd创建新分区

[root@test1 drbd]# fdisk /dev/sdb WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u'). Command (m for help): n Command action e   extended p   primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-1305, default 1):  Using default value 1 Last cylinder, +cylinders or +size{K,M,G} (1-1305, default 1305): +9G Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. AWRNING: Re-reading the partition table failed with error 16: Device or resource busy. The kernel still uses the old table. The new table will be used at the next reboot. Syncing disks. [root@test1 drbd]# partprobe /dev/sdb

test2节点的drbd安装和分区配置步骤略过,和test1上一样安装,test2节点的drbd配置文件保证和test1节点一样,使用scp传到test2节点即可

10、配置drbd

10.1、配置drbd的通用配置文件

[root@test1 drbd.d]# cd /usr/local/drbd/etc/drbd.d [root@test1 drbd.d]# vim global_common.conf  global {                #global是全局配置 usage-count no;        #官方用来统计有多少个客户使用drbd的 # minor-count dialog-refresh disable-ip-verification } common {              #通用配置,用来配置所有资源那些相同属性的。为drbd提供默认属性的 protocol C;            #默认使用协议C,即同步模型。 handlers {        #处理器段,用来配置drbd的故障切换操作 # These are EXAMPLE handlers only. # They may have severe implications, # like hard resetting the node under certain circumstances. # Be careful when chosing your poison. pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";# pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";        #脑裂之后的操作 local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";                    #本地i/o错误之后的操作 # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup {      # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb    #设备启动时,两个节点要同步,设置节点的等待时间,超时时间等 } options { # cpu-mask on-no-data-accessible } disk { on-io-error detach;            #一旦发生i/o错误,就把磁盘卸载。不继续进行同步 # size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes # disk-drain md-flushes resync-rate resync-after al-extents # c-plan-ahead c-delay-target c-fill-target c-max-rate # c-min-rate disk-timeout } net {                         #设置网络的buffers/cache大小,初始化时间等 # protocol timeout max-epoch-size max-buffers unplug-watermark # connect-int ping-int sndbuf-size rcvbuf-size ko-count # allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri # after-sb-1pri after-sb-2pri always-asbp rr-conflict # ping-timeout data-integrity-alg tcp-cork on-congestion # congestion-fill congestion-extents csums-alg verify-alg # use-rle cram-hmac-alg "sha1";        #数据加密使用的算法 shared-secret "mydrbd1fa2jg8";        #验证密码 } syncer { rate 200M;                        #定义数据传输速率 } }

10.2、配置资源文件,资源配置文件的名字要和资源文件中的一样

[root@test1 drbd.d]# vim mydrbd.res resource mydrbd {            #资源名称,可以是任意的ascii码不包含空格的字符 on test1.local {            #节点1,每个节点必须要能使用名称的方式解析对方节点 device /dev/drbd0;            #drbd设备的文件名叫什么 disk /dev/sdb1;                #分区设备是哪个 address 192.168.10.55:7789;#节点ip和监听的端口 meta-disk internal;            #drbd的meta(原数据)放在什么地方,internal是放在设备内部 } on test2.local { device /dev/drbd0; disk /dev/sdb1; address 192.168.10.56:7789; meta-disk internal; } }

10.3、两个节点的配置文件一样,使用工具把配置文件传到另一个节点    

[root@test1 drbd.d]# scp -r /usr/local/drbd/etc/drbd.* test2.local:/usr/local/drbd/etc/

10.4、在每个节点上初始化已定义的资源

[root@test1 drbd.d]# drbdadm create-md mydrbd --== Thank you for participating in the global usage survey ==-- The server's response is: Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. [root@test1 drbd.d]#  [root@test2 drbd.d]# drbdadm create-md mydrbd --== Thank you for participating in the global usage survey ==-- The server's response is: Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. [root@test2 drbd.d]#

10.5、分别启动两个节点的drbd服务

[root@test1 drbd.d]# service drbd start [root@test2 drbd.d]# service drbd start

11、测试drbd的同步

11.1、查看drbd的启动状态

[root@test1 drbd.d]# cat /proc/drbd version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test1.local, 2016-02-23 10:23:03 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----        #两个节点都是从,将来可以把一个提升为主。Inconsistent处于没有同步的状态 ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

11.2、提升一个节点为主,并覆盖从节点的drbd分区数据。在要提升为主的节点上执行

[root@test1 drbd.d]# drbdadm -- --overwrite-data-of-peer primary mydrbd

11.3、查看主节点同步状态

[root@test1 drbd.d]# watch -n 1 cat /proc/drbd Every 1.0s: cat /proc/drbd                                                                                                                                                                Tue Feb 23 17:10:55 2016 version: 8.4.3 (api:1/proto:86-101)     GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test1.local, 2016-02-23 10:23:03 0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n- ns:619656 nr:0 dw:0 dr:627840 al:0 bm:37 lo:1 pe:8 ua:64 ap:0 ep:1 wo:b oos:369144 [=============>.......] sync'ed: 10.3% (369144/987896)K finish: 0:00:12 speed: 25,632 (25,464) K/sec

11.4、查看从节点的状态

[root@test2 drbd]# cat /proc/drbd  version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test2.local, 2016-02-22 16:05:34     0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----- ns:4 nr:9728024 dw:9728028 dr:1025 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

11.5、在主节点格式化分区并挂在写入数据测试

[root@test1 drbd]# mke2fs -j /dev/drbd0 [root@test1 drbd]# mkdir /mydrbd [root@test1 drbd]# mount /dev/drdb0 /mydrbd [root@test1 drbd]# cd /mydrbd [root@test1 mydrbd]# touch drbd_test_file [root@test1 mydrbd]# ls /mydrbd/ drbd_test_file  lost+found

11.6、把主节点降级为从,把从节点提升为主。查看数据是否同步

11.1、主节点操作

11.1.1、卸载分区,注意卸载的时候要退出挂在目录,否则会显示设备忙,不能卸载

[root@test1 mydrbd]# cd ~ [root@test1 ~]# umount /mydrbd [root@test1 ~]# drbdadm secondary mydrbd

11.1.2、查看现在的drbd状态

[root@test1 ~]# cat /proc/drbd  version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test2.local, 2016-02-22 16:05:34     0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:4 nr:9728024 dw:9728028 dr:1025 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

注:可以看到,现在drbd的两个节点的状态都是secondary的,下面把从节点提升为主

11.2、从节点操作

11.2.1、提升操作

[root@test2 ~]# drdbadm primary mydrbd

11.2.2、挂在drbd分区

[root@test2 ~]# mkdir /mydrbd [root@test2 ~]# mount /dev/drbd0 /mydrbd/

11.2.3、查看是否有数据

[root@test2 ~]# ls /myddrbd/ drbd_test_file  lost+found

注:可以看到从节点切换成主后,已经同步了数据。drbd搭建完成。下面结合corosync+mysql配置双主高可用。

12、结合corosync+drbd+mysql实现数据库双主高可用

将drbd配置为corosync双节点高可用集群中的资源,能够实现主从角色的自动切换,注意,要把某一个服务配置为高可用集群的资源,一定不能让这个服务开机自动启动。

12.1、关闭两台节点的drbd开机自启动

12.1.1、主节点操作

[root@test1 drbd.d]# chkconfig drbd off [root@test1 drbd.d]# chkconfig --list |grep drbd drbd          0:off    1:off    2:off    3:off    4:off    5:off    6:off

12.1.2、从节点操作

      

[root@test2 drbd.d]# chkconfig drbd off [root@test2 drbd.d]# chkconfig --list |grep drbd drbd           0:off    1:off    2:off    3:off    4:off    5:off    6:off

12.2、卸载drbd的文件系统并把主节点降级为从节点

12.2.1、从节点操作,注意,这里的从节点刚才提升为主了。现在把他降级

[root@test2 drbd]# umount /mydata/ [root@test2 drbd]# drbdadm secondary mydrbd [root@test2 drbd]# cat /proc/drbd  version: 8.4.3 (api:1/proto:86-101) GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@test2.local, 2016-02-22 16:05:34 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:8 nr:9728024 dw:9728032 dr:1073 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

注:确保两个节点都是secondary

12.3、停止两个节点的drbd服务

12.3.1、从节点操作

      

[root@test2 drbd]# service drbd stop Stopping all DRBD resources: . [root@test2 drbd]#

12.3.2、主节点操作

[root@test1 drbd.d]# service drbd stop Stopping all DRBD resources: . [root@test1 drbd.d]#

12.4、安装corosync并创建日志目录

12.4.1、主节点操作

[root@test1 drbd.d]# wget -P /etc/yum.repos.d http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo [root@test1 drbd.d]# yum install corosync pacemaker crmsh [root@test1 drbd.d]# mkdir /var/log/cluster

12.4.2、从节点操作

[root@test1 drbd.d]# wget  -P /etc/yum.repos.d http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo [root@test2 drbd.d]# mkdir /var/log/cluster [root@test2 drbd.d]# yum install corosync pacemaker crmsh

12.5、corosync配置文件

12.5.1、主节点操作

[root@test1 drbd.d]# cd /etc/corosync/ [root@test1 corosync]# cp corosync.conf.example corosync.conf

12.6、配置主节点配置文件,生成corosync秘钥文件并复制给从节点(包括主配置文件)

12.6.1、主节点操作

[root@test1 corosync]# vim corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2# secauth: Enable mutual node authentication. If you choose to # enable this ("on"), then do remember to create a shared # secret with "corosync-keygen". secauth: on threads: 2 # interface: define at least one interface to communicate # over. If you define more than one interface stanza, you must # also set rrp_mode. interface { # Rings must be consecutively numbered, starting at 0. ringnumber: 0 # This is normally the *network* address of the # interface to bind to. This ensures that you can use # identical instances of this configuration file # across all your cluster nodes, without having to # modify this option. bindnetaddr: 192.168.10.0 # However, if you have multiple physical network # interfaces configured for the same subnet, then the # network address alone is not sufficient to identify # the interface Corosync should bind to. In that case, # configure the *host* address of the interface # instead: bindnetaddr: 192.168.10.0 # When selecting a multicast address, consider RFC # 2365 (which, among other things, specifies that # 239.255.x.x addresses are left to the discretion of # the network administrator). Do not reuse multicast # addresses across multiple Corosync clusters sharing # the same network. mcastaddr: 239.212.16.19 # Corosync uses the port you specify here for UDP     # messaging, and also the immediately preceding # port. Thus if you set this to 5405, Corosync sends # messages over UDP ports 5405 and 5404. mcastport: 5405 # Time-to-live for cluster communication packets. The     # number of hops (routers) that this ring will allow # itself to pass. Note that multicast routing must be # specifically enabled on most network routers. ttl: 1            #每一个数据报文不允许经过路由 } }     logging {     # Log the source file and line where messages are being # generated. When in doubt, leave off. Potentially useful for # debugging. fileline: off # Log to standard error. When in doubt, set to no. Useful when # running in the foreground (when invoking "corosync -f") to_stderr: no # Log to a log file. When set to "no", the "logfile" option # must not be set. to_logfile: yes logfile: /var/log/cluster/corosync.log # Log to the system log daemon. When in doubt, set to yes. to_syslog: no # Log debug messages (very verbose). When in doubt, leave off. debug: off # Log messages with time stamps. When in doubt, set to on # (unless you are only logging to syslog, where double # timestamps can be annoying). timestamp: on logger_subsys


0人推荐
随时随地看视频
慕课网APP