概论:
1.构建文件服务器的高可用性(HA)群集:
本实验部署DRBD + HEARDBEAT + NFS 环境,建立一个高可用(HA)的文件服务器集群。在方案中,通过DRBD保证了服务器数据的完整性和一致性。DRBD类似于一个网络RAID-1功能。当你将数据写入本地文件系统时,数据还将会被推送到网络中另一台主机上,以相同的形式记录在一个另文件系统中。主节点与备节点的数据可以保证实时相互同步。当本地主服务器出现故障时,备份服务器上还会保留有一份相同的数据,可以继续使用。在高可用(HA)中使用DRBD功能,可以代替使用一个共享盘阵。因为数据同时存在于本地主服务器和备份服务器上。切换时,远程主机只要使用它上面的那份备份数据,就可以继续提供主服务器上相同的服务,并且client用户对主服务器的故障无感知。
2.简化拓扑图:
一、准备工作及简单配置
1.node1的基本配置及新建磁盘
1.1 查看主机名并配置ip地址
[root@mail ~]# cat /etc/sysconfig/network
NETWORKING=yes NETWORKING_IPV6=yes HOSTNAME=node1.gjp.com [root@mail ~]# hostname node1.gjp.com[root@mail ~]# logout 登出,重新登录,才能把名字改过来!
Xshell:\> ssh 192.168.10.2
[root@node1 ~]# setup
root@node1 ~]# service network restart
Shutting down interface eth0: [ OK ] Shutting down loopback interface: [ OK ] Disabling IPv4 packet forwarding: net.ipv4.ip_forward = 0 [ OK ] Bringing up loopback interface: [ OK ] Bringing up interface eth0: [ OK ][ ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:0C:29:F9:1C:6F inet addr:192.168.10.2 Bcast:192.168.10.255 Mask:255.255.255.01.2 查看系统信息,同步时间
[root@node1 ~]# uname -rv
2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:54 EDT 2009 [root@node1 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.4 (Tikanga)[root@node1 ~]# hwclock -s
[root@node1 ~]# date Thu Oct 18 14:32:02 CST 20121.3 配置/etc/hosts 文件(就不用dns了)
[root@node1 ~]# echo "192.168.10.2 node1.gjp.com node1 " >>/etc/hosts
[root@node1 ~]# echo "192.168.10.6 node2.gjp.com node2 " >>/etc/hosts [root@node1 ~]# cat /etc/hosts # Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.10.2 node1.gjp.com node1 192.168.10.6 node2.gjp.com node21.4 创建一个新的的分区(磁盘空间)用于实现DRBD技术
[root@node1 ~]# fdisk -l //查看当前分区
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux /dev/sda2 14 1288 10241437+ 83 Linux /dev/sda3 1289 1353 522112+ 82 Linux swap / Solaris [root@node1 ~]# fdisk /dev/sda // 开始创建磁盘空间The number of cylinders for this disk is set to 2610.
There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK)Command (m for help): n // 新建分区Command action
e extended p primary partition (1-4) e //扩展分区Selected partition 4 First cylinder (1354-2610, default 1354): Using default value 1354 Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): Using default value 2610Command (m for help): p //显示分区情况
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux /dev/sda2 14 1288 10241437+ 83 Linux /dev/sda3 1289 1353 522112+ 82 Linux swap / Solaris /dev/sda4 1354 2610 10096852+ 5 ExtendedCommand (m for help): n First cylinder (1354-2610, default 1354):
Using default value 1354 Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): +500m// 建的逻辑分区,磁盘大小:500m
Command (m for help): p
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux /dev/sda2 14 1288 10241437+ 83 Linux /dev/sda3 1289 1353 522112+ 82 Linux swap / Solaris /dev/sda4 1354 2610 10096852+ 5 Extended /dev/sda5 1354 1415 497983+ 83 LinuxCommand (m for help): w //保存The partition table has been altered!
Calling ioctl() to re-read partition table.
WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at the next reboot. Syncing disks.[root@node1 ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux /dev/sda2 14 1288 10241437+ 83 Linux /dev/sda3 1289 1353 522112+ 82 Linux swap / Solaris /dev/sda4 1354 2610 10096852+ 5 Extended /dev/sda5 1354 1415 497983+ 83 Linux下面让刚创建的分区生效:[root@node1 ~]# partprobe /dev/sda [root@node1 ~]# cat /proc/partitions major minor #blocks name
8 0 20971520 sda
8 1 104391 sda1 8 2 10241437 sda2 8 3 522112 sda3 8 4 0 sda4 8 5 497983 sda52. node2基本配置及新建磁盘
2.1 查看主机名称,修改其ip地址
[root@node2 ~]# cat /etc/sysconfig/network
NETWORKING=yes NETWORKING_IPV6=yes HOSTNAME=node2.gjp.com [root@node2 ~]# hostname node2.gjp.com [root@node2 ~]# setup[root@node2 ~]# service network restart Shutting down interface eth0: [ OK ] Shutting down loopback interface: [ OK ] Disabling IPv4 packet forwarding: net.ipv4.ip_forward = 0 [ OK ] Bringing up loopback interface: [ OK ] Bringing up interface eth0: [ OK ]
[root@node2 ~]# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:0C:29:F5:92:A1 inet addr:192.168.10.6 Bcast:192.168.10.255 Mask:255.255.255.02.2 查看系统信息,同步时间
[root@node2 ~]# uname -rv
2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:54 EDT 2009 [root@node2 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.4 (Tikanga) [root@node2 ~]# hwclock -s [root@node2 ~]# date Thu Oct 18 14:55:46 CST 20122.3 配置/etc/hosts文件
[root@node2 ~]# cat /etc/hosts
# Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 192.168.10.2 node1.gjp.com node1 192.168.10.6 node2.gjp.com node22.4 创建一个新的磁盘空间有利于实现DRBD技术
[root@node2 ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux /dev/sda2 14 1288 10241437+ 83 Linux /dev/sda3 1289 1353 522112+ 82 Linux swap / Solaris [root@node2 ~]# fdisk /dev/sdaThe number of cylinders for this disk is set to 2610.
There is nothing wrong with that, but this is larger than 1024, and could in certain setups cause problems with: 1) software that runs at boot time (e.g., old versions of LILO) 2) booting and partitioning software from other OSs (e.g., DOS FDISK, OS/2 FDISK)Command (m for help): n
Command action e extended p primary partition (1-4) e Selected partition 4 First cylinder (1354-2610, default 1354): Using default value 1354 Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): Using default value 2610Command (m for help): n
First cylinder (1354-2610, default 1354): Using default value 1354 Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): +500mCommand (m for help): w
The partition table has been altered!Calling ioctl() to re-read partition table.
WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at the next reboot. Syncing disks.[root@node2 ~]# fdisk -l
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders Units = cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux /dev/sda2 14 1288 10241437+ 83 Linux /dev/sda3 1289 1353 522112+ 82 Linux swap / Solaris /dev/sda4 1354 2610 10096852+ 5 Extended /dev/sda5 1354 1415 497983+ 83 Linux [root@node2 ~]# partprobe /dev/sda [root@node2 ~]# cat /proc/partitions major minor #blocks name8 0 20971520 sda
8 1 104391 sda1 8 2 10241437 sda2 8 3 522112 sda3 8 4 0 sda4 8 5 497983 sda5两端名称能够相互解析:
[root@node1 ~]# ping node2.gjp.com
PING node2.gjp.com (192.168.10.6) 56(84) bytes of data. 64 bytes from node2.gjp.com (192.168.10.6): icmp_seq=1 ttl=64 time=4.29 m[root@node2 ~]# ping node1.gjp.com
PING node1.gjp.com (192.168.10.2) 56(84) bytes of data. 64 bytes from node1.gjp.com (192.168.10.2): icmp_seq=1 ttl=64 time=5.81 ms 64 bytes from node1.gjp.com (192.168.10.2): icmp_seq=2 ttl=64 time=0.491 ms
3. 在node1 和node2 上配置ssh 密钥信息
有利于以后在一个节点对另一个节点直接操作,实现无障碍通信
3.1 在node1 上配置ssh密钥信息
[root@node1 ~]# ssh-
ssh-add ssh-agent ssh-copy-id ssh-keygen ssh-keyscan [root@node1 ~]# ssh-keygen -t rsa //生成钥匙 Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: 21:8b:79:61:6f:7e:4d:b6:61:18:11:cb:47:83:d7:98把生成的公钥放到对方的机器上,传递文件时就无需输入密码了! [root@node1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@node2.gjp.com 15
The authenticity of host 'node2.gjp.com (192.168.10.6)' can't be established. RSA key fingerprint is 87:be:8b:a4:bd:11:11:10:c2:ec:2d:ef:02:68:f6:0e. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'node2.gjp.com,192.168.10.6' (RSA) to the list of known hosts. root@node2.gjp.com's password: Now try logging into the machine, with "ssh 'root@node2.gjp.com'", and check in:.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
3.2 在node2 上配置ssh密钥信息:
[root@node2 ~]# ssh-keygen -t rsa
[root@node2 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@node1.gjp.com 3.3 测试:[root@node2 ~]# ssh node1.gjp.com 'ifconfig'
无需输入密码,即可执行对方机器上的命令!
eth0 Link encap:Ethernet HWaddr 00:0C:29:F9:1C:6F inet addr:192.168.10.2 Bcast:192.168.10.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fef9:1c6f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1968 errors:0 dropped:0 overruns:0 frame:0 TX packets:1450 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:182130 (177.8 KiB) TX bytes:198413 (193.7 KiB) Interrupt:67 Base address:0x2000二、DBRD安装配置步骤
在node1和node2上上传所需软件包:
上传到node2的/root目录下:
[root@node2 ~]# ll drbd* kmod*
-rw-r--r-- 1 root root 221868 Oct 18 15:29 drbd83-8.3.8-1.el5.centos.i386.rpm -rw-r--r-- 1 root root 125974 Oct 18 15:29 kmod-drbd83-8.3.8-1.el5.centos.i686.rpm1.安装DRBD套件
上传的包没有依赖性,可直接用rpm进行安装
[root@node2 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm
warning: drbd83-8.3.8-1.el5.centos.i386.rpm: Header V3 DSA signature: NOKEY, key ID e8562897 Preparing... ########################################### [100%] 1:drbd83 ########################################### [100%] [root@node2 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm warning: kmod-drbd83-8.3.8-1.el5.centos.i686.rpm: Header V3 DSA signature: NOKEY, key ID e8562897 Preparing... ########################################### [100%] 1:kmod-drbd83 ########################################### [100%]把上传的软件包拷贝到node1机器上:
[root@node2 ~]# scp drbd* kmod* node1.gjp.com:/root
drbd83-8.3.8-1.el5.centos.i386.rpm 100% 217KB 216.7KB/s 00:00 kmod-drbd83-8.3.8-1.el5.centos.i686.rpm 100% 123KB 123.0KB/s 00:00到node1上进行安装:
[root@node1 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm
warning: drbd83-8.3.8-1.el5.centos.i386.rpm: Header V3 DSA signature: NOKEY, key ID e8562897 Preparing... ########################################### [100%] 1:drbd83 ########################################### [100%] [root@node1 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm warning: kmod-drbd83-8.3.8-1.el5.centos.i686.rpm: Header V3 DSA signature: NOKEY, key ID e8562897 Preparing... ########################################### [100%] 1:kmod-drbd83 ########################################### [100%]2. 加载DRBD模块
[root@node1 ~]# modprobe drbd
[root@node1 ~]# lsmod |grep drbd drbd 228528 0[root@node2 ~]# modprobe drbd
[root@node2 ~]# lsmod |grep drbd drbd 228528 03.修改配置文件
drbd.conf配置文件DRBD运行时,会读取一个配置文件/etc/drbd.conf.这个文件里描述了DRBD设备与硬盘分区的映射关系
3.1 在node1上作以下配置
[root@node1 ~]# vim /etc/drbd.conf
底行模式执行:
:r /usr/share/doc/drbd83-8.3.8/drbd.conf
把样例文件读取过来:
include后面包含的文件的目录默认也在 /etc/下,由于配置文件也在此目录下:
[root@node1 ~]# cd /etc/drbd.d/
[root@node1 drbd.d]# ll total 4 -rwxr-xr-x 1 root root 1418 Jun 4 2010 global_common.conf [root@node1 drbd.d]# cp global_common.conf global_common.conf.bak为了防止修改错误,无法挽回,所以将其进行备份后再修改!
修改全局配置文件:
[root@node1 drbd.d]# vim global_common.conf
在底行模式输入:1,$d把原来文档清空,修改如下:
1 2 global { 3 usage-count no; 4 # minor-count dialog-refresh disable-ip-verification 5 } 6 common { 7 protocol C; 8 9 startup { 10 wfc-timeout 120; 11 degr-wfc-timeout 120; 12 } 13 disk { 14 on-io-error detach; 15 fencing resource-only; 16 17 } 18 net { 19 cram-hmac-alg "sha1"; 20 shared-secret "mydrbdlab"; //钥匙 21 } 22 syncer { 23 rate 100M; //同步速率 24 } 25 26 }
修改资源配置文件:
[root@node1 drbd.d]# vim /etc/drbd.d/nfs.res 创建的新文件
[root@node1 drbd.d]# cat /etc/drbd.d/nfs.res resource nfs{ on node1.gjp.com{ device /dev/drbd0; disk /dev/sda5; address 192.168.10.2:7789; meta-disk internal; }on node2.gjp.com { device /dev/drbd0; disk /dev/sda5; address 192.168.10.6:7789; meta-disk internal; } }
3.2 复制配置到node2 上:
[root@node1 drbd.d]# pwd
/etc/drbd.d [root@node1 drbd.d]# scp /etc/drbc.conf node2:/etc///第一次复制要输入对方密码,以后都不用了!
The authenticity of host 'node2 (192.168.10.6)' can't be established. RSA key fingerprint is 87:be:8b:a4:bd:11:11:10:c2:ec:2d:ef:02:68:f6:0e. Are you sure you want to continue connecting (yes/no)? no Host key verification failed. lost connection [root@node1 drbd.d]# scp /etc/drbd.conf node2:/etc/ The authenticity of host 'node2 (192.168.10.6)' can't be established. RSA key fingerprint is 87:be:8b:a4:bd:11:11:10:c2:ec:2d:ef:02:68:f6:0e. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'node2' (RSA) to the list of known hosts. drbd.conf 100% 233 0.2KB/s 00:00 [root@node1 drbd.d]# scp ./* node2:/etc/drbd.d/ global_common.conf 100% 538 0.5KB/s 00:00 global_common.conf.bak 100% 1418 1.4KB/s 00:00 nfs.res 100% 344 0.3KB/s3.3 检测配置文件
[root@node1 drbd.d]# drbdadm adjust nfs
0: Failure: (119) No valid meta-data signature found.==> Use 'drbdadm create-md res' to initialize meta-data area. <==
Command 'drbdsetup 0 disk /dev/sda5 /dev/sda5 internal --set-defaults --create-device --fencing=resource-only --on-io-error=detach' terminated with exit code 10
[root@node2 etc]# drbdadm adjust nfs 0: Failure: (119) No valid meta-data signature found.
==> Use 'drbdadm create-md res' to initialize meta-data area. <==
Command 'drbdsetup 0 disk /dev/sda5 /dev/sda5 internal --set-defaults --create-device --fencing=resource-only --on-io-error=detach' terminated with exit code 10
[root@node2 etc]# drbdadm adjust nfs drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.3.4 创建nfs的资源
3.4.1 在node1上创建nfs的资源
[root@node1 drbd.d]# drbdadm create-md nfs
Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. [root@node1 drbd.d]# ll /dev/drbd drbd/ drbd1 drbd11 drbd13 drbd15 drbd3 drbd5 drbd7 drbd9 drbd0 drbd10 drbd12 drbd14 drbd2 drbd4 drbd6 drbd8 [root@node1 drbd.d]# ll /dev/drbd0 brw-r----- 1 root disk 147, 0 Oct 18 15:37 /dev/drbd03.5 为node2 创建nfs资源
在node1 上执行node2机器上的命令,远程控制node2机器,为其创建nfs资源
[root@node1 drbd.d]# ssh node2 'drbdadm create-md nfs'
NOT initialized bitmap Writing meta data... initializing activity log New drbd meta data block successfully created. [root@node1 drbd.d]# ssh node2 'ls -l /dev/drbd0' brw-r----- 1 root disk 147, 0 Oct 18 15:33 /dev/drbd03.6 启动DRBD服务并查看其状态
[root@node1 drbd.d]# service drbd start
Starting DRBD resources: drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.[root@node1 drbd.d]# ssh node2 'service drbd start'
Starting DRBD resources: drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.查看服务状态:
注意:全局配置文件中的
usage-count no;两台机器要一致,否则看不到主从设备标识!
[root@node1 drbd.d]# service drbd status
drbd driver loaded OK; device status: version: 8.3.8 (api:88/proto:86-94) GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 m:res cs ro ds p mounted fstype 0:nfs Connected Secondary/Secondary Inconsistent/Inconsistent C左边Secondary指的是自己,右边的Secondary指的是对方机器是主是从!
[root@node1 drbd.d]# ssh node2 'service drbd status'
drbd driver loaded OK; device status: version: 8.3.8 (api:88/proto:86-94) GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16 m:res cs ro ds p mounted fstype 0:nfs Connected Secondary/Secondary Inconsistent/Inconsistent C下面命令也可以查看其主从设备:
[root@node1 drbd.d]# drbd-overview
0:nfs Connected Secondary/Secondary Inconsistent/Inconsistent C r----设为开机自启动:
[root@node1 drbd.d]# chkconfig drbd on
[root@node1 drbd.d]# chkconfig --list drbd drbd 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@node1 drbd.d]# ssh node2 'chkconfig drbd on' [root@node1 drbd.d]# ssh node2 'chkconfig --list drbd' drbd 0:off 1:off 2:on 3:on 4:on 5:on 6:off3.7 在node1主节点上进行如下配置,并查看挂载信息
[root@node1 ~]# mkdir /mnt/nfs
[root@node1 ~]# ssh node2 'mkdir /mnt/nfs' [root@node1 ~]# drbdsetup /dev/drbd0 primary –o //设为主设备 [root@node1 ~]# mkfs.ext3 /dev/drbd0[root@node1 ~]# mount /dev/drbd0 /mnt/nfs/
[root@node1 ~]# mount
/dev/sda2 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) /dev/drbd0 on /mnt/nfs type ext3 (rw)[root@node1 ~]# ll /mnt/nfs/
total 12 drwx------ 2 root root 12288 Oct 18 16:55 lost+found再次查看两台设备的状态:
[root@node1 ~]# drbd-overview
0:nfs Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 471M 11M 437M 3% [root@node1 ~]# ssh node2 'drbd-overview' 0:nfs Connected Secondary/Primary UpToDate/UpToDate C r----测试:看看是否能够达到文件的同步?
[root@node1 ~]# cd /mnt/nfs
[root@node1 nfs]# echo "my name is guo jiping">gjp在node2 上查看是否存在该文件?
必须将node1设置为secondary,再把node2设置为primary
[root@node1 ~]# drbdadm secondary nfs
[root@node1 ~]# drbd-overview 0:nfs Connected Secondary/Secondary UpToDate/UpToDate C r----[root@node2 ~]# drbdadm primary nfs
[root@node2 ~]# drbd-overview 0:nfs Connected Primary/Secondary UpToDate/UpToDate C r----这样node2 才有权限查看:
[root@node2 ~]# mount /dev/drbd0 /mnt/nfs
[root@node2 ~]# mount /dev/sda2 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) /dev/drbd0 on /mnt/nfs type ext3 (rw) [root@node2 ~]# cd /mnt/nfs [root@node2 nfs]# ll total 13 -rw-r--r-- 1 root root 22 Oct 18 17:01 gjp -rw-r--r-- 1 root root 0 Oct 18 17:00 guojiping drwx------ 2 root root 12288 Oct 18 16:55 lost+found三、NFS的配置
两台服务器都修改nfs配置文件,都修改nfs启动脚本:
1.在node1.gj.com上的详细配置如下:
[root@node1 ~]# vim /etc/exports
[root@node1 ~]# cat /etc/exports /mnt/nfs *(rw,sync,insecure,no_root_squash,no_wdelay) [root@node1 ~]# chkconfig portmap on [root@node1 ~]# chkconfig --list portmap portmap 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@node1 ~]# service portmap start Starting portmap: [ OK ] [root@node1 ~]# chkconfig nfs on [root@node1 ~]# chkconfig --list nfs nfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@node1 ~]# service nfs start Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS daemon: [ OK ] Starting NFS mountd: [ OK ] [root@node1 ~]# vim /etc/init.d/nfs2.在node2.gjp.com上的详细配置:
[root@node2 nfs]# vim /etc/exports
[root@node2 nfs]# cat /etc/exports /mnt/nfs *(rw,sync,insecure,no_root_squash,no_wdelay) [root@node2 nfs]# service portmap start Starting portmap: [ OK ] [root@node2 nfs]# chkconfig portmap on [root@node2 nfs]# chkconfig --list portmap portmap 0:off 1:off 2:on 3:on 4:on 5:on 6:off [root@node2 nfs]# service nfs start Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS daemon: [ OK ] Starting NFS mountd: [ OK ] [root@node2 nfs]# chkconfig nfs on [root@node2 nfs]# chkconfig --list nfs nfs 0:off 1:off 2:on 3:on 4:on 5:on 6:off[root@node2 nfs]# vim /etc/init.d/nfs
四、Heartbeat的配置
1.上传所需软件包并安装:
[root@node1 ~]# mount /dev/cdrom /mnt/cdrom
mount: block device /dev/cdrom is write-protected, mounting read-only利用光盘上带的文件来解决依赖关系!否则,安装不成功!
[root@node1 ~]# yum localinstall -y heartbeat-2.1.4-9.el5.i386.rpm heartbeat-pils-2.1.4-10.el5.i386.rpm heartbeat-stonith-2.1.4-10.el5.i386.rpm libnet-1.1.4-3.el5.i386.rpm perl-MailTools-1.77-1.el5.noarch.rpm --nogpgcheck
[root@node2 ~]# yum localinstall -y heartbeat-2.1.4-9.el5.i386.rpm heartbeat-pils-2.1.4-10.el5.i386.rpm heartbeat-stonith-2.1.4-10.el5.i386.rpm libnet-1.1.4-3.el5.i386.rpm perl-MailTools-1.77-1.el5.noarch.rpm --nogpgcheck
2.拷贝并修改配置文档
2.1 node1 上拷贝及修改
[root@node1 doc]# cd /usr/share/doc/heartbeat-2.1.4/
[root@node1 heartbeat-2.1.4]# ls apphbd.cf DirectoryMap.txt HardwareGuide.html heartbeat_api.txt rsync.txt authkeys faqntips.html HardwareGuide.txt logd.cf startstop AUTHORS faqntips.txt haresources README ChangeLog GettingStarted.html hb_report.html Requirements.html COPYING GettingStarted.txt hb_report.txt Requirements.txt COPYING.LGPL ha.cf heartbeat_api.html rsync.html [root@node1 heartbeat-2.1.4]# cp authkeys ha.cf haresources /etc/ha.d/ [root@node1 heartbeat-2.1.4]# cd /etc/ha.d/ [root@node1 ha.d]# ls authkeys ha.cf harc haresources rc.d README.config resource.d shellfuncs [root@node1 ha.d]# vim ha.cf24 debugfile /var/log/ha-debug
29 logfile /var/log/ha-log
34 logfacility local0
48 keepalive 2
56 deadtime 10
76 udpport 694
121 ucast eth0 192.168.10.6 //单播,填写对方服务器ip地址
157 auto_failback on
213 node node1.gjp.com 增加的
214 node node2.gjp.com222 ping 10.10.10.3
[root@node1 ha.d]# vim haresources
45 node1.gjp.com IPaddr::192.168.10.16/24/eth0 drbddisk::nfs Filesystem::/dev/drbd0::/mn
t/nfs::ext3 killnfsd[root@node1 ha.d]# vim authkeys
末尾增加:
27 auth 1
28 1 crc[root@node1 ha.d]# echo "killall -9 nfsd; /etc/init.d/nfs restart;exit 0">>resource.d/killnfsd
[root@node1 ha.d]# chmod 600 /etc/ha.d/authkeys [root@node1 ha.d]# chmod 755 /etc/ha.d/resource.d/killnfsd2.2 node2 上拷贝及修改
把刚才的配置拷贝到node2.gjp.com
[root@node1 ha.d]# ls
authkeys ha.cf harc haresources rc.d README.config resource.d shellfuncs [root@node1 ha.d]# scp ha.cf authkeys haresources node2:/etc/ha.d/ ha.cf 100% 10KB 10.3KB/s 00:00 authkeys 100% 659 0.6KB/s 00:00 haresources 100% 6013 5.9KB/s 00:00 [root@node1 ha.d]# scp resource.d/killnfsd node2:/etc/ha.d/resource.d/ killnfsd 100% 48 0.1KB/s 00:00 [root@node1 ha.d]# pwd /etc/ha.d[root@node2 ~]# cd /etc/ha.d/
[root@node2 ha.d]# ls authkeys ha.cf haresources resource.d [root@node2 ha.d]# vim ha.cf121 ucast eth0 192.168.10.2
3.服务重启
[root@node1 ha.d]# chkconfig heartbeat on
[root@node1 ha.d]# service heartbeat start Starting High-Availability services: 2012/10/18_18:36:00 INFO: Resource is stopped [ OK ][root@node1 ha.d]# drbd-overview
0:nfs Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 471M 11M 437M 3%[root@node2 ~]# service heartbeat start
Starting High-Availability services: 2012/10/18_18:40:36 INFO: Resource is stopped [ OK ][root@node2 ~]# chkconfig heartbeat on
[root@node2 ha.d]# drbd-overview
0:nfs Connected Secondary/Primary UpToDate/UpToDate C r----[root@node1 ha.d]# ifconfig eth0:0
eth0:0 Link encap:Ethernet HWaddr 00:0C:29:F9:1C:6F inet addr:192.168.10.16 Bcast:192.168.10.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:67 Base address:0x2000五、测试
1. 在测试机上将192.168.10.16:/mnt/nfs 挂载到本地/info下
测试机ip:
[root@gjp99 ~]# setup
[root@gjp99 ~]# mkdir /info
[root@gjp99 info]# mount 192.168.10.16:/mnt/nfs/ /info/
[root@gjp99 info]# cd /info [root@gjp99 info]# ll total 13 -rw-r--r-- 1 root root 22 Oct 18 17:01 gjp -rw-r--r-- 1 root root 0 Oct 18 17:00 guojiping drwx------ 2 root root 12288 Oct 18 16:55 lost+found[root@gjp99 info]# touch test
[root@gjp99 info]# ll total 13 -rw-r--r-- 1 root root 22 Oct 18 17:01 gjp -rw-r--r-- 1 root root 0 Oct 18 17:00 guojiping drwx------ 2 root root 12288 Oct 18 16:55 lost+found -rw-r--r-- 1 root root 0 Oct 18 19:16 test
[root@gjp99 info]# echo "guo jiping">test
2.在测试机上创建shell脚本进行测试,每秒一次
[root@gjp99 ~]# vim tesnfs.sh
[root@gjp99 ~]# chmod +x tesnfs.sh //赋予执行权限
3. 将主节点node1的heartbeat服务停止,则备节点node2 接管服务
[root@node1 ~]# service heartbeat stop
Stopping High-Availability services: [ OK ] [root@node1 ~]# drbd-overview 0:nfs Connected Secondary/Primary UpToDate/UpToDate C r---- [root@node1 ~]# ifconfig eth0:0 eth0:0 Link encap:Ethernet HWaddr 00:0C:29:F9:1C:6F UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:67 Base address:0x2000
[root@node2 ha.d]# drbd-overview
0:nfs Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 471M 11M 437M 3% [root@node2 ha.d]# ifconfig eth0:0 eth0:0 Link encap:Ethernet HWaddr 00:0C:29:F5:92:A1 inet addr:192.168.10.16 Bcast:192.168.10.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:67 Base address:0x2000[root@node2 ha.d]# service heartbeat status
heartbeat OK [pid 7543 et al] is running on node2.gjp.com [node2.gjp.com]...4.在客户端上运行tesnfs.sh测试文件,一直显示如下信息:
[root@gjp99 ~]# ./tesnfs.sh
[root@gjp99 ~]# ./tesnfs.sh
---> try touch x:Thu Oct 18 19:43:40 CST 2012 <---done touch x:Thu Oct 18 19:43:41 CST 2012---> try touch x:Thu Oct 18 19:43:42 CST 2012
<---done touch x:Thu Oct 18 19:43:42 CST 2012---> try touch x:Thu Oct 18 19:43:43 CST 2012
<---done touch x:Thu Oct 18 19:43:43 CST 20125.发现客户端仍可以正常挂载,磁盘可正常使用
[root@gjp99 ~]# mount
/dev/sda2 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) /dev/sda1 on /boot type ext3 (rw) tmpfs on /dev/shm type tmpfs (rw) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw) 192.168.10.16:/mnt/nfs/ on /info type nfs (rw,addr=192.168.10.16) 192.168.10.16:/mnt/nfs/ on /info type nfs (rw,addr=192.168.10.16) [root@gjp99 ~]# ll /info/ total 14 -rw-r--r-- 1 root root 22 Oct 18 17:01 gjp -rw-r--r-- 1 root root 0 Oct 18 17:00 guojiping drwx------ 2 root root 12288 Oct 18 16:55 lost+found -rw-r--r-- 1 root root 11 Oct 18 19:17 testnode2 已成功接管服务,实验已实现所需的功能;也可手动在nfs挂载目录里建立文件,来回切换node1和node2的drbd服务来进行测试!
6.恢复node1为主要节点
[root@node1 ~]# service heartbeat start
Starting High-Availability services: 2012/10/18_19:51:01 INFO: Resource is stopped [ OK ] [root@node1 ~]# drbd-overview 0:nfs Connected Secondary/Primary UpToDate/UpToDate C r---- [root@node1 ~]# drbd-overview 0:nfs Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 471M 11M 437M 3%发现切换时有时间间隔!
《大结局》