概论:

1.构建文件服务器的高可用性(HA)群集:

      本实验部署DRBD + HEARDBEAT + NFS 环境,建立一个高可用(HA)的文件服务器集群。在方案中,通过DRBD保证了服务器数据的完整性和一致性。DRBD类似于一个网络RAID-1功能。当你将数据写入本地文件系统时,数据还将会被推送到网络中另一台主机上,以相同的形式记录在一个另文件系统中。主节点与备节点的数据可以保证实时相互同步。当本地主服务器出现故障时,备份服务器上还会保留有一份相同的数据,可以继续使用。在高可用(HA)中使用DRBD功能,可以代替使用一个共享盘阵。因为数据同时存在于本地主服务器和备份服务器上。切换时,远程主机只要使用它上面的那份备份数据,就可以继续提供主服务器上相同的服务,并且client用户对主服务器的故障无感知。

2.简化拓扑图:

一、准备工作及简单配置

1.node1的基本配置及新建磁盘

1.1 查看主机名并配置ip地址

[root@mail ~]# cat /etc/sysconfig/network

NETWORKING=yes
NETWORKING_IPV6=yes
HOSTNAME=node1.gjp.com
[root@mail ~]# hostname node1.gjp.com

[root@mail ~]# logout   登出,重新登录,才能把名字改过来!

Xshell:\> ssh  192.168.10.2

[root@node1 ~]# setup

root@node1 ~]# service network restart

Shutting down interface eth0:                              [  OK  ]
Shutting down loopback interface:                          [  OK  ]
Disabling IPv4 packet forwarding:  net.ipv4.ip_forward = 0
                                                           [  OK  ]
Bringing up loopback interface:                            [  OK  ]
Bringing up interface eth0:                                [  OK  ]

[ ~]# ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 00:0C:29:F9:1C:6F  
   inet addr:192.168.10.2  Bcast:192.168.10.255  Mask:255.255.255.0

1.2 查看系统信息,同步时间

[root@node1 ~]# uname -rv

2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:54 EDT 2009
[root@node1 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.4 (Tikanga)

[root@node1 ~]# hwclock -s

[root@node1 ~]# date
Thu Oct 18 14:32:02 CST 2012

1.3 配置/etc/hosts 文件(就不用dns了)

[root@node1 ~]# echo "192.168.10.2 node1.gjp.com node1 " >>/etc/hosts

[root@node1 ~]# echo "192.168.10.6 node2.gjp.com node2 " >>/etc/hosts
[root@node1 ~]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1   localhost.localdomain  localhost
::1        localhost6.localdomain6 localhost6
192.168.10.2 node1.gjp.com node1
192.168.10.6 node2.gjp.com node2 

1.4 创建一个新的的分区(磁盘空间)用于实现DRBD技术

[root@node1 ~]# fdisk -l     //查看当前分区

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14        1288    10241437+  83  Linux
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris
[root@node1 ~]# fdisk /dev/sda    // 开始创建磁盘空间

The number of cylinders for this disk is set to 2610.

There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n   // 新建分区

Command action
   e   extended
   p   primary partition (1-4)
e  //扩展分区
Selected partition 4
First cylinder (1354-2610, default 1354):
Using default value 1354
Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610):
Using default value 2610

Command (m for help): p   //显示分区情况

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14        1288    10241437+  83  Linux
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris
/dev/sda4            1354        2610    10096852+   5  Extended

Command (m for help): n   

First cylinder (1354-2610, default 1354):
Using default value 1354
Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): +500m 

// 建的逻辑分区,磁盘大小:500m

Command (m for help): p

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14        1288    10241437+  83  Linux
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris
/dev/sda4            1354        2610    10096852+   5  Extended
/dev/sda5            1354        1415      497983+  83  Linux

Command (m for help): w   //保存

The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.

The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.

[root@node1 ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14        1288    10241437+  83  Linux
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris
/dev/sda4            1354        2610    10096852+   5  Extended
/dev/sda5            1354        1415      497983+  83  Linux

下面让刚创建的分区生效:

[root@node1 ~]# partprobe /dev/sda
[root@node1 ~]# cat /proc/partitions  
major minor  #blocks  name

   8     0   20971520 sda

   8     1     104391 sda1
   8     2   10241437 sda2
   8     3     522112 sda3 
   8     4          0 sda4
   8     5     497983 sda5

2. node2基本配置及新建磁盘

2.1 查看主机名称,修改其ip地址

[root@node2 ~]# cat /etc/sysconfig/network

NETWORKING=yes
NETWORKING_IPV6=yes
HOSTNAME=node2.gjp.com
[root@node2 ~]# hostname
node2.gjp.com
[root@node2 ~]# setup

[root@node2 ~]# service network restart
Shutting down interface eth0:                              [  OK  ]
Shutting down loopback interface:                          [  OK  ]
Disabling IPv4 packet forwarding:  net.ipv4.ip_forward = 0
                                                           [  OK  ]
Bringing up loopback interface:                            [  OK  ]
Bringing up interface eth0:                                [  OK  ]

[root@node2 ~]# ifconfig eth0

eth0      Link encap:Ethernet  HWaddr 00:0C:29:F5:92:A1 
          inet addr:192.168.10.6  Bcast:192.168.10.255  Mask:255.255.255.0

2.2 查看系统信息,同步时间

[root@node2 ~]# uname -rv

2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:54 EDT 2009  
[root@node2 ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.4 (Tikanga)
[root@node2 ~]# hwclock -s
[root@node2 ~]# date
Thu Oct 18 14:55:46 CST 2012

2.3 配置/etc/hosts文件

[root@node2 ~]# cat /etc/hosts

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1   localhost.localdomain  localhost
::1        localhost6.localdomain6 localhost6
192.168.10.2    node1.gjp.com    node1
192.168.10.6    node2.gjp.com    node2

2.4 创建一个新的磁盘空间有利于实现DRBD技术

[root@node2 ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14        1288    10241437+  83  Linux
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris
[root@node2 ~]# fdisk /dev/sda

The number of cylinders for this disk is set to 2610.

There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): n

Command action
   e   extended
   p   primary partition (1-4)
e
Selected partition 4
First cylinder (1354-2610, default 1354):
Using default value 1354
Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610):
Using default value 2610

Command (m for help): n

First cylinder (1354-2610, default 1354):
Using default value 1354
Last cylinder or +size or +sizeM or +sizeK (1354-2610, default 2610): +500m

Command (m for help): w

The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.

The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.

[root@node2 ~]# fdisk -l

Disk /dev/sda: 21.4 GB, 21474836480 bytes

255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System

/dev/sda1   *           1          13      104391   83  Linux
/dev/sda2              14        1288    10241437+  83  Linux
/dev/sda3            1289        1353      522112+  82  Linux swap / Solaris
/dev/sda4            1354        2610    10096852+   5  Extended
/dev/sda5            1354        1415      497983+  83  Linux
[root@node2 ~]# partprobe /dev/sda
[root@node2 ~]# cat /proc/partitions
major minor  #blocks  name

   8     0   20971520 sda

   8     1     104391 sda1
   8     2   10241437 sda2
   8     3     522112 sda3
   8     4          0 sda4
   8     5     497983 sda5

两端名称能够相互解析:

[root@node1 ~]# ping node2.gjp.com

PING node2.gjp.com (192.168.10.6) 56(84) bytes of data.
64 bytes from node2.gjp.com (192.168.10.6): icmp_seq=1 ttl=64 time=4.29 m

[root@node2 ~]# ping node1.gjp.com

PING node1.gjp.com (192.168.10.2) 56(84) bytes of data.
64 bytes from node1.gjp.com (192.168.10.2): icmp_seq=1 ttl=64 time=5.81 ms
64 bytes from node1.gjp.com (192.168.10.2): icmp_seq=2 ttl=64 time=0.491 ms

 

 

3. 在node1 和node2 上配置ssh 密钥信息

有利于以后在一个节点对另一个节点直接操作,实现无障碍通信

3.1 在node1 上配置ssh密钥信息

[root@node1 ~]# ssh-

ssh-add      ssh-agent    ssh-copy-id  ssh-keygen   ssh-keyscan 
[root@node1 ~]# ssh-keygen -t rsa   //生成钥匙
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
21:8b:79:61:6f:7e:4d:b6:61:18:11:cb:47:83:d7:98

把生成的公钥放到对方的机器上,传递文件时就无需输入密码了!

[root@node1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@node2.gjp.com
15
The authenticity of host 'node2.gjp.com (192.168.10.6)' can't be established.
RSA key fingerprint is 87:be:8b:a4:bd:11:11:10:c2:ec:2d:ef:02:68:f6:0e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node2.gjp.com,192.168.10.6' (RSA) to the list of known hosts.
root@node2.gjp.com's password:
Now try logging into the machine, with "ssh 'root@node2.gjp.com'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

3.2 在node2 上配置ssh密钥信息:

[root@node2 ~]# ssh-keygen -t rsa

[root@node2 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@node1.gjp.com
3.3 测试:

[root@node2 ~]# ssh node1.gjp.com 'ifconfig'

无需输入密码,即可执行对方机器上的命令!

eth0      Link encap:Ethernet  HWaddr 00:0C:29:F9:1C:6F 
          inet addr:192.168.10.2  Bcast:192.168.10.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fef9:1c6f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1968 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1450 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:182130 (177.8 KiB)  TX bytes:198413 (193.7 KiB)
          Interrupt:67 Base address:0x2000

二、DBRD安装配置步骤

 

在node1和node2上上传所需软件包:

上传到node2的/root目录下:

[root@node2 ~]# ll drbd* kmod*

-rw-r--r-- 1 root root 221868 Oct 18 15:29 drbd83-8.3.8-1.el5.centos.i386.rpm
-rw-r--r-- 1 root root 125974 Oct 18 15:29 kmod-drbd83-8.3.8-1.el5.centos.i686.rpm

1.安装DRBD套件

上传的包没有依赖性,可直接用rpm进行安装

[root@node2 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm

warning: drbd83-8.3.8-1.el5.centos.i386.rpm: Header V3 DSA signature: NOKEY, key ID e8562897
Preparing...                ########################################### [100%]
   1:drbd83                 ########################################### [100%]
[root@node2 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm
warning: kmod-drbd83-8.3.8-1.el5.centos.i686.rpm: Header V3 DSA signature: NOKEY, key ID e8562897
Preparing...                ########################################### [100%]
   1:kmod-drbd83            ########################################### [100%]

把上传的软件包拷贝到node1机器上:

[root@node2 ~]# scp drbd* kmod* node1.gjp.com:/root

drbd83-8.3.8-1.el5.centos.i386.rpm               100%  217KB 216.7KB/s   00:00   
kmod-drbd83-8.3.8-1.el5.centos.i686.rpm          100%  123KB 123.0KB/s   00:00

到node1上进行安装:

[root@node1 ~]# rpm -ivh drbd83-8.3.8-1.el5.centos.i386.rpm

warning: drbd83-8.3.8-1.el5.centos.i386.rpm: Header V3 DSA signature: NOKEY, key ID e8562897
Preparing...                ########################################### [100%]
   1:drbd83                 ########################################### [100%]
[root@node1 ~]# rpm -ivh kmod-drbd83-8.3.8-1.el5.centos.i686.rpm
warning: kmod-drbd83-8.3.8-1.el5.centos.i686.rpm: Header V3 DSA signature: NOKEY, key ID e8562897
Preparing...                ########################################### [100%]
   1:kmod-drbd83            ########################################### [100%]

2. 加载DRBD模块

[root@node1 ~]# modprobe drbd

[root@node1 ~]# lsmod |grep drbd
drbd                  228528  0

[root@node2 ~]# modprobe drbd

[root@node2 ~]# lsmod |grep drbd
drbd                  228528  0

3.修改配置文件

drbd.conf配置文件DRBD运行时,会读取一个配置文件/etc/drbd.conf.这个文件里描述了DRBD设备与硬盘分区的映射关系

3.1 在node1上作以下配置

[root@node1 ~]# vim /etc/drbd.conf

底行模式执行:

:r /usr/share/doc/drbd83-8.3.8/drbd.conf

把样例文件读取过来:

include后面包含的文件的目录默认也在 /etc/下,由于配置文件也在此目录下:

[root@node1 ~]# cd /etc/drbd.d/

[root@node1 drbd.d]# ll
total 4
-rwxr-xr-x 1 root root 1418 Jun  4  2010 global_common.conf
[root@node1 drbd.d]# cp global_common.conf global_common.conf.bak

为了防止修改错误,无法挽回,所以将其进行备份后再修改!

修改全局配置文件:

[root@node1 drbd.d]# vim global_common.conf

在底行模式输入:1,$d把原来文档清空,修改如下:

 

1

2 global {
3         usage-count no;
4         # minor-count dialog-refresh disable-ip-verification
5        }
6 common {
7         protocol C;
8
9         startup {
10                 wfc-timeout  120;
11                 degr-wfc-timeout 120;
12                 }
13         disk {
14                   on-io-error detach;
15                   fencing resource-only;
16
17              }
18         net {
19                 cram-hmac-alg "sha1";
20                 shared-secret  "mydrbdlab";   //钥匙
21             }
22         syncer {
23                   rate  100M;   //同步速率
24                }
25
26         }

修改资源配置文件:

[root@node1 drbd.d]# vim /etc/drbd.d/nfs.res   创建的新文件

[root@node1 drbd.d]# cat /etc/drbd.d/nfs.res
resource nfs{
        on node1.gjp.com{
        device   /dev/drbd0;
        disk    /dev/sda5;
        address  192.168.10.2:7789;
        meta-disk       internal;
        }  

        on node2.gjp.com {

        device   /dev/drbd0;
        disk    /dev/sda5;
        address  192.168.10.6:7789;
        meta-disk       internal;
        }  
}

3.2 复制配置到node2 上:

[root@node1 drbd.d]# pwd

/etc/drbd.d
[root@node1 drbd.d]# scp /etc/drbc.conf node2:/etc/

//第一次复制要输入对方密码,以后都不用了!

The authenticity of host 'node2 (192.168.10.6)' can't be established.
RSA key fingerprint is 87:be:8b:a4:bd:11:11:10:c2:ec:2d:ef:02:68:f6:0e.
Are you sure you want to continue connecting (yes/no)? no
Host key verification failed.
lost connection
[root@node1 drbd.d]# scp /etc/drbd.conf node2:/etc/
The authenticity of host 'node2 (192.168.10.6)' can't be established.
RSA key fingerprint is 87:be:8b:a4:bd:11:11:10:c2:ec:2d:ef:02:68:f6:0e.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node2' (RSA) to the list of known hosts.
drbd.conf                                        100%  233     0.2KB/s   00:00   
[root@node1 drbd.d]# scp ./*  node2:/etc/drbd.d/
global_common.conf                               100%  538     0.5KB/s   00:00   
global_common.conf.bak                           100% 1418     1.4KB/s   00:00   
nfs.res                                          100%  344     0.3KB/s 

3.3 检测配置文件

[root@node1 drbd.d]# drbdadm adjust nfs

0: Failure: (119) No valid meta-data signature found.

    ==> Use 'drbdadm create-md res' to initialize meta-data area. <==

Command 'drbdsetup 0 disk /dev/sda5 /dev/sda5 internal --set-defaults --create-device --fencing=resource-only --on-io-error=detach' terminated with exit code 10

[root@node2 etc]# drbdadm adjust nfs

0: Failure: (119) No valid meta-data signature found.

    ==> Use 'drbdadm create-md res' to initialize meta-data area. <==

Command 'drbdsetup 0 disk /dev/sda5 /dev/sda5 internal --set-defaults --create-device --fencing=resource-only --on-io-error=detach' terminated with exit code 10

[root@node2 etc]# drbdadm adjust nfs
drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.

3.4 创建nfs的资源

3.4.1 在node1上创建nfs的资源

[root@node1 drbd.d]# drbdadm create-md nfs

Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
[root@node1 drbd.d]# ll /dev/drbd
drbd/   drbd1   drbd11  drbd13  drbd15  drbd3   drbd5   drbd7   drbd9  
drbd0   drbd10  drbd12  drbd14  drbd2   drbd4   drbd6   drbd8  
[root@node1 drbd.d]# ll /dev/drbd0
brw-r----- 1 root disk 147, 0 Oct 18 15:37 /dev/drbd0

3.5 为node2 创建nfs资源

在node1 上执行node2机器上的命令,远程控制node2机器,为其创建nfs资源

[root@node1 drbd.d]# ssh node2 'drbdadm create-md nfs'

NOT initialized bitmap
Writing meta data...
initializing activity log
New drbd meta data block successfully created.
[root@node1 drbd.d]# ssh node2 'ls -l /dev/drbd0'
brw-r----- 1 root disk 147, 0 Oct 18 15:33 /dev/drbd0

3.6 启动DRBD服务并查看其状态

[root@node1 drbd.d]# service drbd start

Starting DRBD resources: drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.

[root@node1 drbd.d]# ssh node2 'service drbd start'

Starting DRBD resources: drbdsetup 0 show:5: delay-probe-volume 0k => 0k out of range [4..1048576]k.

查看服务状态:

注意:全局配置文件中的

usage-count no;两台机器要一致,否则看不到主从设备标识!

[root@node1 drbd.d]# service drbd status

drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
m:res  cs         ro                   ds                         p  mounted  fstype
0:nfs  Connected  Secondary/Secondary  Inconsistent/Inconsistent  C

左边Secondary指的是自己,右边的Secondary指的是对方机器是主是从!

[root@node1 drbd.d]# ssh node2 'service drbd status'

drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
m:res  cs         ro                   ds                         p  mounted  fstype
0:nfs  Connected  Secondary/Secondary  Inconsistent/Inconsistent  C

下面命令也可以查看其主从设备:

[root@node1 drbd.d]# drbd-overview

  0:nfs  Connected Secondary/Secondary Inconsistent/Inconsistent C r----

设为开机自启动:

[root@node1 drbd.d]# chkconfig drbd on

[root@node1 drbd.d]# chkconfig --list drbd
drbd               0:off    1:off    2:on    3:on    4:on    5:on    6:off
[root@node1 drbd.d]# ssh node2 'chkconfig drbd on'
[root@node1 drbd.d]# ssh node2 'chkconfig --list drbd'
drbd               0:off    1:off    2:on    3:on    4:on    5:on    6:off

3.7 在node1主节点上进行如下配置,并查看挂载信息

[root@node1 ~]# mkdir /mnt/nfs

[root@node1 ~]# ssh node2 'mkdir /mnt/nfs'
[root@node1 ~]# drbdsetup /dev/drbd0 primary –o  //设为主设备
[root@node1 ~]# mkfs.ext3 /dev/drbd0

[root@node1 ~]# mount /dev/drbd0 /mnt/nfs/

[root@node1 ~]# mount

/dev/sda2 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/drbd0 on /mnt/nfs type ext3 (rw)

[root@node1 ~]# ll /mnt/nfs/

total 12
drwx------ 2 root root 12288 Oct 18 16:55 lost+found

再次查看两台设备的状态:

[root@node1 ~]# drbd-overview

  0:nfs  Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 471M 11M 437M 3%
[root@node1 ~]# ssh node2 'drbd-overview'
  0:nfs  Connected Secondary/Primary UpToDate/UpToDate C r----

测试:看看是否能够达到文件的同步?

[root@node1 ~]# cd /mnt/nfs

[root@node1 nfs]# echo "my name is guo jiping">gjp

在node2 上查看是否存在该文件?

必须将node1设置为secondary,再把node2设置为primary

[root@node1 ~]# drbdadm secondary nfs

[root@node1 ~]# drbd-overview
  0:nfs  Connected Secondary/Secondary UpToDate/UpToDate C r----

[root@node2 ~]# drbdadm primary nfs

[root@node2 ~]# drbd-overview
  0:nfs  Connected Primary/Secondary UpToDate/UpToDate C r----

这样node2 才有权限查看:

[root@node2 ~]# mount /dev/drbd0 /mnt/nfs

[root@node2 ~]# mount
/dev/sda2 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/drbd0 on /mnt/nfs type ext3 (rw)
[root@node2 ~]# cd /mnt/nfs
[root@node2 nfs]# ll
total 13
-rw-r--r-- 1 root root    22 Oct 18 17:01 gjp
-rw-r--r-- 1 root root     0 Oct 18 17:00 guojiping
drwx------ 2 root root 12288 Oct 18 16:55 lost+found

三、NFS的配置

两台服务器都修改nfs配置文件,都修改nfs启动脚本:

1.在node1.gj.com上的详细配置如下:

[root@node1 ~]# vim /etc/exports

[root@node1 ~]# cat /etc/exports
/mnt/nfs *(rw,sync,insecure,no_root_squash,no_wdelay)
[root@node1 ~]# chkconfig portmap on
[root@node1 ~]# chkconfig --list portmap
portmap            0:off    1:off    2:on    3:on    4:on    5:on    6:off
[root@node1 ~]# service portmap start
Starting portmap:                                          [  OK  ]
[root@node1 ~]# chkconfig nfs on
[root@node1 ~]# chkconfig --list nfs
nfs                0:off    1:off    2:on    3:on    4:on    5:on    6:off
[root@node1 ~]# service nfs start
Starting NFS services:                                     [  OK  ]
Starting NFS quotas:                                       [  OK  ]
Starting NFS daemon:                                       [  OK  ]
Starting NFS mountd:                                       [  OK  ]
[root@node1 ~]# vim /etc/init.d/nfs

2.在node2.gjp.com上的详细配置:

 

[root@node2 nfs]# vim /etc/exports

[root@node2 nfs]# cat /etc/exports
/mnt/nfs *(rw,sync,insecure,no_root_squash,no_wdelay)
[root@node2 nfs]# service portmap start
Starting portmap:                                          [  OK  ]
[root@node2 nfs]# chkconfig portmap on
[root@node2 nfs]# chkconfig --list portmap
portmap            0:off    1:off    2:on    3:on    4:on    5:on    6:off
[root@node2 nfs]# service nfs start
Starting NFS services:                                     [  OK  ]
Starting NFS quotas:                                       [  OK  ]
Starting NFS daemon:                                       [  OK  ]
Starting NFS mountd:                                       [  OK  ]
[root@node2 nfs]# chkconfig nfs on
[root@node2 nfs]# chkconfig --list nfs
nfs                0:off    1:off    2:on    3:on    4:on    5:on    6:off

[root@node2 nfs]# vim /etc/init.d/nfs

四、Heartbeat的配置

1.上传所需软件包并安装:

[root@node1 ~]# mount /dev/cdrom /mnt/cdrom

mount: block device /dev/cdrom is write-protected, mounting read-only

利用光盘上带的文件来解决依赖关系!否则,安装不成功!

[root@node1 ~]# yum localinstall -y heartbeat-2.1.4-9.el5.i386.rpm heartbeat-pils-2.1.4-10.el5.i386.rpm heartbeat-stonith-2.1.4-10.el5.i386.rpm libnet-1.1.4-3.el5.i386.rpm perl-MailTools-1.77-1.el5.noarch.rpm --nogpgcheck

[root@node2 ~]# yum localinstall -y heartbeat-2.1.4-9.el5.i386.rpm heartbeat-pils-2.1.4-10.el5.i386.rpm heartbeat-stonith-2.1.4-10.el5.i386.rpm libnet-1.1.4-3.el5.i386.rpm perl-MailTools-1.77-1.el5.noarch.rpm --nogpgcheck

2.拷贝并修改配置文档

2.1 node1 上拷贝及修改

[root@node1 doc]# cd /usr/share/doc/heartbeat-2.1.4/

[root@node1 heartbeat-2.1.4]# ls
apphbd.cf     DirectoryMap.txt     HardwareGuide.html  heartbeat_api.txt  rsync.txt
authkeys      faqntips.html        HardwareGuide.txt   logd.cf            startstop
AUTHORS       faqntips.txt         haresources         README
ChangeLog     GettingStarted.html  hb_report.html      Requirements.html
COPYING       GettingStarted.txt   hb_report.txt       Requirements.txt
COPYING.LGPL  ha.cf                heartbeat_api.html  rsync.html
[root@node1 heartbeat-2.1.4]# cp authkeys ha.cf haresources /etc/ha.d/
[root@node1 heartbeat-2.1.4]# cd /etc/ha.d/
[root@node1 ha.d]# ls
authkeys  ha.cf  harc  haresources  rc.d  README.config  resource.d  shellfuncs
[root@node1 ha.d]# vim ha.cf

24 debugfile /var/log/ha-debug

29 logfile /var/log/ha-log

34 logfacility     local0

48 keepalive 2

56 deadtime 10

76 udpport 694

121 ucast eth0 192.168.10.6   //单播,填写对方服务器ip地址

157 auto_failback on

213 node node1.gjp.com   增加的

214 node node2.gjp.com

222 ping 10.10.10.3

[root@node1 ha.d]# vim haresources

45  node1.gjp.com   IPaddr::192.168.10.16/24/eth0 drbddisk::nfs  Filesystem::/dev/drbd0::/mn

   t/nfs::ext3 killnfsd

[root@node1 ha.d]# vim authkeys

末尾增加:

27 auth 1

28 1 crc

[root@node1 ha.d]# echo "killall -9 nfsd; /etc/init.d/nfs restart;exit 0">>resource.d/killnfsd

[root@node1 ha.d]# chmod 600 /etc/ha.d/authkeys
[root@node1 ha.d]# chmod 755 /etc/ha.d/resource.d/killnfsd

2.2 node2 上拷贝及修改

把刚才的配置拷贝到node2.gjp.com

[root@node1 ha.d]# ls

authkeys  ha.cf  harc  haresources  rc.d  README.config  resource.d  shellfuncs
[root@node1 ha.d]# scp ha.cf authkeys haresources node2:/etc/ha.d/
ha.cf                                                              100%   10KB  10.3KB/s   00:00   
authkeys                                                           100%  659     0.6KB/s   00:00   
haresources                                                        100% 6013     5.9KB/s   00:00   
[root@node1 ha.d]# scp resource.d/killnfsd node2:/etc/ha.d/resource.d/
killnfsd                                                           100%   48     0.1KB/s   00:00   
[root@node1 ha.d]# pwd
/etc/ha.d

[root@node2 ~]# cd /etc/ha.d/

[root@node2 ha.d]# ls 
authkeys  ha.cf  haresources  resource.d
[root@node2 ha.d]# vim ha.cf

121 ucast eth0 192.168.10.2

 

3.服务重启

 

 

[root@node1 ha.d]# chkconfig heartbeat on

[root@node1 ha.d]# service heartbeat start
Starting High-Availability services:
2012/10/18_18:36:00 INFO:  Resource is stopped
                                                           [  OK  ]

[root@node1 ha.d]# drbd-overview

  0:nfs  Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 471M 11M 437M 3%

[root@node2 ~]# service heartbeat start

Starting High-Availability services:
2012/10/18_18:40:36 INFO:  Resource is stopped
                                                           [  OK  ]

[root@node2 ~]# chkconfig heartbeat on

 

[root@node2 ha.d]# drbd-overview

  0:nfs  Connected Secondary/Primary UpToDate/UpToDate C r----
 

[root@node1 ha.d]# ifconfig eth0:0

eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:F9:1C:6F 
          inet addr:192.168.10.16  Bcast:192.168.10.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:67 Base address:0x2000

五、测试

1. 在测试机上将192.168.10.16:/mnt/nfs  挂载到本地/info下

测试机ip:

[root@gjp99 ~]# setup

 

[root@gjp99 ~]# mkdir /info

[root@gjp99 info]# mount 192.168.10.16:/mnt/nfs/ /info/

[root@gjp99 info]# cd /info
[root@gjp99 info]# ll
total 13
-rw-r--r-- 1 root root    22 Oct 18 17:01 gjp
-rw-r--r-- 1 root root     0 Oct 18 17:00 guojiping
drwx------ 2 root root 12288 Oct 18 16:55 lost+found

[root@gjp99 info]# touch test

[root@gjp99 info]# ll
total 13
-rw-r--r-- 1 root root    22 Oct 18 17:01 gjp
-rw-r--r-- 1 root root     0 Oct 18 17:00 guojiping
drwx------ 2 root root 12288 Oct 18 16:55 lost+found
-rw-r--r-- 1 root root     0 Oct 18 19:16 test

 

[root@gjp99 info]# echo "guo jiping">test

2.在测试机上创建shell脚本进行测试,每秒一次

[root@gjp99 ~]# vim tesnfs.sh

[root@gjp99 ~]# chmod +x tesnfs.sh     //赋予执行权限

3. 将主节点node1的heartbeat服务停止,则备节点node2 接管服务

[root@node1 ~]# service heartbeat stop

Stopping High-Availability services:
                                                           [  OK  ]
[root@node1 ~]# drbd-overview
  0:nfs  Connected Secondary/Primary UpToDate/UpToDate C r----
[root@node1 ~]# ifconfig eth0:0
eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:F9:1C:6F 
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:67 Base address:0x2000

 

[root@node2 ha.d]# drbd-overview

  0:nfs  Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 471M 11M 437M 3%
[root@node2 ha.d]# ifconfig eth0:0
eth0:0    Link encap:Ethernet  HWaddr 00:0C:29:F5:92:A1 
          inet addr:192.168.10.16  Bcast:192.168.10.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          Interrupt:67 Base address:0x2000

[root@node2 ha.d]# service heartbeat status

heartbeat OK [pid 7543 et al] is running on node2.gjp.com [node2.gjp.com]...

4.在客户端上运行tesnfs.sh测试文件,一直显示如下信息:

[root@gjp99 ~]# ./tesnfs.sh

 

[root@gjp99 ~]# ./tesnfs.sh

---> try touch x:Thu Oct 18 19:43:40 CST 2012
<---done touch x:Thu Oct 18 19:43:41 CST 2012

---> try touch x:Thu Oct 18 19:43:42 CST 2012

<---done touch x:Thu Oct 18 19:43:42 CST 2012

---> try touch x:Thu Oct 18 19:43:43 CST 2012

<---done touch x:Thu Oct 18 19:43:43 CST 2012

5.发现客户端仍可以正常挂载,磁盘可正常使用

[root@gjp99 ~]# mount

/dev/sda2 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
192.168.10.16:/mnt/nfs/ on /info type nfs (rw,addr=192.168.10.16)
192.168.10.16:/mnt/nfs/ on /info type nfs (rw,addr=192.168.10.16)
[root@gjp99 ~]# ll /info/
total 14
-rw-r--r-- 1 root root    22 Oct 18 17:01 gjp
-rw-r--r-- 1 root root     0 Oct 18 17:00 guojiping
drwx------ 2 root root 12288 Oct 18 16:55 lost+found
-rw-r--r-- 1 root root    11 Oct 18 19:17 test

node2 已成功接管服务,实验已实现所需的功能;也可手动在nfs挂载目录里建立文件,来回切换node1和node2的drbd服务来进行测试!

6.恢复node1为主要节点

[root@node1 ~]# service heartbeat start

Starting High-Availability services:
2012/10/18_19:51:01 INFO:  Resource is stopped
                                                           [  OK  ]
[root@node1 ~]# drbd-overview
  0:nfs  Connected Secondary/Primary UpToDate/UpToDate C r----
[root@node1 ~]# drbd-overview 
  0:nfs  Connected Primary/Secondary UpToDate/UpToDate C r---- /mnt/nfs ext3 471M 11M 437M 3%

发现切换时有时间间隔!

 

 

                                     《大结局》