一、环境准备

192.168.100.100    master
192.168.100.101    node01
192.168.100.102    node02/manager

三台服务器均需生成免认证密码文件

ssh-keygen

拷贝免登录文件到当前Linux服务器和另外两台Linux服务器

ssh-copy-id -i /root/.ssh/id_rsa [email protected]
ssh-copy-id -i /root/.ssh/id_rsa [email protected]
ssh-copy-id -i /root/.ssh/id_rsa [email protected]

测试免密

ssh [email protected]
ssh [email protected]
ssh [email protected]

二、安装MySQL

下载安装包 mysql80-community-release-el7-8.noarch.rpm

wget http://dev.mysql.com/get/mysql80-community-release-el7-8.noarch.rpm

安装mysql源

yum localinstall -y mysql80-community-release-el7-8.noarch.rpm

查看yum源默认安装版本

yum repolist enabled | grep mysql

使用 yum 安装 mysql

yum -y install mysql-community-server --nogpgcheck

修改MySQL配置文件

vi /etc/my.cnf
主节点 master 的配置

[mysqld]
server_id = 1                                   
skip_name_resolve   
gtid-mode = on                                  
enforce-gtid-consistency = true 
binlog_format = ROW
log-slave-updates = 1                   
log-bin = mysql-bin                             
relay-log = relay-log                   
default_authentication_plugin = mysql_native_password

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

从节点node01的配置

[mysqld]

server_id = 2
skip_name_resolve   
gtid-mode = on                                  
enforce-gtid-consistency = true 
binlog_format = ROW
log-slave-updates = 1                   
log-bin = mysql-bin                             
relay-log = relay-log                   
read_only = ON                                          
relay_log_purge = 0                                     
default_authentication_plugin = mysql_native_password

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

从节点node02的配置

[mysqld]

server_id = 3
skip_name_resolve   
gtid-mode = on                                  
enforce-gtid-consistency = true 
binlog_format = ROW
log-slave-updates = 1                   
log-bin = mysql-bin                             
relay-log = relay-log                   
read_only = ON                                          
relay_log_purge = 0                                     
default_authentication_plugin = mysql_native_password

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

启动MySQL并查看MySQL临时密码

systemctl start mysqld
cat /var/log/mysqld.log | grep password

登录后修改密码并开启远程登陆

mysql -uroot -p
#修改本地登陆密码
ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'yourpassword';
#在主节点创建远程登陆账号
create user 'root'@'%' identified with mysql_native_password by 'yourpassword';
grant all privileges on *.* to 'root'@'%' with grant option;
flush privileges;

三、配置主从

主节点上创建复制用户

create user rep@'%' identified with mysql_native_password by 'yourpassword';
grant replication slave,replication client on *.* to rep@'%';

从节点绑定主节点

#每个从节点执行
change master to
master_host='192.168.100.100',
master_port=3306,
master_user='rep',
master_password='yourpassword',
master_auto_position=1;
start slave;

查看主从连接状态

show slave status\G

四、安装配置MHA

三台服务器安装所需要的依赖包

yum install -y epel-release
yum install -y perl-DBD-MySQL ncftp perl-DBI

安装监控服务器依赖包

因资源不足选的node02节点,最好单独在一个服务器上部署和数据库节点分开
yum install -y perl-Config-Tiny.noarch perl-Time-HiRes.x86_64 perl-Parallel-ForkManager perl-Log-Dispatch.noarch

三台服务器上传和安装MHA-node包

rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm

在监控节点安装MHA-manager包

这里选择的是node02节点
rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

主节点创建mha用户

create user 'mhaadmin'@'%' identified WITH mysql_native_password by 'yourpassword';
grant all privileges on *.* to mhaadmin@'%';

三台服务器上创建mha目录

mkdir -p /data/mha

管理节点(node02)创建mha服务目录

mkdir -p /home/mha

配置管理节点的配置文件

vi /data/mha/mysql-mha.conf 

[server default]
user=mhaadmin
password=yourpassword
manager_workdir=/home/mha
manager_log=/home/mha/manager.log
remote_workdir=/home/mha
ssh_user=root
repl_user=rep
repl_password=yourpassword
ping_interval=3
master_binlog_dir=/var/lib/mysql
ssh_port=22
master_ip_failover_script = /usr/bin/master_ip_failover   # 若使用keepalived进行vip飘逸则注释此行master_ip_failover_script = /usr/bin/master_ip_failover
secondary_check_script = /usr/bin/masterha_secondary_check -s 192.168.100.101 -s 192.168.100.102
[server1]
hostname=192.168.100.100
candidate_master=1
[server2]
hostname=192.168.100.101
candidate_master=1
[server3]
hostname=192.168.100.102
no_master=1

配置管理节点的vip飘逸文件

vi /usr/bin/master_ip_failover

#!/usr/bin/env perl

use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
    $command, $orig_master_host, $orig_master_ip,$ssh_user,
    $orig_master_port, $new_master_host, $new_master_ip,$new_master_port,
    $orig_master_ssh_port,$new_master_ssh_port,$new_master_user,$new_master_password
);

my $vip = '192.168.100.10/24';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $ssh_Bcast_arp= "/sbin/arping -I ens33 -c 3 -A 192.168.100.10";

GetOptions(
    'command=s'          => \$command,
    'ssh_user=s'         => \$ssh_user,
    'orig_master_host=s' => \$orig_master_host,
    'orig_master_ip=s'   => \$orig_master_ip,
    'orig_master_port=i' => \$orig_master_port,
    'orig_master_ssh_port=i' => \$orig_master_ssh_port,
    'new_master_host=s'  => \$new_master_host,
    'new_master_ip=s'    => \$new_master_ip,
    'new_master_port=i'  => \$new_master_port,
    'new_master_ssh_port' => \$new_master_ssh_port,
    'new_master_user' => \$new_master_user,
    'new_master_password' => \$new_master_password

);

exit &main();

sub main {
    $ssh_user = defined $ssh_user ? $ssh_user : 'root';
    print "\n\nIN SCRIPT TEST====$ssh_user|$ssh_stop_vip==$ssh_user|$ssh_start_vip===\n\n";

    if ( $command eq "stop" || $command eq "stopssh" ) {

        my $exit_code = 1;
        eval {
            print "Disabling the VIP on old master: $orig_master_host \n";
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@\n";
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "start" ) {

        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n";
            &start_vip();
	    &start_arp();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}

sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub start_arp() {
    `ssh $ssh_user\@$new_master_host \" $ssh_Bcast_arp \"`;
}
sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --ssh_user=user --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

添加权限

chmod +x /usr/bin/master_ip_failover

在主数据库手动执行vip添加命令

/sbin/ifconfig ens33:1 192.168.100.10/24

利用脚本检查ssh互信配置

masterha_check_ssh --conf=/data/mha/mysql-mha.conf

检查主从配置

masterha_check_repl --conf=/data/mha/mysql-mha.conf
Ubuntu系统可能会出现at /usr/share/perl5/MHA/NodeUtil.pm line 195;报错,按照以下将原有的注释掉并加入以下未注释内容

vim /usr/share/perl5/MHA/NodeUtil.pm +195



#sub parse_mysql_version($) {
#  my $str = shift;
#  my $result = sprintf( '%03d%03d%03d', $str =~ m/(\d+)/g );
#  return $result;
#}
sub parse_mysql_version($) {
  my $str = shift;
 ($str) =  $str =~ m/^[^-]*/g;
  my $result = sprintf( '%03d%03d%03d', $str =~ m/(\d+)/g );
  return $result;
}

#sub parse_mysql_major_version($) {
#  my $str = shift;
#  my $result = sprintf( '%03d%03d', $str =~ m/(\d+)/g );
#  return $result;
#}
sub parse_mysql_major_version($) {
  my $str = shift;
  $str =~ /(\d+)\.(\d+)/;
  my $strmajor = "$1.$2";
  my $result = sprintf( '%03d%03d', $strmajor =~ m/(\d+)/g );
  return $result;
}

启动MHA

masterha_manager --conf=/data/mha/mysql-mha.conf &

配置systemctl管理MHA(centos)

vi /etc/init.d/masterha_managerd

#!/bin/bash
# chkconfig: 35 80 20
# description: MHA management script.

STARTEXEC="/usr/bin/masterha_manager --conf"
STOPEXEC="/usr/bin/masterha_stop --conf"
CONF="/data/mha/mysql-mha.conf"
process_count=`ps -ef |grep -w masterha_manager|grep -v grep|wc -l`
PARAMS="--ignore_last_failover"

case "$1" in
  start)
      if [ $process_count -gt 1 ]
      then
              echo "masterha_manager exists, process is already running"
      else
              echo "Starting Masterha Manager"
              $STARTEXEC $CONF $PARAMS < /dev/null > /home/mha/manager.log 2>&1 &
      fi
      ;;
  stop)
      if [ $process_count -eq 0 ]
      then
              echo "Masterha Manager does not exist, process is not running"
      else
              echo "Stopping ..."
              $STOPEXEC $CONF
              while(true)
              do
                  process_count=`ps -ef |grep -w masterha_manager|grep -v grep|wc -l`
                  if [ $process_count -gt 0 ]
                      then
                           sleep 1
                      else
                           break
                  fi
              done
              echo "Master Manager stopped"
      fi
      ;;
  *)
      echo "Please use start or stop as first argument"
      ;;
esac

添加权限加载配置

chmod +x /etc/init.d/masterha_managerd
chkconfig --add masterha_managerd
chkconfig masterha_managerd on
systemctl start masterha_managerd
systemctl status masterha_managerd

配置systemctl管理MHA(Ubuntu)

vi /etc/systemd/system/masterha_managerd.service

[Unit]
Description=MHA Manager Service
After=network.target mysql.service

[Service]
Type=simple
User=root
ExecStart=/usr/bin/masterha_manager --conf=/data/mha/mysql-mha.conf
Restart=on-failure

[Install]
WantedBy=multi-user.target


查看MHA状态

masterha_check_status --conf=/data/mha/mysql-mha.conf

查看日志

tail -100 /home/mha/manager.log

五、故障模拟

停掉主库mysql

systemctl stop mysqld

查看mha日志

tail -f /home/mha/manager.log
发现node01已经变成新的主库,查看node02的主从状态,发现所属主也已经变成node01
show slave status\G
查看原主库的vip 发现已经没有,在新主库node01查看 发现vip已经漂移到了这里
管理节点查看集群状态 发现MHA已经停止失效,故障转移完成后, manager将会自动停止
masterha_check_status --conf=/data/mha/mysql-mha.conf

恢复MHA

将停机的旧主启动 配置主从 主库为刚才的新主库
systemctl start mysqld.service

change master to
master_host='192.168.100.101',
master_port=3306,
master_user='rep',
master_password='yourpassword',
master_auto_position=1;
start slave;

检查集群节点状态

masterha_check_repl --conf=/data/mha/mysql-mha.conf

删除failover.complete文件

MHA的监控进程,在执行一次切换后,就会结束退出,并留下一个禁止文件 /home/mha/mysql-mha.failover.complete
该文件存在的话,不会执行下一次切换

rm -rf /home/mha/mysql-mha.failover.complete

重启MHA集群

#masterha_manager --conf=/data/mha/mysql-mha.conf &
systemctl restart masterha_managerd

查看MHA集群状态

masterha_check_status --conf=/data/mha/mysql-mha.conf