MYSQL MHA 5.7

服务器

浏览数:96

2020-5-31


1.   概述

它由日本DeNA公司youshimaton(现就职于Facebook公司)开发,MHA来保证数据库系统的高可用.在宕机的时间内(通常10—30秒内),完成故障切换,部署MHA,可避免主从一致性问题,节约购买新服务器的费用,不影响服务器性能,易安装,不改变现有部署。

   还支持在线切换,从当前运行master切换到一个新的master上面,只需要很短的时间(0.5-2秒内),此时仅仅阻塞写操作,并不影响读操作,便于主机硬件维护。

 

工作原理

(1)从宕机崩溃的master保存二进制日志事件(binlog events);

(2)识别含有最新更新的slave;

(3)应用差异的中继日志(relay log)到其他的slave;

(4)应用从master保存的二进制日志事件(binlog events);

(5)提升一个slave为新的master;

(6)使其他的slave连接新的master进行复制;

2.   搭建环境

1.  架构说明:

IP

主机名

数据库ID

类型

192.168.2.51

mysql01

251

MySQL Master + mha-node

192.168.2.52

mysql02

252

Mha-node+ Candidate Slave

192.168.2.53

mysql03

223

Mha-node

192.168.2.54

mysql-mha01

 

mha-manager

 

OS:CentOS 6.9

Mysql:5.7.20

Mha:0.57

2:配置/etc/my.cnf相关参数,在3各节点中分别配置GTID

log-bin=binlog
binlog-format=ROW

log-slave-updates=true

gtid-mode=on

enforce-gtid-consistency=true

master-info-repository=TABLE

relay-log-info-repository=TABLE

sync-master-info=1

slave-parallel-workers=2

binlog-checksum=CRC32

master-verify-checksum=1

slave-sql-verify-checksum=1

binlog-rows-query-log_events=1

 

重新初始化MySQL(非必须,这部是由于自动化脚本无法判定密码,要自动化必须剔除密码)

mysqld –initialize-insecure –user=mysql

service mysqld restart

 

设置root密码,创建复制用户:
mysql> use mysql;
mysql> GRANT ALL PRIVILEGES ON *.* TO root@”%” IDENTIFIED BY “123456”;

mysql> GRANT GRANT REPLICATION SLAVE ON *.* TO ‘repl’@’%’ identified by ‘123456’;

mysql> grant all privileges on *.* to ‘mha’@’%’ identified by ‘123456’;

mysql> flush privileges;
 

在主导出mysqldump -uroot -p –all-databases –triggers –routines –events > /root/all.sql

在备库导入

3:在mysql2、mysql3配置Gtid复制
CHANGE MASTER TO 
MASTER_HOST = ‘192.168.2.51’,
MASTER_PORT = 3306,
MASTER_USER = ‘repl’,
MASTER_PASSWORD = ‘123456’,
MASTER_AUTO_POSITION = 1;

 

3.   MHA Manager安装

3.1.  manager节点安装依赖软件

yum install perl-DBD-MySQL perl-ExtUtils-MakeMaker perl-CPAN perl-Mail-Sender perl-Log-Dispatch perl-Time-HiRes perl-Config-Tiny perl-Parallel-ForkManager perl-Module-Install perl-Config-IniFiles –y
 

3.2.  安装MHA-Manager的RPM

git clone https://gitee.com/gibsonxue/MHA.git
cd mha ; yum install mha4mysql-manager-0.57-0.el6.noarch.rpm ; mha4mysql-node-0.57-0.el6.noarch.rpm
 

3.3.  免秘钥登录配置

在manager节点需要配置到所有node节点的ssh免密码登录。
# ssh-keygen
#依次添加信任sh-copy-id  root@ip{x.x.x.x}
#scp –r ~/.ssh root@
-rw------- 1 root root  398 Nov 19 11:24 authorized_keys
-rw------- 1 root root 1671 Nov 19 11:19 id_rsa
-rw-r--r-- 1 root root  398 Nov 19 11:20 id_rsa.pub
-rw-r--r-- 1 root root 1576 Nov 19 11:51 known_hosts

拷贝整个.ssh目录到所有其他服务器

 

3.4.  脚本定义

定义的脚本如下放入/usr/bin/下面:

3.4.1.    master_ip_failover_01 故障切换时执行脚本

#!/usr/bin/env perl

use strict;

use warnings FATAL =>’all’;

 

use Getopt::Long;

 

my (

$command,          $ssh_user,        $orig_master_host, $orig_master_ip,

$orig_master_port, $new_master_host, $new_master_ip,    $new_master_port

);

 

my $vip = ‘192.168.2.50/24’;  # Virtual IP

my $key = “2”;

my $ssh_start_vip = “/sbin/ifconfig eth0:$key $vip”;

my $ssh_stop_vip = “/sbin/ifconfig eth0:$key down”;

my $exit_code = 0;

 

GetOptions(

‘command=s’          => \$command,

‘ssh_user=s’         => \$ssh_user,

‘orig_master_host=s’ => \$orig_master_host,

‘orig_master_ip=s’   => \$orig_master_ip,

‘orig_master_port=i’ => \$orig_master_port,

‘new_master_host=s’  => \$new_master_host,

‘new_master_ip=s’    => \$new_master_ip,

‘new_master_port=i’  => \$new_master_port,

);

 

exit &main();

 

sub main {

 

#print “\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n”;

 

if ( $command eq “stop” || $command eq “stopssh” ) {

 

        # $orig_master_host, $orig_master_ip, $orig_master_port are passed.

        # If you manage master ip address at global catalog database,

        # invalidate orig_master_ip here.

        my $exit_code = 1;

        eval {

            print “\n\n\n***************************************************************\n”;

            print “Disabling the VIP – $vip on old master: $orig_master_host\n”;

            print “***************************************************************\n\n\n\n”;

&stop_vip();

            $exit_code = 0;

        };

        if ($@) {

            warn “Got Error: $@\n”;

            exit $exit_code;

        }

        exit $exit_code;

}

elsif ( $command eq “start” ) {

 

        # all arguments are passed.

        # If you manage master ip address at global catalog database,

        # activate new_master_ip here.

        # You can also grant write access (create user, set read_only=0, etc) here.

my $exit_code = 10;

        eval {

            print “\n\n\n***************************************************************\n”;

            print “Enabling the VIP – $vip on new master: $new_master_host \n”;

            print “***************************************************************\n\n\n\n”;

&start_vip();

            $exit_code = 0;

        };

        if ($@) {

            warn $@;

            exit $exit_code;

        }

        exit $exit_code;

}

elsif ( $command eq “status” ) {

        print “Checking the Status of the script.. OK \n”;

        `ssh $ssh_user\@$orig_master_host \” $ssh_start_vip \”`;

        exit 0;

}

else {

&usage();

        exit 1;

}

}

 

# A simple system call that enable the VIP on the new master

sub start_vip() {

`ssh $ssh_user\@$new_master_host \” $ssh_start_vip \”`;

}

# A simple system call that disable the VIP on the old_master

sub stop_vip() {

`ssh $ssh_user\@$orig_master_host \” $ssh_stop_vip \”`;

}

 

sub usage {

print

“Usage: master_ip_failover –command=start|stop|stopssh|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=po

rt –new_master_host=host –new_master_ip=ip –new_master_port=port\n”;

}

 

3.4.2.    master_ip_online_change_01无故障维护切换使用

#!/usr/bin/env perl

 

#  Copyright (C) 2011 DeNA Co.,Ltd.

#

#  This program is free software; you can redistribute it and/or modify

#  it under the terms of the GNU General Public License as published by

#  the Free Software Foundation; either version 2 of the License, or

#  (at your option) any later version.

#

#  This program is distributed in the hope that it will be useful,

#  but WITHOUT ANY WARRANTY; without even the implied warranty of

#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

#  GNU General Public License for more details.

#

#  You should have received a copy of the GNU General Public License

#   along with this program; if not, write to the Free Software

#  Foundation, Inc.,

#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

 

## Note: This is a sample script and is not complete. Modify the script based on your environment.

 

use strict;

use warnings FATAL => ‘all’;

 

use Getopt::Long;

use MHA::DBHelper;

use MHA::NodeUtil;

use Time::HiRes qw( sleep gettimeofday tv_interval );

use Data::Dumper;

 

my $_tstart;

my $_running_interval = 0.1;

 

my $vip = ‘192.168.2.50’; # Virtual IP

my $key = “2”;

my $ssh_start_vip = “/sbin/ifconfig eth0:$key $vip/24”;

my $ssh_stop_vip = “/sbin/ifconfig eth0:$key down”;

my $ssh_send_garp = “/sbin/arping -U $vip -I eth0 -c 1”;

 

my (

  $command,              $orig_master_is_new_slave, $orig_master_host,

  $orig_master_ip,       $orig_master_port,         $orig_master_user,

  $orig_master_password, $orig_master_ssh_user,

  $new_master_host,      $new_master_ip,            $new_master_port,         

  $new_master_user,      $new_master_password,      $new_master_ssh_user,

);

 

GetOptions(

  ‘command=s’                => \$command,

  ‘orig_master_is_new_slave’ => \$orig_master_is_new_slave,

  ‘orig_master_host=s’       => \$orig_master_host,

  ‘orig_master_ip=s’         => \$orig_master_ip,

  ‘orig_master_port=i’       => \$orig_master_port,

  ‘orig_master_user=s’       => \$orig_master_user,

  ‘orig_master_password=s’   => \$orig_master_password,

  ‘orig_master_ssh_user=s’   => \$orig_master_ssh_user,

  ‘new_master_host=s’        => \$new_master_host,

  ‘new_master_ip=s’          => \$new_master_ip,

  ‘new_master_port=i’        => \$new_master_port,

  ‘new_master_user=s’        => \$new_master_user,

  ‘new_master_password=s’    => \$new_master_password,

  ‘new_master_ssh_user=s’    => \$new_master_ssh_user,

);

 

exit &main();

 

sub start_vip(){

    `ssh  $new_master_ssh_user\@$new_master_host \” $ssh_start_vip \”`;

    `ssh  $new_master_ssh_user\@$new_master_host \” $ssh_send_garp \”`;

}

 

sub stop_vip(){

    `ssh $orig_master_ssh_user\@$orig_master_host \” $ssh_stop_vip \”`;

}

 

 

sub current_time_us {

  my ( $sec, $microsec ) = gettimeofday();

  my $curdate = localtime($sec);

  return $curdate . ” ” . sprintf( “%06d”, $microsec );

}

 

sub sleep_until {

  my $elapsed = tv_interval($_tstart);

  if ( $_running_interval > $elapsed ) {

    sleep( $_running_interval – $elapsed );

  }

}

 

sub get_threads_util {

  my $dbh                    = shift;

  my $my_connection_id       = shift;

  my $running_time_threshold = shift;

  my $type                   = shift;

  $running_time_threshold = 0 unless ($running_time_threshold);

  $type                   = 0 unless ($type);

  my @threads;

 

  my $sth = $dbh->prepare(“SHOW PROCESSLIST”);

  $sth->execute();

 

  while ( my $ref = $sth->fetchrow_hashref() ) {

    my $id         = $ref->{Id};

    my $user       = $ref->{User};

    my $host       = $ref->{Host};

    my $command    = $ref->{Command};

    my $state      = $ref->{State};

    my $query_time = $ref->{Time};

    my $info       = $ref->{Info};

    $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);

    next if ( $my_connection_id == $id );

    next if ( defined($query_time) && $query_time < $running_time_threshold );

    next if ( defined($command)    && $command eq “Binlog Dump” );

    next if ( defined($user)       && $user eq “system user” );

    next

      if ( defined($command)

      && $command eq “Sleep”

      && defined($query_time)

      && $query_time >= 1 );

 

    if ( $type >= 1 ) {

      next if ( defined($command) && $command eq “Sleep” );

      next if ( defined($command) && $command eq “Connect” );

    }

 

    if ( $type >= 2 ) {

      next if ( defined($info) && $info =~ m/^select/i );

      next if ( defined($info) && $info =~ m/^show/i );

    }

 

    push @threads, $ref;

  }

  return @threads;

}

 

sub main {

  if ( $command eq “stop” ) {

    ## Gracefully killing connections on the current master

    # 1. Set read_only= 1 on the new master

    # 2. DROP USER so that no app user can establish new connections

    # 3. Set read_only= 1 on the current master

    # 4. Kill current queries

    # * Any database access failure will result in script die.

    my $exit_code = 1;

    eval {

      ## Setting read_only=1 on the new master (to avoid accident)

      my $new_master_handler = new MHA::DBHelper();

 

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

      print current_time_us() . ” Set read_only on the new master.. “;

      $new_master_handler->enable_read_only();

      if ( $new_master_handler->is_read_only() ) {

        print “ok.\n”;

      }

      else {

        die “Failed!\n”;

      }

      $new_master_handler->disconnect();

 

      # Connecting to the orig master, die if any database error happens

      my $orig_master_handler = new MHA::DBHelper();

      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,

        $orig_master_user, $orig_master_password, 1 );

 

      ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand

      $orig_master_handler->disable_log_bin_local();

      # print current_time_us() . ” Drpping app user on the orig master..\n”;

      #drop_app_user($orig_master_handler);

 

      ## Waiting for N * 100 milliseconds so that current connections can exit

      my $time_until_read_only = 15;

      $_tstart = [gettimeofday];

      my @threads = get_threads_util( $orig_master_handler->{dbh},

        $orig_master_handler->{connection_id} );

      while ( $time_until_read_only > 0 && $#threads >= 0 ) {

        if ( $time_until_read_only % 5 == 0 ) {

          printf

“%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n”,

            current_time_us(), $#threads + 1, $time_until_read_only * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . “\n”

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_read_only–;

        @threads = get_threads_util( $orig_master_handler->{dbh},

          $orig_master_handler->{connection_id} );

      }

 

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write

      print current_time_us() . ” Set read_only=1 on the orig master.. “;

      $orig_master_handler->enable_read_only();

      if ( $orig_master_handler->is_read_only() ) {

        print “ok.\n”;

      }

      else {

        die “Failed!\n”;

      }

 

      ## Waiting for M * 100 milliseconds so that current update queries can complete

      my $time_until_kill_threads = 5;

      @threads = get_threads_util( $orig_master_handler->{dbh},

        $orig_master_handler->{connection_id} );

      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {

        if ( $time_until_kill_threads % 5 == 0 ) {

          printf

“%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n”,

            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . “\n”

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_kill_threads–;

        @threads = get_threads_util( $orig_master_handler->{dbh},

          $orig_master_handler->{connection_id} );

      }

 

      ## Terminating all threads

      print current_time_us() . ” Killing all application threads..\n”;

      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );

      print current_time_us() . ” done.\n”;

      $orig_master_handler->enable_log_bin_local();

      $orig_master_handler->disconnect();

 

      ## Droping the VIP    

      print “Disabling the VIP an old master: $orig_master_host \n”;

      &stop_vip();

 

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK

      $exit_code = 0;

    };

    if ($@) {

      warn “Got Error: $@\n”;

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq “start” ) {

    ## Activating master ip on the new master

    # 1. Create app user with write privileges

    # 2. Moving backup script if needed

    # 3. Register new master’s ip to the catalog database

 

# We don’t return error even though activating updatable accounts/ip failed so that we don’t interrupt slaves’ recovery.

# If exit code is 0 or 10, MHA does not abort

    my $exit_code = 10;

    eval {

      my $new_master_handler = new MHA::DBHelper();

 

      # args: hostname, port, user, password, raise_error_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

 

      ## Set read_only=0 on the new master

      $new_master_handler->disable_log_bin_local();

      print current_time_us() . ” Set read_only=0 on the new master.\n”;

      $new_master_handler->disable_read_only();

 

      ## Creating an app user on the new master

      #print current_time_us() . ” Creating app user on the new master..\n”;

      # create_app_user($new_master_handler);

      print “Enabling the VIP $vip on the new master: $new_master_host \n”;

      &start_vip();

      $new_master_handler->enable_log_bin_local();

      $new_master_handler->disconnect();

 

      ## Update master ip on the catalog database, etc

      $exit_code = 0;

    };

    if ($@) {

      warn “Got Error: $@\n”;

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq “status” ) {

 

    # do nothing

    exit 0;

  }

  else {

    &usage();

    exit 1;

  }

}

 

sub usage {

  print

“Usage: master_ip_online_change –command=start|stop|status –orig_master_host=host –orig_master_ip=ip –orig_master_port=port –new_master_host=host –new_master_ip=ip –new_master_port=port\n”;

  die;

}

 

3.4.3.    发送报告使用send_report.py

#!/usr/bin/env python

#-*- encoding:utf-8 -*-

#——————————————————————————-

# Name:        send_report.py

# Author:      zhoujy

#———————————————-

import os

import sys

import time

import datetime

import smtplib

import subprocess

import fileinput

import getopt

from email.mime.text import MIMEText

from email.mime.multipart import MIMEMultipart

from email.Utils import COMMASPACE, formatdate

 

reload(sys)

sys.setdefaultencoding(‘utf8’)

 

def send_mail(to, subject, text, from_mail, server=”localhost”):

    message = MIMEMultipart()

    message[‘From’] = from_mail

    message[‘To’] = COMMASPACE.join(to)

    message[‘Date’] = formatdate(localtime=True)

    message[‘Subject’] = subject

    message.attach(MIMEText(text,_charset=’utf-8′))

    smtp = smtplib.SMTP(server)

    smtp.sendmail(from_mail, to, message.as_string())

    smtp.close()

 

if __name__ == “__main__”:

    opts,args = getopt.getopt(sys.argv[1:],”h”,[“orig_master_host=”,”new_master_host=”,”new_slave_hosts=”,”conf=”,”subject=”,”body=”,”app_vip=”,”new_master_ssh_port=”,”ssh_user=”])

#    print opts,args

    for lines in opts:

        key,values = lines

        if key == ‘–orig_master_host’:

            orig_master_host = values

        if key == ‘–new_master_host’:

            new_master_host = values

        if key == ‘–new_slave_hosts’:

            new_slave_hosts = values

        if key == ‘–subject’:

            subject = values

        if key == ‘–body’:

            body = values

#    text = sys.stdin.read()

    mail_list = [‘xueyuanfeng@cangoonline.com’]

    send_mail(mail_list, subject.encode(“utf8”), body, “MHA_Monitor@smtp.cangoonline.com”, server=”127.0.0.1″)

 

3.4.4.    编辑配置文件

[server default]

manager_log=/var/log/masterha/app01/manager.log

manager_workdir=/var/log/masterha/app01

master_binlog_dir=/var/lib/mysql

master_ip_failover_script=/usr/bin/master_ip_failover_01

master_ip_online_change_script=/usr/bin/master_ip_online_change_01

user=mha

password=123456

ping_interval=2

remote_workdir=/tmp

repl_user=repl

repl_password=123456

report_script=/usr/bin/send_report.py

secondary_check_script=/usr/bin/masterha_secondary_check -s MYSQL01 –user=root –port=22 –master_host=MYSQL02 –master_port=3306

shutdown_script=””

ssh_user=root

ssh_port=22

ping_type=CONNECT

 

[server1]

candidate_master=1

hostname=MYSQL01

port=3306

 

[server2]

candidate_master=1

hostname=MYSQL02

port=3306

 

[server3]

hostname=MYSQL03

port=3306

 

3.4.5.    启动MHA

nohup masterha_manager –conf=/etc/masterha/app01/app01.conf < /dev/null > /var/log/masterha/app01/manager.log 2>&1 &

 

4.   MHA Node安装

4.1.  manager节点需要安装node软件。

yum install perl-DBD-MySQL perl-ExtUtils-MakeMaker perl-CPAN perl-Mail-Sender perl-Log-Dispatch perl-Time-HiRes perl-Config-Tiny perl-Parallel-ForkManager perl-Module-Install perl-Config-IniFiles –y
 

4.2.  安装MHA-Node的RPM

git clone https://gitee.com/gibsonxue/MHA.git
cd mha ; mha4mysql-node-0.57-0.el6.noarch.rpm
 

5.   MHA部署检查

5.1.  参数文件检查

检查文件权限是否为600。

5.2.  masterha_check_ssh

masterha_check_ssh --conf=/etc/app01.cnf

 

检查结果必须是:

All SSH connection tests passed successfully.

5.3.  masterha_check_repl

masterha_check_repl --conf=/etc/app01.cnf
检查结果必须是
MySQL Replication Health is OK.

5.4.  masterha_secondary_check

具体脚本见配置文件里面secondary_check_script后面的内容。
检查结果必须是:
Master is reachable from

5.5.  send_report

send_report --orig_master_host=x.x.x.x --new_master_host=y.y.y.y --new_slave_hosts=x.x.x.x --subject=/etc/masterha/app01.cnf --body="这是一封测试邮件"
必须要可以发送成功。

5.6.  master_ip_failover_script

重点检查 vip 和 网卡 设置等是否正确。

5.7.  master_ip_online_change_script

重点检查 vip 和 网卡 设置等是否正确。

5.8.  manager.log

检查日志 /var/log/masterha/app01/manager.log 是否有报错。

作者:gibsonxue