本文是生产环境中的一个案例,主要是为了通过反向代理实现多条线路容灾。在原有的基础上升级了openssl,openssh,nginx,通过一些模块来实现我们的需求。
正常情况下,反向代理会去请求online下面的主机,使用sticky模块实现session粘连。如果online的下面的主机全部挂掉了,这个时候,它会去出502报错(或404,具体与你的环境有关),此时它会启用下failover下面的主机,以实现线路容灾。具体的容灾方式,你可以选择多个。比如使用backup来进行标识。使用了nginx_upstream_check_module进行后端的主机健康检查。
- 重新部署新应用nginx
- 1、升级openssh
- 2、升级nginx
- 3、增加nignx模块
- 本次升级主要是从容灾的角度和反向代理安全性的角度考虑.通过升级openssh,避免一些低版本漏洞。
- 添加nginx的一些常规支持:
- a、支持多个SSL证书
- b、支持反端http 健康检查
- c、支持session 粘滞 nginx-sticky-module //支持sticky+rr ,sticky+weight
- d、通过nginx的权重+粘滞实现多线路容灾
- e、添加geoip模块支持,未来考虑智能CDN+GeoIP配合(在nginx前端对来源IP,判断从哪个机房取数据)
- 安装telnet服务器:
- #yum install -y telnet-server telnet
- 编译托管的服务
- # chkconfig telnet on
- #
- # /etc/init.d/xinetd restart
- Stopping xinetd: [FAILED]
- Starting xinetd: [ OK ]
- # netstat -tnlp
- Active Internet connections (only servers)
- Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
- tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 2632/sshd
- tcp 0 0 0.0.0.0:23 0.0.0.0:* LISTEN 21977/xinetd
- //在防火墙中加入自己的IP允许23的规则
- 建立普通用户进行登录
- # useradd sshinstall
- # echo "123456@sshinstall" | passwd --stdin sshinstall
- Changing password for user sshinstall.
- passwd: all authentication tokens updated successfully.
- 将该用户加入到sudo组里
- echo "sshinstall ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers
- 开始安装包了
- # tar -xzf openssl-1.0.1c.tar.gz
- # cd openssl-1.0.1c
- # ./config enable-tl***t --prefix=/usr/local/openssl-1.0.0c
- # make
- # make test
- # make install
- # echo /usr/local/openssl-1.0.0c/lib/ >> /etc/ld.so.conf
- # ln -s /usr/local/openssl-1.0.0c/ /usr/local/openssl
- echo '
- PATH=/usr/local/openssl/bin:$PATH
- export PATH' >> /etc/profile
- # source /etc/profile
- # openssl version -a
- OpenSSL 1.0.1c 10 May 2012
- built on: Fri Jan 4 00:32:23 CST 2013
- platform: linux-x86_64
- options: bn(64,64) rc4(16x,int) des(idx,cisc,16,int) idea(int) blowfish(idx)
- compiler: gcc -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -DTERMIO -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM
- OPENSSLDIR: "/usr/local/openssl-1.0.0c/ssl
- 开始删除openssh
- # rpm -e openssh-server-4.3p2-41.el5 --nodeps
- # rpm -e openssh-4.3p2-41.el5 --nodeps
- # rpm -e openssh-askpass-4.3p2-41.el5 --nodeps
- # rpm -e openssh-clients-4.3p2-41.el5 --nodeps
- # rm -rf /etc/ssh/
- 开始安装openssh
- # tar -xzf openssh-6.1p1.tar.gz
- # cd openssh-6.1p1
- # ./configure --prefix=/usr --sysconfdir=/etc/ssh --with-pam --with-ssl-dir=/usr/local/openssl-1.0.0c --with-md5-passwords --mandir=/usr/share/man
- # make
- # make install
- 将sshd加入到服务列表里面去
- # cp ./contrib/redhat/sshd.init /etc/init.d/sshd
- # chmod u+x /etc/init.d/sshd
- # chkconfig --add sshd
- # chkconfig sshd on
- # service sshd start
- Starting sshd: OK ]
- # ssh -v
- OpenSSH_6.1p1, OpenSSL 1.0.1c 10 May 2012
- 关掉telnetserver,删除sshinstall用户
- # chkconfig telnet off
- # /etc/init.d/xinetd restart
- Stopping xinetd: [ OK ]
- Starting xinetd: [ OK ]
- # netstat -tnlp
- Active Internet connections (only servers)
- Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
- tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 29602/sshd
- # userdel -r sshinstall
- 安装Nginx
- # tar zxvf libunwind-0.99.tar.gz
- # cd libunwind-0.99/
- # CFLAGS=-fPIC ./configure && make CFLAGS=-fPIC
- # make CFLAGS=-fPIC install
- # tar xzf google-perftools-1.6.tar.gz
- # cd google-perftools-1.6
- # ./configure
- # make && make install
- # tar -xzf pcre-8.12.tar.gz
- # cd pcre-8.12
- # ./configure && make && make install
- 安装geoip
- # wget http://geolite.maxmind.com/download/geoip/api/c/GeoIP.tar.gz
- # tar -xzf GeoIP.tar.gz
- # cd GeoIP-1.4.8/
- # ./configure && make && make install
- # wget http://geolite.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz
- # gunzip GeoIP.dat.gz
- # echo '/usr/local/lib' > /etc/ld.so.conf.d/geoip.conf
- # ldconfig
- 解压各个模块,在安装nginx时,加入该模块
- # unzip nginx_upstream_jvm_route.zip //tomcat session
- Archive: nginx_upstream_jvm_route.zip
- creating: nginx-upstream-jvm-route/
- creating: nginx-upstream-jvm-route/nginx_upstream_jvm_route/
- inflating: nginx-upstream-jvm-route/nginx_upstream_jvm_route/CHANGES
- inflating: nginx-upstream-jvm-route/nginx_upstream_jvm_route/config
- inflating: nginx-upstream-jvm-route/nginx_upstream_jvm_route/jvm_route.patch //补丁文件,需要手工进行执行
- inflating: nginx-upstream-jvm-route/nginx_upstream_jvm_route/ngx_http_upstream_jvm_route_module.c
- inflating: nginx-upstream-jvm-route/nginx_upstream_jvm_route/README
- # unzip master.zip //nginx_upsteam check_module
- # tar -xzf nginx-sticky-module-1.1.tar.gz //session
- # tar -xzf nginx-1.2.6.tar.gz
- # cd nginx-1.2.6
- 开始打补丁了
- # patch -p0 < /root/upgrade/nginx-upstream-jvm-route/nginx_upstream_jvm_route/jvm_route.patch
- patching file src/http/ngx_http_upstream.c
- Hunk #1 succeeded at 4117 (offset 380 lines).
- Hunk #3 succeeded at 4249 (offset 380 lines).
- Hunk #5 succeeded at 4348 (offset 380 lines).
- patching file src/http/ngx_http_upstream.h
- Hunk #1 succeeded at 90 (offset 5 lines).
- Hunk #3 succeeded at 118 (offset 5 lines).
- # patch -p1 < /root/upgrade/nginx_upstream_check_module-master/check_1.2.6+.patch
- patching file src/http/modules/ngx_http_upstream_ip_hash_module.c
- patching file src/http/modules/ngx_http_upstream_least_conn_module.c
- patching file src/http/ngx_http_upstream_round_robin.c
- patching file src/http/ngx_http_upstream_round_robin.h
- # ./configure --prefix=/usr/local/nginx --user=nobody --group=nobody --with-http_stub_status_module --with-http_gzip_static_module --with-http_realip_module --with-http_sub_module --with-http_geoip_module --with-http_ssl_module --with-http_ssl_module --with-openssl=/root/upgrade/openssl-1.0.1c --with-pcre=/root/upgrade/pcre-8.12 --add-module=/root/upgrade/nginx-upstream-jvm-route/nginx_upstream_jvm_route/ --add-module=/root/upgrade/nginx_upstream_check_module-master/ --add-module=/root/upgrade/nginx-sticky-module-1.1/ --with-google_perftools_module
- # make && make install
- # /usr/local/nginx/sbin/nginx -v
- nginx version: nginx/1.2.6
- 如果你的nginx是正常运行的,请对当前nginx进行在线升级.
- # ps aux | grep master
- root 13589 0.0 0.0 26772 3884 ? S 2012 0:01 nginx: master process /usr/local/nginx/sbin/nginx
- root 20834 0.0 0.0 61140 768 pts/4 S+ 17:14 0:00 grep master
- 进程替换
- # kill -USR2 13589
- # ps aux | grep master
- root 13589 0.0 0.0 26772 3884 ? S 2012 0:01 nginx: master process /usr/local/nginx/sbin/nginx
- root 21395 0.5 0.0 40272 3504 ? S 17:16 0:00 nginx: master process /usr/local/nginx/sbin/nginx
- root 21416 0.0 0.0 61140 768 pts/4 S+ 17:16 0:00 grep master
- # kill -WINCH 13589 //发送WINCH信号到旧的nginx主进程以杀掉旧的nginx子进程
- # kill -QUIT 13589 // 退出旧的nginx主进程
- # ps aux |grep master
- root 21395 0.0 0.0 40272 3504 ? S 17:16 0:00 nginx: master process /usr/local/nginx/sbin/nginx
- root 21749 0.0 0.0 61140 772 pts/4 S+ 17:16 0:00 grep master
- 删除老的版本
- # rm -rf /usr/local/nginx/sbin/nginx.old
- 查看当前版本
- # /usr/local/nginx/sbin/nginx -v
- nginx version: nginx/1.2.6
- 完成所以安装,收工!
- # /usr/local/nginx/sbin/nginx -V
- nginx version: nginx/1.2.6
- built by gcc 4.1.2 20080704 (Red Hat 4.1.2-52)
- TLS SNI support enabled //用于支持SSL多域名证书的哟
- configure arguments: --prefix=/usr/local/nginx --user=nobody --group=nobody --with-http_stub_status_module --with-http_gzip_static_module --with-http_realip_module --with-http_sub_module --with-http_geoip_module --with-http_ssl_module --with-http_ssl_module --with-openssl=/root/upgrade/openssl-1.0.1c --with-pcre=/root/upgrade/pcre-8.12 --add-module=/root/upgrade/nginx-upstream-jvm-route/nginx_upstream_jvm_route/ --add-module=/root/upgrade/nginx_upstream_check_module-master/ --add-module=/root/upgrade/nginx-sticky-module-1.1/ --with-google_perftools_module
- 环境验证:
- 前端一台反向代理,后端两台异地机房(A\B机房),A机房具有较高的带宽、B机房作为备用机房。
- 要求: 正常情况下,访问全部走A机房,在A机房不可用时,全部访问走B机房。理论上我们不需要进行session粘滞便可以轻松实现。但基于未来多机房分布,session粘滞还是必须需要的。那么当前情况下,我有两个模块nginx_upstream_jvm_route(需要配置tomcat\resin等,应用环境有限),nginx-sticky-module-1.1. 根据不同情况使用。
- 目前可以分为两种情况:
- 1、只有一条主线路和一条备线路
- 要求: 在主线路可以使用时,尽量使用主线路,备线路在主线路故障时使用。
- 实际情况: A机房一线主线路,B机房一条备用线路(因为线路质量差,所以备用)
- 具体配置如下:
- upstream.conf
- //
- upstream online {
- server 172.28.10.161:8080 max_fails=0 fail_timeout=3s ;
- server 172.28.10.163:8080 backup;
- check interval=3000 rise=2 fall=1 timeout=1000 type=http;
- check_http_send "GET / HTTP/1.0\r\n\r\n";
- check_http_expect_alive http_2xx http_3xx;
- }
- 2、有多条主线路和一条备线路
- 要求: 多条主线路进行负载均衡,在所有主线路都故障时,使用备用线路.
- 实际情况:A、C两条线路进行负载均衡、B线路最终备用线路.
- 具体配置如下:
- server.conf
- //
- server {
- ......
- location / {
- proxy_pass http://online;
- }
- error_page 404 502 = @backup; //加502的原因是因为线上系统在online里的upstream全部挂掉时,页面会报502,并不是404
- location @failover {
- proxy_pass http://backup;
- }
- location /status {
- check_status;
- access_log off;
- allow all; //生产环境请允许特定IP访问
- }
- ......
- }
- upstream.conf
- //
- proxy_next_upstream http_404 http_502; //让404报错进入max_fails计数
- upstream online {
- sticky;
- server 172.28.70.161:8080 max_fails=0 fail_timeout=3s ;
- server 172.28.70.163:8080 max_fails=0 fail_timeout=3s ;
- check interval=3000 rise=2 fall=1 timeout=1000 type=http;
- check_http_send "GET / HTTP/1.0\r\n\r\n";
- check_http_expect_alive http_2xx http_3xx;
- }
- upstream backup {
- server 172.28.22.29:7777 max_fails=0 fail_timeout=3s;
- }
- 如果upstream里的主机全部挂掉了,日志会报
- 2013/01/12 22:57:37 [error] 7627#0: *23641 no live upstreams while connecting to upstream, client: 100.120.111.94, server: *.mydomain.com, request: "GET http://www.mydomain.com/.....(省略) HTTP/1.1", upstream: "http://online/.....(省略), host: "www.mydomain.com", referrer: "http://www.mydomain.com/.....(省略)"
最后需要指出的就是后端的日志记录问题!这个在nginx升级安装时已经考虑到了,增加http_realip_module模块。