linux27-nagios-文章-软件开发-linux

linux27-nagios

时间:04-16 13:54 阅读:1518次

*温馨提示：点击图片可以放大观看高清大图

简介：介绍linux27-nagios

监控大量机器上运行的服务和负载等,带报警功能.

[root@li ~]# ls /share/soft/soft/monitor2/

nagios-3.2.3.tar.gz --主程序包

nagios-plugins-1.4.15.tar.gz --插件包

nrpe-2.12.tar.gz --客户端程序包

--注意插件包等和主程序包的版本号不一定要一致

1,搭建rpm版lamp

# yum install httpd* gd gd-devel

2,建立用户

# useradd nagios

# groupadd nagiosgroup

# usermod -G nagiosgroup nagios

# usermod -G nagiosgroup apache

3,安装nagios主程序包

# tar xf /share/soft/soft/monitor2/nagios-3.2.3.tar.gz -C /usr/src/

# cd /usr/src/nagios-3.2.3/

# ./configure --with-nagios-user=nagios --with-nagios-group=nagiosgroup

# make all

# make install

# make install-init

# make install-commandmode

# make install-config

# make install-webconf

make install

- This installs the main program, CGIs, and HTML files

make install-init

- This installs the init script in /etc/rc.d/init.d

make install-commandmode

- This installs and configures permissions on the

directory for holding the external command file

make install-config

- This installs *SAMPLE* config files in /usr/local/nagios/etc

You'll have to modify these sample files before you can

use Nagios. Read the HTML documentation for more info

on doing this. Pay particular attention to the docs on

object configuration files, as they determine what/how

things get monitored!

make install-webconf

- This installs the Apache config file for the Nagios

web interface

# ls /usr/local/nagios/bin etc libexec sbin share var

--libexec目录为空,需要安装插件包才会有很多命令与脚本

4,安装nagios插件包 --包含用于收集数据的程序,命令,脚本等

# tar xf /share/soft/soft/monitor2/nagios-plugins-1.4.15.tar.gz -C /usr/src/

# cd /usr/src/nagios-plugins-1.4.15/

# ./configure --with-nagios-user=nagios --with-nagios-group=nagiosgroup

# make ;make install

5.创建web界面可访问的验证用户

/etc/httpd/conf.d/nagios.conf --在这个文件里已经配置了nagios的apache验证,我们要把用户给创建出来

# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin --文件路径和用户名都不要改,被规定了就是这个

New password:

Re-type new password:

Adding password for user nagiosadmin

6,nagios配置文件介绍

/usr/local/nagios/etc/nagios.cfg --主配置文件

/usr/local/nagios/etc/objects/ --子配置文件的目录

localhost.cfg --一个示例模版,默认定义了监控本机的8个服务

templates.cfg --模版定义文件

commands.cfg --命令定义文件

contacts.cfg --定义通知方式的文件

timeperiods.cfg --监控时间段定义文件

==================================================

关于nagios配置文件之间的联系讲解示例

# vim /usr/local/nagios/etc/nagios.cfg

cfg_file=/usr/local/nagios/etc/objects/localhost.cfg

# vim /usr/local/nagios/etc/objects/localhost.cfg

define host{

use linux-server --模版

host_name localhost --主机名

alias localhost --主机别名

address 127.0.0.1 --被监控机器的IP}

define hostgroup{

hostgroup_name linux-servers

alias Linux Servers

members localhost --linux Servers组现在只有localhost这一个成员

}

--下面是8个默认定义的服务,我以监控磁盘利用率的这一段为例

define service{

use local-service --模版,在templates.cfg 里定义的

host_name localhost --主机名,调用的是同配置文件里define host里定义的host_name

service_description Root Partition --描述,会在web界面显示的一个标题

check_command check_local_disk!20%!10%!/ --检测利用率的命令,free空间小于20%就报警,小于10就critcal警告

}

# vim /usr/local/nagios/etc/objects/templates.cfg

define host{

name linux-server

use generic-host --linux主机模版也使用了一个叫generic-host的模版,也在templates.cfg里

check_period 24x7 --在timeperiods.cfg 里定义的时间段

check_interval 5

retry_interval 1

max_check_attempts 10

check_command check-host-alive --在commands.cfg 里定义的命令

notification_period workhours --通知时间在timeperiods.cfg里定义的

notification_interval 120 --通知间隔

notification_options d,u,r --通知选项

contact_groups admins --通知组,在contacts.cfg 里定义

# vim /usr/local/nagios/etc/objects/commands.cfg

define command{

command_name check-host-alive

command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5

}

--命令都在libexec下,用--help去查

# /usr/local/nagios/libexec/check_ping --help

＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝

--配置文件默认什么都不要改,检查配置文件正确性

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

# /etc/init.d/nagios restart

# /etc/init.d/httpd restart

访问路径：http://172.19.1.17/nagios

用户名:nagioadmin

密码: 前面htpasswd -c 创建时定义的密码

===================================================

现在查看web界面,默认只监控了localhost,并监控了其8个服务

一些小操作：

1,如果http服务为黄色,是警告,则需要把网站家目录里加一个主页进去（家目录为空,他就会警告）.

但需要等它下一次check才会OK.如果要手动check,可以点http,再右边点Re-schedule the next check of this service去强制check,就OK了

2,默认http和ssh是关闭通知的,是在localhost.cfg里这两个服务有一句 notifications_enabled 0.

也可以手动打开,点进去,再右边点enabled notifications for this service.

3,关闭ssh服务,刷新web界面,还是没有critical.

点击ssh,可以看到下一次计划的check时间.如果不等的话,在右边点Re-schedule the next check of this service强制check,再刷新就critical

4,修改ssh的check时间间隔

# vim /usr/local/nagios/etc/objects/localhost.cfgdefine service{

use local-service --使用的这个模版,要去改这个模版里的时间

host_name localhost

service_description SSH

check_command check_ssh

notifications_enabled 0

}

# vim /usr/local/nagios/etc/objects/templates.cfg

define service{

name local-service

.............

normal_check_interval 1 --把这个五分钟改为1分钟

.............

}

# /etc/init.d/nagios reload

--再去web界面验证,check时间为1分钟了

========================================================

例1：在默认8个服务的基础上,如何增加监控本机的服务如ftp

思路步骤：

1,看libexec/下是否有检测ftp的命令,如果没有,网上下载,或自己开发

2,在localhost.cfg里定义这个服务

3,在command.cfg里定义命令

#vim /usr/local/nagios/etc/objects/localhost.cfg

define service{

use local-service

host_name localhost

service_description FTP

check_command check_ftp!3!6

}

# vim /usr/local/nagios/etc/objects/commands.cfg

define command{

command_name check_ftp

command_line $USER1$/check_ftp -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$

}

# /etc/init.d/nagios restart

练习：

1,如果本机ftp服务为监听2121端口,应该如何监控

# vim /etc/vsftpd/vsftpd.conf

listen_port=2121 --加上这一句

# netstat -ntlup |grep ftp

tcp 0 0 0.0.0.0:2121 0.0.0.0:* LISTEN 29883/vsftpd

# vim /usr/local/nagios/etc/objects/localhost.cfg

---加下面一段

define service{

use local-service

host_name localhost

service_description FTP --标题改成FTP

check_command check_ftp_2121!3!6!2121

--命令我这里是没有的,在command.cfg里默认有一个check_ftp,没有

--check_ftp_2121这个,要手动去加；!为参数分隔符,3是第一个参数,6是第二个参数,2121是第三个参数；它们对应于我下面定义的-w -c -p

}

# vim /usr/local/nagios/etc/objects/commands.cfg

define command{

command_name check_ftp_2121

command_line $USER1$/check_ftp -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p $ARG3$

}

--直接使用监控命令去手工check一下,OK的

# /usr/local/nagios/libexec/check_ftp -w 3 -c 6 -p 2121

FTP OK - 0.004 second response time on port 2121 [220-#############################

220-#]|time=0.003835s;3.000000;6.000000;0.000000;10.000000

# /etc/init.d/nagios reload

--reload后,再去web界面可以看到能监控本机的ftp这个服务了

2,监控本机的mysql

# vim /usr/local/nagios/etc/objects/localhost.cfg

define service{

use local-service

host_name localhost

service_description MYSQL

check_command check_mysql!root!123

}

# vim /usr/local/nagios/etc/objects/commands.cfg

define command{

command_name check_mysqlcommand_line $USER1$/check_mysql -H $HOSTADDRESS$ -u $ARG1$ -p $ARG2$ --第一个参数对应上面的root,第二个对应密码123

}

--手动check一下mysql,OK

# /usr/local/nagios/libexec/check_mysql -u root -p123

Uptime: 189 Threads: 1 Questions: 5 Slow queries: 0 Opens: 12 Flush tables: 1 Open tables: 6 Queries per second avg: 0.026

# /etc/init.d/nagios reload

--去nagios 的web界面刷新查看,OK

===========================================================

我们把监控的服务分为公共和私有

公共：如ssh,http,ftp,mysql等.监控本地或远程的公共服务,都可以直接配置

私有：如load,users,disk usage等.监控本地私有服务直接配置就好,监控远程私有服务,需要服务和被监控端安装nrpe

例：监控远程服务器的普通服务（公共服务）.如ssh,http,ftp,mysql等

如：我的被监控端IP为172.19.1.125

1.在nagios服务器的主配置文件里加上125的主机配置文件

# vim /usr/local/nagios/etc/nagios.cfg

cfg_file=/usr/local/nagios/etc/objects/125.cfg

2,创建这个125.cfg

# cd /usr/local/nagios/etc/objects/

# cp localhost.cfg 125.cfg

# vim 125.cfg

define host{

use linux-server

host_name 172.19.1.125 --主机名,最好/etc/hosts里对应好IP,我这里没有做,就直接写IP

alias 172.19.1.125 --显示到web上的名字

address 172.19.1.125 --实际被监控主机IP

}

--下面是公共服务,这里我只写了五个,你可以自行增加

define service{

use local-service

host_name 172.19.1.125service_description PING

check_command check_ping!100.0,20%!500.0,60%

}

define service{

use local-service

host_name 172.19.1.125

service_description SSH

check_command check_ssh

}

define service{

use local-service

host_name 172.19.1.125

service_description HTTP

check_command check_http

}

define service{

use local-service

host_name 172.19.1.125

service_description FTP

check_command check_ftp!3!6

}

define service{

use local-service

host_name 172.19.1.125

service_description MYSQL

check_command check_mysql!root!123

}

--验证配置文件,再重启服务

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

# /etc/init.d/nagios reload

=====================================================

例:监控远程的私有服务

1,在nagios服务器上安装nrpe插件

# tar xf /share/soft/soft/monitor2/nrpe-2.12.tar.gz -C /usr/src/

# ./configure && make && make install

--安装完后,就有下面的脚本了

/usr/local/nagios/libexec/check_nrpe

2,增加check_nrpe命令到commands.conf文件里

# vim /usr/local/nagios/etc/objects/commands.cfg

define command{

command_name check_nrpe

command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ --c参数后接command, 也就说check_nrpe可以调用别的check命令

}

3,在nagios服务器上对229的配置文件增加远程私有服务

# vim 125.cfg

define service{

use local-service

host_name 172.19.1.125

service_description Root Partition

check_command check_nrpe!check_remote_root

--check_remote_root就是check_nrpe的C参数要调用的命令,此命令在nagios服务器上的commands.cfg里是不存在,它会在后面的步骤中加到被监控端

}

define service{

use local-service

host_name 172.19.1.125

service_description Current Users

check_command check_nrpe!check_remote_users

}

define service{

use local-service

host_name 172.19.1.125

service_description Total Processes

check_command check_nrpe!check_remote_total_procs

}

define service{

use local-service

host_name 172.19.1.125

service_description Current Load

check_command check_nrpe!check_remote_load

}

define service{

use local-service

host_name 172.19.1.125

service_description Swap Usagecheck_command check_nrpe!check_remote_swap

}

# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

--检查一下配置文件正确性,OK的话则配置端配置完毕,先不reload nagios服务,等被监控端配置完后再reload

==============================================================

现在在被监控端125上安装

1,新建用户

# useradd nagios

# groupadd nagiosgroup

# usermod -G nagiosgroup nagios

# usermod -G nagiosgroup apache

2,安装plugins插件,包含了数据采集命令脚本

# tar xf nagios-plugins-1.4.15.tar.gz -C /usr/src/

# cd /usr/src/nagios-plugins-1.4.15/

# ./configure --with-nagios-user=nagios --with-nagios-group=nagiosgroup

# make && make install

3,安装nrpe

# tar xf nrpe-2.12.tar.gz -C /usr/src/

# cd /usr/src/nrpe-2.12/

# ./configure && make && make install

# make install-plugin

# make install-daemon

# make install-daemon-config

# make install-xinetd

4,修改nrpe的超级守护进程的配置文件

# vim /etc/xinetd.d/nrpe

service nrpe

{

flags = REUSE

socket_type = stream

port = 5666

wait = no

user = nagios

group = nagios

server = /usr/local/nagios/bin/nrpe

server_args = -c /usr/local/nagios/etc/nrpe.cfg --inetd

log_on_failure = USERID

disable = no

only_from = 127.0.0.1 172.19.1.17 --加上nagios服务器的IP,允许它来访问

}

# vim /etc/services --面加一行

nrpe 5666/tcp # NRPE

5,在nrpe配置文件里定义check命令,使nagios服务能调用

# vim /usr/local/nagios/etc/nrpe.cfg

command[check_remote_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10

command[check_remote_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

command[check_remote_root]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda2 --/dev/sda2是被监控端的根分区,也可以直接就写一个 / 就可以了

command[check_remote_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

command[check_remote_swap]=/usr/local/nagios/libexec/check_swap -w 80%% -c 60%% --这句默认没有的,但nagios服务器有配置,加上这句

command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z --这个是默认有的,但nagios服务器那边我没有加,这个在这里没有用

# /etc/init.d/xinetd restart --启动超级守护进程

# netstat -ntlup |grep 5666 --有端口被监听了

tcp 0 0 0.0.0.0:5666 0.0.0.0:* LISTEN 22120/xinetd

6,在本地或nagios服务器测试

--在被监控端测试成功

# /usr/local/nagios/libexec/check_users -w 5 -c 10

USERS OK - 3 users currently logged in |users=3;5;10;0

--在nagios服务器上测试成功

# /usr/local/nagios/libexec/check_nrpe -H 172.19.1.125 -c check_remote_users

USERS OK - 3 users currently logged in |users=3;5;10;0

7,回到nagios服务器重启服务

# /etc/init.d/nagios restart

=================================================================

nagios邮件报警功能

报警方式：

1,声音报警

2,邮件报警

3,短信报警

这里我就用自带的sendmail简单实现

1,搭建邮件系统 --前面都有讲,这里省略,我直接用本地sendmail测试是否能收邮件

#yum install sendmail* m4 -y

#/etc/init.d/sendmail restart

# echo 'test' | mail root@li.cluster.com --发一个邮件给自己的root用户测试能接收

2,修改nagios服务器联系相关的子配置文件

# vim /usr/local/nagios/etc/objects/contacts.cfg

define contact{

contact_name nagiosadmin

use generic-contact

alias Nagios Admin

email root@li.cluster.com --改成收的邮件地址

}

# vim /usr/local/nagios/etc/objects/templates.cfg --保持默认

define contact{

name generic-contact

host_notification_period 24x7

service_notification_options w,u,c,r,f,s

host_notification_options d,u,r,f,s

service_notification_commands notify-service-by-email --服务通知命令

host_notification_commands notify-host-by-email --主机通知命令

}

# vim /usr/local/nagios/etc/objects/commands.cfg --默认有这两条命令的定义,我这里保持默认就好了

define command{

command_name notify-host-by-email

command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$}

define command{

command_name notify-service-by-email

command_line /usr/bin/printf "%b" "***** Nagios *****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$

}

# /etc/init.d/nagios restart

--关闭一些服务进行测试tail /var/mail/root来查看

===============================================================

使用免费139邮箱来进行短信通知

在https://mail.10086.cn/Register/default.aspx上注册一个邮箱

我的为：

158xxxxxxxx@139.com

登录进去邮箱：

点设置－－＞邮件到达通知－－＞点开启

然后在本机发送一个测试邮件给你的邮箱,会发现手机会马上收到此邮件

# mail -s 'test' 158xxxxxxxx@139.com

# vim /usr/local/nagios/etc/objects/contacts.cfg

define contact{

contact_name nagiosadmin

use generic-contact

alias Nagios Admin

email 158xxxxxxxx@139.com --改成收的邮件地址

}

# /etc/init.d/nagios restart

--然后就把几个服务关闭,开启一下,等邮件通知,通知过就会发到手机

=============================================================

＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝

awstats

用于分析apache日志,也可以分析squid和nginx日志,日志格式为combined格式

软件包：

ls /share/soft/soft/awstats/awstats-6.95.tar.gz

解压：

[root@li cronolog-1.6.2]# tar xvf /share/soft/soft/awstats/awstats-6.95.tar.gz -C /usr/local/

[root@li cronolog-1.6.2]# cd /usr/local/

[root@li local]# mv awstats-6.95/ awstats

[root@li awstats]# cd /usr/local/awstats/

[root@li awstats]# ./tools/awstats_configure.pl --使用这个工具来配置监控的网站,产生监控网页

它会自动查找apache配置文件路径,如果没有查找到,可能需要你手动输入httpd.conf的位置

修改apache的配置文件,把日志格式Logformat 改为combined格式,这种格式可以分析出客户的操作系统,浏览器等特性

-----> Need to create a new config file ?

Do you want me to build a new AWStats config/profile

file (required if first install) [y/N] ? y --输入y,产生awstat的配置文件

-----> Define config file name to create

What is the name of your web site or profile analysis ?

Example: www.mysite.com

Example: demo

Your web site, virtual server or profile name:

> www.station35.cluster.com --定义要分析的网站的网站名,随便定义

-----> Define config file path

In which directory do you plan to store your config file(s) ?

Default: /etc/awstats

Directory path to store config file(s) (Enter for default):

> --定义网站分析配置文件的路径,这里直接回车代表使用默认路径

-----> Create config file '/etc/awstats/awstats.www.station35.cluster.com.conf'

Config file /etc/awstats/awstats.www.station35.cluster.com.conf created.

--就会看这个信息,自动产生了/etc/awstats/awstats.www.station35.cluster.com.conf

配置新产生的这个配置文件

vim /etc/awstats/awstats.www.station35.cluster.com.conf

LogFile="/usr/local/apache2/logs/access.log" --定义要分析的access.log路径

LogType=W --这个确认是W就OK,表示是web日志类型

DirData="/var/lib/awstats" --这是awstats的数据库存放目录,目录需要实际存在

[root@li ~]# mkdir /var/lib/awstats

更新网站数据：

[root@li ~]# /usr/local/awstats/wwwroot/cgi-bin/awstats.pl -update -config=www.station35.cluster.com

访问：

http:/10.1.1.35/awstats/awstats.pl?config=www.station35.cluster.com

定时更新加到crontab里

01 1 * * * /usr/local/awstats/wwwroot/cgi-bin/awstats.pl -update -config=www.station35.cluster.com > /dev/null 2>&1

设置访问权限,不让别人有权限看到自己网站的统计信息

在apache配置文件里加上htpasswd

Options FollowSymLinks

AllowOverride all

order allow,deny

allow from all

</Directory>

然后去建立htpasswd的加密

--------------------------------------------

关于监控多个网站的讨论

1,如果是有虚拟主机的话

则每个虚拟主机都要配置一次,分析不同的虚拟主机的access_log

2,如果想要监控别的机器上的网站,按原理来说就是需要别的机器上的access_log,那么就会有下面情况：

如果access_log在共享存储上,则直接指定就好了

如果没有的话,则写脚本,把要监控的远程机上的access_log给传送过来就好了

1,先配置ssh等效性

2,脚本的写法

被监控者： node100vim /bin/accesslogcp.sh

#!/bin/bash

cat /etc/httpd/logs/access_log > /tmp/node100.log

echo " " > /etc/httpd/logs/access_log

监控者

vim /bin/nod100logcp.sh

#!/bin/bash

ssh 10.1.1.100 /bin/accesslogcp.sh > /dev/null 2>&1

scp 10.1.1.100:/tmp/node100.log /var/log/ >/dev/null 2>&1

3,在监控者这边使用awstats软件,把要分析的accesslog指定为/var/log/node100.log就可以了

标签：

Linux Nagios

分享到：

linux27-nagios

相关视频» 更多

最新文章» 更多

热门文章» 更多

文章创建人

相关文章» 更多

课程推荐» 更多

相关资料» 更多

相关计划» 更多

linux27-nagios

相关视频» 更多

最新文章» 更多

热门文章» 更多

文章创建人

相关文章» 更多

课程推荐» 更多

相关资料» 更多

相关计划» 更多

信息纠错