Monitoring Microsoft Windows from a Linux server

/, Private Cloud, Virtualization/Monitoring Microsoft Windows from a Linux server

Monitoring Microsoft Windows from a Linux server

When monitoring a Microsoft Windows server or workstation, one of the greatest challenges is to do it without a local agent, in order to avoid manual installation of any third party software on the system. In our Monitoring system, we are relying heavily on WMI. In this blog, we will explain what WMI is and how it is configured and how to use it from a Linux machine.

Windows Management Instrumentation (WMI) is a set of specifications from Microsoft for consolidating the management of devices and applications in a network from Windows computer systems. WMI is Microsoft’s implementation of the Web-Based Enterprise Management (WBEM) and Common Information Model (CIM) standards.

What does WMI offer?

It provides users with information about the status of local or remote computer systems. It also supports such actions as the configuration of security settings, setting and changing system properties, setting and changing permissions for authorized users and user groups, assigning and changing drive labels, scheduling processes to run at specific times, backing up the object repository, and enabling or disabling error logging.

Requirements

WMI is installed on all computers with Windows 2000 or later. Out of box, it provides lot of functionalities and access to detailed metrics within the Microsoft Windows syste. We want to receive system information about Windows machines, and a way to gain this information without installing additional agents on the Windows side. Howto enable WMI remote access can be found here.

Configuration

On our Linux machine, we need the WMI command line tool (WMIC). WMIC is a command-line tool designed to ease WMI information retrieval about a system by using some simple keywords (aliases). WMIC is available under Windows XP Professional and later but there is a Linux port for it too. To use it we need to create a Windows user account. As command line parameters, credentials should be provided and the user should be part of the Administrators group.

We should include software if we want to start implementing a monitoring system. If we have lot of machines that need constant checking for functionality and resource usage we will need to use a special software that will do it for us. One example is Icinga 2 which is part of the VA-Monitoring application.

Icinga 2

It is an open source monitoring system, which checks the availability of your network resources, notifies users of outages, and generates performance data for reporting. Icinga 2 is scalable and extensible and can monitor large complex environments across multiple locations. The software can use WMI with plugin named check_wmi_plus. Check WMI Plus is a wrapper for WMIC command line tool that parse the output and provides human readable information for each configure sensor.

The standard sensors are:

  • CPU,
  • RAM,
  • SWAP,
  • HDDs usage,
  • network cards utilization,
  • checks for errors and warnings in Application,
  • System and Security Logs,
  • Domain Controllers status,
  • monitoring of critical services as well additional details for MS SQL and IIS.

For every sensor, there is a selected set of parameters which should be passed to the check_wmi_plus.pl executable, which is usually located in /usr/lib/nagios/plugins/.

For checking CPU, with warning level at 75% and critical level at 90%, the command has to be invoked with the following arguments:

# /usr/lib/nagios/plugins/check_wmi_plus.pl -H 10.107.150.16 -A /etc/check_wmi_plus/evo12.txt -m checkcpu -w 75 -c 90
OK (Sample Period 16 sec) – Average CPU Utilisation 0.05%|’Avg CPU Utilisation’=0.05%;75;90;

In the command above, the credentials for the server 10.107.150.16 are stored in the file /etc/check_wmi_plus_evo12.txt, in the following format:

username={user, example administrator}
password={pass}
domain={shortdomain}

This plugin is added to Icinga2 with the following configuration segment

object CheckCommand “check_wmi” {
import “plugin-check-command”
command = [ “/usr/lib/nagios/plugins/check_wmi_plus.pl” ] arguments = {
“–inidir” = “$wmi_inidir$”
“-H” = “$host.address$”
“-A” = “$wmi_authfile_path$”
“-m” = “$check_mode$”
“-s” = “$wmi_submode$”
“-a” = “$wmi_arg1$”
“-o” = “$wmi_arg2$”
“-3” = “$wmi_arg3$”
“-4” = “$wmi_arg4$”
“-y” = “$wmi_delay$”
“-w” = “$wmi_warn$”
“-c” = “$wmi_crit$”
“-exc” = “$wmi_exclude$”
“–nodatamode” = {
set_if = “$wmi_nodatamode$”
}
}

After adding the plugin to Icinga2, the various sensors can be configured within a .conf file, in /etc/icinga2/conf.d. For example, the server 10.107.150.16 with multiple sensors, we need to create a configuration file named winserver.conf:

object Host “winserver” {     # the host identifier or fqdn
import “generic-host”     # import generic configuration
address = “10.107.150.16”     # the host IP address
display_name = “Secure Corporate Windows Server”     # host friendly name or
vars.notification[“mail”] = { groups = [ “icingaadmins, va-masters” ] }     # which users to be notified for problems/recoveries
vars.wmi_authfile_path = “/etc/check_wmi_plus/evo12.txt”     # where are the credentials for this server
vars.os=”Windows”     # variable for assigning checks
}

We will use the vars.os variable, to assign multiple WMI checks, which will be added into a separate file and called for multiple servers, with only putting the vars.os=”Windows” into their host configuration file.

  • Check CPU:
    • apply Service “CPU” {
      import “wmi-service”
      check_interval = 1m
      retry_interval = 1m
      max_check_attempts = 3
      vars.check_mode = “checkcpu”
      vars.wmi_warn =”_AvgCPU=65″
      vars.wmi_crit =”_AvgCPU=75″
      vars.tovamaster = “true”
      assign where host.vars.os == “Windows”
      ignore where host.vars.disable_wmi
      }
  • Check RAM Memory
    • apply Service “Memory” {
      import “wmi-service”
      check_interval = 1m
      retry_interval = 1m
      max_check_attempts = 3
      vars.check_mode = “checkmem”
      vars.wmi_warn =”70″
      vars.wmi_crit =”80″
      vars.tovamaster = “true”
      assign where host.vars.os == “Windows” ignore where host.vars.disable_wmi
      }
  • Check Disk capacity
    • apply Service “Disk” {
      import “wmi-service”
      check_interval = 10m
      retry_interval = 10m
      max_check_attempts = 2
      vars.check_mode = “checkdrivesize”
      vars.wmi_arg1 = “.”
      vars.wmi_warn = “95”
      vars.wmi_crit = “98”
      assign where host.vars.os == “Windows”
      ignore where host.vars.disable_wmi
      }
  • Check Disk IO
    • apply Service “Disk IO C:” {
      import “wmi-service”
      vars.check_mode = “checkio”
      vars.wmi_submode = “logical”
      vars.wmi_arg1 = “C:”
      assign where host.vars.os == “WindowsX”
      ignore where host.vars.disable_wmi
      }
  • EventLog Application
    • apply Service “Log: Application” {
      import “wmi-service”
      check_interval = 15m
      retry_interval = 5m
      max_check_attempts = 1
      vars.check_mode = “checkeventlog”
      vars.wmi_arg1 = “application”
      vars.wmi_arg2 = “1”
      vars.wmi_arg3 = “1”
      vars.wmi_warn = “1”
      vars.wmi_crit = “10”
      vars.hidden = “True”
      assign where host.vars.os == “Windows”
      ignore where host.vars.disable_wmi
      }
  • EventLog Security
    • apply Service “Log: Security” {
      import “wmi-service”
      check_interval = 15m
      retry_interval = 5m
      max_check_attempts = 1
      vars.check_mode = “checkeventlog”
      vars.wmi_arg1 = “security”
      vars.wmi_arg2 = “,5”
      vars.wmi_arg3 = “1”
      vars.wmi_warn = “1”
      vars.wmi_crit = “5”
      vars.hidden = “True”
      assign where host.vars.os == “Windows”
      ignore where host.vars.disable_wmi
      }
  • EventLog System
    • apply Service “Log: System” {
      import “wmi-service”
      vars.check_mode = “checkeventlog”
      vars.wmi_arg1 = “system”
      vars.wmi_arg2 = “1”
      vars.wmi_arg3 = “1”
      vars.wmi_warn = “1”
      vars.wmi_crit = “5”
      #vars.wmi_exclude = “DistributedCOM”
      vars.hidden = “True”
      assign where host.vars.os == “Windows”
      ignore where host.vars.disable_wmi
      }
  • Network Throughput
    • apply Service “Network Interfaces” {
      import “wmi-service”
      vars.check_mode = “checknetwork”
      vars.wmi_arg1 = “10.107.150.”
      vars.wmi_warn = “_SendBytesUtilisation=0.1”
      vars.wmi_crit = “_SendBytesUtilisation=0.3”
      vars.hidden = “True”
      assign where host.vars.os == “Windows”
      ignore where host.vars.disable_wmi
      }

Apart from these standard checks, there are two very useful sensors about which applications are consuming most CPU and Memory. Having a high CPU usage is good information, but more important is to know which process is consuming it. Particularly in a situation, where one cannot instantly connect to the system and do instant debugging.

That is why we created two sensors which show which processes and applications are consuming the CPU/Memory resources.

  • CPU Hungry Apps
    • apply Service “CPU Hungry Apps” {
      import “wmi-service”
      check_interval = 2m
      retry_interval = 2m
      max_check_attempts = 3
      vars.check_mode = “checkproc”
      vars.wmi_submode = “cpuabove”
      vars.wmi_arg1 = “%”
      vars.wmi_exclude = “_AvgCPU=@0:10”     # ignore processes consuming bellow 10% cpu usage
      vars.wmi_delay = “0”
      vars.wmi_warn = “80”
      vars.wmi_crit = “90”
      vars.wmi_nodatamode = true
      assign where host.vars.os == “Windows”
      ignore where host.vars.disable_wmi
      vars.hidden = “True”
      }
  • Memory Hungry Apps
    • apply Service “Memory Hungry Apps” {
      import “wmi-service”
      check_interval = 2m
      retry_interval = 2m
      max_check_attempts = 3
      vars.check_mode = “checkproc”
      vars.wmi_submode = “memory”
      vars.wmi_arg1 = “%”
      vars.wmi_arg2 = “svchost%”
      vars.wmi_exclude = “WorkingSet=@0:80M”
      vars.wmi_delay = “0”
      vars.wmi_warn = “WorkingSet=80M”
      vars.wmi_crit = “WorkingSet=500M”
      vars.wmi_nodatamode = true
      assign where host.vars.os == “Windows”
      ignore where host.vars.disable_wmi
      vars.hidden = “True”
      }

For example, we can also run the Memory Hungry Apps plugin in command line and get feedback instantly about the memory status on the Windows server.

# /usr/lib/nagios/plugins/check_wmi_plus.pl -H 10.107.150.16 -A /etc/check_wmi_plus/evo12.txt -m checkproc -s cpuabove -a “%” -w 80 -c 90

OK (Sample Period 41 sec) – Total Process Count=53 (Process details on next line)|’Process Count’=53;
OK – CPU for FoxitConnectedPDFService (PID=1132)=11.5%
OK – CPU for vmtoolsd#2 (PID=8360)=0.2%
OK – CPU for SppExtComObj (PID=6516)=0.1%

And this way, we can see which process is consuming the system resources historically, with better insight on what should be corrected on the system.

We encourage you to try the software because it includes many settings, which can aid your work from a broad perspective. If you have any questions or you want help in deploying it feel free to contact us at support@vapour-apps.com or support@va.mk.

By |2017-09-22T10:40:36+00:00September 22nd, 2017|Monitoring, Private Cloud, Virtualization|Comments Off on Monitoring Microsoft Windows from a Linux server

About the Author: