Previous  |  Contents  |  Next   

Chapter 3: NodeWatch

   What Is NodeWatch?
  NodeWatch Terms
  NodeWatch home Page
   Managing Node Groups
Creating a Node Group
Assigning User Access to Certain Node Groups
Editing / Deleting Node Groups
   Managing Nodes
Add a Single Node
Add Node Types
Import a List of Nodes
Autodiscovery
Edit / Delete Nodes
Pausing and Resuming Node Monitoring
   Defining Alerts
   Monitoring Nodes from Different Remote Locations
   Graphing Node and Node Groups
Status Map
Accessing / Reading Graphs
  Node Summary Data
   NodeWatch Reports
Recent Failures
Active Alerts
Top N Reports
Daily / Monthly / Yearly Report
Service Level Agreement (SLA)
Node Alert Details Reports
Exporting / Emailing Reports




What Is NodeWatch?

NodeWatch is a module of the Chroniker Suite that monitors servers, routers and other nodes from the network layer perspective. NodeWatch uses an ICMP/ Ping check on the servers or any TCP connect based applications you specify. Response time data is collected and reported.

Back to top


NodeWatch Terms

Some common terms that are critical for using NodeWatch are listed below:

Node
A node is an IP address (e.g. server, router, switch, etc.) or a TCP connect based application to be monitored by NodeWatch.

Node Alias
Nodes are listed by an alias instead of the host or IP address to make the NodeWatch interface more readable. You specify the unique node alias.

Node Group
A node group is a collection of nodes that share the following characteristics: group alias, check frequency, connect time-out, and profile. An admin can assign a user as the owner of the group and also specify the sharing privileges for the node group.

Reaction
An e-mail, numerical page, SNMP trap or command to execute that is sent when tests exceed monitoring parameters.

Status Map
A Status Map shows how the nodes are related to each other and displays the parent-child nodes relationship.

Report
A Report is historical monitoring data in a tabular format.

Back to top


NodeWatch Home Page

Access all node management functions from the NodeWatch home page. Click the NodeWatch link from the Modules menu on Chroniker main page. Make sure to set up Profiles, Reactions, Events and Groups before trying to create a Node to monitor. There is no limit to the number of Groups or Nodes that NodeWatch can manage.

Buttons and Displays in Top Section

•  Refresh button (two arrows in a circular pattern) updates the page by showing the most current status of each node test

•  Add Node... button opens a menu of three options:
       Add a single node
       Add node through Autodiscovering your network
       Import Nodes

 

Manage Nodes button opens a page that allows you to assign or change parents of nodes, groups, etc.

•  The Recent Failures button opens the Failure Page which lists the failed nodes.

•  Active Alerts button opens the Alert page which displays the current alerts on nodes that are down or exceeding thresholds.

•  Filter View filters only those nodes within a particular group or with a particular alert status. This is especially useful when you are watching dozens or hundreds of nodes.

•  Current status and overall status is displayed in a box at the top right of the NodeWatch home page. Current status is the current network uptime percentage (the weighted-average response time of all available nodes divided by the number of nodes tested). The overall status is the total number of successful tests of all nodes being monitored divided by the total number of attempts.
The color scheme is as follows:
       Red - network is less than 70% available
       Orange - network is between 85% and 70% available
       Yellow - network is between 95% and 85% available
       Green - network is between 100% and 95% available.

Reading the Nodes Table

Table Columns

Left most column control buttons:
•  The Pause icon and Resume icon toggle to pause and resume the testing of a node.
•  The Trash can icon deletes the node.
•  The Edit icon opens the Node Detail Page and allows you to edit the node definition.

Alert Status column displays a colored ball indicating the current alert status of the node.
•  Alert Status Color Codes:
      
Green - no alert
         Blue - information alert
         Yellow - warning alert
         Orange - error alert
         Red circle white arrow icon -it means down, or not found
        Grey with black question mark -it means unknown because the parent node is down

Node Alias is the name you gave to the node. It is also a link to a response time graph of that node.

Group column displays the group in which the node belongs. Click the group link to display a response time graph of the entire node group.

Status column displays the date and time of when the current up or down status took effect for that node. Example: Up since 12-09-2004 14:05:27.

Response Time (ms) column displays the response time from the most recent test (in milliseconds) of the node.

Uptime (%) column displays the uptime percentage of the node.
•  Example: If a node is monitored every 15 minutes for twenty-four hours, it will have 96 attempts. If Node B fails once, the Uptime% would be 95 (number of success) divided by 96 (number of attempts) or 98.9%.

Sorting the Nodes Table

The node table can be sorted by clicking on one of the following table headings: Node Alias, Group Name, Status, Response Time, and Uptime%. An arrow in the column heading indicates how the table is currently sorted, ascending or descending. The Alias and Group Name columns are sorted alphabetically. The Status, Response Time, and Uptime% columns are sorted numerically.

Back to top


Managing Node Groups

A node group is a collection of nodes that share the following characteristics: alias, check frequency, connect time-out, and profile. There is no limit to the number of Groups that can be created. An admin can assign a user as the owner of the group and also specify the sharing privileges for the node group.

Group Page Overview

The Group page displays a table of the defined groups, listing the group name, connection time-out and check frequency parameters for each.

•  The Add New Group button opens the New Group page to define new groups.

•  The Pause icon toggles with the Resume icon to start and stop the monitoring of all the nodes belonging to the group.

•  The Trash can icon deletes the group.

•  The Edit icon opens a page to edit the group's parameters.

•  The Group Name field displays the name of the group.

•  The Connect Time-out (Seconds) field displays the number of seconds before the response is considered a failure

•  The Check Frequency (Minutes) field displays the amount of time between node response time tests.

Creating a Node Group

•  Click on the Groups link in the Tools menu in NodeWatch page to open the node Groups page.

•  Click on the Add New Group button.

•  Type a unique group name in the Group Alias field. Nodes can be sorted and graphed by group.

•  Check Frequency (min.) Type the number of minutes indicating how often you want the group to be tested.

•  Connect Timeout (sec.) Type the number of seconds a group is allowed to be down before an alert is triggered.

•  Profile Select a profile from the drop down list.

•  Assign the Group Owner: (only the admin has this section of the form available)
       Select the User Group that the owner belongs to
       Select the User that will be the group owner
Note: Refer to the Assigning User Access to Certain Node Groups section for more help.

•  Check the sharing privilege you want for this group:
       None - Choosing this allows only the group owner and admin to have access to this group.
       User Group - Choose this to allow all the users belonging to the same User Group as the owner access to this group.
       Everyone - All Chroniker users have access.
Note: Refer to the Assigning User Access to Certain Node Groups section for more help.

•  Description field Type is a description about the group in this text field. When you have many groups defined, this helps you and other users remember the group's purpose.

Assigning User Access to Certain Node Groups

With Chroniker, the admin or owner of a node group can assign access privileges for the node group, so that only certain users will be able to access or modify the node group and the node objects under this group.

User access to a node group can be assigned when creating the node group or when editing it depending on your user privileges:

Admin

The admin can assign nodes to any user group or owners of user groups.  The admin can also assign someone as a group owner, and change this assignment at will.

In the Add New Group or Edit Group page, the admin will see:

•  Group Owner: Here the admin can assign a Chroniker user to be the owner of this node group.
       Select the User Group that the owner belongs to
       Select the User that will be the group owner

Note: If no user is selected as the owner, then the admin is assigned as the group owner.

•  Sharing: This is where the admin selects the access privileges for this node group.
       None - Only the group owner (and admin) has access to it.
       User Group - All the users belonging to the same User Group as the owner have access to this node group.
       Everyone - All Chroniker users have access to this node group.

Owner of node group

A Chroniker user can be the owner of a node group if he/she created the group or if the admin assigned the node group to them. The owner of a node group can assign access of a node group to himself, or users in his group,  or all Chroniker users.

In the Add New Group or Edit Group page, the group owner will see:

•  Sharing: This is where the group owner selects the access privileges for this node group.
       None - Only the group owner (and admin) has access to it.
       User Group - All the users belonging to the same User Group as the owner have access to this node group.
       Everyone - All Chroniker users have access to this node group.

Other Chroniker Users

Users who are not owners of the node group can not assign access privileges.  

Back to top

Editing a Node Group

•  Click on the Groups link in the Tools menu to open the Groups page.

•  Find the Group you wish to edit and click its Edit icon.

•  Edit the Group parameters.
      Important : If you edit Check Frequency, the new check frequency will begin after the next originally scheduled check. For example, if it was originally a 15-minute frequency, it will check at 6:15 and 6:30. If you change the frequency to 5 minutes at 6:16, you will not see the 5-minute frequency begin until 6:30.

•  Click the Update button.

Deleting a Node Group

•  Click on the Groups link in the Tools menu.

•  Find the Group you wish to delete, then click its Trash can icon.

•  Click the OK button to confirm the deletion.

Back to top


Managing Nodes

There are 3 ways to add nodes:
       Add a Single Node
       Import a CSV list
       Autodiscovery

Each node uses one Watch Element. Chroniker has no hard limit to the number of Nodes that you can define. You are limited only by the number of Watch Elements you have available.

Before adding a node, reactions and events need to be created first or pre-defined ones can be used. To create new reactions and events, please refer to “managing Reactions” and “Events” sections in Chapter 2.

Adding a Single Node

•  Click on the Add New Node button from the NodeWatch home page.

•  Select a Group to which the new Node will belong from the drop down list.
            Note: Select <New> to create a new group. Please refer to “Creating New Group” section under Managing Node Groups for more help with New Group form.

•  Node Alias field: Type a descriptive name which should be unique.

•  Host/IP field: Type a URL Address, Host Name or IP address in the format of www.yourbusiness.com, your_server_name, or ###.###.#.#

•  Port field: Type in the application port number if you are monitoring an application. Leave this field blank for a Ping test.

•  Type field: Select the type of this node from the drop-down list.

•  SubType field: Select the sub type from the drop down list.
            Note:  You can add more sub types to the drop down list if needed. For instructions on how to do this, please refer to “Adding Types and Sub Types” section below.

•  Dependency: Move the nodes that you want this new node to be dependent on from ‘Available Nodes' box to the ‘Parent Nodes' box using “>>” button.
            Note: If the Parent Node goes down, its dependents will show as Unreachable. In this case, you will receive only one alert that the parent is down. If the Parent Node is up, reactions will resume as individually configured for the dependents.

•  Select the Events for NodeWatch at each level from its drop down list. For an overview of Chroniker events please refer to Chapter 2: Features common to all Modules: Events.

•  Click on the Add button at the bottom of the page.

Back to top


Adding Node Types and Sub Types

To add more types and sub types to “Add New Node” form, edit the following file:
                      “<Chroniker Installation Folder>/apps/includes/nodetypes.txt”

The file has the following format:
                      <Type>,<SubType>

Every entry needs to be on a new line.
                      Example: To add a new OS type Amiga, add the following to the end of the file:
                                            OS,Amiga

                      This type will now be selectable from the sub type list.

Sub Type Image:

The value selected in the sub type drop-down list is also used to determine the icon used for this Node when displayed in the Status Map. To add an image for the types you added:

•  Create an image file in your photo editor software.

•  Name it <sub type>.gif. It is important that the image name is the same as the sub-type.
                      Example: Amiga.gif using the sub type created in the example above.
                      Note:  For ideal results the image dimensions should be 40x40 pixels.
                                 At the moment only GIF image types are supported.

•  Copy the image file created to :
                       “<Chroniker Installation Folder>/apps/images/statusmap_icons/”

Back to top


Importing a List of Nodes

To import nodes into NodeWatch, the file must be a CSV file with the following characteristics:

• The file should contain the following columns in the order listed:
                    (1) Node Alias
                    (2) Node Hostname / IP
                    (3) Port: can be left blank - however the column still needs to be in the file (see example below)

• The columns should be separated using one of the following allowed characters:
                    Ampersand
                    Comma
                    Hash sign
                    Semi-colon
                    Tab

• The file should not contain any column headers

• The Maximum allowable file size is 2MB

Example:
myserver, localhost, 8888
anotherserver, 10.0.0.127,
thirdserver, another_hostname, 1234

Note: The number of nodes that can be imported depends on the number of Watch Elements allowable by your Chroniker License

Create a CSV File

• Open your text editing program e.g. Notepad.

• Use one of the following allowed characters to separate your node alias, Node IP, and Port.

                      Allowed Separators: Ampersand, Comma, Hash sign, Semi-colon, and Tab

                     Example: myserver, 10.0.0.77, 8888

                     Important: if no port is available then leave the section blank, but make sure to have the separator after node IP
                                          example: nrg77, 10.0.0.77,

• At least one port must be defined or Chroniker will not import the CSV file.

• Save as a CSV file, add the extension .csv when giving your file a name and saving it.  

Note: The Maximum allowable file size is 2MB.

Create a CSV File Using Excel

• Open Microsoft Excel.

• Type each of the field names into its own data cell in record 1. A record is the numbered horizontal rows in Excel which are made up of specific cells. For example, record 1 is made up of cells A1, B1, C1. There are currently 3 fields available in a database upload. Node Alias, Node Hostname / IP and Port

• Type a record for each item beginning with record A1.

• Go to the "File" menu and select "Save As."

• Save as type:
                    - CSV (comma delimited)(*.csv)
• Click "Save."

Import CSV File into NodeWatch

•  In NodeWatch home page, select Import Nodes from Setup menu

•  Select a group for these nodes or define a new group by selecting <NEW>

•  Click browse and Select the CSV file (plain text) containing the list of the Nodes to import into NodeWatch. 

Note: The number of nodes that can be imported depends on the number of Watch Elements allowable by your Chroniker License

Back to top


Autodiscovery

Select “Auto Discover” from “Add Nodes” button

•  Click on “Add/Delete Ports” to define ports on your network

•  Click on “Select Ports to Scan” to specify ports to check for this run on Autodiscovery.

•  The default ports will appear. Check the box next to the ports you do NOT want to scan. Add the names and ports at the bottom of ports you want to ADD. Click Update. When the ports are correct, click Auto Discover Nodes.

•  Enter the IP address of the lowest IP address in the range you want to scan. (Tab to get to next field)

•  Enter the last field of the ending IP address to scan.

•  Click Enter

•  Wait a few minutes for Chroniker to scan your network.

Back to top


Editing a Single Node

• Find the node you wish to edit in NodeWatch home page.

• Click the Edit icon corresponding to the node. This brings up the Edit Node page.

Editing Multiple Nodes

• Click on Manage Nodes in NodeWatch home page

• Choose all the nodes you want to change the group for and assign a group. Click update.

• Select all the nodes that share a common type and subtype. Select type and subtype, and click update.

• Select all the nodes that share a common parent. Move the common parent to the right box, and click update.

• Click on Status Map and view to see if all the nodes are configured as you wanted

• Select all nodes that you want to share common events and hit update.

• Continue to easily update your node configurations.

Deleting a Node

•  Find the node you wish to delete in NodeWatch home page.

•  Click the Trash can icon corresponding to the node you wish to delete. Click OK to confirm.

Back to top


Pausing and Resuming Node Monitoring

The Pause icon and Resume icon toggle to pause and resume the testing of a node. Monitoring can be paused and then re-started when necessary if there is planned downtime for a node. Pausing monitoring will prevent unwanted reactions. Double red vertical bars are visible when the node test is active.

To Pause Node Monitoring

•  Find the node you wish to stop monitoring on the NodeWatch home page.

•  Click its Pause icon (two vertical red lines).

•  Click the OK button to confirm the pause.

To Resume Node Monitoring

•  Find the node you want to resume monitoring on the NodeWatch home page. When the node test is paused, the button shows the resume icon.

•  Click on the corresponding Resume icon for the node.

•  Click the OK button to confirm.

Back to top


Defining Alerts

To setup alerts that will be triggered when your node goes down or its response time exceeds a certain threshold, you need to:

  • Define a Reaction. Reactions are enacted when thresholds are exceeded.
    1. In Nodes Page, click on Reactions link in the Tools menu
    2. Click on "Add New Reaction" and select the type of reactions you want to define: email, custom, numerical page, restart, or SNMP trap.
    3. Fill out the form.
      Click here to learn more about reactions.
  • Define an Event that the reaction will be applied to. Events will launch your predefined reactions when conditions you define are met. You can assign up to 2 reactions per event.
    1. In NodeWatch module, click on the events link in the Tools menu
    2. Click on "Add New Event" button and fill out the form.
      Click here to learn more about events.
  • Apply this event to your node(s):
    1. In the event page, click on the “Apply Event” button which is the first button to the left of the event you want to apply
    2. Select your node(s) from the list
    3. Click “Apply”

Note: you can also apply a predefined event to a node by clicking on the edit button next to the node in the Nodes page.

For a detailed help on setting up reactions and events, please refer to Chapter 2: Features Common to All Modules .

Back to top

Monitoring Nodes from Different Remote Locations

With Chroniker Remote Locations, you can now monitor the response time of your critical servers and nodes FROM different locations. Each defined Chroniker Remote Location will report to Chroniker Base so you can easily manage and compare the monitored data from a central location.

Download Chroniker Remote Location from: http://www.nrgglobal.com/downloads/chroniker_remote_locations_downloads.php

Note: Chroniker Remote Location needs to be installed in a different server from the one hosting Chroniker Base.

Back to top

Graphing Nodes and Node Groups

Status Map

Status Map is a graphical map showing how the nodes are related to each other (parent-child relationship).

Status Map Options:

Status Map Options allows you to specify the layout and the graphical options of the map.

•  Click on the Status Map link from the Tools menu in the NodeWatch home page .

•  Click on the Status Map Options button ()(icon)

•  Enter the full path for a custom background image.

•  Check “Show Popups” if you want the alert message to be displayed when the mouse pointer moves over the node icon.

•  Check “Show Chroniker Icon” if you want the Chroniker icon to be displayed in the map.

•  Select the Default Image Scaling percentage to increase or decrease the final image size.
             Note: Increasing the scale will increase the status map generation time.

•  Select the Default Layout Method for drawing the status map.

•  Click on Save Settings button at the bottom of the page.

Back to top


Accessing Graphs

To access graphs in NodeWatch do one of the following

•  Go to the Nodewatch home page and click a Node Alias link or Group link to display the corresponding response time graph.

•  See Chapter 2: Features Common to All Modules, Graphs section for details on the graphs page.

Reading Historical Response Time Graphs

Graphs by default show, the thirty most recent data points. For example, if the test frequency of a particular node set to one minute, the graph would span the most recent half hour of response times.

Node Group Graph Interpretation

Each data point is the average of the response times of each node in that group.

Shifting the Time Scale

•  Click either the go backward [<] or go forward [>] button beneath the graph to view a different time frame.

Zooming

•  Click the zoom out icon (magnifying glass with a minus sign in the middle) to expand the time frame shown. Example; if the graph shows a 30-minute duration on the x-axis (hour:min), click the zoom out icon to display hours (24-hour period), click it again to display days (1-month period), and click again to display months (1-year period).

•  To zoom in, point to body of the graph, notice the magnifying glass icon with a plus sign in the middle appears. Click the zoom in icon on the point you desire.

Back to top


Node Summary Data

Summary data for a node or a group is shown to the right of the graph. The summary contains the following data:

Group
This is the Group Alias name for which the Node belongs.

Current Status
Reports if the node is up or down and time stamp for when this state began.

Last Response Time
The response time from the most recent test.

Current Uptime
Time since the last down or time from the first test if the node has never been down. In other words, the time span the node has been up. Zero is reported if the node is down for two or more consecutive tests (it has been down for the time equal to or greater than the check frequency).

      ** Important: A node must test down for two consecutive tests for Current uptime to be zero. A single test reporting a down status will not cause Current Uptime to be zero.

     ** Note: If the Chroniker host server is turned off or goes down, or Chroniker is shut down, this does not count as downtime, and the counter for current uptime will not be restarted.

Current Downtime
The time span the node has been down. Zero is reported if the node is up.

Last Uptime
The date and time of the last time the node came back to the up state or the time of the first test if the node has never been down.

Last Failure
The date and time of the last time the node went down. It is blank if the node has never been down.

Total Uptime
The total time the node has been up from the first test.

Total Downtime
The total time that node has been down from the first test.

Uptime Ratio
The percentage of time the node has been up.

Max. Response Time for [current month]
The longest response time reported this month.

Min. Response Time for [current month]
The shortest response time reported this month.

Avg. Response Time for [current month]
The average response time this month.

No. of Failures in [current month]
Number of node failures reported this month.

Last Checked At
When the most recent check was made.

Next Check Scheduled At
When the next check will happen.

Back to top


NodeWatch Reports

Recent Failures

Click on the Recent Failures button from the NodeWatch home page to view recent failures. This displays a list of the most recent node failures at the top; time stamp and node alias are listed.

The recent failures list is purged periodically. The purge schedule for Recent Failures follows the Purge Real-time data after X days setting. This is edited on the Global Options home page through clicking the Setup link in the main top bar.

Active Alerts

Click the Active Alerts button, from the NodeWatch home page to view Active Alerts. Active Alerts show what is currently exceeding thresholds at a glance. The table shows the Alert Time, the Alias, the Group and a message.

The colored ball indicates the alert status of the group or individual test.

Alert Status Color Codes

Blue ball icon - informational alert

Yellow ball icon - warning alert

Orange ball icon - error alert

Red circle white arrow icon - means "down" or "not found"

Grey circle with black question mark icon - means unknown status because the parent node is down

Back to top


Top N Reports

Select a number from the left hand drop down list. Click the radial button to Get the top N (10, 15, 25, or 50) nodes with a particular test result statistic.

Daily/Monthly/ Yearly Report

This report displays Uptime and Performance information in three different time frames - daily, monthly, and yearly. Columns show the Uptime% or Performance in milliseconds for the time frames of Business hours or Twenty Four Seven.

Uptime Percentage and Performance in Seconds
Uptime Percentage is the numerical value for the amount of time the test is active.

Performance (ms)
This measurement is broken down by column in either 24/7 time period or Business Hours.

Business Hours
Uptime% during the hours of 9:00 am to 5:00 p.m., Monday through Friday. This reference to business hours is not related to any profiles you may have defined. These hours are built into Chroniker and cannot be changed.

24/7
Total uptime% without any time restraints placed on it.

Back to top


Service Level Agreement (SLA)

Display test Uptime and Performance information in three different time frames daily, weekly, and monthly.

Uptime Percentage and Performance in Milliseconds.

Uptime Percentage
The numerical value for the amount of time the test is active.

Performance (ms)
This is the numerical time value it takes to get a response from an active test.

These columns are split up by Today and Yesterday for the Day, Current or Previous for the Week and Month.

A high uptime percentage is desirable, while a low performance number (i.e. response time) is desirable. Green means good and red means bad when comparing today to yesterday or the current week to the previous ones arrows. The arrows show either up or down in comparison to its own category.

For example, a green arrow uptime of 98% compared to yesterday's 69% is good. A red up arrow for today's performance of 1000.24 compared to yesterday's 50.12 is bad.

Back to top


Node Alert Details Report

Node Alert Details give users information on the Nodes that have most recently gone down.

Nodes
This is the name of the Node that went down.

Alert Duration
The amount of time that the node was down.

Alert Started
This is the actual time in which the node was first measured as being down.

Alert Ended
This is the actual time in which the node was measured as being up.

Alert Message
This tells whether it was down or could not be found and what parameter it exceeded.

Back to top


Exporting Reports

Reports generated from the Reports page can be exported. To export a report to Excel:

•  Create the report you wish export.

•  Click Export to Excel button.

•  Choose to either Open the Report immediately OR to Save the report. To save the report one has to specify the path location to save the file to.

E-Mailing Reports

Reports generated from the Reports page can be E-mailed. Note, the recipient must have HTML enabled for their inbox. To e-mail a report:

•  Create the report you wish to export.

•  Click the E-Mail this Report button.

•  Type the Name and E-mail address of who will receive the report in the proper fields.

•  Type your name and E-mail address in the proper fields.

•  Type a message you wish to accompany the report in the body of the E-mail.

•  Click the Send E-mail button.

Back to top




Previous  |  Contents  |  Next