Monitoring
The Monitoring dashboard is accessed by clicking Monitoring on side menu bar. The dashboard provides a high-level insight into the state of your Kazuhm resources and any ongoing compute jobs in your Kazuhm instance.
Data Tiles
Data Tiles show the latest usage metrics from your environment running Kazuhm.
Metrics displayed in the Data Tiles are real-time snapshots and hence are unaffected by selecting different date ranges.
It should be noted that all metrics other than GPU Usage are at the host level and consequently reflect ALL host activity not only Kazuhm.
Connected Hosts - data tile displaying the total number of hosts currently included in your Kazuhm instance.
CPUs Available - data tile displaying the total number CPUs available within your Kazuhm instance across all Connected Hosts.
CPU Usage - expressed as a percentage of average CPU, this data tile represents the average CPU consumption across all Connected Hosts in your Kazuhm instance.
Available GPU RAM - data tile displaying the total amount of GPU RAM available within your Kazuhm instance across all Connected (GPU-capable) Hosts.
GPU Usage - expressed as a percentage of average GPU, this data tile represents the average GPU consumption across all GPU-capable Connected Hosts in your Kazuhm instance.
Memory Available - data tile displaying the total amount of RAM available within your Kazuhm instance across all Connected Hosts.
Memory Usage - expressed as a percentage of total memory, this data tile represents the average memory usage across all Connected Hosts in your Kazuhm instance.
Metrics displayed in the Data Tiles are real-time snapshots and hence are unaffected by selecting different date ranges.
Graphs
Graphs provide insight into past trends on resource consumption.
It should be noted that all metrics are at the host level and consequently reflect ALL host activity not only Kazuhm.
CPU Usage - expressed as a percentage of average CPU, this graph tracks the average CPU consumption across all Connected Hosts in your Kazuhm instance.
GPU Usage - expressed as a percentage of average GPU, this graph tracks the average GPU consumption across all Connected Hosts in your Kazuhm instance.
Memory Usage - expressed in GB, this graph tracks the average memory usage across all Connected Hosts in your Kazuhm instance.
Disk Usage - expressed as a percentage of available disk space, this graph tracks the average disk usage across all Connected Hosts in your Kazuhm instance.
Hosts - this graph tracks the total number of hosts currently included in your Kazuhm instance over time, allowing visibility into the addition/deletion of hosts.
Network Traffic In - expressed in kilobytes per second, this graph tracks the total network traffic coming into all Connected Hosts in your Kazuhm instance.
Network Traffic Out - expressed in kilobytes per second, this graph tracks the total network traffic going out from all Connected Hosts in your Kazuhm instance.
Data Selection
Hovering the cursor within a graph will open a dialog box showing the data point and timestamp associated with it. This dialog box will follow the cursor as it is dragged it across a graph, showing more data points and giving more information on the trend being observed.
Date Range
Selecting a data range via the available filters will show the last data points for a given interval i.e. selecting the 30 Days filter will expand the range of the graphs to show the trend for the last 30 days.
Data Refresh
The refresh button will reset and refresh all graph data.
Comments
0 comments
Article is closed for comments.