Monitoring Metrics and Performance Insights
The Monitoring tab provides a filter to view performance insights, OS metrics, DB metrics, or a combination of all. It also includes a search bar to quickly locate specific metric dashboard. Metrics can be manually refreshed using the Refresh button on the top right.
Viewing the Monitoring Dashboard
Select a time interval of 1h, 3h, 6h, 12h, or 24h to display data.
Default interval: 1h
Graphs show averaged values over the selected period
For a custom duration:
Select Custom
Choose the desired date and time
Click Apply
Granularity
The Granularity dropdown at the top right of the Monitoring tab allows you to control the time interval at which the metrics are displayed. You can select from the following options:
1 Min - Displays metrics at 1-minute intervals.
5 Mins - Displays metrics at 5-minute intervals.
15 Mins - Displays metrics at 15-minute intervals.
30 Mins - Displays metrics at 30-minute intervals.
1 Hour - Displays metrics aggregated over 1-hour intervals.
Metrics Visualization Features
The graphs display the most recent metrics for your database services, logically grouped into OS Metrics, DB Metrics, and Performance Insights. By default, the graphs represent the monitoring metrics of the current primary node.
For the service with HA/RR/DR configurations, users can select a specific node to view its corresponding metrics.
Each graph provides the following capabilities:
Full-Screen View
To view a graph in expanded full-screen mode, click the full-screen icon located in the top-right corner.
A new window will open, displaying the graph with metrics available at granular time intervals such as 1 min, 5 min, 15 min, 30 min, and 1 hour. Users can also choose the statistic type as Average (Avg), Minimum (Min), or Maximum (Max).
Zoom for Detailed Analysis
To examine a specific section of the graph in detail, use the selection zoom option from the top-right corner.
Click and drag over the desired area in the graph to zoom in and view more granular metric data.
Drag and Explore
Use the crosshair cursor to select and drag across a specific portion of the graph for focused analysis.
Download Options
To download the graph, use the menu option available in the top-right corner. The graph can be exported in the following formats:
SVG / PNG: Downloads the graph as an image, preserving its current visual representation.
CSV: Downloads the underlying metric data in a tabular format, suitable for analysis in spreadsheet tools (For example, Excel).
Service Availability
The Service Availability graph displays database service uptime, with green indicating availability and gray representing unavailability.
OS Metrics
CPU Usage :
This graph displays the percentage of CPU resources utilized on the database instance.
System Load Average :
This graph displays the average number of processes that are actively using CPU or waiting for resources on the database instance.
Values are reported over 1, 5, and 15 minutes and are expressed as a count.
Memory Usage :
This graph displays the amount of used and available memory on the database instance, measured in GiB.
Swap Usage :
This graph displays the amount of swap space used and available on the database instance, measured in GiB.
Swap usage indicates memory pressure when physical memory is insufficient.
Filesystem - Root :
This graph shows the used and available space on the root filesystem of the database instance, measured in GiB.
It typically includes OS files, OS logs, DB logs, and system binaries.
Filesystem - Data :
This graph displays the used and available space on the data filesystem where database data files are stored, measured in GiB.
Filesystem - Archive :
This graph displays the used and available space on the archive filesystem, measured in GiB.
This filesystem is typically used for storing binary logs.
Filesystem - DB Software :
This graph displays the used and available space on the filesystem where database software binaries and related components are installed, measured in GiB.
Throughput - Data disk :
This graph displays the rate of data read from and written to the data disk per second, measured in either MiB/s or KiB/s.
It reflects the volume of I/O operations in terms of data transfer.
Throughput - Archive disk :
This graph displays the rate of data read from and written to the archive disk per second, measured in either MiB/s or KiB/s.
IOPS - Data disk :
This graph displays the number of read and write I/O operations performed per second on the data disk.
It indicates how frequently the disk is being accessed.
IOPS - Archive disk :
This graph displays the number of read and write I/O operations performed per second on the archive disk.
Network Usage :
This graph displays the rate of data transmitted and received over the network interface, measured in either MiB/s or KiB/s.
It reflects both customer database traffic and Tessell traffic used for monitoring and replication.
Top Processes by CPU :
This section lists the processes consuming the highest CPU resources on the database instance, helping identify CPU-intensive workloads.
Displays three columns, Process ID (PID), Processes, and the average value in percentage of the CPU used. These process details are independent of the time range. This graph displays only the current data and not the older data.
Top Processes by Memory :
This section lists the processes consuming the most memory on the database instance, helping identify memory-intensive applications or potential memory leaks.
This displays three columns, Process ID (PID), Processes, and the average value in percentage of the memory used. These process details are independent of the time range. This graph displays only the current data and not the older data.
DB Metrics
Connections
This graph displays the number of client connections to the database instance, measured in count.
Max Used Connections : Maximum number of concurrent connections that have been used since the database was started.
Total Connections : Total number of connections (both active and inactive) established to the database.
Active Connections : Number of connections currently in use or executing queries.
Queries
This graph shows the rate of query execution on the database. It is measured in Queries per Second. Helps to understand workload type (read vs write).
Questions: Total number of queries executed per second.
SELECT: Number of SELECT statements executed per second.
INSERT: Number of INSERT statements executed per second.
UPDATE: Number of UPDATE statements executed per second.
DELETE: Number of DELETE statements executed per second.
IO
This graph displays buffer pool activity. It is measured in operations per second. High disk reads may indicate poor cache efficiency.
Buffer Pool Reads: Number of logical reads served from memory (buffer pool).
Buffer Pool Writes: Number of logical writes made to the InnoDB buffer pool. High values indicate heavy write activity. If the status variable
Innodb_buffer_pool_wait_freevalue is high relative to this metric, it suggests the buffer pool is too small and is struggling to flush dirty pages, causing writes to wait.Disk Reads: Physical reads from disk (when data is not in memory).
InnoDB Writes
This graph shows write-related activity in the InnoDB storage engine, measured in Writes per second. Indicates write workload and durability overhead.
Data Writes: Number of write operations performed to data files on disk.
Double Writes: Number of write operations performed to the doublewrite buffer on disk for crash safety. Before writing a page to its actual location in the data files, InnoDB first writes it to this doublewrite buffer for data protection.
Log Writes: Number of write operations performed to the Innodb redo log files (transaction log) on disk.
Buffer Pool Utilization
This graph shows how much of the InnoDB buffer pool is being used and it is measured in percentage.
Low utilization may indicate oversized buffer pool OR poorly optimized queries which can evict the data frequently from buffer pool and cause more disk reads.
High utilization is generally expected. However, if the buffer pool is extremely high utilization and the buffer pool hit ratio stays high, the system is likely running fine. But if the buffer hit ratio starts to drop while utilization remains high, it indicates the current buffer pool size is likely too small for your workload.
Error Counts
This graph displays the number of SQL statement errors encountered by the database, measured in count. This error includes Constraint violations, Table not found, Deadlocks, Timeout failures, Permission errors, etc. Persistent increase indicates application or query issues.
Open Tables
This graph shows the number of tables currently open in the MySQL server's table cache. Higher values indicate active workload. If Open_tables is equal to your table_open_cache setting, it means your table cache is full. If you also see the Opened_tables status variable growing rapidly over time, you should consider increasing your table_open_cache to improve performance.
Temp Tables
This graph displays number of temporary tables created by MySQL server, measured in Tables Created Per Second. This includes both in-memory and on-disk temporary tables.
Average Row Lock Time
This graph represents the average time spent waiting to acquire row-level locks, measured in milliseconds. High values indicates high contention.
Aborted Clients
This graph displays the number of client connections that were aborted unexpectedly, measured in count. This includes client disconnected improperly, timeout settings, network issues or client sent packets larger than max_allowed_packet limit.
Aborted Connects
This metric represents the number of failed connection attempts to the database, measured in count. This includes cases such as authentication failures, connection handshake errors, client disconnects before authentication, network issues, etc.
DB Uptime
This metric shows the total time elapsed since the database instance was last started, measured in seconds. It is useful for identifying DB restarts.
Node Status
This metric indicates the availability state of the database node. It is mainly used in clustered or replicated environments for monitoring node health.
1 (ONLINE): Node is up and serving requests
0 (OFFLINE): Node is down or not reachable
-1 (NA): Node Status not applicable for standalone db instance
Node Read-only
This metric indicates whether the database node is operating in read-only or read-write mode. It is useful for validating failover status and monitoring the role (primary/replica).
1 (Read-only): Node accepts only read operations (typically replica)
0 (Read-write): Node accepts both read and write operations (typically primary)
Replication Lag
This metric represents the amount of time, measured in seconds, by which the replica node is lagging behind the source (primary) database instance.
A higher replication lag indicates that changes executed on the source database are taking longer to be applied on the replica. This can impact read consistency and failover readiness in replication environments.
Typical causes of replication lag include:
high write activity on the source database
long-running queries on the replica
insufficient CPU, memory, or I/O resources
network latency between source and replica
Interpretation:
0
Replica is fully synchronized with the source database
>0
Replica is delayed by the displayed number of seconds
-1
Replication failure or replication thread is not running on the replica node
Performance Insights
Tessell Performance Insights is a database performance monitoring tool that allows to assess and analyze the load on the database within a specified timeframe. This tool enables users to identify bottlenecks and pinpoint areas where performance improvements are needed.
To enable performance insights, ensure that you create a Monitoring Infra in the Monitoring Performance Insights Infrastructure application under the Infrastructure Management app family.
After the monitoring infrastructure is deployed, this feature can be optionally enabled for each database service, either at the time of provisioning or later through the Performance Insights option under the Settings tab for existing db services.
Database Load
Database load measures the level of session activity in the database. The key metric in Performance Insights is DBLoad, which is collected every second. The unit for database load is the AAS (Average Active Sessions) which is the measure of the average active connections in a specific timeframe.
Active Sessions
A session is active when it is either running on CPU or waiting for a resource to become available so that it can proceed. For example, an active session might wait for a page (or block) to be read into memory, and then consume CPU while it reads data from the page.
Average Active Sessions
It measures how many database sessions are concurrently active on the database on an average within a given timeframe.
Every second, Performance Insights samples the number of sessions concurrently running a query. For each active session, Performance Insights collects the following data:
Wait - Session state (running on CPU or waiting)
SQL statement
Host
User running the SQL
Database on which the SQL is running
Performance Insights calculates the AAS by dividing the total number of sessions by the number of samples for a specific time period.
Top Dimensions
Top dimensions are the dimensions of the data corresponding to the DB Load within a given timeframe. These include:
Top Waits: Shows where the database is actually spending time in a given timeframe.
Top SQLs: Shows top sql statements that are contributing to the db load in a given timeframe.
Top Hosts: Shows top client hosts that are contributing to most of the db load on the database in a given timeframe.
Top Users: Shows top database users that are contributing to most of the db load on the database in a given timeframe.
Top Databases: Shows top databases that are contributing to most of the db load on the multi-tenancy database service in a given timeframe.
Data Retention for Performance Insights
By default, Performance Insights includes 20 days of performance data history. If you want to extend the data retention, you can reach out to Tessell Support.
Viewing Performance Insights dashboard
Click View Detailed Insights to view the Performance Insights Dashboard in full-screen view.
Select time intervals of 1h, 3h, 6h, 12h, or 24h to display the data for that duration on the graph. The default time interval is 1h.
For custom time duration, select the Custom option and then choose your desired date and time, and click Apply.
Last updated
Was this helpful?