Skip to content

Conversation

@deeb00
Copy link

@deeb00 deeb00 commented Apr 17, 2025

Hey everyone! 👋

First of all, thanks for this awesome exporter - it's been super helpful in monitoring our infrastructure!

While using it, I noticed we were missing some important RDS metrics that could give us better insights into our database performance. So I decided to add them! Here's what's new:

CPU & Credit Metrics:

  • BurstBalance - to track gp2 burst-bucket I/O credits
  • CPUCreditBalance & CPUCreditUsage - for monitoring CPU credits
  • CPUSurplusCreditBalance & CPUSurplusCreditsCharged - for tracking surplus CPU credits

EBS & Storage Metrics:

  • EBSByteBalance & EBSIOBalance - for monitoring EBS volume performance
  • DiskQueueDepth - to track outstanding I/O operations
  • ReadLatency & WriteLatency - for measuring disk I/O performance

Network & Replication:

  • NetworkReceiveThroughput & NetworkTransmitThroughput - for network traffic monitoring
  • OldestReplicationSlotLag - to track replication delays
  • CheckpointLag - for monitoring WAL consistency
  • TransactionLogsGeneration - to track transaction log activity

These additions should give us a more complete picture of our RDS instances' performance and resource utilization. Let me know if you'd like any adjustments or have questions!

Cheers! 🚀

@deeb00 deeb00 force-pushed the feature/add-new-rds-metrics branch 2 times, most recently from 301bdb1 to 0a5ff13 Compare April 17, 2025 13:49
- Add BurstBalance metric for gp2 burst-bucket I/O credits
- Add CheckpointLag metric for WAL data consistency
- Add CPU credit metrics (Balance, Usage, Surplus)
- Add DiskQueueDepth metric for outstanding IOs
- Add EBS metrics (ByteBalance, IOBalance)
- Add Network metrics (Receive/Transmit Throughput)
- Add Replication metrics (OldestSlotLag)
- Add Latency metrics (Read/Write)
- Add TransactionLogsGeneration metric

Signed-off-by: Semyon Koshel <[email protected]>
@deeb00 deeb00 force-pushed the feature/add-new-rds-metrics branch from 0a5ff13 to 04ef515 Compare April 17, 2025 13:51
@qfritz qfritz self-assigned this Apr 17, 2025
@serge-r
Copy link

serge-r commented Apr 18, 2025

Good job, we have to use this metrics too, using YACE for it. Would be nice to have full set of metrics in one exporter.

…tch queries

- Remove % symbol from query IDs while preserving original metric names
- Fix EBSByteBalance% and EBSIOBalance% metrics collection

Signed-off-by: Semyon Koshel <[email protected]>
@deeb00
Copy link
Author

deeb00 commented Apr 21, 2025

Hey! 👋

I did another round of testing after submitting the PR and noticed that the metrics weren't being collected. Here's what I saw in the debug output:
{"time":"2025-04-21T13:54:00.978701+07:00","level":"DEBUG","msg":"cloudwatch metrics fetched","metrics":{"Instances":null}}

Turns out CloudWatch has a special requirement when it comes to query IDs - they can't contain special characters like %. But we need to keep the % in the actual metric names when making API calls.

I've fixed this by:

  • Stripping % from query IDs
  • Keeping the original metric names with % for API calls
  • Making sure the responses are processed correctly

Now all the EBS balance metrics (EBSByteBalance% and EBSIOBalance%) should be collected properly!

@deeb00 deeb00 force-pushed the feature/add-new-rds-metrics branch from d62b56b to a55f0e2 Compare April 23, 2025 10:17
@deeb00 deeb00 force-pushed the feature/add-new-rds-metrics branch from 938ab04 to 4f8be76 Compare April 24, 2025 09:43
…loudwatch API throttling

Signed-off-by: Semyon Koshel <[email protected]>
@deeb00 deeb00 force-pushed the feature/add-new-rds-metrics branch from 4f8be76 to 59fe136 Compare April 24, 2025 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants