-
Notifications
You must be signed in to change notification settings - Fork 355
FEAT: (NVIDIA) support multiple gpus status individually #369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThree GPU monitoring shell scripts changed output formatting from plain numeric strings to pipe-delimited, labeled strings for NVIDIA GPU paths; logic and gating remain unchanged. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
scripts/gpu_usage.sh (1)
46-46: Verify compatibility withnormalize_percent_lenfor pipe-delimited format.The change from a plain percentage to pipe-delimited format
|<percentage>%|is appropriate for multi-GPU support. Confirm that thenormalize_percent_lenfunction (line 54) correctly handles the new format with embedded pipes for multiple values.Additionally, consider adding error handling if
nvidia-smifails or returns no output, as the script currently has no guard against malformed input.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
scripts/gpu_power.sh(1 hunks)scripts/gpu_ram_info.sh(1 hunks)scripts/gpu_usage.sh(1 hunks)
🔇 Additional comments (2)
scripts/gpu_power.sh (1)
47-47: Clarify output format:echo "Power"inclusion in usage variable may create malformed output.Lines 47 and 49 include
echo "Power"in the command chain that sets theusagevariable. This produces a multi-line output where "Power" appears on the first line and the pipe-delimited metrics on subsequent lines. When echoed inmain()(line 74), this produces:GPU Power\n|<values>|, which differs from the pipe-delimited format in other scripts.Is the "Power" label intentional as part of the usage output, or should it be handled separately? Please verify this matches the intended output format shown in the PR screenshots.
Also applies to: 49-49
scripts/gpu_ram_info.sh (1)
45-45: Add explicit field separator for CSV parsing in awk to improve robustness and clarity.Line 45 uses awk without specifying a field separator when parsing CSV output from nvidia-smi. While the implicit numeric coercion of
$0currently works, explicitly setting-F ', *'and using$1instead of$0follows CSV parsing best practices and prevents fragility if input format varies. Apply the suggested diff to improve code maintainability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
scripts/gpu_power.sh (1)
47-47: Fix awk field reference in power percentage calculation.Line 47 uses
$0 / $2 * 100in the awk expression, but with-F ', *'field separator,$0is the entire input line. The percentage should divide the first field (power draw) by the second field (power limit). Change$0 / $2 * 100to$1 / $2 * 100.This issue was flagged in a prior review but the
$0reference was not corrected.- usage=$(echo "Power" && nvidia-smi --query-gpu=power.draw,power.limit --format=csv,noheader,nounits | awk -F ', *' '{ printf("|%d%%", $0 / $2 * 100) }' && echo "|") + usage=$(echo "Power" && nvidia-smi --query-gpu=power.draw,power.limit --format=csv,noheader,nounits | awk -F ', *' '{ printf("|%d%%", $1 / $2 * 100) }' && echo "|")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
scripts/gpu_power.sh(1 hunks)scripts/gpu_ram_info.sh(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
scripts/gpu_ram_info.sh (1)
scripts/utils.sh (2)
normalize_percent_len(26-35)get_tmux_option(3-12)
🪛 Shellcheck (0.11.0)
scripts/gpu_ram_info.sh
[warning] 50-50: used_accuracy appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 51-51: total_accuracy appears unused. Verify use (or export if used externally).
(SC2034)
🔇 Additional comments (2)
scripts/gpu_power.sh (1)
49-49: Field references in wattage format are correct.Line 49 correctly uses
$1and$2in the printf format string. No changes needed here.scripts/gpu_ram_info.sh (1)
50-51: Verify variable usage forused_accuracyandtotal_accuracy.Lines 50–51 assign
used_accuracyandtotal_accuracyfrom tmux options. Confirm whether these variables are referenced after these lines (particularly in line 52's printf format). If unused, remove the assignments; if intended for formatting, apply them to the output.
Relevant PRs
|
Hello!
I want to suggest to show gpus status independently from the experience that each gpu runs different jobs, if there are multiple gpus. (especially when users share the machine)
Please refer the attached image below.
It still support a one gpu as well.
Any additional suggestions are welcome.
Thank you.