mmengine - INFO #1489
anaamansari
started this conversation in
General
mmengine - INFO
#1489
Replies: 1 comment
-
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I had a question regarding the output logs of the training script?
What is the unit of time, data_time and memory in the output below?
02/08 06:41:02 - mmengine - INFO - Epoch(train) [1][ 50/1931] lr: 7.9756e-05 eta: 7:52:46 time: 2.4590 data_time: 0.3534 memory: 42306 grad_norm: 3328.0067 loss: 634.0465 loss_heatmap: 613.3816 layer_-1_loss_cls: 7.7629 layer_-1_loss_bbox: 12.9020 matched_ious: 0.0015
02/08 06:42:41 - mmengine - INFO - Epoch(train) [1][ 100/1931] lr: 9.3103e-05 eta: 7:05:07 time: 1.9825 data_time: 0.0618 memory: 42732 grad_norm: 70.0995 loss: 23.1417 loss_heatmap: 7.4082 layer_-1_loss_cls: 5.3415 layer_-1_loss_bbox: 10.3920 matched_ious: 0.0288
Also using nvidia-smi I can see that I am using about 78G of memory on a H100 GPU. Please let me know how does the memory reported by the log relate to the nvida-smi memory?
Beta Was this translation helpful? Give feedback.
All reactions