Skip to content

How can we know how many resources my training is using? #10186

@JesusSilvaUtrera

Description

@JesusSilvaUtrera

Search before asking

Question

Hi, I wanted to know why the value of the gpu_mem increases with the passage of the epochs and it's not a constant (or almost constant) value, because every epoch has the same batch size and input size, so the memory usage should be the same, shouldn't it?

Also I wanted to know how can we know how many RAM memory is necessary for our training in case, for example, we want to deploy a service on a cluster to train models.

Thanks in advance.

Additional

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    StaleStale and schedule for closing soonquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions