Skip to content

CUDA Out of Memory #18

@ArezooGhodsifard

Description

@ArezooGhodsifard

Hello,

I'm encountering a CUDA Out of Memory (OOM) issue while attempting to allocate an additional 768.00 MiB for model inference, despite having a seemingly sufficient amount of free memory on my NVIDIA GeForce RTX 3060 (6 GB total capacity). The exact warning message is as follows:

UserWarning: cuda device found but got the error CUDA out of memory. Tried to allocate 768.00 MiB (GPU 0; 5.80 GiB total capacity; 1.95 GiB already allocated; 356.75 MiB free; 2.00 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF - using CPU for inference

This occurs when I launch my application with Uvicorn, involving model registrations that potentially lead to this memory allocation issue. My environment is set up with CUDA 11.7 and PyTorch compatible with this CUDA version, running in a Conda environment on Ubuntu.

Could you provide insights or suggestions on how to manage or mitigate this OOM issue? Are there recommended practices for memory management or configurations specific to PyTorch that I should consider to optimize GPU memory usage and avoid hitting this limit?

Thank you for your support.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions