torch.cuda.memory_stats¶
-
torch.cuda.
memory_stats
(device=None)[source]¶ Returns a dictionary of CUDA memory allocator statistics for a given device.
The return value of this function is a dictionary of statistics, each of which is a non-negative integer.
Core statistics:
"allocated.{all,large_pool,small_pool}.{current,peak,allocated,freed}"
: number of allocation requests received by the memory allocator."allocated_bytes.{all,large_pool,small_pool}.{current,peak,allocated,freed}"
: amount of allocated memory."segment.{all,large_pool,small_pool}.{current,peak,allocated,freed}"
: number of reserved segments fromcudaMalloc()
."reserved_bytes.{all,large_pool,small_pool}.{current,peak,allocated,freed}"
: amount of reserved memory."active.{all,large_pool,small_pool}.{current,peak,allocated,freed}"
: number of active memory blocks."active_bytes.{all,large_pool,small_pool}.{current,peak,allocated,freed}"
: amount of active memory."inactive_split.{all,large_pool,small_pool}.{current,peak,allocated,freed}"
: number of inactive, non-releasable memory blocks."inactive_split_bytes.{all,large_pool,small_pool}.{current,peak,allocated,freed}"
: amount of inactive, non-releasable memory.
For these core statistics, values are broken down as follows.
Pool type:
all
: combined statistics across all memory pools.large_pool
: statistics for the large allocation pool (as of October 2019, for size >= 1MB allocations).small_pool
: statistics for the small allocation pool (as of October 2019, for size < 1MB allocations).
Metric type:
current
: current value of this metric.peak
: maximum value of this metric.allocated
: historical total increase in this metric.freed
: historical total decrease in this metric.
In addition to the core statistics, we also provide some simple event counters:
"num_alloc_retries"
: number of failedcudaMalloc
calls that result in a cache flush and retry."num_ooms"
: number of out-of-memory errors thrown.
The caching allocator can be configured via ENV to not split blocks larger than a defined size (see Memory Management section of the Cuda Semantics documentation). This helps avoid memory framentation but may have a performance penalty. Additional outputs to assist with tuning and evaluating impact:
"max_split_size"
: blocks above this size will not be split."oversize_allocations.{current,peak,allocated,freed}"
: number of over-size allocation requests received by the memory allocator."oversize_segments.{current,peak,allocated,freed}"
: number of over-size reserved segments fromcudaMalloc()
.
- Parameters
device (torch.device or int, optional) – selected device. Returns statistics for the current device, given by
current_device()
, ifdevice
isNone
(default).
Note
See Memory management for more details about GPU memory management.