Memory issues on MPS

## Small notes
I'm running this on the latest version of PyTorch and Lightning, this requires one small change to the code in training_utils.py. The removal of line 385 `accelerator_connector._register_external_accelerators_and_strategies()` however this has no effect on this issue.

## MPS and Apple Silicon
MPS (which stands for Metal Performance Shaders) is more or less Apple's version of CUDA, here's the official [documentation](https://developer.apple.com/documentation/metalperformanceshaders). On Apple Silicon, there is no difference between VRAM and normal RAM, it's all shared. As such, I shall just call it RAM here.

## The issue
While training an Acoustic model, diffsinger uses an abnormally high amount of RAM while training. The worst part is, it scales and doesn't seem to stop. This would imply a memory leak somewhere in the code. I don't believe it's my dataset or my config, as I have trained them on CUDA without issue. There are some known issues with PyTorch memory leaks on MPS, however as I don't know anything about the coding parts of AI nor python as a language, I can't confirm that these are the issues here. https://github.com/pytorch/pytorch/issues/154329 https://github.com/pytorch/pytorch/issues/145374

![Image](https://github.com/user-attachments/assets/b95eb841-3f48-4bd5-a2d5-7bbd73dc5704)

[config.yaml](https://github.com/user-attachments/files/24693792/config.yaml)

If someone could add in PyTorch profiling for memory or some other form of seeing where this issue maybe be, that would be greatly appreciated. I know that most of the dev team doesn't have macs, and that I'm the first person to really try training on one so I want to help as much as I can. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory issues on MPS #290

Small notes

MPS and Apple Silicon

The issue

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory issues on MPS #290

Description

Small notes

MPS and Apple Silicon

The issue

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions