Optimize GPU usage in reward models

Some of the validators are getting CUDA OOM every now and then (including the test validator). 

https://wandb.ai/opentensor-dev/openvalidators/runs/7p6prmo1/logs?workspace=user-opentensor-pedro

My initial hypothesis is that things are getting stacked in the GPU until they reach the limit. Considering that we have a validator that should run for days, it would be nice to identify some potential points of improvement for GPU management in order to avoid reaching the OOM point.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize GPU usage in reward models #82

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize GPU usage in reward models #82

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions