Skip to content

Conversation

@jianyicheng
Copy link
Collaborator

No description provided.

@JeffreyWong20
Copy link

It seems like it fails because there is no space left on the device.

@jianyicheng
Copy link
Collaborator Author

Ops... there are two options:

  1. removing unnecessary packages in the Dockerfile to save space
  2. build and push the container to upstream offline (on your local machine

@Aaron-Zhao123
Copy link
Collaborator

somehow it is building the gpu docker, this would trigger the whole cuda install and it is then exceeding the size limit?

@Aaron-Zhao123
Copy link
Collaborator

Aaron-Zhao123 commented Dec 7, 2025

@JeffreyWong20 would you mind to run something on the line of

TAG=$(date -u +%Y%m%d%H%M%S) docker build --no-cache --build-arg VHLS_PATH=/mnt/applications/Xilinx/23.1 --build-arg VHLS_VERSION=2023.1 -f Dockerfile-cpu --tag docker.io/aaronyirenzhao/mase-cpu:$TAG .

You may need your own docker id/password

Then you can run docker push

This basically mirrors

docker build --no-cache --build-arg VHLS_PATH=/mnt/applications/Xilinx/23.1 \

Then if you fix in the mase repository to point to that docker hub image, things should work.

@bingleilou
Copy link
Collaborator

Emm.. now figured out why the Docker image was hitting the space limit, and it turns out MLIR wasn't the problem at all.

The issue is, although we install the cpu version of PyTorch at the start, the subsequent mase installation selected a default cuda-based version and overwrites the original cpu one.

I have temporarily replaced the image with bingleilou/mase-docker-cpu:latest to bypass this. the solution is remove the torch-cuda and all nvidai-related packages and reinstall a cpu torch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants