Skip to content

Conversation

@williamberman
Copy link
Contributor

this will help for benchmarking since we don't have to include xformers in it :P

Testing the MOVQ:

from muse.modeling_movq import MOVQ
from PIL import Image
import numpy as np
import torch

torch.set_grad_enabled(False)
torch.manual_seed(0)

image = Image.open('input.png')

image = image.convert("RGB")
image = np.array(image)
image = image.astype(np.float32)
image = image / 255
image = image[None, :, :, :]
image = image.transpose(0, 3, 1, 2)
image = torch.from_numpy(image).to('cuda')

vae = MOVQ.from_pretrained("openMUSE/movq-lion-high-res-f8-16384")
vae.to('cuda')
vae.set_use_memory_efficient_attention_xformers(True) # comment out if running on this branch

out = vae(image)
out = out[0]

print(out.abs().sum())

out = out * 255
out = out.permute(0, 2, 3, 1)
out = out.cpu()
out = out.numpy()
out = out.astype(np.int8)
out = out[0, :, :, :]
out = Image.fromarray(out, mode='RGB')
out.save('out.png')

torch native: tensor(63837.2969, device='cuda:0')
xformers: tensor(63861.8047, device='cuda:0')

input:
man_in_forest

torch native out:
movq_out_torch_native

xformers out:
movq_out_xformers

Testing the transformer:

from muse import PipelineMuse
import torch

torch.manual_seed(0)

model = "openMUSE/muse-cc12m-uvit-clip-130k"

pipe = PipelineMuse.from_pretrained(model).to("cuda")
pipe.transformer.set_use_memory_efficient_attention_xformers(True) # comment out if running on this branch

pipe("a person in the forest", timesteps=12)[0].save("out.png")

torch native:
transformer_torch_native

xformers:
transformer_xformers

@williamberman williamberman requested a review from patil-suraj May 30, 2023 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant