fix: Metal backend #150

PABannier · 2024-04-16T21:37:23Z

This PR allows users to use the Metal (MacOS) and cuBLAS backend by:

Exposing the n_gpu_layers parameter in the CLI
Using the Metal backend in the forward pass

siraben · 2024-04-19T17:52:59Z

After it creates the tokens and runs ggml_metal_init, I get this:

ggml_metal_init: GPU name:   Apple M1 Pro
ggml_metal_init: GPU family: MTLGPUFamilyApple7 (1007)
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 21845.34 MB
ggml_metal_init: maxTransferRate               = built-in GPU
ggml_metal_add_buffer: allocated 'backend         ' buffer, size =    54.36 MB, (   54.98 / 21845.34)
encodec_load_model_weights: model size =    44.36 MB
encodec_load_model: n_q = 32
ggml_metal_add_buffer: allocated 'backend         ' buffer, size =   314.06 MB, (  369.05 / 21845.34)
encodec_eval: compute buffer size: 314.05 MB

ggml_metal_graph_compute_block_invoke: error: node   0, op =   REPEAT not implemented
GGML_ASSERT: /Users/siraben/Git/bark.cpp/encodec.cpp/ggml/src/ggml-metal.m:1428: false
ggml_metal_graph_compute_block_invoke: error: node 4677, op = MAP_CUSTOM2_F32 not implemented
[1]    9701 abort      ./examples/main/main -ngl 100 -t 8 -m ./ggml_weights/ggml_weights.bin -em  -p

PABannier · 2024-04-20T13:48:53Z

Hello @siraben !
Indeed, it seems that some operations (e.g., repeat, which is used to broadcast computations) do not have a corresponding Metal kernel implemented in ggml. I'll open a PR to implement them.

normatovjj · 2024-04-23T23:50:53Z

When I try to run cmake -DGGML_CUBLAS=ON .. I get:

CMake Warning at encodec.cpp/ggml/src/CMakeLists.txt:219 (message):
  cuBLAS not found

normatovjj · 2024-04-26T02:52:28Z

When I try to run cmake -DGGML_CUBLAS=ON .. I get:
CMake Warning at encodec.cpp/ggml/src/CMakeLists.txt:219 (message):
  cuBLAS not found

I also tried CMAKE_ARGS='-DLLAMA_CUBLAS=on' cmake .. and added all the changes proposed in this pull, but to no success.

PABannier added 2 commits April 16, 2024 23:35

Exposed n_gpu_layers for Metal backend + build works

c8449db

Merge branch 'main' into fix_metal

d4431a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Metal backend #150

fix: Metal backend #150

Uh oh!

PABannier commented Apr 16, 2024

Uh oh!

siraben commented Apr 19, 2024

Uh oh!

PABannier commented Apr 20, 2024

Uh oh!

normatovjj commented Apr 23, 2024

Uh oh!

normatovjj commented Apr 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: Metal backend #150

Are you sure you want to change the base?

fix: Metal backend #150

Uh oh!

Conversation

PABannier commented Apr 16, 2024

Uh oh!

siraben commented Apr 19, 2024

Uh oh!

PABannier commented Apr 20, 2024

Uh oh!

normatovjj commented Apr 23, 2024

Uh oh!

normatovjj commented Apr 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants