Skip to content

Conversation

@PABannier
Copy link
Owner

This PR allows users to use the Metal (MacOS) and cuBLAS backend by:

  • Exposing the n_gpu_layers parameter in the CLI
  • Using the Metal backend in the forward pass

@siraben
Copy link

siraben commented Apr 19, 2024

After it creates the tokens and runs ggml_metal_init, I get this:

ggml_metal_init: GPU name:   Apple M1 Pro
ggml_metal_init: GPU family: MTLGPUFamilyApple7 (1007)
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 21845.34 MB
ggml_metal_init: maxTransferRate               = built-in GPU
ggml_metal_add_buffer: allocated 'backend         ' buffer, size =    54.36 MB, (   54.98 / 21845.34)
encodec_load_model_weights: model size =    44.36 MB
encodec_load_model: n_q = 32
ggml_metal_add_buffer: allocated 'backend         ' buffer, size =   314.06 MB, (  369.05 / 21845.34)
encodec_eval: compute buffer size: 314.05 MB

ggml_metal_graph_compute_block_invoke: error: node   0, op =   REPEAT not implemented
GGML_ASSERT: /Users/siraben/Git/bark.cpp/encodec.cpp/ggml/src/ggml-metal.m:1428: false
ggml_metal_graph_compute_block_invoke: error: node 4677, op = MAP_CUSTOM2_F32 not implemented
[1]    9701 abort      ./examples/main/main -ngl 100 -t 8 -m ./ggml_weights/ggml_weights.bin -em  -p

@PABannier
Copy link
Owner Author

Hello @siraben !
Indeed, it seems that some operations (e.g., repeat, which is used to broadcast computations) do not have a corresponding Metal kernel implemented in ggml. I'll open a PR to implement them.

@normatovjj
Copy link

When I try to run cmake -DGGML_CUBLAS=ON .. I get:

CMake Warning at encodec.cpp/ggml/src/CMakeLists.txt:219 (message):
  cuBLAS not found

@normatovjj
Copy link

When I try to run cmake -DGGML_CUBLAS=ON .. I get:

CMake Warning at encodec.cpp/ggml/src/CMakeLists.txt:219 (message):
  cuBLAS not found

I also tried CMAKE_ARGS='-DLLAMA_CUBLAS=on' cmake .. and added all the changes proposed in this pull, but to no success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants