Skip to content

how to run server on GPU? #553

@moseshu

Description

@moseshu

my running code is below . but it's not on gpu.?how to run the server on GPU?the output length is short, it seems that n_ctx does't work

python3 -m llama_cpp.server --model ggml-model-f16.bin --port 7777 --host 192.168.0.1 --n_gpu_layers 30 --n_threads 4 --n_ctx 2048

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions