This repository showcases example workflows and code built specifically to take advantage of Cerebras’ ultra-fast inference capabilities. Whether you’re building agent pipelines or multi-turn tools, these examples demonstrate how to harness low-latency inference for more interactive and iterative LLM applications. For the full guides to each example in this repository, visit our cookbook site found in our developer docs platform.
To get started, you’ll need access to the Cerebras Inference API and an API key. If you don't have one already, you can get one here.
Once you have your key, set it as an environment variable:
export CEREBRAS_API_KEY=<your API key>For more resources—including production tips and API reference—check out the Cerebras Inference documentation.