Elk: Exploring the Efficiency of Inter-core Connected AI Chips with Deep Learning Compiler Techniques (Artifact)

1. Abstract

This is the artifact of the Elk paper, which is accepted at MICRO'2025. In this artifact, we provide the source code of Elk's compilation, simulation, and evaluation framework. Then, we guide readers to explore how Elk improves the model serving performance on a variety of ICCA chips (i.e., Figure 17–24 in our paper). To run this artifact, please use a Linux machine with at least 200 GB of main memory and at least 20 GB of disk space.

2. Artifact Checklist

Algorithm: Inductive tensor operator scheduling, cost-aware on-chip memory allocation, and ICCA chip design space exploration.
Neural Network Models: Llama2-13B, Gemma2-27B, OPT-30B, Llama2-70B, and DIT-XL. Their execution graphs are included in the repo.
Run-time environment: Ubuntu 20.04 or newer, Python 3.10.
Metrics: Execution time, hardware utilization.
Output: Trace files and result figures.
Experiments: Generate experiments using supplied scripts.
How much main memory required (approximately): 200 GB
How much disk space required (approximately): 20 GB
How much time to prepare workflow (approximately): 10 minutes
How much time to complete experiments (approximately): 30 hours on a machine with 64 CPU cores and 200 GB main memory.

3. Description

3.1 Hardware Dependencies

The Elk simulation and evaluation framework can run on any x86 machine with at least 200 GB of main memory and at least 20 GB of disk space.

3.2 Software Dependencies

The framework needs a Linux environment (preferably Ubuntu) with Python 3.10 installed.

4. Installation

Start by downloading the Elk artifact from GitHub:

git clone https://github.com/platformxlab/elk.git
cd elk

Please make sure all prerequisites are successfully installed:

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.10 tmux -y
curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
python3.10 -m pip install -r requirements.txt

5. Experiment Workflow

To compile DL models into programs and obtain program execution traces from the Elk simulator, we provide a one-click script benchmark_scripts/generate_data_from_sim.py for you to launch all test cases in one place. However, the script may take more than 30 hours to finish, so we recommend using tmux:

tmux

Then, run the one-click script within the new tmux terminal:

python3.10 benchmark_scripts/generate_data_from_sim.py

To return from the tmux terminal without pausing the script, press Ctrl+B and then press D on your keyboard. To attach back to the original tmux terminal where the script is running, use:

tmux attach -t 0

For more tips on using tmux, refer to https://tmuxcheatsheet.com.

5.1 Handle Errors

If the script encounters an error, the most common cause is that the artifact runs on too many CPU cores and overflows the main memory. In such events, (1) go to launch.py, (2) change the CORE_REDUCE macro in line 22 to a larger value (e.g., CORE_REDUCE=8), and (3) rerun the script:

python3.10 benchmark_scripts/generate_data_from_sim.py

The script should automatically skip any completed test cases and resume from the failed one.

6. Evaluation and Expected Results

After the completion of all experiments, please run the following script to evaluate the results:

./run_artifact_eval_graph_gen.sh

This script gathers all data from the execution trace and draws all figures. To verify the results, one can compare the generated figures with those in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
benchmark_scripts		benchmark_scripts
models		models
.gitignore		.gitignore
DNNModels.py		DNNModels.py
DNNProgram.py		DNNProgram.py
OpPartitionSearch.py		OpPartitionSearch.py
README.md		README.md
TensorExpression.py		TensorExpression.py
fast_perms.py		fast_perms.py
launch.py		launch.py
predictor.py		predictor.py
requirements.txt		requirements.txt
run_artifact_eval_graph_gen.sh		run_artifact_eval_graph_gen.sh
run_mem_monitor.sh		run_mem_monitor.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Elk: Exploring the Efficiency of Inter-core Connected AI Chips with Deep Learning Compiler Techniques (Artifact)

1. Abstract

2. Artifact Checklist

3. Description

3.1 Hardware Dependencies

3.2 Software Dependencies

4. Installation

5. Experiment Workflow

5.1 Handle Errors

6. Evaluation and Expected Results

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

platformxlab/elk

Folders and files

Latest commit

History

Repository files navigation

Elk: Exploring the Efficiency of Inter-core Connected AI Chips with Deep Learning Compiler Techniques (Artifact)

1. Abstract

2. Artifact Checklist

3. Description

3.1 Hardware Dependencies

3.2 Software Dependencies

4. Installation

5. Experiment Workflow

5.1 Handle Errors

6. Evaluation and Expected Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages