Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
68089ce
build bug fix
alexfrater Jun 12, 2024
ba68158
SIM Wave Windows
alexfrater Jun 15, 2024
dd7c3ca
Temp Sys Path
alexfrater Jun 15, 2024
75a700d
Fixed Weight Bank Bug
alexfrater Jun 16, 2024
63dcaca
Simualtion Files and MLP model
alexfrater Jun 21, 2024
40dcb98
Debugging Full Simulation
alexfrater Jul 3, 2024
fd90818
functions to save JIT model and graph
alexfrater Jul 8, 2024
93b1b22
gnn model software integration in testbench
alexfrater Jul 8, 2024
d1729b1
9 jul
alexfrater Jul 9, 2024
a803129
nodeslot writeback monitor checker
alexfrater Jul 10, 2024
dfabf34
bug fix: empty weight FIFOs set to full and dumping undefined values …
alexfrater Jul 10, 2024
d50adb4
gating non-valid fp input values to prevent x-propogation
alexfrater Jul 10, 2024
239440c
fte writeback logic
alexfrater Jul 10, 2024
7539061
sim files
alexfrater Jul 10, 2024
91c17d0
bug generating 4 meshes
alexfrater Jul 11, 2024
4e9905e
bug generating 4 meshes 2
alexfrater Jul 11, 2024
5b8f72b
aggregation feature write count - enable feature counts of arbitrary …
alexfrater Jul 12, 2024
31b372d
correct feature input order
alexfrater Jul 12, 2024
f3217d1
testbench output data and size comparision
alexfrater Jul 12, 2024
38d071c
fix wrong neigbour message address
alexfrater Jul 15, 2024
f8227a7
allow writeback of non fully populated fte
alexfrater Jul 16, 2024
aaa4300
fast multiplication bug fix
alexfrater Jul 16, 2024
5577d9a
valid row fix for nodeslot count greater than tfe channel count (test…
alexfrater Jul 16, 2024
e2ebdac
simulation edits
alexfrater Jul 16, 2024
7b80cb2
bug fix buffer slot reset after transform
alexfrater Jul 17, 2024
5b99d67
fix bug last row of fte not reset
alexfrater Jul 17, 2024
36fadb4
backpressure enabled on agc feature aggregator and bug fix valid sign…
alexfrater Jul 17, 2024
f4c766c
mapping memory for varying feature widths in multi-layer mlp
alexfrater Jul 17, 2024
a65f167
FTE writeback to HBM
alexfrater Jul 18, 2024
6a3c4d9
multi layer weights mapping in sdk
alexfrater Jul 18, 2024
3fa3e83
reset row fifos between layers (MLP testbench passed)
alexfrater Jul 18, 2024
5ec5b34
read from previous layer features in multi-layer models
alexfrater Jul 18, 2024
45d46c0
replaced Xilinx IP for simulation with rtl models of aggregation and …
alexfrater Jul 19, 2024
52ecffa
verialtor make file
alexfrater Jul 19, 2024
27ed41a
linting actions
alexfrater Jul 21, 2024
626ccb2
verilator_linting
alexfrater Jul 21, 2024
6638d51
verilator --Wwarn
alexfrater Jul 21, 2024
10d51ec
changed dockerfile location
alexfrater Jul 21, 2024
c254de6
dockerfile 2
alexfrater Jul 21, 2024
c7e4095
lint test
alexfrater Jul 21, 2024
0133713
Werror verilator linting
alexfrater Jul 21, 2024
e5240be
lint2
alexfrater Jul 21, 2024
356a30e
lint3
alexfrater Jul 21, 2024
10061d7
Update verilog_linting.yml
alexfrater Jul 21, 2024
06f3833
fix cocotb modelsim error
alexfrater Jul 23, 2024
04d60e2
updated tb - fix bias removal for arbitrary layers
alexfrater Jul 23, 2024
735bf43
fixed gcn aggregation bug
alexfrater Jul 24, 2024
ba288a2
added layer config to enable aggregation (issues with some features i…
alexfrater Jul 24, 2024
be51854
fpga clock cycle benchmarking
alexfrater Jul 25, 2024
ef92b44
dynamically add workarea path
alexfrater Jul 25, 2024
40d7a42
requirments system agnostic
alexfrater Jul 25, 2024
2ec25f0
mkl service
alexfrater Jul 25, 2024
34e2c29
added model tb precision
alexfrater Jul 25, 2024
e0a3c69
added arguments to tb and changed logging levels
alexfrater Jul 25, 2024
c45750d
minor changes
alexfrater Jul 26, 2024
22233e6
cleaning up code 1
alexfrater Jul 27, 2024
333582d
cleaning up code 2
alexfrater Jul 27, 2024
5e1cc86
Merge pull request #3 from alexfrater/mlp-feature-branch
alexfrater Jul 27, 2024
68ad548
updated gitignore and READM
alexfrater Jul 27, 2024
72dbffb
remove imports
alexfrater Jul 27, 2024
660396c
working adj list offset for each layer
alexfrater Jul 28, 2024
0fb028f
removed redunant signals
alexfrater Jul 29, 2024
6a38e41
fix_bug
alexfrater Jul 29, 2024
e91769e
added precision to gcn_mlp
alexfrater Jul 29, 2024
cd8418f
updated sim
alexfrater Jul 29, 2024
04bef35
debugging
alexfrater Jul 29, 2024
6e522e3
more trials in gpu bmark
Jul 30, 2024
be024bb
brand publish
alexfrater Jul 30, 2024
6800fc1
temp fixes from float_silu branch to allow verilator compilation
alexfrater Jul 30, 2024
d66de93
Merge branch 'multilayer_models' of https://github.com/alexfrater/agi…
alexfrater Jul 30, 2024
5d74b9a
makefile
alexfrater Jul 30, 2024
a4c2de6
Merge branch 'edge_embeddings' into multilayer_models
alexfrater Jul 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .github/workflows/verilator-linting.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: Verilog Code Linting with Verilator

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
workflow_dispatch:
inputs:
logLevel:
description: 'Log level'
required: true
default: 'warning'
type: choice
options:
- info
- warning
- debug

jobs:
verilog-lint:
runs-on: ubuntu-latest
container:
image: verilator/verilator:latest # Use your Docker image

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Run Verilator Linter
run: |
mkdir -p lint_reports
find . -name "*.sv" -or -name "*.v" | xargs -I {} sh -c 'verilator --lint-only {} > lint_reports/$(basename {}).lint || true'

- name: Upload lint reports
uses: actions/upload-artifact@v4
with:
name: verilator-lint-reports
path: lint_reports/
47 changes: 47 additions & 0 deletions .github/workflows/verilog_linting.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Verilog Code Linting with Verible

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]
workflow_dispatch:
inputs:
logLevel:
description: 'Log level'
required: true
default: 'warning'
type: choice
options:
- info
- warning
- debug

jobs:
verilog-lint:
runs-on: ubuntu-latest
container:
image: alexanderlwhite/ample-image:latest # Use your Docker image with Verilator and Verible

steps:
- name: Checkout code
uses: actions/checkout@v3

- name: Cache Verible Lint Configuration
uses: actions/cache@v3
with:
path: .verible_lint.rules
key: ${{ runner.os }}-verible-lint-config-${{ hashFiles('.verible_lint.rules') }}
restore-keys: |
${{ runner.os }}-verible-lint-config-

- name: Run Verible Linter
run: |
mkdir -p lint_reports
find . -name "*.sv" -or -name "*.v" | xargs -I {} sh -c 'verible-verilog-lint --rules_config .verible_lint.rules {} > lint_reports/$(basename {}).lint || true'

- name: Upload lint reports
uses: actions/upload-artifact@v4
with:
name: verible-lint-reports
path: lint_reports/
26 changes: 25 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ vivado_*
*.runs
*.sim
*.tmp
*.pt
*.pth
*.csv

*.log
*.jou
Expand All @@ -18,9 +21,13 @@ vivado_*
*.vdb
*.sdb




# Simulation
xsim.dir/
.Xil/
.nfs*

vagrant/

Expand All @@ -42,6 +49,7 @@ graph_dump.txt
nodeslot_programming.json
layer_config.json
*.mem
Cora/

hw/build/
hw/sim/xsim.ini
Expand All @@ -53,6 +61,7 @@ hw/update_regbanks.tcl
# Modelsim simulation
sim_build/
modelsim_lib/
verilator_build/

# Cocotb
results.xml
Expand All @@ -61,4 +70,19 @@ transcript
*.wlf
compile.do
simulate.do
opt.do
opt.do


.vscode/


hw/sim/-debugDB
hw/sim/sim_cycles.txt
hw/sim/vsim.dbg
hw/sim/sim_time.txt
imports/verilog-axi
imports/nocrouter
imports/

hw/tb/module_tests/lib/test/*

18 changes: 18 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
FROM ubuntu:latest

# Install dependencies
RUN apt-get update && apt-get install -y \
wget \
tar \
xz-utils \
build-essential

# Install Verible
RUN apt-get update && apt-get upgrade -y \
&& wget https://github.com/chipsalliance/verible/releases/download/v0.0-3724-gdec56671/verible-v0.0-3724-gdec56671-linux-static-x86_64.tar.gz \
&& tar -xzf verible-v0.0-3724-gdec56671-linux-static-x86_64.tar.gz \
&& mv verible-v0.0-3724-gdec56671/bin/* /usr/local/bin/

CMD ["bash"]


15 changes: 14 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ source $WORKAREA/scripts/build.sh
6. Generate the simulation payloads. For example, for the KarateClub dataset:

```bash
$WORKAREA/scripts/initialize.py --karate --gcn --payloads --random
python3 $WORKAREA/scripts/initialize.py --karate --gcn --payloads --random
```

7. Build the testbench.
Expand All @@ -131,6 +131,19 @@ make build
make sim GUI=1
```


8. Benchmark AMPLE against a CPU
```bash
make build
python3 $WORKAREA/scripts/initialize.py --karate --mlp --payloads --random --layers 10 --sim --cpu --tb_log_level INFO
```



Note: Defining ```SIMULATION``` in the Modelsim makefile will simualte without Xillinx IP and ```SIMULATION_QUICK``` will simulate without FP units, activaiton, bias or aggregation



<p align="right">(<a href="#readme-top">back to top</a>)</p>


Expand Down
5 changes: 5 additions & 0 deletions hw/ip/aggregation_engine/include/age_pkg.sv
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@ typedef struct packed {

logic [MAX_AGC_PER_NODE-1:0] [$clog2(noc_pkg::MAX_MESH_COLS)-1:0] coords_x;
logic [MAX_AGC_PER_NODE-1:0] [$clog2(noc_pkg::MAX_MESH_ROWS)-1:0] coords_y;

logic [MAX_AGC_PER_NODE-1:0] [$clog2(top_pkg::MAX_FEATURE_COUNT)-1:0] num_features;



} AGE_AGM_REQ_t;

typedef struct packed {
Expand Down
43 changes: 37 additions & 6 deletions hw/ip/aggregation_engine/rtl/aggregation_core.sv
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,17 @@ module aggregation_core #(
);

parameter ALLOCATION_PKT_AGGR_FUNC_OFFSET = $clog2(top_pkg::MAX_NODESLOT_COUNT);
parameter ALLOCATION_PKT_NUM_FEATURES_OFFSET = ALLOCATION_PKT_AGGR_FUNC_OFFSET + $bits(top_pkg::AGGREGATION_FUNCTION_e) ;

parameter EXPECTED_FLITS_PER_PACKET = 1;

typedef enum logic [3:0] {
typedef enum logic [4:0] {
AGC_FSM_IDLE,
AGC_FSM_NODESLOT_ALLOCATION,
AGC_FSM_WAIT_FEATURE_HEAD,
AGC_FSM_WAIT_FEATURE_BODY,
AGC_FSM_UPDATE_ACCS,
AGC_FSM_WAIT_UPDATE_ACCS,
AGC_FSM_WAIT_BUFFER_REQ,
AGC_FSM_SEND_BUFF_MAN,
AGC_FSM_WAIT_DRAIN
Expand Down Expand Up @@ -119,6 +121,11 @@ logic noc_router_waiting;

logic [SCALE_FACTOR_QUEUE_READ_WIDTH-1:0] scale_factor_q;


logic [$clog2(top_pkg::MAX_FEATURE_COUNT)-1:0] num_features;
logic [MESH_NODE_ID_WIDTH - 1 : 0] agc_loc;


// ==================================================================================================================================================
// Instantiations
// ==================================================================================================================================================
Expand Down Expand Up @@ -202,16 +209,23 @@ always_comb begin
end

AGC_FSM_UPDATE_ACCS: begin
agc_state_n = feature_aggregator_in_feature_ready ? AGC_FSM_WAIT_UPDATE_ACCS
: AGC_FSM_UPDATE_ACCS;
end


AGC_FSM_WAIT_UPDATE_ACCS: begin
agc_state_n =
// Updating last feature accumulator and last packet flag has already been received
(received_flits == EXPECTED_FLITS_PER_PACKET[3:0]) && &feature_updated && aggregation_manager_packet_last_q ? AGC_FSM_WAIT_BUFFER_REQ // updating final features

// Updating last feature accumulator but packets still pending
: (received_flits == EXPECTED_FLITS_PER_PACKET[3:0]) && &feature_updated ? AGC_FSM_WAIT_FEATURE_HEAD

: AGC_FSM_UPDATE_ACCS;
: AGC_FSM_WAIT_UPDATE_ACCS;
end


AGC_FSM_WAIT_BUFFER_REQ: begin
agc_state_n = router_aggregation_core_valid && aggregation_manager_pkt && received_buffer_req_head && tail_packet && correct_pkt_dest ? AGC_FSM_SEND_BUFF_MAN
: AGC_FSM_WAIT_BUFFER_REQ;
Expand Down Expand Up @@ -240,6 +254,7 @@ always_comb begin
head_packet = (router_aggregation_core_data.flit_label == noc_pkg::HEAD);
tail_packet = (router_aggregation_core_data.flit_label == noc_pkg::TAIL);


packet_source = router_aggregation_core_data.data.head_data.head_pl[noc_pkg::HEAD_PAYLOAD_SIZE-1 : noc_pkg::HEAD_PAYLOAD_SIZE-MESH_NODE_ID_WIDTH];
packet_source_col = packet_source[MESH_NODE_ID_WIDTH - 1 : MESH_NODE_ID_WIDTH - $clog2(noc_pkg::MAX_MESH_COLS)];
packet_source_row = packet_source[$clog2(noc_pkg::MAX_MESH_ROWS)-1:0];
Expand Down Expand Up @@ -351,10 +366,23 @@ always_comb begin
: sent_flits_counter == EXPECTED_FLITS_PER_PACKET[$clog2(FEATURE_COUNT/2)-1:0] ? noc_pkg::TAIL
: noc_pkg::BODY;

aggregation_core_router_data.data.bt_pl = {buffer_manager_pkt_dest_col, buffer_manager_pkt_dest_row,
X_COORD[$clog2(MAX_MESH_COLS)-1:0], Y_COORD[$clog2(MAX_MESH_ROWS)-1:0], // source node coordinates
bm_chosen_data };
end
if (sent_flits_counter == '0) begin

aggregation_core_router_data.data.head_data.x_dest = {buffer_manager_pkt_dest_col};
aggregation_core_router_data.data.head_data.y_dest = {buffer_manager_pkt_dest_row};

agc_loc[MESH_NODE_ID_WIDTH - 1 : MESH_NODE_ID_WIDTH - $clog2(MAX_MESH_COLS)] = X_COORD[$clog2(MAX_MESH_COLS)-1:0];
agc_loc[$clog2(MAX_MESH_ROWS)-1:0] = Y_COORD[$clog2(MAX_MESH_ROWS)-1:0];

aggregation_core_router_data.data.head_data.head_pl[noc_pkg::HEAD_PAYLOAD_SIZE-1 : noc_pkg::HEAD_PAYLOAD_SIZE-MESH_NODE_ID_WIDTH] = agc_loc;

aggregation_core_router_data.data.head_data.head_pl[noc_pkg::HEAD_PAYLOAD_SIZE-MESH_NODE_ID_WIDTH- 1 : noc_pkg::HEAD_PAYLOAD_SIZE-MESH_NODE_ID_WIDTH - $bits(num_features)] = {num_features};


end else begin
aggregation_core_router_data.data.bt_pl = {bm_chosen_data};
end
end

// Nodeslot allocation
// --------------------------------------------
Expand Down Expand Up @@ -385,7 +413,10 @@ always_ff @(posedge core_clk or negedge resetn) begin
: router_aggregation_core_data.data.bt_pl [ALLOCATION_PKT_AGGR_FUNC_OFFSET + $bits(top_pkg::AGGREGATION_FUNCTION_e) - 1 : ALLOCATION_PKT_AGGR_FUNC_OFFSET] == 2'd2 ? top_pkg::WEIGHTED_SUM
: AGGR_FUNC_RESERVED;

num_features <= router_aggregation_core_data.data.bt_pl [ALLOCATION_PKT_NUM_FEATURES_OFFSET + $bits(top_pkg::MAX_FEATURE_COUNT) - 1 : ALLOCATION_PKT_NUM_FEATURES_OFFSET];

end

end


Expand Down
12 changes: 9 additions & 3 deletions hw/ip/aggregation_engine/rtl/aggregation_core_allocator.sv
Original file line number Diff line number Diff line change
Expand Up @@ -74,18 +74,24 @@ if (ALLOCATION_MODE == age_pkg::AGC_ALLOCATION_MODE_STATIC) begin
agm_req.nsb_req = allocation_req;
agm_req.required_agcs = layer_config_in_features_count[9:4] + (|layer_config_in_features_count[3:0] ? 1'b1 : '0);

// Mask of allocated cores, size NUM_CORES = (AGGREGATION_ROWS * AGGREGATION_COLUMNS)
// Used later for deallocating AGCs when AGM is finished so they become available for next AGM
// Not needed here since allocation is static
agm_req.allocated_cores = '0;
end

for (genvar allocation_slot = 0; allocation_slot < age_pkg::MAX_AGC_PER_NODE; allocation_slot++) begin
always_comb begin
agm_req.coords_x [allocation_slot] = 1'b1 + allocation_slot;
agm_req.coords_y [allocation_slot] = (allocation_req.nodeslot % NUM_MANAGERS);

//Final AGC may take less than 16 features
if (allocation_slot == (agm_req.required_agcs-1) && |layer_config_in_features_count[3:0]) begin
agm_req.num_features [allocation_slot] <= layer_config_in_features_count[3:0];
end
else begin
agm_req.num_features [allocation_slot] <= 16;
end
end
end


end

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import age_pkg::*;
import noc_pkg::*;

module aggregation_core_allocator #(
module aggregation_core_allocator_sequential_rr #(
parameter NUM_CORES = top_pkg::TRANSFORMATION_CHANNELS * top_pkg::AGGREGATION_CHANNELS,
parameter NUM_MANAGERS = top_pkg::TRANSFORMATION_CHANNELS,
parameter AGGREGATION_COLUMNS = top_pkg::AGGREGATION_CHANNELS
Expand Down Expand Up @@ -66,7 +66,6 @@ always_comb begin
// Static AGM req payloads
agm_req.nsb_req = allocation_req_q;
agm_req.required_agcs = layer_config_in_features_count[9:4] + (|layer_config_in_features_count[3:0] ? 1'b1 : '0);

allocation_req_ready = !busy;
done = (agc_counter == agm_req.required_agcs);
end
Expand Down Expand Up @@ -102,10 +101,17 @@ for (genvar allocation_slot = 0; allocation_slot < age_pkg::MAX_AGC_PER_NODE; al
if (!resetn) begin
agm_req.coords_x [allocation_slot] <= '0;
agm_req.coords_y [allocation_slot] <= '0;

agm_req.num_features [allocation_slot] <= 0;
end else if (busy && !done && (agc_counter == allocation_slot)) begin
agm_req.coords_x [allocation_slot] <= 1'b1 + (allocated_core_bin % AGGREGATION_COLUMNS); // TO DO: replace with AGGREGATION_COLS when merging with generalized aggregation mesh
agm_req.coords_y [allocation_slot] <= allocated_core_bin / AGGREGATION_COLUMNS;

agm_req.num_features [allocation_slot] <= 16;

//Final AGC may take less than 16 features
if (allocation_slot == (agm_req.required_agcs-1) && |layer_config_in_features_count[3:0]) begin
agm_req.num_features [allocation_slot] <= layer_config_in_features_count[3:0];
end
end
end
end
Expand Down
Loading