Verified & Integrated MXINT Implementation and Evaluation on NeRF (ADLS Group 0) #277

omaralkhatib03 · 2025-03-27T20:50:59Z

Description

This PR introduces a significant refactor of the MXINT implementation. It enhances verification and fully integrates MXINT quantization into the Mase Framework. These changes enable the emission of deep fully connected PyTorch networks and facilitate software analysis of MXINT quantization using Optuna. Additionally, we provide enhancements to NeRF vision and leverage the model for evaluation purposes.

Contributions

An automated tooflow for emitting pytorch models after applying MXINT quantisation.
A set of verified OCP-compliat parametric verilog modules:

MXINT Linear Layer
- MXINT Accumulator
- MXINT Cast
- MXINT Dot Product
MXINT ReLU Layer
MXINT Concatenation

Evaluation of MXINT Quantisation onf NeRF Model
Evaluation of MXINT Resource - Accuracy Pareto front of an MLP Network using Optuna Multi-objective study

Detailed Implementation Notes

Changes to the analysis

Overall change, rename all instances of modules referring to integer to fixed-point to avoid double naming
Changes to the add_hardware_metadata pass
- Introduce the ability so support different data_types for the hardware modules
  - Define a python type variable to capture the structure of the INTERNAL_COMP dictionary and the individual ip entries
  - Define the list of supported hw quantizations as a string literal (this is used when checking for compatibility when running the pass)
  - Change the analysis to account for different datatypes
- support MXint_relu
Changes to the quantize pass
- Automatically quantize the data_out variables, using the data_in config if a data_out config is not explicitly defined
- override integer to be fixed-point if it is encountered

Changes to emission

At a high level, allow for non-fixed-point datatypes, specifically the mxint variety
BRAM emission (emit_bram)
- Introduce an emit_parameters_in_mem_internal_mxint function
  - computes the required constants and inserts them in the mxint_bram_template
  - write the bram to a file
- Refactor emit_parameters_in_dat_internal to handle mxint datatype
  - write the values out as binary instead of hex because of issues with encoding packed arrays for non multiple of 4 bitwidths
Testbench (emit_tb)
- Handle non-fixed point datatype, namely mxint
- Refactor to have custom drivers and monitors based on the data type (inheriting from the standard ones)
  - introduce the custom monitor for mxint off by one errors as described in the report
- Move the quantisation step to occur inside of the custom driver/monitor
Top emission (emit_top)
- Handle non-fixed point datatype!
  - Abstract the wiring/port/interface generation logic into common function
  - Handle the type-specific cases within these abstracted functions

Changes to the cocotb interfaces

introduce generic off-by-one multi-signal monitor
Changes to the testing
Introduce test_emit_verilog_linear_mxint
- This test is responsible for ensuring that verilog emissions works for the mxint datatype
- Has two test functions that each take a seed and generate some parameters to test on
  - ..._mlp generates and tests a multi layer perceptron with random number of layers and random dimensions
  - ..._linear simply tests the linear layer, to ensure the weighs and biases being emitted are correct
- In all cases the test does the following:
  - given some model and some configuration it will quantize the modeldd_hardware_metadata` pass
- Introduce the ability so support different data_types for the hardware modules
  - Define a python type variable to capture the structure of the INTERNAL_COMP dictionary and the individual ip entries
  - Define the list of supported hw quantizations as a string literal (this is used when checking for compatibility when running the pass)
  - Change the analysis to account for different datatypes
- support MXint_relu
Changes to the quantize pass
- Automatically quantize the data_out variables, using the data_in config if a data_out config is not explicitly defined
- override integer to be fixed-point if it is encountered

References

Open Compute Project. OCP Microscaling Formats (MX) Specification Version 1.0. September 2023. [Online]. Available: https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf

* fix mxint * fix doc string * add mxint relu * update mxint quantize test * remove mxint linear test * run python black --------- Co-authored-by: Ollie Cosgrove <oc121@gpu35.doc.ic.ac.uk>

* Added mxint_matrix_cat.sv: Untested Revewing Interface * Added Test Bench * Added Test Bench * Added Test Bench * Run python black * verible-v0.0-2776-gbaf0efe9 * Removed waves argument * Removed wrong file --------- Co-authored-by: splogdes <95136830+splogdes@users.noreply.github.com>

* fix mxint cast * run verible * Working Accumulator * Working Dot Product * fix accumulate * quantizer * working dot product * cleanup * better testing * wip dot_product * fix dot product tb * fix accumulate * Working linear layer for mxint no bias * format code * Fix mxint linear tb * remove prints * small fix to accumulate shift value * fix accumulator and update tb * final changes to accum tb * verible format * fix dot product testbench * fix bitwidths to match new accum * verible fix --------- Co-authored-by: luigirinaldi <luigirinaldi@users.noreply.github.com>

) * fix mxint cast mostly still bugs improve mxint linear * add an off by one mode to the monitor * Remove asserts from mxint_cast.sv * Add off by one to test bench monitor * Fix over flow problems

* better testing on linear layer * format * even better random and asserts * organize random probabilities * HACK to avoid mxint_cast issues * update tb to match working configurations --------- Co-authored-by: luigirinaldi <luigirinaldi@users.noreply.github.com>

* Use bias in mxint quantise to follow the spec more closely * python black

* Added mxint relu * fix test bench and reformat --------- Co-authored-by: Omar Alkhatib <oa321@ic.ac.uk>

* Added support to compute bias * verible format * Fixed accumulator tb + added tb to workflow * black format * Fix mxint cast and add support for off by one mode in cocotb monitor (#9) * fix mxint cast mostly still bugs improve mxint linear * add an off by one mode to the monitor * Remove asserts from mxint_cast.sv * Add off by one to test bench monitor * Fix over flow problems * better testing on linear layer (#7) * better testing on linear layer * format * even better random and asserts * organize random probabilities * HACK to avoid mxint_cast issues * update tb to match working configurations --------- Co-authored-by: luigirinaldi <luigirinaldi@users.noreply.github.com> * Fixed bias * Use bias in mxint quantise to follow the spec more closely (#5) * Use bias in mxint quantise to follow the spec more closely * python black * Implemented cat for non-square block-size (#10) * Feat add relu (#11) * Added mxint relu * fix test bench and reformat --------- Co-authored-by: Omar Alkhatib <oa321@ic.ac.uk> * Fixed bias * Removed zeros * Fixed accum tb --------- Co-authored-by: splogdes <95136830+splogdes@users.noreply.github.com> Co-authored-by: Luigi Rinaldi <82770895+luigirinaldi@users.noreply.github.com> Co-authored-by: luigirinaldi <luigirinaldi@users.noreply.github.com>

* Better relu TB * Better relu TB * black format

* Increase casting range * Defo working mxint cast * format * add asserts * add asserts * Change bounds of linear tb

* Training finally working * TPE Search working * TPE Search Working Correctly * Training OK * Testing fixed * Configs * remove a.json * PSNR as search metric + Corrected Formatting

* add relu dependencies * wiring works * working quite stably relu * quantize relu * less logging * wip * working relu and mlp * format * fix rebase artifacts * format python * remove cat from test --------- Co-authored-by: splogdes <95136830+splogdes@users.noreply.github.com> Co-authored-by: luigirinaldi <luigirinaldi@users.noreply.github.com>

* Added vivado stuff * Verilog format * Modified code to make it syntheisable * Working Optuna Fully Deep Connected Analysis * migrating to run host * Fixing mlp features * Re-definint parameters * Clean up * Fixed test_emit_verilog_mxint.py

Co-authored-by: luigirinaldi <luigirinaldi@users.noreply.github.com>

luigirinaldi and others added 21 commits March 4, 2025 21:14

fix: broken cocotb version

b05b45a

remove workflows

8abfb38

readd actions

d85d261

Merge branch 'main' of github.com:DeepWok/mase

d7429ba

Refactoring and Fixing MXINT (#1)

c41e481

* fix mxint * fix doc string * add mxint relu * update mxint quantize test * remove mxint linear test * run python black --------- Co-authored-by: Ollie Cosgrove <oc121@gpu35.doc.ic.ac.uk>

Fix mxint cast and add support for off by one mode in cocotb monitor (#9

28bea26

) * fix mxint cast mostly still bugs improve mxint linear * add an off by one mode to the monitor * Remove asserts from mxint_cast.sv * Add off by one to test bench monitor * Fix over flow problems

Use bias in mxint quantise to follow the spec more closely (#5)

4eafb2a

* Use bias in mxint quantise to follow the spec more closely * python black

Implemented cat for non-square block-size (#10)

9af4c0a

Feat add relu (#11)

2660be2

* Added mxint relu * fix test bench and reformat --------- Co-authored-by: Omar Alkhatib <oa321@ic.ac.uk>

Removed unnecessary parameter (#14)

d8f39e2

Better relu TB (#15)

bc0b71a

* Better relu TB * Better relu TB * black format

Fix mxint cast (for the last time) (#17)

cd81dc2

* Increase casting range * Defo working mxint cast * format * add asserts * add asserts * Change bounds of linear tb

Luigi's top emission with bram (#16)

22742f5

Nerf Vision Training and Quantization (#19)

b47f5e5

* Training finally working * TPE Search working * TPE Search Working Correctly * Training OK * Testing fixed * Configs * remove a.json * PSNR as search metric + Corrected Formatting

remove unused mxint_quantizer_for_hw (#23)

71b9c0e

Co-authored-by: luigirinaldi <luigirinaldi@users.noreply.github.com>

omaralkhatib03 marked this pull request as draft March 27, 2025 20:58

Renamed erroneous variable name (#24)

82bf9da

omaralkhatib03 marked this pull request as ready for review March 27, 2025 22:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Verified & Integrated MXINT Implementation and Evaluation on NeRF (ADLS Group 0) #277

Verified & Integrated MXINT Implementation and Evaluation on NeRF (ADLS Group 0) #277

Uh oh!

omaralkhatib03 commented Mar 27, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Verified & Integrated MXINT Implementation and Evaluation on NeRF (ADLS Group 0) #277

Are you sure you want to change the base?

Verified & Integrated MXINT Implementation and Evaluation on NeRF (ADLS Group 0) #277

Uh oh!

Conversation

omaralkhatib03 commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Contributions

Detailed Implementation Notes

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

omaralkhatib03 commented Mar 27, 2025 •

edited

Loading