Feature/int4 lut gemm #32

5000user5000 · 2025-05-15T06:43:53Z

This update refactors the lookup table implementation to support mixed-precision matrix multiplication, specifically INT4 × FP16 or FP32. The ProductLookupTable class has been extended to allow both scalar lookups and dynamic row-wise generation of lookup vectors based on activation matrix rows. This enables more efficient LUT-based computation without requiring runtime multiplications. The fast GEMM kernel is updated accordingly to rebuild the LUT for each activation row, improving cache locality and performance. Additionally, the original AVX2-specific code has been removed to simplify the implementation and improve readability. All relevant tests, including those involving INT4 and mixed-precision computation, pass successfully.

5000user5000 added 4 commits May 9, 2025 13:24

int4 quant support

3a661fb

fix proposal img link

e3bdfaa

feat:int4 support

ef5ccb0

remove debug message printout

d44d3ea

5000user5000 mentioned this pull request May 15, 2025

Add LUT multiplication support for signed integers #29

Closed

5000user5000 merged commit c79c398 into main May 15, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/int4 lut gemm #32

Feature/int4 lut gemm #32

Uh oh!

5000user5000 commented May 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/int4 lut gemm #32

Feature/int4 lut gemm #32

Uh oh!

Conversation

5000user5000 commented May 15, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants