Commit 9b59641
authored
CUDA: quantized KV support for FA vec (ggml-org#7527)
* CUDA: quantized KV support for FA vec
* try CI fix
* fix commented-out kernel variants
* add q8_0 q4_0 tests
* fix nwarps > batch size
* split fattn compile via extern templates
* fix flake8
* fix metal tests
* fix cmake
* make generate_cu_files.py executable
* add autogenerated .cu files
* fix AMD
* error if type_v != FP16 and not flash_attn
* remove obsolete code1 parent a323ec6 commit 9b59641
File tree
110 files changed
+2649
-1152
lines changed- ggml-cuda
- template-instances
- tests
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
110 files changed
+2649
-1152
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| 109 | + | |
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
| |||
402 | 403 | | |
403 | 404 | | |
404 | 405 | | |
| 406 | + | |
| 407 | + | |
405 | 408 | | |
406 | 409 | | |
407 | 410 | | |
| |||
427 | 430 | | |
428 | 431 | | |
429 | 432 | | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
430 | 445 | | |
431 | 446 | | |
432 | 447 | | |
| |||
571 | 586 | | |
572 | 587 | | |
573 | 588 | | |
| 589 | + | |
| 590 | + | |
574 | 591 | | |
575 | 592 | | |
576 | 593 | | |
| |||
590 | 607 | | |
591 | 608 | | |
592 | 609 | | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
593 | 623 | | |
594 | 624 | | |
595 | 625 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
421 | 421 | | |
422 | 422 | | |
423 | 423 | | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
424 | 433 | | |
425 | 434 | | |
426 | 435 | | |
| |||
431 | 440 | | |
432 | 441 | | |
433 | 442 | | |
| 443 | + | |
434 | 444 | | |
435 | 445 | | |
436 | 446 | | |
| |||
493 | 503 | | |
494 | 504 | | |
495 | 505 | | |
496 | | - | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
497 | 510 | | |
498 | 511 | | |
499 | 512 | | |
| |||
505 | 518 | | |
506 | 519 | | |
507 | 520 | | |
508 | | - | |
| 521 | + | |
509 | 522 | | |
510 | 523 | | |
511 | 524 | | |
| |||
585 | 598 | | |
586 | 599 | | |
587 | 600 | | |
| 601 | + | |
588 | 602 | | |
589 | 603 | | |
590 | 604 | | |
591 | 605 | | |
592 | | - | |
| 606 | + | |
593 | 607 | | |
594 | 608 | | |
595 | 609 | | |
| |||
749 | 763 | | |
750 | 764 | | |
751 | 765 | | |
| 766 | + | |
752 | 767 | | |
753 | 768 | | |
754 | 769 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
501 | 501 | | |
502 | 502 | | |
503 | 503 | | |
| 504 | + | |
504 | 505 | | |
505 | 506 | | |
506 | 507 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2905 | 2905 | | |
2906 | 2906 | | |
2907 | 2907 | | |
2908 | | - | |
| 2908 | + | |
2909 | 2909 | | |
2910 | 2910 | | |
2911 | | - | |
| 2911 | + | |
| 2912 | + | |
| 2913 | + | |
| 2914 | + | |
| 2915 | + | |
2912 | 2916 | | |
2913 | 2917 | | |
2914 | 2918 | | |
| |||
0 commit comments