This repository contains the code related to the article Efficiently Parallelizable Strassen-Based Multiplication of a Matrix by its Transpose (V. Arrigoni, F. Maggioli, A. Massini, E. Rodolà).
Each folder contains a separate project, with its own building instruction and sample applications. Please, refer to the proper README file for each sub-project.
Here's a list of the projects and their content:
sequentialcontains the implementation of the sequential AtA algorithm, and provides an efficient implementation of the Strassen's algorithm.sharedcontains the implementation of the parallel AtA algorithm in a shared memory model. The same Strassen's algorithm fromsequentialis also contained here.distributedcontains the implementation of the parallel AtA algorithm in a distributed memory model.