-
Notifications
You must be signed in to change notification settings - Fork 21
Spartan is a Python library for distributed array programming. Programmers build up array expressions (using Numpy-like operations). These expressions are then compiled and optimized and run on a distributed array backend across multiple machines.
Spartan is actively maintained by researchers in the Systems Group of New York University's Computer Science Department.
Let's start with array programming. From Wikipedia:
In computer science, array programming languages (also known as vector or multidimensional languages) generalize operations on scalars to apply transparently to vectors, matrices, and higher-dimensional arrays.
Distributed array programming extends this by allowing the data to be
distributed among many threads on a single machine (useful for testing) or, the
more common use case, distributed among many CPUs on a cluster of machines.
You can tell Spartan how to tile your data (split it into smaller arrays)
through the tile_hint parameters on many array creation functions. Or,
Spartan can automatically tile your data using our custom algorithms.
A related question is what problems does this solve? Look at some of the examples we have implemented (PageRank, regressions, Black Scholes, and more).
See a comparison of the other tools here.
No, this project heavily relies on NumPy, but extends it for large arrays. Instead of being limited to the memory and processing power of a single machine, Spartan allows you to use the familiar Numpy operations you've come to know and love, but on much larger datasets.
Even I don't know!
You can contact the maintainers at robert.gardner@nyu.edu