Skip to content
Robert Gardner edited this page Oct 31, 2014 · 1 revision

Spartan FAQ

What is Spartan?

From Spartan's README:

Spartan is a Python library for distributed array programming. Programmers build up array expressions (using Numpy-like operations). These expressions are then compiled and optimized and run on a distributed array backend across multiple machines.

Who are the developers?

Spartan is actively maintained by researchers in the Systems Group of New York University's Computer Science Department.

What is distributed array programming?

Let's start with array programming. From Wikipedia:

In computer science, array programming languages (also known as vector or multidimensional languages) generalize operations on scalars to apply transparently to vectors, matrices, and higher-dimensional arrays.

Distributed array programming extends this by allowing the data to be distributed among many threads on a single machine (useful for testing) or, the more common use case, distributed among many CPUs on a cluster of machines. You can tell Spartan how to tile your data (split it into smaller arrays) through the tile_hint parameters on many array creation functions. Or, Spartan can automatically tile your data using our custom algorithms.

Why Spartan?

A related question is what problems does this solve? Look at some of the examples we have implemented (PageRank, regressions, Black Scholes, and more).

What related work is there?

See a comparison of the other tools here.

Does this replace NumPy?

No, this project heavily relies on NumPy, but extends it for large arrays. Instead of being limited to the memory and processing power of a single machine, Spartan allows you to use the familiar Numpy operations you've come to know and love, but on much larger datasets.

What are the limitations of this system?

Where does the name come from?

Even I don't know!

Question not covered here?

You can contact the maintainers at robert.gardner@nyu.edu