ALP User Documentation 0.7.alpha
Algebraic Programming User Documentation
|
ALP/Pregel enables vertex-centric programming. More...
Classes | |
struct | ConnectedComponents< VertexIDType > |
A vertex-centric Connected Components algorithm. More... | |
struct | PageRank< IOType, localConverge > |
A Pregel-style PageRank-like algorithm. More... | |
class | Pregel< MatrixEntryType > |
A Pregel run-time instance. More... | |
struct | PregelState |
The state of the vertex-center Pregel program that the user may interface with. More... | |
Enumerations | |
enum | SparsificationStrategy { NONE = 0 , ALWAYS , WHEN_REDUCED , WHEN_HALVED } |
The set of sparsification strategies supported by the ALP/Pregel interface. More... | |
Variables | |
constexpr const SparsificationStrategy | out_sparsify = NONE |
What sparsification strategy should be applied to the outgoing messages. | |
ALP/Pregel enables vertex-centric programming.
With vertex-centric programming, graph algorithms are written from the perspective of a vertex within an input graph. Each vertex executes a program on a round-by-round basis, while between rounds all vertex programs pass messages to neighbour vertices using the edges of the input graph. Edges may be directed or undirected; in the former, messages travel from the source vertex to the destination vertex only. Each vertex program sends the same message to all of its neighbours – i.e., it broadcasts a single given message. In ALP/Pregel, incoming messages are furthermore accumulated using a grb::Monoid. The accumulation of incoming messages is typically used by the vertex-centric program during the next round it executes.
Pregel programs thus execute on a given graph, and hence constructing a grb::interfaces::Pregel instance requires passing input iterators corresponding to the graph on which ALP/Pregel programs are executed. Such an instance logically corresponds to an execution engine of vertex-centric programs for a specific graph. Multiple grb::interfaces::Pregel instances, each potentially built using a different input graph, may exist simultaneously.
ALP/Pregel programs then are executed using grb::interfaces::Pregel::execute. The first template argument to this function is the binary operator of the monoid to be used for accumalating incoming messages, while the second template argument corresponds to its identity– see grb::operators and grb::identities for example operator and identities. The remainder template arguments to grb::interfaces::Pregel::execute are automatically inferred.
The first non-template argument is the vertex-centric program, for example, grb::algorithms::pregel::ConnectedComponents– a vertex-centric program in ALP/GraphBLAS hence is a class where the program is given as a public static function named program. This function takes five arguments:
The types of arguments 1-4 are defined by the program, but must be plain old data (POD) types– similar to the requirements of an ALP operator. An example of an ALP/Pregel algorithm that has non-trivial algorithm parameters is grb::algorithms::pregel::PageRank: grb::algorithms::pregel::PageRank::Data.
The type of the 5th argument to grb::interfaces::Pregel::execute is an instance of grb::interfaces::PregelState. Some of the ALP/Pregel state fields are read-only, such as the current round number grb::interfaces::PregelState::round, while others are read-write. Please see the corresponding documentation for what read-only states may be inspected during program execution. Some fields are global (such as again the current round number), while others are specific to the vertex a program is running on (such as grb::interfaces::PregelState::indegree).
Read-write ALP/Pregel state is used for determining termination conditions. There are two associated flags:
Each vertex has its own state of these two flags, with the defaults being true
for the former and false
for the latter.
If, by the end of any round, a vertex sets its active
flag to false
, that vertex will not participate in any future rounds. For any neighbouring vertices it shall be as though the inactive vertex keeps broadcasting the identity of the given accumulation monoid.
If at the end of any round all vertices are inactive, the program terminates. Similarly, if by the end of a round all vertices have the voteToHalt
flag set to true
, then that Pregel program terminates as well.
By convention, ALP/Pregel algorithms allow for a simplified way of executing them that does not require the Pregel algorithm user to pass the right monoid to grb::interfaces::Pregel::execute each time they call one, such as, for example,
These functions only take the Pregel instance that is to execute the Pregel program, as well as a vector of initial states as mandatory input. As usual, optional parameters indicate the maximum number of rounds allotted to the program (zero for unbounded), and where to write back the number of rounds after which the program has terminated (NULL
for no write back).
All pre-defined ALP/Pregel algorithms reside in the grb::algorithms::pregel namespace.
The ALP/Pregel run-time system manages state for every vertex in the underlying graph. The execution time of a single round is always proportional to the number of active vertices. Since inactive vertices stay inactive in subsequent rounds, their state could be erased. This has two potential benefits:
We may opt to always attempt to sparsify state, use some heuristic to determine when to sparsify, or just simply never attempt such sparsification.
This choice is configurable via grb::interfaces::config::out_sparsify; see grb::interfaces::config::SparsificationStrategy for options and more details.
The set of sparsification strategies supported by the ALP/Pregel interface.
Enumerator | |
---|---|
NONE | No sparsification of internal and user-defined vertex states, beyond that which is necessary to bound the run-time by the number of active vertices. |
ALWAYS | Always applies the sparsification procedure on both internal and user- defined vertex states. Does not consider whether the resulting operation would reduce the number of vertex entries. This variant was tested against NONE for out_sparsify, and found to be slower always. |
WHEN_REDUCED | Sparsify only when the resulting vector would indeed be sparser. While this sounds like it should be a minimal condition to check for before applying sparsification, this check itself comes at non-trivial overhead for any backend. The performance of this strategy versus ALWAYS hence is a trade-off, one that varies with underlying graphs as well as with the vertex-centric program chosen. |
WHEN_HALVED | Sparsify only when the resulting vector would have half (or less) its current number of nonzeroes. This is a simple heuristic that balances the trade-off of applying sparsification by amortising its overhead. The overhead described at WHEN_REDUCED corresponding to determining the gain of sparsification, however, remains the same. |