ALP User Documentation 0.7.alpha
Algebraic Programming User Documentation
|
The namespace for ALP/GraphBLAS algorithms. More...
Namespaces | |
namespace | pregel |
The namespace for ALP/Pregel algorithms. | |
Functions | |
template<Descriptor descr = descriptors::no_operation, typename IOType , typename NonzeroType , typename InputType , typename ResidualType , class Semiring = Semiring< operators::add< InputType, InputType, InputType >, operators::mul< IOType, NonzeroType, InputType >, identities::zero, identities::one >, class Minus = operators::subtract< ResidualType >, class Divide = operators::divide< ResidualType >> | |
RC | bicgstab (grb::Vector< IOType > &x, const grb::Matrix< NonzeroType > &A, const grb::Vector< InputType > &b, const size_t max_iterations, ResidualType tol, size_t &iterations, ResidualType &residual, Vector< InputType > &r, Vector< InputType > &rhat, Vector< InputType > &p, Vector< InputType > &v, Vector< InputType > &s, Vector< InputType > &t, const Semiring &semiring=Semiring(), const Minus &minus=Minus(), const Divide ÷=Divide()) |
Solves a linear system \( b = Ax \) with \( x \) unknown by using the bi-conjugate gradient (bi-CG) stabilised method; i.e., BiCGstab. | |
template<Descriptor descr = descriptors::no_operation, typename IOType , typename ResidualType , typename NonzeroType , typename InputType , class Ring = Semiring< grb::operators::add< IOType >, grb::operators::mul< IOType >, grb::identities::zero, grb::identities::one >, class Minus = operators::subtract< IOType >, class Divide = operators::divide< IOType >> | |
grb::RC | conjugate_gradient (grb::Vector< IOType > &x, const grb::Matrix< NonzeroType > &A, const grb::Vector< InputType > &b, const size_t max_iterations, ResidualType tol, size_t &iterations, ResidualType &residual, grb::Vector< IOType > &r, grb::Vector< IOType > &u, grb::Vector< IOType > &temp, const Ring &ring=Ring(), const Minus &minus=Minus(), const Divide ÷=Divide()) |
Solves a linear system \( b = Ax \) with \( x \) unknown by the Conjugate Gradients (CG) method on general fields. | |
template<Descriptor descr = descriptors::no_operation, typename OutputType , typename InputType1 , typename InputType2 , class Ring , class Division = grb::operators::divide< typename Ring::D3, typename Ring::D3, typename Ring::D4 >> | |
RC | cosine_similarity (OutputType &similarity, const Vector< InputType1 > &x, const Vector< InputType2 > &y, const Ring &ring=Ring(), const Division &div=Division()) |
Computes the cosine similarity. | |
template<Descriptor descr = descriptors::no_operation, typename IOType = double, class Operator = operators::square_diff< IOType, IOType, IOType >> | |
RC | kmeans_iteration (Matrix< IOType > &K, Vector< std::pair< size_t, IOType > > &clusters_and_distances, const Matrix< IOType > &X, const size_t max_iter=1000, const Operator &dist_op=Operator()) |
The kmeans iteration given an initialisation. | |
template<Descriptor descr, typename OutputType , typename InputType > | |
RC | knn (Vector< OutputType > &u, const Matrix< InputType > &A, const size_t source, const size_t k, Vector< bool > &buf1) |
Given a graph and a source vertex, indicates which vertices are contained within k hops. | |
template<Descriptor descr = descriptors::no_operation, typename IOType = double, class Operator = operators::square_diff< IOType, IOType, IOType >> | |
RC | kpp_initialisation (Matrix< IOType > &K, const Matrix< IOType > &X, const Operator &dist_op=Operator()) |
a simple implementation of the k++ initialisation algorithm for kmeans | |
template<typename IOType > | |
RC | label (Vector< IOType > &out, const Vector< IOType > &y, const Matrix< IOType > &W, const size_t n, const size_t l, const size_t maxIterations=1000) |
The label propagation algorithm. | |
template<Descriptor descr, class Ring , typename IOType , typename InputType > | |
RC | mpv (Vector< IOType > &u, const Matrix< InputType > &A, const size_t k, const Vector< IOType > &v, Vector< IOType > &temp, const Ring &ring) |
The matrix powers kernel. | |
template<Descriptor descr = descriptors::no_operation, class Ring , typename InputType , typename OutputType , Backend backend, typename Coords > | |
RC | norm2 (OutputType &x, const Vector< InputType, backend, Coords > &y, const Ring &ring=Ring(), const typename std::enable_if< std::is_floating_point< OutputType >::value, void >::type *const =nullptr) |
Provides a generic implementation of the 2-norm computation. | |
template<Descriptor descr = descriptors::no_operation, typename IOType , typename NonzeroT > | |
RC | simple_pagerank (Vector< IOType > &pr, const Matrix< NonzeroT > &L, Vector< IOType > &pr_next, Vector< IOType > &pr_nextnext, Vector< IOType > &row_sum, const IOType alpha=0.85, const IOType conv=0.0000001, const size_t max=1000, size_t *const iterations=nullptr, double *const quality=nullptr) |
The canonical PageRank algorithm. | |
template<Descriptor descr = descriptors::no_operation, typename IOType , typename WeightType , typename BiasType , typename ThresholdType = IOType, class MinMonoid = Monoid< grb::operators::min< IOType >, grb::identities::infinity >, class ReluMonoid = Monoid< grb::operators::relu< IOType >, grb::identities::negative_infinity >, class Ring = Semiring< grb::operators::add< IOType >, grb::operators::mul< IOType >, grb::identities::zero, grb::identities::one >> | |
grb::RC | sparse_nn_single_inference (grb::Vector< IOType > &out, const grb::Vector< IOType > &in, const std::vector< grb::Matrix< WeightType > > &layers, const std::vector< BiasType > &biases, const ThresholdType threshold, grb::Vector< IOType > &temp, const ReluMonoid &relu=ReluMonoid(), const MinMonoid &min=MinMonoid(), const Ring &ring=Ring()) |
Performs an inference step of a single data element through a Sparse Neural Network defined by num_layers sparse weight matrices and num_layers biases. | |
template<Descriptor descr = descriptors::no_operation, typename IOType , typename WeightType , typename BiasType , class ReluMonoid = Monoid< grb::operators::relu< IOType >, grb::identities::negative_infinity >, class Ring = Semiring< grb::operators::add< IOType >, grb::operators::mul< IOType >, grb::identities::zero, grb::identities::one >> | |
grb::RC | sparse_nn_single_inference (grb::Vector< IOType > &out, const grb::Vector< IOType > &in, const std::vector< grb::Matrix< WeightType > > &layers, const std::vector< BiasType > &biases, grb::Vector< IOType > &temp, const ReluMonoid &relu=ReluMonoid(), const Ring &ring=Ring()) |
Performs an inference step of a single data element through a Sparse Neural Network defined by num_layers sparse weight matrices and num_layers biases. | |
template<bool normalize = false, typename IOType > | |
RC | spy (grb::Matrix< IOType > &out, const grb::Matrix< bool > &in) |
Specialisation for boolean input matrices in. | |
template<bool normalize = false, typename IOType , typename InputType > | |
RC | spy (grb::Matrix< IOType > &out, const grb::Matrix< InputType > &in) |
Given an input matrix and a smaller output matrix, map nonzeroes from the input matrix into the smaller one and count the number of nonzeroes that are mapped from the bigger matrix into the smaller. | |
template<bool normalize = false, typename IOType > | |
RC | spy (grb::Matrix< IOType > &out, const grb::Matrix< void > &in) |
Specialisation for void input matrices in. | |
The namespace for ALP/GraphBLAS algorithms.
RC bicgstab | ( | grb::Vector< IOType > & | x, |
const grb::Matrix< NonzeroType > & | A, | ||
const grb::Vector< InputType > & | b, | ||
const size_t | max_iterations, | ||
ResidualType | tol, | ||
size_t & | iterations, | ||
ResidualType & | residual, | ||
Vector< InputType > & | r, | ||
Vector< InputType > & | rhat, | ||
Vector< InputType > & | p, | ||
Vector< InputType > & | v, | ||
Vector< InputType > & | s, | ||
Vector< InputType > & | t, | ||
const Semiring & | semiring = Semiring() , |
||
const Minus & | minus = Minus() , |
||
const Divide & | divide = Divide() |
||
) |
Solves a linear system \( b = Ax \) with \( x \) unknown by using the bi-conjugate gradient (bi-CG) stabilised method; i.e., BiCGstab.
descr | Any descriptor to use for the computation (optional). |
IOType | The solution vector element type. |
NonzeroType | The system matrix entry type. |
InputType | The element type of the right-hand side vector. |
ResidualType | The type of the residuals used during computation. |
Semiring | The semiring under which to perform the BiCGstab |
Minus | The inverse operator of the additive operator of Semiring. |
Divide | The inverse of the multiplicative operator of Semiring. |
By default, these will be the regular add, mul, subtract, and divide over the types IOType, NonzeroType, InputType, and/or ResidualType, as appropriate.
Does not perform any preconditioning.
[in,out] | x | On input: an initial guess to the solution \( Ax = b \). On output: if grb::SUCCESS is returned, the solution to \( Ax=b \) within the given tolerance tol. Otherwise, the last computed approximation to the solution is returned. |
[in] | A | The square non-singular system matrix \( A \). |
[in] | b | The right-hand side vector \( b \). |
If the size of \( A \) is \( n \times n \), then the sizes of x and b must be \( n \) also. The vector x must have capacity \( n \).
Mandatory inputs to the BiCGstab algorithm:
[in] | max_iterations | The maximum number of iterations this algorithm may perform. |
[in] | tol | The relative tolerance which determines when an an approximated solution \( x \) becomes acceptable. Must be positive and non-zero. |
Additional outputs of this algorithm:
[out] | iterations | When grb::SUCCESS is returned, the number of iterations that were required to obtain an acceptable approximate solution. |
[out] | residual | When grb::SUCCESS is returned, the square of the 2-norm of the residual; i.e., \( (r,r) \), where \( r = b - Ax \). |
To operate, this algorithm requires a workspace consisting of six vectors of length and capacity \( n \). If vectors with less capacity are passed as arguments, grb::ILLEGAL will be returned.
[in] | r,rhat,p,v,s,t | Workspace vectors required for BiCGstab. |
The BiCGstab operates over a field defined by the following algebraic structures:
[in] | semiring | Defines the domains as well as the additive and the multicative monoid. |
[in] | minus | The inverse of the additive operator. |
[in] | divide | The inverse of the multiplicative operator. |
_DEBUG
macro defined, the print-out statements require sqrt
as an additional algebraic concept. This concept presently lives "outside" of ALP.Valid descriptors to this algorithm are:
For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
grb::RC conjugate_gradient | ( | grb::Vector< IOType > & | x, |
const grb::Matrix< NonzeroType > & | A, | ||
const grb::Vector< InputType > & | b, | ||
const size_t | max_iterations, | ||
ResidualType | tol, | ||
size_t & | iterations, | ||
ResidualType & | residual, | ||
grb::Vector< IOType > & | r, | ||
grb::Vector< IOType > & | u, | ||
grb::Vector< IOType > & | temp, | ||
const Ring & | ring = Ring() , |
||
const Minus & | minus = Minus() , |
||
const Divide & | divide = Divide() |
||
) |
Solves a linear system \( b = Ax \) with \( x \) unknown by the Conjugate Gradients (CG) method on general fields.
Does not perform any preconditioning.
descr | The user descriptor |
IOType | The input/output vector nonzero type |
ResidualType | The type of the residual |
NonzeroType | The matrix nonzero type |
InputType | The right-hand side vector nonzero type |
Ring | The semiring under which to perform CG |
Minus | The minus operator corresponding to the inverse of the additive operator of the given Ring. |
Divide | The division operator corresponding to the inverse of the multiplicative operator of the given Ring. |
Valid descriptors to this algorithm are:
By default, i.e., if none of ring, minus, or divide (nor their types) are explicitly provided by the user, the natural field on double data types will be assumed.
[in,out] | x | On input: an initial guess to the solution. On output: the last computed approximation. |
[in] | A | The (square) positive semi-definite system matrix. |
[in] | b | The known right-hand side in \( Ax = b \). Must be structurally dense. |
If A is \( n \times n \), then x and b must have matching length \( n \). The vector x furthermore must have a capacity of \( n \).
CG algorithm inputs:
[in] | max_iterations | The maximum number of CG iterations. |
[in] | tol | The requested relative tolerance. |
Additional outputs (besides x):
[out] | iterations | The number of iterations the algorithm has started. |
[out] | residual | The residual corresponding to output x. |
The CG algorithm requires three workspace buffers with capacity \( n \):
[in,out] | r | A temporary vector of the same size as x. |
[in,out] | u | A temporary vector of the same size as x. |
[in,out] | temp | A temporary vector of the same size as x. |
Finally, the algebraic structures over which the CG is executed are given:
[in] | ring | The semiring under which to perform the CG. |
[in] | minus | The inverse of the additive operator of ring. |
[in] | divide | The inverse of the multiplicative operator of ring. |
This algorithm may return one of the following error codes:
On output, the contents of the workspace r, u, and temp are always undefined. For non-grb::SUCCESS error codes, additional containers or states may be left undefined:
For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
RC cosine_similarity | ( | OutputType & | similarity, |
const Vector< InputType1 > & | x, | ||
const Vector< InputType2 > & | y, | ||
const Ring & | ring = Ring() , |
||
const Division & | div = Division() |
||
) |
Computes the cosine similarity.
Given two vectors \( x, y \) of equal size \( n \), this function computes \( \alpha = \frac{ (x,y) }{ ||x||_2\cdot||y||_2 } \).
The 2-norms and inner products are computed according to the given semi- ring. However, the norms make use of the standard sqrt
and so the algorithm assumes a regular field is used. Effectively, hence, the semiring controls the precision / data types under which the computation is performed.
descr | The descriptor under which to perform the computation. |
OutputType | The type of the output element (scalar). |
InputType1 | The type of the first vector. |
InputType2 | The type of the second vector. |
Ring | The semiring used. |
Division | Which binary operator correspond to division corresponding to the given Ring. |
[out] | similarity | Where to fold the result into. |
[in] | x | The non-zero left-hand input vector. |
[in] | y | The non-zero right-hand input vector. |
[in] | ring | The semiring to compute over. |
[in] | div | The division operator corresponding to ring. |
The argument div is optional. It will map to grb::operators::divide by default.
For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
RC kmeans_iteration | ( | Matrix< IOType > & | K, |
Vector< std::pair< size_t, IOType > > & | clusters_and_distances, | ||
const Matrix< IOType > & | X, | ||
const size_t | max_iter = 1000 , |
||
const Operator & | dist_op = Operator() |
||
) |
The kmeans iteration given an initialisation.
[in,out] | K | k by m matrix containing the current k means as row vectors |
[in] | clusters_and_distances | Vector containing the class and distance to centroid for each point |
[in] | X | m by n matrix containing the n points to be classified as column vectors |
[in] | max_iter | Maximum number of iterations |
[in] | dist_op | Coordinatewise distance operator, squared difference by default |
RC knn | ( | Vector< OutputType > & | u, |
const Matrix< InputType > & | A, | ||
const size_t | source, | ||
const size_t | k, | ||
Vector< bool > & | buf1 | ||
) |
Given a graph and a source vertex, indicates which vertices are contained within k hops.
This implementation is based on the matrix powers kernel over a Boolean semiring.
[out] | u | The distance-k neighbourhood. Any prior contents will be ignored. |
[in] | A | The input graph in (square) matrix form |
[in] | source | The source vertex index. |
[in] | k | The neighbourhood distance, or the maximum number of hops in a breadth-first search. |
This algorithm requires the following workspace:
[in,out] | buf1 | A buffer vector. Must match the size of A. |
For \( n \times n \) matrices A, the capacity of u, buf1, and buf2 must equal \( n \).
For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
RC kpp_initialisation | ( | Matrix< IOType > & | K, |
const Matrix< IOType > & | X, | ||
const Operator & | dist_op = Operator() |
||
) |
a simple implementation of the k++ initialisation algorithm for kmeans
[in,out] | K | k by m matrix containing the current k means as row vectors |
[in] | X | m by n matrix containing the n points to be classified as column vectors |
[in] | dist_op | Coordinatewise distance operator, squared difference by default |
RC label | ( | Vector< IOType > & | out, |
const Vector< IOType > & | y, | ||
const Matrix< IOType > & | W, | ||
const size_t | n, | ||
const size_t | l, | ||
const size_t | maxIterations = 1000 |
||
) |
The label propagation algorithm.
IOType | The value type of the label vector. This will determine the precision of all computations this algorithm performs. |
[out] | out | The resulting labelled vector representing the n vertices. |
[in] | y | Vector holding the initial labels from a total set of n vertices. The initial labels are assumed to correspond to the vertices corresponding to the first l entries of this vector. The labels must be either 0 ar 1. |
[in] | W | Sparse symmetric matrix of size n by n, holding the weights between the n vertices. The weights must be positive (larger than 0). The matrix may be defective while the corresponding graph may not be connected. |
[in] | n | The total number of vertices. If zero, and all of out, y, and W are empty, calling this function is equivalent to a no-op. |
[in] | l | The number of vertices with an initial label. Must be larger than zero. |
[in] | maxIterations | The maximum number of iterations this algorithm may execute. Optional. Default value: 1000. |
[1] Kamvar, Haveliwala, Manning, Golub; ‘Extrapolation methods for accelerating the PageRank computation’, ACM Press, 2003.
RC mpv | ( | Vector< IOType > & | u, |
const Matrix< InputType > & | A, | ||
const size_t | k, | ||
const Vector< IOType > & | v, | ||
Vector< IOType > & | temp, | ||
const Ring & | ring | ||
) |
The matrix powers kernel.
Calculates \( y = A^k x \) for some integer \( k \geq 0 \) using the given semiring.
descr | The descriptor used to perform this operation. |
Ring | The semiring used. |
IOType | The output vector type. |
InputType | The nonzero type of matrix elements. |
implementation | Which implementation to use. |
[out] | u | The output vector. Contents shall be overwritten. The supplied vector must match the row dimension size of A. |
[in] | A | The square input matrix A. The supplied matrix must match the dimensions of u and v. |
[in] | k | How many matrix–vector multiplications are requested. |
[in] | v | The input vector v. The supplied vector must match the column dimension size of A. It may not be the same vector as u. |
[in] | ring | The semiring to be used. This defines the additive and multiplicative monoids to be used. |
This algorithm requires workspace:
[in,out] | temp | A workspace buffer of matching size to the row dimension of A. May be the same vector as v, though note that the contents of temp on output are undefined. |
This algorithm assumes that u and temp have full capacity. If this assumption does not hold, then a two-stage mpv must be employed instead.
For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
RC norm2 | ( | OutputType & | x, |
const Vector< InputType, backend, Coords > & | y, | ||
const Ring & | ring = Ring() , |
||
const typename std::enable_if< std::is_floating_point< OutputType >::value, void >::type * const | = nullptr |
||
) |
Provides a generic implementation of the 2-norm computation.
Proceeds by computing a dot-product on itself and then taking the square root of the result.
This function is only available when the output type is floating point.
For return codes, exception behaviour, performance semantics, template and non-template arguments,
[out] | x | The 2-norm of y. The input value of x will be ignored. |
[in] | y | The vector to compute the norm of. |
[in] | ring | The Semiring under which the 2-norm is to be computed. |
sqrt
function must be applied on the result of a dot-product of y with itself under the supplied semiring.RC simple_pagerank | ( | Vector< IOType > & | pr, |
const Matrix< NonzeroT > & | L, | ||
Vector< IOType > & | pr_next, | ||
Vector< IOType > & | pr_nextnext, | ||
Vector< IOType > & | row_sum, | ||
const IOType | alpha = 0.85 , |
||
const IOType | conv = 0.0000001 , |
||
const size_t | max = 1000 , |
||
size_t *const | iterations = nullptr , |
||
double *const | quality = nullptr |
||
) |
The canonical PageRank algorithm.
descr | The descriptor under which to perform the computation. |
IOType | The value type of the pagerank vector. This will determine the precision of all computations this algorithm performs. |
NonzeroT | The type of the elements of the nonzero matrix. |
[in,out] | pr | Vector of size and capacity \( n \), where \( n \) is the vertex size of the input graph L. On input, the contents of this vector will be taken as the initial guess to the final result, but only if the vector is dense; if it is, all entries of the initial guess must be nonzero, while it is not, this algorithm will make an initial guess. On output, if grb::SUCCESS is returned, the PageRank vector corresponding to L. |
[in] | L | The input graph as a square link matrix of size \( n \). |
To operate, this algorithm requires a workspace of three vectors. The size and capacities of these must equal \( n \). The contents on input are ignored, and the contents on output are undefined.
This algorithm does not explicitly materialise the Google matrix \( G = \alpha L + (1-\alpha)ee^T \) over which the power iterations are exectuted.
[in,out] | pr_next | Buffer for the PageRank algorithm. |
[in,out] | pr_nextnext | Buffer for the PageRank algorithm. |
[in,out] | row_sum | Buffer for the PageRank algorithm. |
The PageRank algorithm holds the following optional parameters:
[in] | alpha | The scaling factor. The default value is 0.85. This value must be smaller than 1, and larger than 0. |
[in] | conv | If the difference between two successive iterations, in terms of its one-norm, is less than this value, then the PageRank vector is considered converged and this algorithm exits successfully. The default value is \( 10^{-8} \). If this value is set to zero, then the algorithm will continue until max iterations are reached. May not be negative. |
[in] | max | The maximum number of power iterations. The default value is 1000. This value must be larger than 0. |
The PageRank algorithm reports the following optional output:
[out] | iterations | If not nullptr , the number of iterations the call to this algorithm took will be written to the location pointed to. |
[out] | quality | If not nullptr , the last computed residual will be written to the location pointed to. |
For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
grb::RC sparse_nn_single_inference | ( | grb::Vector< IOType > & | out, |
const grb::Vector< IOType > & | in, | ||
const std::vector< grb::Matrix< WeightType > > & | layers, | ||
const std::vector< BiasType > & | biases, | ||
const ThresholdType | threshold, | ||
grb::Vector< IOType > & | temp, | ||
const ReluMonoid & | relu = ReluMonoid() , |
||
const MinMonoid & | min = MinMonoid() , |
||
const Ring & | ring = Ring() |
||
) |
Performs an inference step of a single data element through a Sparse Neural Network defined by num_layers sparse weight matrices and num_layers biases.
The initial single data element may be sparse also, such as common in Graph Neural Networks (GNNs).
Inference here is a repeated sequence of application of a sparse linear layer, addition of a bias factor, and the application of a ReLU.
We employ a linear algebraic formulation where the ReLU and the bias application are jointly applied via a max-operator.
This formalism follows closely the linear algebraic approach to the related IEEE/MIT GraphChallenge problem, such as for example described in
Combinatorial Tiling for Sparse Neural Networks F. Pawlowski, R. H. Bisseling, B. Uçar and A. N. Yzelman 2020 IEEE High Performance Extreme Computing (HPEC) Conference
[out] | out | The result of inference through the neural network. |
[in] | in | The input vector, may be sparse or dense. |
[in] | layers | A collection of linear layers. Each layer is assumed to be square and of the equal size to one another. |
This implies that all layers are \( n \times n \). The vectors in and out hence must be of length \( n \).
Commonly, as an input propagates through a network, the features become increasingly dense. Hence out is assumed to have full capacity in order to potentially store a fully dense activation vector.
Inference proceeds under a set of biases, one for each layer. Activation vectors are added a constant bias value prior to applying the given relu function. After application, the resulting vector is furthermore tresholded. The treshold is assumed constant over all layers.
[in] | biases | An array of num_layers bias factors. |
[in] | threshold | The value used for thresholding. |
Inference is done using a single buffer that is alternated with out:
[in,out] | temp | A buffer of size and capacity \( n \). |
Finally, optional arguments define the algebraic structures under which inference proceeds:
[in] | relu | The non-linear ReLU function to apply element-wise. |
[in] | min | Operator used for thresholding. Maximum feature value is hard-coded to 32, as per the GraphChallenge. |
[in] | ring | The semiring under which to perform the inference. |
The default algebraic structures are standard relu (i.e., max), min for tresholding, and the real (semi-) ring.
Valid descriptors for this algorithm are:
For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
grb::RC sparse_nn_single_inference | ( | grb::Vector< IOType > & | out, |
const grb::Vector< IOType > & | in, | ||
const std::vector< grb::Matrix< WeightType > > & | layers, | ||
const std::vector< BiasType > & | biases, | ||
grb::Vector< IOType > & | temp, | ||
const ReluMonoid & | relu = ReluMonoid() , |
||
const Ring & | ring = Ring() |
||
) |
Performs an inference step of a single data element through a Sparse Neural Network defined by num_layers sparse weight matrices and num_layers biases.
The initial single data element may be sparse also, such as common in Graph Neural Networks (GNNs).
Inference here is a repeated sequence of application of a sparse linear layer, addition of a bias factor, and the application of a ReLU.
We employ a linear algebraic formulation where the ReLU and the bias application are jointly applied via a max-operator.
This formalism follows closely the linear algebraic approach to the related IEEE/MIT GraphChallenge problem, such as, for example, described in
Combinatorial Tiling for Sparse Neural Networks F. Pawlowski, R. H. Bisseling, B. Uçar and A. N. Yzelman 2020 IEEE High Performance Extreme Computing (HPEC) Conference
[out] | out | The result of inference through the neural network. |
[in] | in | The input vector, may be sparse or dense. |
[in] | layers | A collection of linear layers. Each layer is assumed to be square and of the equal size to one another. |
This implies that all layers are \( n \times n \). The vectors in and out hence must be of length \( n \).
Commonly, as an input propagates through a network, the features become increasingly dense. Hence out is assumed to have full capacity in order to potentially store a fully dense activation vector.
Inference proceeds under a set of biases, one for each layer. Activation vectors are added a constant bias value prior to applying the given relu function. This function does not perform tresholding.
[in] | biases | An array of num_layers bias factors. |
Inference is done using a single buffer that is alternated with out:
[in,out] | temp | A buffer of size and capacity \( n \). |
Finally, optional arguments define the algebraic structures under which inference proceeds:
[in] | relu | The non-linear ReLU function to apply element-wise. |
[in] | ring | The semiring under which to perform the inference. |
The default algebraic structures are standard relu (i.e., max), min for tresholding, and the real (semi-) ring.
Valid descriptors for this algorithm are:
For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
RC spy | ( | grb::Matrix< IOType > & | out, |
const grb::Matrix< bool > & | in | ||
) |
Specialisation for boolean input matrices in.
See grb::algorithms::spy.
RC spy | ( | grb::Matrix< IOType > & | out, |
const grb::Matrix< InputType > & | in | ||
) |
Given an input matrix and a smaller output matrix, map nonzeroes from the input matrix into the smaller one and count the number of nonzeroes that are mapped from the bigger matrix into the smaller.
normalize | If set to true, will not compute a number of mapped nonzeroes, but its inverse instead (one divided by the count). The default value for this template parameter is false . |
[out] | out | The smaller output matrix. |
[in] | in | The larger input matrix. |
false
) will be counted as a nonzero by this algorithm.grb::set( tmp, in, true );
, with tmp a Boolean or pattern matrix of the same size as in.For performance semantics regarding work, inter-process data movement, intra-process data movement, synchronisations, and memory use, please see the specification of the ALP primitives this function relies on. These performance semantics, with the exception of getters such as grb::nnz, are specific to the backend selected during compilation.
RC spy | ( | grb::Matrix< IOType > & | out, |
const grb::Matrix< void > & | in | ||
) |
Specialisation for void input matrices in.
See grb::algorithms::spy.