Albert-Jan N. Yzelman
What's new
- Stale synchronicity, originally posited and popularised in machine learning by Xing's Petuum framework, relaxes the strict superstep structure in Bulk Synchronous Parallel (BSP) processing in a structured, bounded fashion. BSP can be relaxed even further to fit, for example, inference-only deep learning workloads: by exploiting sparse neural network connectivity while relaxing barrier implementations, or by relaxing collective semantics. These relaxations retain the advantages of structured general-purpose parallel programming whilst achieving performance that exceeds the state of the art. With thanks to co-authors Raphael Steiner, Christos Matzarakos, András Papp, and Toni Böhnlein, you can now also enjoy the benefits of stale synchronicity for sparse triangular solves: on 32 realistic matrices for SpTRSV workloads and using a full 48-core AMD EPYC, we achieve a geometric mean speedup of 26% versus SpMP.
- In partitioning matrices, replicating separators instead of incurring communication on separators can trade memory for data movement. This classic trade-off has become known as 2.5D methods-- however, goes back as far as 1999 in classic computations and (at least) to 2011 as a general concept in sparse matrix partitioning. We now release a pre-print that studies replication applied to scheduling general computations.
- After more than eleven years, I have moved on from Huawei. Follow me on LinkedIn for more details and for what's next!
- A summary of our recent papers on scheduling and applications was presented at the 19th workshop on Scheduling for Large-Scale Systems. The slides are available.
Contact info
| E-mail: | albert-jan@<last name>.net |
| Telephone: | +41 76 771 29 29 |
| ORCID: | 0000-0001-8842-3689 |
Overview
- Publications
- Presentations
- ALP/Pregel & ALP/GraphBLAS (gitee), MulticoreBSP, and other software
- A short biography, and an even shorter one