Show simple item record

dc.contributor.authorAgarwal, Ananten_US
dc.contributor.authorKranz, David A.en_US
dc.contributor.authorNatarajan, Venkaten_US
dc.date.accessioned2023-03-29T14:39:29Z
dc.date.available2023-03-29T14:39:29Z
dc.date.issued1995-09
dc.identifier.urihttps://hdl.handle.net/1721.1/149251
dc.description.abstractThis paper presents a theoretical framework for automatically partitioning parallel loops to minimize cache coherency traffic on shared-memory multiprocessors. While several previous papers have looked at hyperplane partitioning of iteration spaces to reduce communication traffic, the problem of deriving the optimal tiling parameters for minimal communication in loops with general affine index expressions had remained open. Our paper solves this open problem by presenting a method for deriving an optimal hyperparallelepiped tiling of iteration spaces for minimal communication in multiprocessors with caches. We show that the same theoretical framework can also be used to determine optimal tiling parameters for both data and loop partitioning in distributed memory multicomputers. Our framework uses matrices to represent iteration and data space mappings and the notion of uniformly intersecting references to capture temporal locality in array references. We introduce the notion of data footprints to estimate the communication traffic between processors and use linear algebraic methods and lattice theory to compute precisely the size of data footprints. We have implemented this framework in a compiler for Alewife, a distributed shared-memory multiprocessor.en_US
dc.relation.ispartofseriesMIT-LCS-TM-538
dc.titleAutomatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-memory Multiprocessorsen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record