Using Plapack

by Van De Geijn, Robert A.; Alpatov, Philip; Baker, Greg; Edwards, Carter; Gunnels, John; Morrow, Greg; Overfelt, James

ISBN13: 9780262720267

ISBN10: 0262720264

Format: Paperback

Pub. Date: 1997-03-15

Publisher(s): Mit Pr

Other versions by this Author

List Price: ~~$45.00~~

Rent Textbook

Select for Price

Add to Cart

There was a problem. Please try again later.

New Textbook

We're Sorry
Sold Out

Used Textbook

We're Sorry
Sold Out

eTextbook

We're Sorry
Not Available

Buy from our Marketplace starting at $22.29

Summary

PLAPACK is a library infrastructure for the parallel implementation of linear algebra algorithms and applications on distributed memory supercomputers such as the Intel Paragon, IBM SP2, Cray T3D/T3E, SGI PowerChallenge, and Convex Exemplar. This infrastructure allows library developers, scientists, and engineers to exploit a natural approach to encoding so-called blocked algorithms, which achieve high performance by operating on submatrices and subvectors. This feature, as well as the use of an alternative, more application-centric approach to data distribution, sets PLAPACK apart from other parallel linear algebra libraries, allowing for strong performance and significanltly less programming by the user. This book is a comprehensive introduction to all the components of a high-performance parallel linear algebra library, as well as a guide to the PLAPACK infrastructure. Scientific and Engineering Computation series

Series Foreword

xiii

(2)

Preface

1 Introduction

(18)

1.1 Why a New Infrastructure?

(1)

1.2 Natural Description of Linear Algebra Algorithms

(1)

1.3 Physically Based Matrix Distribution

(5)

1.4 Redistributing and Duplicating Matrices and Vectors

(4)

1.4.1 Redistributing vectors, matrix rows, and matrix columns

(2)

1.4.2 Spreading vectors, matrix rows, and matrix columns

(1)

1.4.3 Reducing vectors, matrix rows, and matrix columns

(1)

1.4.4 Terminology

(1)

1.5 Implementation of Basic Matrix-Vector Operations (Preview)

(3)

1.5.1 Matrix-vector multiplication

(1)

1.5.2 Rank-1 update

(2)

1.6 Implementation of Basic Matrix-Matrix Operations (Preview)

(1)

1.6.1 Matrix-matrix multiplication

(1)

1.6.2 Attaining better performance

(1)

1.7 Basic Linear Algebra Subprograms

(1)

1.8 Message-Passing Interface

(1)

1.9 Parallel Sparse Linear Algebra

(1)

1.10 FORTRAN Interface

(1)

1.11 Availability

(1)

2 Templates and Linear Algebra Objects

(24)

2.1 Initializing PLAPACK

(1)

2.2 Distribution Templates

(7)

2.2.1 Template creation

(1)

2.2.2 Template destruction

(2)

2.2.3 Template inquiry routines

(2)

2.3 Linear Algebra Objects

(12)

2.3.1 Linear algebra object creation

(4)

2.3.2 Linear algebra object destruction

(1)

2.3.3 Linear algebra object inquiry routines

(6)

2.3.4 Extracting and setting local data

(1)

2.3.5 Initializing object data

(1)

2.4 Example

(1)

2.5 Return Values

(1)

2.6 More Operations and Information

(4)

3 Advanced Linear Algebra Object Manipulation

(16)

3.1 Creating Views into Objects

(3)

3.2 Splitting of Linear Algebra Objects

(4)

3.2.1 Splitting into four quadrants

(2)

3.2.2 Splitting into two parts

(2)

3.3 Shifting of Linear Algebra Objects

(1)

3.4 Determining Where to Split

(1)

3.5 Creating Objects "Conformal to" Other Objects

(3)

3.6 Annotating Object Orientation

(1)

3.7 Casting Object Types

(1)

3.8 More Operations and Information

(2)

4 Application Program Interface

(16)

4.1 Introduction

(1)

4.2 API-Activation

(1)

4.3 Opening and Closing an Object

(1)

4.4 Accessing a Vector

(4)

4.5 Accessing a Matrix

(2)

4.6 Completion and Synchronization

(1)

4.7 Examples

(6)

4.8 More Operations and Information

(2)

5 Data Duplication and Consolidation

(14)

5.1 Copy

(3)

5.1.1 Copy involving vectors

(1)

5.1.2 Copy involving multivectors

(1)

5.1.3 Copy involving matrices

(2)

5.1.4 Copy involving multiscalars

(1)

5.1.5 Specialized copy

(1)

5.2 Reduce

(2)

5.3 Pipelining Computation and Communication

(1)

5.4 A Building Block Approach to Implementing Copy and Reduce

(5)

5.4.1 Collective communication operations

(1)

5.4.2 Efficient implementation of collective communication

(1)

5.4.3 Implementation of the copy

(3)

5.4.4 Implementation of the reduce

(1)

5.5 More Operations and Information

(3)

6 Vector-Vector Operations

(16)

6.1 Copy

(3)

6.1.1 Standard FORTRAN call

(1)

6.1.2 PLAPACK FORTRAN-C interface

(1)

6.1.3 PLAPACK calls

(2)

6.2 Swap

(1)

6.2.1 Standard FORTRAN call

(1)

6.2.2 PLAPACK FORTRAN-C interface

(1)

6.2.3 PLAPACK calls

(1)

6.3 Scaling a Vector (Object)

(1)

6.3.1 Standard FORTRAN call

(1)

6.3.2 PLAPACK FORTRAN-C interface

(1)

6.3.3 PLAPACK calls

(1)

6.4 Scaled Vector (Object) Addition

(2)

6.4.1 Standard FORTRAN call

(1)

6.4.2 PLAPACK FORTRAN-C interface

(1)

6.4.3 PLAPACK calls

(1)

6.5 Inner Product of Vectors

(1)

6.5.1 Standard FORTRAN call

(1)

6.5.2 PLAPACK FORTRAN-C interface

(1)

6.5.3 PLAPACK calls

(1)

6.6 Norms of Vectors

(2)

6.6.1 Standard FORTRAN call

(1)

6.6.2 PLAPACK FORTRAN-C interface

(1)

6.6.3 PLAPACK calls

(1)

6.7 Maximum Absolute Value in Vector

(1)

6.7.1 Standard FORTRAN call

(1)

6.7.2 PLAPACK FORTRAN-C interface

(1)

6.7.3 PLAPACK calls

(1)

6.8 Example: Parallelizing Inner Product

100

(1)

6.9 Example: Parallelizing "axpy" for Vector Objects

101

(2)

6.10 More Operations and Information

103

(2)

7 Matrix-Vector Operations

105

(20)

7.1 General Matrix-Vector Multiplication

105

(3)

7.1.1 Standard FORTRAN call

105

(1)

7.1.2 PLAPACK FORTRAN-C interface

105

(1)

7.1.3 PLAPACK calls

106

(2)

7.2 Symmetric Matrix-Vector Multiplication

108

(2)

7.2.1 Standard FORTRAN call

108

(1)

7.2.2 PLAPACK FORTRAN-C interface

109

(1)

7.2.3 PLAPACK calls

109

(1)

7.3 Triangular Matrix-Vector Multiplication

110

(2)

7.3.1 Standard FORTRAN call

110

(1)

7.3.2 PLAPACK FORTRAN-C interface

110

(1)

7.3.3 PLAPACK calls

111

(1)

7.4 Triangular Solve

112

(1)

7.4.1 Standard FORTRAN call

112

(1)

7.4.2 PLAPACK FORTRAN-C interface

112

(1)

7.4.3 PLAPACK calls

113

(1)

7.5 Rank-1 Update

113

(2)

7.5.1 Standard FORTRAN call

113

(1)

7.5.2 PLAPACK FORTRAN-C interface

114

(1)

7.5.3 PLAPACK calls

114

(1)

7.6 Symmetric Rank-1 Update

115

(1)

7.6.1 Standard FORTRAN call

115

(1)

7.6.2 PLAPACK FORTRAN-C interface

115

(1)

7.6.3 PLAPACK calls

115

(1)

7.7 Symmetric Rank-2 Update

116

(2)

7.7.1 Standard FORTRAN call

116

(1)

7.7.2 PLAPACK FORTRAN-C interface

117

(1)

7.7.3 PLAPACK calls

117

(1)

7.8 Example: Parallelizing Matrix-Vector Multiplication

118

(3)

7.8.1 Simple implementation

118

(1)

7.8.2 General implementation

119

(2)

7.9 Example: Parallelizing Rank-1 Update

121

(1)

7.9.1 Simple implementation

121

(1)

7.9.2 General implementation

121

(1)

7.10 More Operations and Information

122

(3)

8 Matrix-Matrix Operations

125

(36)

8.1 General Matrix-Matrix Multiplication

125

(2)

8.1.1 Standard FORTRAN call

125

(1)

8.1.2 PLAPACK FORTRAN-C interface

126

(1)

8.1.3 PLAPACK calls

126

(1)

8.2 Symmetric Matrix-Matrix Multiplication

127

(4)

8.2.1 Standard FORTRAN call

129

(1)

8.2.2 PLAPACK FORTRAN-C interface

129

(1)

8.2.3 PLAPACK calls

129

(2)

8.3 Symmetric Rank-k Update

131

(1)

8.3.1 Standard FORTRAN call

131

(1)

8.3.2 PLAPACK FORTRAN-C interface

131

(1)

8.3.3 PLAPACK calls

131

(1)

8.4 Symmetric Rank-2k Update

132

(3)

8.4.1 Standard FORTRAN call

133

(1)

8.4.2 PLAPACK FORTRAN-C interface

133

(1)

8.4.3 PLAPACK calls

133

(2)

8.5 Triangular Matrix-Matrix Multiplication

135

(2)

8.5.1 Standard FORTRAN call

135

(1)

8.5.2 PLAPACK FORTRAN-C interface

135

(1)

8.5.3 PLAPACK calls

136

(1)

8.6 Triangular Solve with Multiple Right-Hand-Sides

137

(2)

8.6.1 Standard FORTRAN call

137

(1)

8.6.2 PLAPACK FORTRAN-C interface

137

(1)

8.6.3 PLAPACK calls

138

(1)

8.7 Example: Parallelizing Matrix-Matrix Multiplication

139

(19)

8.7.1 Forming C = AB + BC

139

(3)

8.7.2 Forming C = ABT + BC

142

(3)

8.7.3 Forming C = ATB + BC

145

(2)

8.7.4 Forming C = ATBT + BC

147

(1)

8.7.5 A more general approach

147

(2)

8.7.6 Performance results

149

(9)

8.8 Querying Algorithmic Blocking Size

158

(1)

8.9 More Operations and Information

159

(2)

9 Application of the Infrastructure

161

(12)

9.1 Cholesky Factorization

161

(1)

9.2 Right-Looking Variant

161

(4)

9.2.1 Level-2 BLAS implementation

161

(2)

9.2.2 Level-3 BLAS implementation

163

(2)

9.3 Left-Looking Variant

165

(7)

9.3.1 Level-2 BLAS implementation

165

(1)

9.3.2 Level-3 BLAS implementation

166

(2)

9.3.3 Towards further performance improvements

168

(4)

9.4 More Operations and Information

172

(1)

A Summary of PLAPACK Routines and Their Arguments

173

(8)

B Summary of BLAS Related Routines

181

(4)

Bibliography

185

(2)

Index

187

(4)

Constants Index

191

(2)

Function Index

193

Kids

Men

Unisex

Women

For You

For Your Car

For Your Home

For Your Pet

For Your Tech

Holiday

Home Decor

Mascot

Office Decor

Spirit

Stationary

Diploma Frames

Graduation Gifts

For Your Office

Clothing

Using Plapack

Rent Textbook

New Textbook

Used Textbook

eTextbook

Summary

Table of Contents

Using Plapack

Rent Textbook

New Textbook

Used Textbook

eTextbook

How Marketplace Works:

Summary

Table of Contents

Digital License