Name: Introduction to High Performance Scientific Computing
ISBN: 978-1-257-99254-6

Introduction to High Performance Scientific Computing

Victor Eijkhout

Computers & Technology

Introduction to High Performance Scientific Computing

Free

Read Now

Description

Contents

Reviews

The field of high performance scientific computing lies at the
crossroads of a number of disciplines and skill sets, and
correspondingly, for someone to be successful at using high
performance computing in science requires at least elementary
knowledge of and skills in all these areas. This book brings
together the strands of numerical modeling, numerical linear
algebra, computer architecture, parallel computing, performance
optimization in a unified manner.

The contents of this book are a combination of theoretical material
and self-guided tutorials on various practical skills. Together,
this teaches a graduate student or advanced undergraduate the
necessary skills to be a successful computational scientist.

Victor Eijkhout is a research scientist at the
Texas Advanced Computing Center of
The University of Texas at Austin.

This book is released under a CC-BY license, thanks to a gift from the Saylor Foundation. Print copies and course materials are available from the author's web page.

Language

English

ISBN

978-1-257-99254-6

I Theory

Single-processor Computing

The Von Neumann architecture

Modern processors

The processing cores

8-bit, 16-bit, 32-bit, 64-bit

Caches: on-chip memory

Graphics, controllers, special purpose hardware

Superscalar processing and instruction-level parallelism

Memory Hierarchies

Busses

Latency and Bandwidth

Registers

Caches

Prefetch streams

Concurrency and memory transfer

Memory banks

TLB and virtual memory

Multicore architectures

Cache coherence

Computations on multicore chips

Locality and data reuse

Data reuse and arithmetic intensity

Locality

Programming strategies for high performance

Peak performance

Pipelining

Cache size

Cache lines

TLB

Cache associativity

Loop tiling

Optimization strategies

Cache aware and cache oblivious programming

Case study: Matrix-vector product

Case study: Goto matrix-matrix product

Power consumption

Derivation of scaling properties

Multicore

Total computer power

Review questions

Parallel Computing

Introduction

Quantifying parallelism

Definitions

Asymptotics

Amdahl's law

Scalability

Simulation scaling

Parallel Computers Architectures

SIMD

MIMD / SPMD computers

The commoditization of supercomputers

Different types of memory access

Symmetric Multi-Processors: Uniform Memory Access

Non-Uniform Memory Access

Logically and physically distributed memory

Granularity of parallelism

Data parallelism

Instruction-level parallelism

Task-level parallelism

Conveniently parallel computing

Medium-grain data parallelism

Task granularity

Parallel programming

Thread parallelism

OpenMP

Distributed memory programming through message passing

Hybrid shared/distributed memory computing

Parallel languages

OS-based approaches

Active messages

Bulk synchronous parallelism

Program design for parallelism

Topologies

Some graph theory

Busses

Linear arrays and rings

2D and 3D arrays

Hypercubes

Switched networks

Cluster networks

Bandwidth and latency

Locality in parallel computing

Multi-threaded architectures

Co-processors

A little history

Bottlenecks

GPU computing

Intel Xeon Phi

Remaining topics

Load balancing

Distributed computing, grid computing, cloud computing

Usage scenarios

Characterization

Capability versus capacity computing

FPGA computing

MapReduce

The top500 list

The top500 list as a recent history of supercomputing

Heterogeneous computing

Computer Arithmetic

Integers

Real numbers

They're not really real numbers

Representation of real numbers

Limitations

Normalized and unnormalized numbers

Representation error

Machine precision

The IEEE 754 standard for floating point numbers

Round-off error analysis

Correct rounding

Addition

Multiplication

Subtraction

Examples

Roundoff error in parallel computations

Compilers and round-off

More about floating point arithmetic

Programming languages

Other computer arithmetic systems

Extended precision

Fixed-point arithmetic

Complex numbers

Conclusions

Numerical treatment of differential equations

Initial value problems

Error and stability

Finite difference approximation: Euler explicit and implicit methods

Boundary value problems

General PDE theory

The Poisson equation in one space dimension

The Poisson equation in two space dimensions

Difference stencils

Other discretization techniques

Initial boundary value problem

Discretization

Stability analysis

Numerical linear algebra

Elimination of unknowns

Linear algebra in computer arithmetic

Roundoff control during elimination

Influence of roundoff on eigenvalue computations

LU factorization

The algorithm

The Cholesky factorization

Uniqueness

Pivoting

Solving the system

Complexity

Block algorithms

Sparse matrices

Storage of sparse matrices

Sparse matrices and graph theory

LU factorizations of sparse matrices

Iterative methods

Abstract presentation

Convergence and error analysis

Computational form

Convergence of the method

Choice of K

Stopping tests

Theory of general iterative methods

Iterating by orthogonalization

Coupled recurrences form of iterative methods

The method of Conjugate Gradients

Derivation from minimization

GMRES

Complexity