Table Of ContentMathematics Chapman & Hall/CRC
D
e Numerical Analysis and Scientific Computing
Many of today’s complex scientific applications now require a vast s
amount of computational power. General purpose graphics processing ig
units (GPGPUs) enable researchers in a variety of fields to benefit from n
Designing Scientific
the computational power of all the cores available inside graphics cards. in
g
Designing Scientific Applications on GPUs shows you how to use GPUs
for applications in diverse scientific fields, from physics and mathematics S Applications on GPUs
c
to computer science. The book explains the methods necessary for
i
e
designing or porting your scientific application on GPUs. It will improve your
n
knowledge about image processing, numerical applications, methodology
t
to design efficient applications, optimization methods, and much more. ifi
c
The first part of the book introduces the GPUs and NVIDIA’s CUDA
A
programming model, currently the most widespread environment for
p
designing GPU applications. The second part focuses on significant
p
image processing applications on GPUs. The third part presents general
l
i
methodologies for software development on GPUs and the fourth part c
describes the use of GPUs for addressing several optimization problems. a
t
The fifth part covers many numerical applications, including obstacle i
o
problems, fluid simulation, and atomic physics models. The last part
n
illustrates agent-based simulations, pseudorandom number generation, s
and the solution of large sparse linear systems for integer factorization. o
Some of the codes presented in the book are available online. n
G
Features
• Provides a comprehensive, research-level treatment of the use of P
U
GPUs for scientific applications
s
• Explains how to port scientific applications on GPUs
• Discusses the latest algorithms and their use with classical
applications
• Describes key elements for choosing the context and way of
implementing the algorithms C
o
u
t
u
r
ie
r Edited by
K16551 Raphaël Couturier
K16551_Cover.indd 1 10/16/13 8:45 AM
Designing Scientific
Applications on GPUs
CHAPMAN & HALL/CRC
Numerical Analysis and Scientific Computing
Aims and scope:
Scientific computing and numerical analysis provide invaluable tools for the sciences and engineering.
This series aims to capture new developments and summarize state-of-the-art methods over the whole
spectrum of these fields. It will include a broad range of textbooks, monographs, and handbooks.
Volumes in theory, including discretisation techniques, numerical algorithms, multiscale techniques,
parallel and distributed algorithms, as well as applications of these methods in multi-disciplinary fields,
are welcome. The inclusion of concrete real-world examples is highly encouraged. This series is meant
to appeal to students and researchers in mathematics, engineering, and computational science.
Editors
Choi-Hong Lai Frédéric Magoulès
School of Computing and Applied Mathematics and
Mathematical Sciences Systems Laboratory
University of Greenwich Ecole Centrale Paris
Editorial Advisory Board
Mark Ainsworth Peter Jimack
Mathematics Department School of Computing
Strathclyde University University of Leeds
Todd Arbogast Takashi Kako
Institute for Computational Department of Computer Science
Engineering and Sciences The University of Electro-Communications
The University of Texas at Austin
Peter Monk
Craig C. Douglas Department of Mathematical Sciences
Computer Science Department University of Delaware
University of Kentucky
Francois-Xavier Roux
Ivan Graham ONERA
Department of Mathematical Sciences
University of Bath Arthur E.P. Veldman
Institute of Mathematics and Computing Science
University of Groningen
Proposals for the series should be submitted to one of the series editors above or directly to:
CRC Press, Taylor & Francis Group
4th, Floor, Albert House
1-4 Singer Street
London EC2A 4BQ
UK
Published Titles
Classical and Modern Numerical Analysis: Theory, Methods and Practice
Azmy S. Ackleh, Edward James Allen, Ralph Baker Kearfott, and
Padmanabhan Seshaiyer
Cloud Computing: Data-Intensive Computing and Scheduling
Frédéric Magoulès, Jie Pan, and Fei Teng
Computational Fluid Dynamics
Frédéric Magoulès
A Concise Introduction to Image Processing using C++
Meiqing Wang and Choi-Hong Lai
Decomposition Methods for Differential Equations: Theory and Applications
Juergen Geiser
Designing Scientific Applications on GPUs
Raphaël Couturier
Desktop Grid Computing
Christophe Cérin and Gilles Fedak
Discrete Dynamical Systems and Chaotic Machines: Theory and Applications
Jacques M. Bahi and Christophe Guyeux
Discrete Variational Derivative Method: A Structure-Preserving Numerical Method for
Partial Differential Equations
Daisuke Furihata and Takayasu Matsuo
Grid Resource Management: Toward Virtual and Services Compliant Grid Computing
Frédéric Magoulès, Thi-Mai-Huong Nguyen, and Lei Yu
Fundamentals of Grid Computing: Theory, Algorithms and Technologies
Frédéric Magoulès
Handbook of Sinc Numerical Methods
Frank Stenger
Introduction to Grid Computing
Frédéric Magoulès, Jie Pan, Kiat-An Tan, and Abhinit Kumar
Iterative Splitting Methods for Differential Equations
Juergen Geiser
Mathematical Objects in C++: Computational Tools in a Unified Object-Oriented Approach
Yair Shapira
Numerical Linear Approximation in C
Nabih N. Abdelmalek and William A. Malek
Numerical Techniques for Direct and Large-Eddy Simulations
Xi Jiang and Choi-Hong Lai
Parallel Algorithms
Henri Casanova, Arnaud Legrand, and Yves Robert
Parallel Iterative Algorithms: From Sequential to Grid Computing
Jacques M. Bahi, Sylvain Contassot-Vivier, and Raphaël Couturier
Particle Swarm Optimisation: Classical and Quantum Perspectives
Jun Sun, Choi-Hong Lai, and Xiao-Jun Wu
XML in Scientific Computing
C. Pozrikidis
TThhiiss ppaaggee iinntteennttiioonnaallllyy lleefftt bbllaannkk
Designing Scientific
Applications on GPUs
Edited by
Raphaël Couturier
University of Franche-Comte
Belfort, France
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2014 by Taylor & Francis Group, LLC
CRC Press is an imprint of Taylor & Francis Group, an Informa business
No claim to original U.S. Government works
Version Date: 20130827
International Standard Book Number-13: 978-1-4665-7164-8 (eBook - PDF)
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts
have been made to publish reliable data and information, but the author and publisher cannot assume
responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to
copyright holders if permission to publish in this form has not been obtained. If any copyright material has
not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmit-
ted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented,
including photocopying, microfilming, and recording, or in any information storage or retrieval system,
without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.
com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood
Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and
registration for a variety of users. For organizations that have been granted a photocopy license by the CCC,
a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used
only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Contents
List of Figures xi
List of Tables xvii
Preface xxi
I Presentation of GPUs 1
1 Presentation of the GPU architecture and of the CUDA
environment 3
Rapha¨el Couturier
2 Introduction to CUDA 13
Rapha¨el Couturier
II Image processing 23
3 Setting up the environment 25
Gilles Perrot
4 Implementing a fast median filter 31
Gilles Perrot
5 Implementing an efficient convolution operation on GPU 53
Gilles Perrot
III Software development 71
6 Development of software components for heterogeneous
many-core architectures 73
Stefan L. Glimberg, Allan P. Engsig-Karup, Allan S. Nielsen, and
Bernd Dammann
vii
viii Contents
7 Development methodologies for GPU and cluster of GPUs 105
Sylvain Contassot-Vivier, Stephane Vialle, and Jens Gustedt
IV Optimization 151
8 GPU-accelerated tree-based exact optimization methods 153
Imen Chakroun and Nouredine Melab
9 Parallel GPU-accelerated metaheuristics 183
Malika Mehdi and Ahc`ene Bendjoudi, Lakhdar Loukil, and
Nouredine Melab
10 Linear programming on a GPU: a case study 215
Xavier Meyer and Bastien Chopard and Paul Albuquerque
V Numerical applications 249
11 Fast hydrodynamics on heterogeneous many-core hardware 251
Allan P. Engsig-Karup, Stefan L. Glimberg, Allan S. Nielsen, and
Ole Lindberg
12 Parallel monotone spline interpolation and approximation on
GPUs 295
Gleb Beliakov and Shaowu Liu
13 Solving sparse linear systems with GMRES and CG methods
on GPU clusters 311
Lilia Ziane Khodja, Rapha¨el Couturier, and Jacques Bahi
14 Solving sparse nonlinear systems of obstacle problems on
GPU clusters 331
Lilia Ziane Khodja, Rapha¨el Couturier, and Jacques Bahi, Ming
Chau, and Pierre Spit´eri
15 Ludwig:multipleGPUsforacomplexfluidlatticeBoltzmann
application 355
Alan Gray and Kevin Stratford
16 NumericalvalidationandperformanceoptimizationonGPUs
of an application in atomic physics 371
Rachid Habel, Pierre Fortin, Fabienne J´ez´equel, and Jean-Luc
Lamotte, and Stan Scott
Contents ix
17 A GPU-accelerated envelope-following method for switching
power converter simulation 395
Xuexin Liu, Sheldon Xiang-Dong Tan, Hai Wang, and Hao Yu
VI Other 413
18 Implementing multi-agent systems on GPU 415
Guillaume Laville, Christophe Lang, B´en´edicte Herrmann, and
Laurent Philippe, Kamel Mazouzi, and Nicolas Marilleau
19 Pseudorandom number generator on GPU 441
Rapha¨el Couturier and Christophe Guyeux
20 Solving large sparse linear systems for integer factorization
on GPUs 453
Bertil Schmidt and Hoang-Vu Dang
Index 473