Sunday, September 18, 2011

OpenCL Programming Guide - A book review

Hi everyone,

This post is about a book i helped reviewed during this last month. This book is called OpenCL Programming Guide by Addison Wesley where one of the co-authors is the main guy behind OpenCL, Dr. Tim G. Mattson. 

First, let me begin by what i feel about the book and then i'll proceed to give a more detailed review. When i first saw the book, i thought that it was a rather large book (approx 600 pages) for a relatively young technology as OpenCL and the reason for this was that the book was split into 3 major sections: Section 1: All about OpenCL, Section 2: Case studies, Section 3: OpenCL API guide. What i next felt was that there was a lack lustre when it came to the overall cover design of the book but the layout of the contents is good, clear and concise. Fits well into the human hand and i reckon can be carried around the bus, library, shopping mall whenever you feel like reading. I thoroughly enjoyed the case studies since those are real world applications of OpenCL and where details are desired, the appropriate reference are provided.

Overall, i think this is a good guide for programmers in OpenCL, CUDA and the general audience would benefit, too. Thanks 

This book covers the OpenCL 1.1 specification in its entirety which is a good thing. I don't think i'll cover all the chapters but i'll point out which chapters sort of stood out, for me.

Chapter 1:
---------------
What i liked about this chapter was the cautionary note to readers that implementation of work-groups (including execution concurrency) in OpenCL is largely vendor dependent. This marks a great difference between the equivalent in the NVIDIA's CUDA model. I also liked the fact that the authors reminded the readers that they need to be aware of the IEEE 754 floating-point support issues - they did make known that it would suffice for current heterogenous platforms. The authors took a painstaking task, i believe, to lay the ground for OpenCL by going through the core concepts through clear explanation. I thought it was a pleasure to read :) 

What i didn't like about this chapter was that there was too much write up about the API, perhaps they could illustrate a few and point the readers to the specs for details? 

Chapter 2:
---------------
This chapter begins with a sample OpenCL in the classic "Hello World" application and HOWTOs configure 3 IDEs to use OpenCL. I think this was helpful for folks who come to OpenCL having used Eclipse, CodeBlocks and Microsoft Visual Studio. However, i didn't quite appreciate the lengthy explanation of that example but still i think it has value because the book assumes you don't have experience in using OpenCL so i think it still has tremendous value.

Chapter 3:
---------------
This section also suffered, imo, from elaborate explanation of the APIs ... in another word, kind of boring but still nonetheless important. It would serve you well when you hit those bugs because you didn't read the API docs well. However, this OpenCL 1.1 reference card is handy (http://www.khronos.org/files/opencl-1-1-quick-reference-card.pdf) 

Chapter 4, 5, 6:
---------------
This is the chapter i liked. They go into great detail on the data types in OpenCL like the "vector", "half" and illustrate how to manipulate them through code in snippet form. They talked about implicit type conversions among types, IEEE 754 rounding modes and the usual arithmetic operations you can apply to the new data types in OpenCL. Something that caught my eye was the fact that there isn't short circuit evaluation for "vector" data types. This chapter also introduced kernel attribute qualifiers necessary for compiler optimization. They talked about the kinds of memory spaces allowed in OpenCL and i liked the fact that they pointed out that pointer types cannot be re-interpreted across different memory address spaces. Cool.

Coming back to vector data types, the authors didn't show whether binary operations applied but they did show what'll happen when scalars are applied to vectors.

Atomic/synchronization functions are introduced in chapter 5 and they more or less look similar to CUDA's synchronization primitives.

In Chapter 6, i liked that the authors mentioned that you can load NVIDIA's PTX into OpenCL. Interoperability but i suspect that's only pertinent to the OpenCL driver provided by NVIDIA.

Case studies
---------------
Totally love this section of the book where many examples of real-world, working OpenCL are applied. This is where the money's at, i think.

3 comments:

Ɓukasz said...

I just got back from 5th Parallel Tools workshop in Dresden where it was said OpenCL is far behind CUDA due to lack of the debugger and slow native kernels. Do you agree?

Raymond Tay said...

Hi Lukasz,

On debugger support, what i'll say is that debugging the kernels running OpenCL on GPUs need to take an approach of few threads consuming few data instead of stepping through a kernel that launches tens of thousands of threads since there's a good chance that'll freeze the OS - i'm speaking from experience. So i'll agree and disagree on this.

As to the claims of the slow native kernels, i think there are two factors involved here. First, the kind of problem OpenCL's applied to; Second, the vendor's implementation. I won't dwell into the first point but on the second point i'll say that OpenCL, in general, runs well on GPUs as compared to CPUs since in CPUs, OpenCL takes advantage of the processor's wide registers but still there are only so many of them (compare them to GPUs, well GPUs win hands down)

What do you think Lukasz?

company profile report said...

Every time i come here i see something very new