Dr Paul Richmond

Dr Paul Richmond

Research Software Engineer

Course Information

Welcome to the 2021/2022 module page for COM4521/COM6521 Parallel Computing with GPUs.

Accelerator architectures are discrete processing units which supplement a base processor with the objective of providing advanced performance at lower energy cost. Performance is gained by a design which favours a high number of parallel compute cores at the expense of imposing significant software challenges. This module looks at accelerated computing from multi-core CPUs to GPU accelerators with many TFlops of theoretical performance. The module will give insight into how to write high performance code with specific emphasis on GPU programming with NVIDIA CUDA GPUs. A key aspect of the module will be understanding what the implications of program code are on the underlying hardware so that it can be optimised.

The modules aims, objectives and assessment details are available on the modules public teaching page.

Software for the Module

The module programming exercises are designed to be completed on PCs in the Diamond compute labs. As the modules teaching is online these machines are available via virtual desktop. All Diamond compute lab machines have Visual Studio 2019 and CUDA 11.1 If you intend to use your own machine for programming exercises (on the CUDA part of the module) then you must install the latest Community version of Visual Studio 2019 before you install the CUDA 11.1 toolkit.

If you want to complete the exercises in Linux then example Makefiles will be provided with the lab starting code and solutions. It is not possible to build Linux CUDA programs on PCs in the Diamond compute labs.

Computers and Labs Available

As the module requires access to a machine with a GPU the following have been made available to you.

  • All diamond Compute Labs (other than High spec lab) - All diamond all in one machines have an NVIDIA GTX1050 (Pascal generation) GPUs. Dedicated labs have been reserved each week to use these. You can find machine availability outside of lab times by using Find a PC
  • Diamond High Spec Lab Reservation - These are higher spec machines with NVIDIA Quadro 5200 (Pascal generation) GPUs.
  • Your own Windows/Linux machine - Follow the instructions under “software for this module”.
  • A GPU backed cloud instance (at your own risk and cost)

Course Attendance Monitoring

Lab attendance is recorded through the use of the Lab Feedback Form. All attendees on the course are expected to complete this each week. The purpose of the lab feedback from to monitor engagement with the course content and to allow the course instructors to identify areas for group discussion.

Important Note: It is not possible to properly understand the course material without completing the labs and reviewing the solutions. If you do not complete the labs then you will find the assignment difficult. The first lecture will provide some insight into how course engagement affects assessment performance.


With the exception of the first lecture, which will be delivered in person, the course will be run using a flip classroom approach. Lecture content has been pre-recorded into bite sized chunks of ~10-15m each. There is a lecture list for each week which you are expected to listen to in advance of the next weeks lab.

In Person Lab Classes

The lab classes have been designed to re-enforce the material which you will observe in the on-line lectures by applying the techniques and approaches to specific problems. You should aim to attempt the lab classes exercises prior to attending the lab class (i.e. the week before) and use the labs to obtain help in understand and applying the taught content. The lab class solutions are commented to provide insight. The solutions are available in advance of the lab so if you are stuck on a particular exercise then review these to move on and seek help in understanding the problem and solution in the lab class. Within the labs, pair programming or work within small groups is encouraged but left to personal preference. Discussion is encouraged.

During the lab class there will be an opportunity to discuss and review lecture content, lecture examples and lab solutions. Guided walkthroughs of certain parts of the lab solutions will be provided.

Although the labs are structured around the online lecture material each week you can (and should) ask for help regarding any of the labs during the scheduled lab time. The labs are also used for assignment help. You should start this early.

You should complete the Lab Feedback Form each week by editing your response. Please update your responses at the start of each lab.

Course Assessment

In response to student feedback the number of assignments has been reduced to a single assignment which reflects the reduced number of learning objectives for the module. The assignment will be released Monday the 28th February 2022 (week 4) and is due 17:00 on Wednesday 25th May (week 13). The assignment forms 80% of your mark. You are expected to ask for feedback on your assignment work during the scheduled lab classes.

The remaining 20% of the module mark is from two mole quizes which must be completed under exam conditions. I.e. Within the specified locations with the lock down browser under invigilation from demonstrators.

  • Week 5 - 08/03/2022 13:00 in Diamond Computer Room 3
  • Week 10 - 03/05/2022 13:00 in Computer room A04, Alfred Denny Building

Quizes will take 40m (must be completed within the hour) and be multiple choice in a simmilar format to the weekly lab question.

DDP students and Staff Candidates

PhD students and research/academic staff are not required to undertake assessment but DDP students are expected to attend labs as evidence of participation in the module. You should ensure that you enroll for the course via DDP to ensure that you have access to the Blackboard. If you are a staff member attendee and require access to Blackboard then please contact me.

Discussion, Announcements and Requests for Help

A Google group has been created for announcements, help and discussion. Any important announcements relating to the module will be made via this group. All students enrolled on the module on the 3rd February 2021 have been added to this group already. Likewise and staff or Phd students who expressed an interest in the course via the google form have been added. If you have transferred via Add/Drop then you will need to manually join the group yourself. The group is monitored by the teaching staff (including lab assistants) as well as additional PhD students who can provide help. The purpose of the mailing list is to ask for general support and guidance with the course material (e.g. with concepts and ideas) rather than posting your own code. You should not post your assignment code on this forum. If you require personal assistance with your assignment code then you should request this during the lab hours. Any lab class can be used for assignment help in addition to the lab exercises which are set each week.


Course Material Material

Lectures are pre-recorded and are availble on the COM4521 Parallel Computing with Graphical Processing Units Kaltura Channel or as downloadable pdfs on Google Drive. Each weeks practical activities (the labs) follow the ideas presented in the lectures so it is important that you follow the lecture and lab timetable below.

Week 01

Lecture 01 (In Person) - Course Introduction and Overview

On-line On Demand Lectures - Introduction to C

Week 02

Introduction to Visual Studio and C Programming Lab

On-line On Demand Lectures - Memory

On-line On Demand Lectures - Optimisation

Week 03

Memory and Performance Lab

On-line On Demand Lectures - OpenMP

On-line On Demand Lectures - OpenMP Part II

Week 04

OpenMP Lab

On-line On Demand Lectures - GPU Architectures

On-line On Demand Lectures - Introduction to CUDA

Assignment Handout

The assignment will be handed out on the 28th February via Blackboard.

Week 05

Introduction to CUDA Lab

On-line On Demand Lectures - CUDA Memory

On-Demand Lectures - Using GPU Backed Cloud Instances (Optional Content)

Note/Disclaimer: Any use of cloud is entirely at your own risk and cost. The videos provide a short overview of the use of cloud instances but you are responsible for your own accounts and any cost associated with it. You are encouraged to read up on aspects such as billing notifications and the specific charges for on demand pricing and storage.

  • Creating an EC2 instance using the template AMI (recording)
  • Connecting to your cloud instance (recording)
  • Setting restriction on inbound traffic to your instance (recording)

Blackboard Quiz

There is an assessed blackboard quiz this week. Date, time and location are in the course google calendar.

Week 6

CUDA Memory Lab

On-line On Demand Lectures - CUDA Shared Memory

On-line On Demand Lectures - CUDA Performance

Week 7

Shared Memory Lab

On-line On Demand Lectures - Warp Level CUDA

On-line On Demand Lectures - Parallel Patterns

Week 8

Atomics and Primitives Lab

On-line On Demand Lectures - Performance Optimisation

  • Performance Profiling - Guest Lecture by Dr Robert Chisholm (pdf, recording)


Week 9

Profiling Lab

  • Profile Lecture Example Code There is no lab sheet for this lab. Examine the source code and try changing the STEP macro to compile different iterations of the code to run through the profiler.

On-line On Demand Lectures - Sorting and Libraries

On-line On Demand Lectures - CUDA Streams

Week 10

Blackboard Quiz

There is an assessed blackboard quiz this week. Date, time and location are in the course google calendar.

Libraries and Streams Lab (Note: This is on Friday due to bank holiday)

In Person Invited Lecture - TBC

Previous On Demand Invited Lectures (Optional)

Please Find below a list of previous invited lectures which may be of interest.

Week 11:

Assignment Help Lab

No Lectures

Week 12:

Assignment Help Lab

No Lectures


You can add this calendar to your University of Sheffield Google Calendar by searching for COM4521 and COM6521

Recommended Reading

The following are useful resources but not required reading.

  • Edward Kandrot, Jason Sanders, “CUDA by Example: An Introduction to General-Purpose GPU Programming”, Addison Wesley 2010.
  • Brian Kernighan, Dennis Ritchie, “The C Programming Language (2nd Edition)”, Prentice Hall 1988.
  • NVIDIA, CUDA C Programming Guide