Concurrent Programming

20 April 2021

Obed N Munoz

Cloud Software Engineer

Concurrent vs Parallel


Parallel Programming - Serial Computing


Parallel Programming - Parallel Computing


Parallel Programming - Parallel Computers 1/2

Stand-alone Computers
- Multiple functional units (Lx Caches, prefetch, decode, floating point, GPU, etc)
- Multiple execution units/cores
- Multiple hardware threads


Parallel Programming - Parallel Computers 2/2

Multiple stand-alone computers to make a larger parallel computer(cluster).


Parallel Computing - Where?


Parallel Computing - Why?


Jargon of Parallel Programming

von Neumann Architecture

So what? Who cares?
Parallel computers still follow this basic design, just multiplied in units. The basic, fundamental architecture remains the same.


Jargon of Parallel Programming - Terms


Flynn's Taxonomy

Distinguishes multi-processor computer architectures according to how they can be classified along the two independent dimensions of Instruction Stream and Data Stream. Each of these dimensions can have only one of two possible states: Single or Multiple.


Parallel Architectures - Shared Memory

Uniform Memory Access (UMA)


Parallel Architectures - Shared Memory

Non-Uniform Memory Access (NUMA)


Parallel Architectures - Distributed Memory

Distributed memory systems require a communication network to connect inter-processor memory.


Parallel Architectures - Hybrid Distributed-Shared Memory

The largest and fastest computers in the world today employ both shared and distributed memory architectures.


Parallel Programming Pattern's Definition

In the book of Patterns for Parallel Programming from Massingill, Sanders and Mattson there's a pattern language that helps on the process of understanding and designing parallel programs.


Finding Concurrency

Programmers should start their design of a parallel solution by analyzing the problem within the problem domain to expose exploitable concurrency.

Is the problem large enough and the results significant enough to justify?


Algorithm Structure

Our goal is to refine the design and move it closer to a program that can execute tasks concurrently by mapping the concurrency onto multiple UEs running on a parallel computer.

The key issue at this stage is to decide which pattern or patterns are most appropriate for the problem.


Supportting Structures

We call these patterns Supporting Structures because they describe software constructions or "structures" that support the expression of parallel algorithms.


Implementation mechanisms

Every parallel program needs to:

1. Create the set of UEs.
2. Manage interactions between them and their access to shared resources
3. Exchange information between UEs.
4. Shut them down in an orderly manner.

Parallel Programming Models

More details at:


Parallel Programming Design Considerations

More at:


Pthreads - Introduction

Take a look on:


Pthreads - Threads

A thread is defined as an independent stream of instructions that can be scheduled to run as such by the operating system.


Pthreads - Create / Termination (exit)

#include <pthread.h>

int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
                   void *(*start)(void *), void *arg);
                 // Returns 0 on success, or a positive error number on error
void pthread_exit(void *retval);

Take a look on:
- src/07/pthread_create.c


Pthreads - Thread IDs

#include <pthread.h>

pthread_t pthread_self(void);  // Returns the thread ID of the calling thread
#include <pthread.h>

int pthread_equal(pthread_t t1, pthread_t t2);
// Returns nonzero value if t1 and t2 are equal, otherwise 0

Pthreads - Join

#include <pthread.h>

int pthread_join(pthread_t thread, void **retval);
                 // Returns 0 on success, or a positive error number on error

Take a look on threads/simple_thread.c from The Linux Programming Interface.


Pthreads - Detach

#include <pthread.h>

int pthread_detach(pthread_t thread);
                   // Returns 0 on success, or a positive error number on error

Take a look on threads/detached_attrib.c from The Linux Programming Interface.

Quick Question:
What happens if we do do a pthread_detach(pthread_self()); inside the threadFunc from threads/simple_thread.c - The Linux Programming Interface?


Threads vs Processes Discussion

- Sharing data between threads is easy.
- Thread creation is faster than process creation.
- We need to ensure that the function we call are thread-safe.
- A bug in one thread can damage all of the threads in the process.
- Each thread is competing for use of the finite virtual address space of the host process.
- Dealing with signals in a multithreaded application requires careful design.
- In a multithreaded application, all threads must be running the same program.
  In a multiprocess application, different processes can run different programs.
- Aside from data, threads also share certain other information.
  e.g. file descriptors, signal dispositions, current working directory, and user and group IDs.

Synchronization: Mutexes


Synchronization: Condition Variables


Let's code: Matrix Multiplication with Pthreads


Resources and Credits

This material is genereated thanks to some extracts from following resources:


Thank you

Obed N Munoz

Cloud Software Engineer

Use the left and right arrow keys or click the left and right edges of the page to navigate between slides.
(Press 'H' or navigate to hide this message.)