• LOGIN
  • No products in the cart.

You must be logged in to take this course  →   LOGIN | REGISTER NOW

The goal of this course is to give solid foundations for developing, analyzing, and implementing parallel and locality-efficient algorithms. This course covers theoretical underpinnings. To give a practical feeling for how algorithms map to and behave on real systems, we’ll supplement algorithmic theory with hands-on exercises on modern HPC systems, such as Cilk Plus or OpenMP on shared memory nodes and CUDA for graphics co-processors (GPUs).

Assessment

This course does not involve any written exams. Students need to answer 5 assignment questions to complete the course, the answers will be in the form of written work in pdf or word. Students can write the answers in their own time. Each answer needs to be 200 words (1 Page). Once the answers are submitted, the tutor will check and assess the work.

Certification

Edukite courses are free to study. To successfully complete a course you must submit all the assignment of the course as part of assessment. Upon successful completion of a course, you can choose to make your achievement formal by obtaining your Certificate at a cost of £49.

Having an Official Edukite Certification is a great way to celebrate and share your success. You can:

  • Add the certificate to your CV or resume and brighten up your career
  • Show it to prove your success

 

 

Course Credit: Georgia Institute of Technology

Course Curriculum

Introduction to High Performance Computing
Welcome 00:01:00
Philosophy and Logistics 00:02:00
What is HPC? 00:01:00
Supercomputing in a Nutshell 00:02:00
Topics 00:04:00
Intro to the Work-Span Model
Introduction 00:02:00
The Multithreaded DAG Model 00:03:00
Example Sample Reduction 00:04:00
Quiz: A Reduction Tree 00:01:00
Work and Span 00:03:00
Quiz: Work and Span Quiz 00:01:00
Quiz: Work and Span for Reduction 00:01:00
Basic Work Span Laws 00:03:00
Brent’s Theorem – Part 1 00:05:00
Quiz: Brent’s Theorem Aside 00:01:00
Brent’s Theorem – Part 2 00:03:00
Quiz: Applying Brent’s Theorem 00:01:00
Quiz: The Slack in Brent’s Bound – Part 1 00:02:00
Desiderata Speedup, Work Optimality and Weak scaling 00:06:00
Quiz: Which Parallel Algorithm is Better 00:01:00
Basic Concurrency Primitives 00:05:00
Quiz: A Subtle Point about Spawn 00:01:00
Basic Analysis of Work and Span 00:03:00
Quiz: Solve a Recurrence 00:01:00
Desiderata For Work and Span 00:02:00
Concurrency Primitive Parallel For 00:02:00
Quiz: Implementing Par For 00:01:00
Implementing Par For – Part 2 00:02:00
Quiz: Matrix Vector Multiply 00:03:00
Data Races and Race Conditions 00:02:00
Quiz: Putting It All Together 1 00:01:00
Quiz: Putting It All Together 2 00:01:00
Vector Notation 00:03:00
Conclusion 00:02:00
Comparison-based Sorting
Introduction 00:01:00
Comparator Networks 00:03:00
Quiz: Sort 4 Values 00:01:00
Bitonic Sequences 00:02:00
Quiz: Bitonic Sequences Quiz 00:01:00
Bitonic Splits 00:02:00
Quiz: Bitonic Split Quiz 00:01:00
Bitonic Splits A Parallel Scheme 00:01:00
Bitonic Merge 00:02:00
Quiz: Bitonic Merge Networks 00:02:00
Quiz: Generate a Bitonic Sequence 00:01:00
Bitonic Sort 00:02:00
Conclusion 00:02:00
Scans and List Ranking
Introduction 00:03:00
Prefix Sums Definitions 00:01:00
Quiz: Prefix Sums Quiz 00:01:00
Scans 00:02:00
Quiz: Parallel Scan 00:01:00
Parallel Scans 00:03:00
Quiz: Parallel Scans Quiz 00:01:00
Parallel Scan Analysis 00:02:00
Parallel Quicksort 00:02:00
Quiz: Parallel Partitioning Quiz 00:01:00
Conditional Gathers Using Scans 00:05:00
Segmented Scans 00:03:00
Quiz: Scan Ingredients 00:01:00
List Ranking Definitions 00:02:00
Quiz: List Ranking Quiz 00:01:00
Linked Lists as Array Pools 00:03:00
Quiz: Linked Lists as Array Pools Quiz 00:01:00
Quiz: What Does This Do 00:01:00
A Parallel List Ranker 00:05:00
Quiz: How Many Jumps 00:01:00
A Parallel List Ranker (Formal) 00:01:00
Quiz: A Parallel List Ranker Quiz 00:01:00
Conclusion 00:01:00
Tree Computations
Introduction 00:02:00
Tree Warm Up 00:04:00
Quiz: Parallel Root Finder 00:01:00
Quiz: Work-Optimal List Scan/ Prefix-Sum – Part 1 00:02:00
Parallel Independent Sets – Part 1 00:04:00
Quiz: Parallel Independent Sets – Part 2 00:01:00
Quiz: Parallel Independent Sets – Part 3 00:01:00
Quiz: Parallel Independent Sets – Part 4 00:01:00
Work-Optimal List Scan/ Prefix-Sum – Part 2 00:03:00
Quiz: Work-Optimal List Scan/ Prefix-Sum – Part 3 00:01:00
A Seemingly Sequential Tree Algorithm 00:02:00
Quiz: Another Tree Traversal 00:01:00
The Euler Tour Technique 00:03:00
Quiz: The Span of a Euler Tour 00:01:00
Quiz: Computing Levels 00:02:00
Implementing Euler Tours 00:03:00
Quiz: Successor or Failure-or 00:01:00
Conclusion 00:01:00
Shared Memory Parallel BFS
Introduction 00:01:00
BFS 101 00:02:00
BFS 101 (cont) 00:03:00
Quiz: BFS Example 00:02:00
Analysis -Is BFS Inherently Sequential 00:02:00
Intuition -Why we might do better 00:02:00
Quiz: BFS Example 2 00:01:00
High Level Approach to Parallel BFS 00:02:00
Bags: Key Properties 00:02:00
Pennants, Building Blocks For Bags 00:03:00
Quiz: Pennants 00:19:00
Combining Pennants Into Bags 00:02:00
Duality Between Bags and Binary Math 00:02:00
Quiz: What is the Cost of Insertion 00:01:00
Quiz: Combining Two Bags 00:01:00
Quiz: Cost of Other Operations 00:01:00
Bag Splitting 00:02:00
Finishing the Parallel BFS with Bags 00:03:00
Conclusion 00:01:00
Intro to Dist. Memory Models
Introduction 00:02:00
Quiz: Simulating the Brain 00:02:00
A Basic Model of Distributed Memory 00:05:00
Quiz: Getting a Feel for the Alpha Beta Model 00:02:00
Applying the Rules 00:01:00
Quiz: Scenario 2 00:01:00
Quiz: Scenario 3 00:01:00
Quiz: Scenario 4 00:01:00
Collective Operations – Part 1 00:04:00
Point to Point 00:04:00
Point to Point Completion Semantics 00:03:00
Quiz: Send and Recieve in Action 00:01:00
Quiz: Send and Recieve Revisited 00:01:00
All to One Reduce Pseudocode 00:02:00
Quiz: All to One Reduce Pseudocode 00:01:00
Vector Reductions 00:01:00
Quiz: Vector Reductions 00:01:00
More Collectives 00:03:00
A Pseudocode API for Collectives 00:05:00
Quiz: All Gather, From Building Blocks 00:01:00
Collectives Lower Bounds 00:03:00
Quiz: All Gather Quiz 00:01:00
Quiz: Implement Scatter Quiz 00:01:00
Implementing Scatter and Gather – Part 2 00:03:00
Quiz: When to Use Tree Based Reduce 00:01:00
Quiz: What’s Wrong with Tree Based Reduce 00:01:00
Bucketing Algorithms for Collectives 00:03:00
Quiz: Bandwidth Optimal Broadcast 00:01:00
Quiz: All Reduce 00:01:00
Conclusion 00:01:00
Intro to MPI
Introduction 00:01:00
Topology
Introduction 00:01:00
Intro to Network Models Links and Diameter 00:03:00
Quiz: Improve the Diameter of a 2-D Mesh 00:01:00
Bisection (Band)Width 00:03:00
Some Other Network Topologies 00:05:00
Quiz: Diameter and Bisection 00:01:00
Mappings and Congestion 00:05:00
Quiz: 2-D to 1-D Congestion 00:01:00
A Lower Bound on Congestion 00:03:00
Exploiting Higher Dimensions 00:03:00
Quiz: 2D Broadcast 00:01:00
All to All Personalized Exchange 00:06:00
Quiz: All to All in Higher Dimensions 00:01:00
Conclusion 00:01:00
Dist. Dense Matrix Multiply
Introduction 00:01:00
Matrix Multiply Basic Defiitions 00:04:00
Quiz: Definitions Check 00:01:00
A Geometrical View 00:03:00
Quiz: Applying Loomis Whitney 00:01:00
1D Algorithms 00:04:00
Quiz: 1D Algorithm Cost 00:01:00
Quiz: 1D Algorithm Cost – Part 2 00:01:00
Efficiency and the 1D Algorithm 00:03:00
Quiz: Isoefficiency 00:01:00
A 2D Algorithm SUMMA 00:03:00
Quiz: SUMMA Communication Time 00:01:00
Efficiency of 2D SUMMA 00:02:00
Quiz: SUMMA Memory 00:01:00
A Lower Bound on Communication 00:07:00
Quiz: A Lower Bound on Communication Quiz 00:01:00
Matching (Or Beating!) The Lower Bounds 00:05:00
Conclusion 00:02:00
Dist. Memory Sorting
Introduction 00:02:00
Distributed Bitonic Merge Via Binary Exchange 00:04:00
Quiz: Pick a Network 00:01:00
Quiz: Communication Cost of a Bitonic Merge 00:01:00
Bitonic Merge Via Transposes 00:03:00
Quiz: Butterfly Trivia 00:01:00
Bitonic Sort Cost Computation 00:02:00
Quiz: Bitonic Sort Cost Communication 00:01:00
Linear Time Distributed Sort – Part 1 00:03:00
Quiz: Distributed Bucket Sort 00:02:00
Linear Time Distributed Sort – Part 2 00:03:00
Quiz: Cost of Distributed Sample Sort 00:01:00
Conclusion 00:02:00
Distributed BFS
Introduction 00:01:00
Graphs and Adjacency Matrices 00:02:00
Quiz: The Adjacency Matrix of a Directed Graph 00:01:00
Quiz: Losing Your Direction 00:01:00
Breadth First Search Review 00:02:00
Matrix Based BFS 00:03:00
Quiz: Matrix based BFS The Quiz! 00:01:00
1 D Distributed BFS 00:03:00
Quiz: 2 D Distributed BFS Quiz! 00:01:00
Conclusion 00:01:00
Graph Partitioning
Introduction 00:01:00
The Graph Partitioning Problem 00:03:00
Quiz: Do You Really Want a Graph Partition? 00:01:00
Graph Bisection and Planar Separators 00:03:00
Quiz: Partitioning via Breadth First Search 00:01:00
Kernighan Lin, Part 1: No Gain is Pain 00:03:00
Quiz: Kernighan Lin Algorithm Quiz! 00:01:00
Kernighan Lin Algorithm 00:03:00
Graph Coarsening 00:04:00
Quiz: Coarsen Me, Baby 00:01:00
Maximal and Maximum Matchings 00:02:00
Quiz: Find the Maximal Matching 00:01:00
Quiz: A Fact about Maximal Matchings 00:01:00
Computing a Maximal Matching 00:03:00
Fine-to Coarse and Back Again 00:02:00
Quiz: Partition Refinement 00:01:00
Spectral Partitioning, Part 1: The Graph Laplacian 00:04:00
Quiz: Graph Laplacian 00:01:00
Spectral Partitioning. Part 2: Springs Fling 00:04:00
Quiz: A 2-D Laplacian 00:01:00
Quiz: Counting Edge Cuts 00:02:00
Spectral Partitioning, Part 4 00:03:00
Conclusion 00:02:00
Basic Model of Locality
Introduction 00:01:00
A First, Basic Model 00:04:00
Quiz: Two Level Memories 00:01:00
Quiz: Alignment 00:01:00
Quiz: Minimum Transfers to Sort 00:01:00
Quiz: Minimum Transfers to Multiply Matricies 00:01:00
I/O Example Reduction 00:03:00
Quiz: Matrix Vector Multiply 00:03:00
Algorithmic Design Goals 00:02:00
Quiz: Which is Better? 00:01:00
Intensity, Balance and Time 00:03:00
Quiz: Roofline Plots 00:02:00
Quiz: Intensity of Conventional Matrix Multiply 00:02:00
Quiz: Intensity of Conventional Matrix Multiply, Part 2 00:02:00
Quiz: Informing the Architecture 00:02:00
Conclusion 00:02:00
Algorithmic Time, Energy and Power
Introduction 00:01:00
Quiz: Speed Trends 00:01:00
Quiz: Speed Limits 00:02:00
Quiz: Space Limits 00:01:00
Quiz: Balance in Time 00:02:00
Balance Principles 00:06:00
Quiz: Double, Double Toil and Trouble 00:02:00
Power Limits 00:03:00
The Dynamic Power Equation 00:04:00
Quiz: Power Motivates Parallelism 00:01:00
Quiz: Power Knobs 00:01:00
Quiz: Powerless to Choose 00:01:00
Quiz: Exploiting DVFS 00:01:00
Quiz: Algorithmic Energy 00:02:00
Quiz: Algorithmic Dynamic Power 00:01:00
Quiz: Parallelism and DVFS 00:02:00
Conclusion (Lesson 18) 00:02:00
I/O-Avoiding Algorithms
Introduction 00:01:00
Quiz: A Sense of Scale 00:03:00
External Memory Mergesort 00:02:00
Quiz: Partitioned Sorting Step Analysis 00:02:00
Two Way External Memory Merging 00:03:00
Quiz: External Memory Mergesort 00:01:00
Quiz: What’s Wrong with 2 Way Merging 00:02:00
Mulitway Merging 00:04:00
Quiz: Cost of Multiway Merge 00:02:00
A Lower Bound on External Memory Sorting 00:03:00
Quiz: How Many Transfers in Binary Search 00:01:00
Quiz: Lower Bounds for Search 00:01:00
Quiz: I/O Efficient Data Structures 00:01:00
Conclusion 00:02:00
Cache-Oblivious Algorithms
Introduction 00:01:00
An Introductory Example 00:03:00
The Ideal-Cache Model 00:05:00
Quiz: What Would an Ideal Cache Do? 00:02:00
Quiz: LRU Replacement 00:01:00
How Ideal is an Ideal Cache? 00:03:00
Proof of the LRU-OPT Competitiveness Lemma 00:04:00
Quiz: LRU Sorting 00:01:00
The Tall-Cache Assumption 00:03:00
Quiz: Which “Cache” is Tall? 00:01:00
Cache-Oblivious Matrix Multiply 00:05:00
Cache-Oblivious Binary Search 00:04:00
Quiz: Van Emde Boas Layout 00:01:00
Conclusion 00:01:00
Conclusion
So long and thanks for all the…spawns and syncs 00:02:00
Thank you and goodnight 00:02:00
Assessment
Submit Your Assignment 00:00:00
Certification 00:00:00

Course Reviews

4.7

4.7
9 ratings
  • 5 stars0
  • 4 stars0
  • 3 stars0
  • 2 stars0
  • 1 stars0

No Reviews found for this course.

386 STUDENTS ENROLLED
©2021 Edukite. All Rights Resereved
Edukite is A Part Of Ebrahim College, Charity Commission
Reg No 110841