Skip to main content
Back to top
Ctrl
+
K
Welcome
Syllabus
Lectures
1) First Class: Reproducibility and Git
2) The Linux Filesystem and commands
3) Community Projects
4) Introduction to Computer Architectures
5) Introduction to Vectorization
6) Measuring Performance
7) CPU Optimization: Matrix-Matrix Multiply
8) Blocked Matrix-Matrix Multiplication
9) Packing for cache
10) Introduction to Multithreading
11) Introduction to Parallel Scaling
12) Introduction to OpenMP
13) More on OpenMP and OpenMP Tasks
14) Case Study 1: libCEED
15) Case Study 2: ClimaCore
16) Parallel reductions and scans
17) Introduction to MPI
18) Introduction to Batch Jobs
19) Parallel Linear Algebra
20) Introduction to MPI.jl
21) Blocking and non-blocking communication
22) Collective communication
23) Coprocessor architectures
24) GPUs and CUDA
25) Practical CUDA
26) Intro to GPU programming in Julia
27) Memory management with CUDA.jl
28) Parallel reductions with CUDA.jl
29) ISPC, OpenMP target, OpenACC, and all that
30) I/O in HPC
31) Careers in HPSC
Repository
Open issue
Search
Error
Please activate JavaScript to enable the search functionality.
Ctrl
+
K