🚀 LeetGPU Solutions & Progress Tracker

🔗 Profile:lzyrapx on LeetGPU | 🎯 Challenges: LeetGPU Challenges

Progress Summary: Actively conquering GPU programming challenges across multiple frameworks. Currently focusing heavily on CUDA and PyTorch, with ongoing explorations into modern compilers and languages like Triton, Mojo, and TinyGrad.

🧮 Matrix & Linear Algebra

Core BLAS operations, matrix manipulation, and quantized variations.

Problems	CUDA	PyTorch	Triton	Mojo	TinyGrad	Cute DSL
Batched Matrix Multiplication	✅	✅
Dot Product	✅	✅
FP16 Batched Matrix Multiplication	✅
FP16 Dot Product	✅
GEMM (FP16)	✅	✅
INT8 Quantized MatMul	✅	✅
Matrix Addition	✅
Matrix Copy	✅	✅		✅
Matrix Multiplication	✅	✅	✅	✅	✅	✅
Matrix Power	✅
Matrix Transpose	✅	✅	✅	✅		✅
Sparse Matrix-Vector Multiplication	✅	✅

🧠 Deep Learning & Neural Network Layers

Attention mechanisms, normalizations, activations, and modern LLM kernels.

Problems	CUDA	PyTorch	Mojo
Attention with Linear Biases	✅
Batch Normalization	✅
Categorical Cross Entropy Loss	✅	✅
Gaussian Error Gated Linear Unit	✅
Leaky ReLU	✅	✅	✅
Linear Self-Attention	✅
LoRA Linear	✅
Mean Squared Error	✅	✅
Multi-Head Self-Attention	✅
ReLU	✅	✅	✅
RMS Normalization	✅
Rotary Positional Embedding	✅
Sigmoid Activation	✅
Sigmoid Linear Unit	✅
Simple Inference		✅
Sliding Window Self-Attention	✅
Softmax	✅	✅
Softmax Attention	✅	✅
Swish-Gated Linear Unit	✅
Weight Dequantization	✅

🖼️ Convolutions, Image & Signal Processing

Filtering, FFT, max pooling, and spatial transformations.

Problems	CUDA	PyTorch	Triton	Mojo	TinyGrad	Cute DSL
1D Convolution	✅	✅	✅	✅	✅
2D Convolution	✅	✅
2D Max Pooling	✅
3D Convolution	✅
Color Inversion	✅	✅		✅		✅
Fast Fourier Transform	✅
Gaussian Blur	✅	✅
RGB to Grayscale	✅

🧩 Core Algorithms, Memory & Arrays

Parallel reductions, prefix sums, sorting, and array manipulations.

Problems	CUDA	PyTorch	Triton	Mojo	TinyGrad	Cute DSL
2D Subarray Sum	✅
3D Subarray Sum	✅
Count Array Element	✅	✅
Count 2D Array Element	✅	✅
Count 3D Array Element	✅
Histogramming	✅	✅
Interleave Arrays	✅
Max Subarray Sum	✅
Merge Sorted Arrays	✅
Parallel Merge	✅
Prefix Sum	✅	✅
Radix Sort	✅	✅
Reduction	✅	✅
Reverse Array	✅	✅		✅
Sorting	✅	✅
Subarray Sum	✅
Top-K Selection	✅	✅
Value Clipping	✅
Vector Addition	✅	✅	✅	✅	✅	✅

⚙️ Machine Learning, Graph & Others

Stencils, regressions, graph traversal, and simulation algorithms.

Problems	CUDA	PyTorch	Mojo
2D Jacobi Stencil	✅
All-Pairs Shortest Paths	✅
BFS Shortest Path	✅
K-Means Clustering	✅
Linear Recurrence	✅
Logistic Regression	✅	✅
Monte Carlo Integration	✅	✅	✅
Multi-Agent Simulation	✅
Nearest Neighbor	✅
Ordinary Least Squares	✅	✅
Password Cracking	✅
Rainbow Table	✅	✅	✅

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
cuda		cuda
cute dsl		cute dsl
mojo		mojo
pytorch		pytorch
tinygrad		tinygrad
triton		triton
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 LeetGPU Solutions & Progress Tracker

🧮 Matrix & Linear Algebra

🧠 Deep Learning & Neural Network Layers

🖼️ Convolutions, Image & Signal Processing

🧩 Core Algorithms, Memory & Arrays

⚙️ Machine Learning, Graph & Others

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 LeetGPU Solutions & Progress Tracker

🧮 Matrix & Linear Algebra

🧠 Deep Learning & Neural Network Layers

🖼️ Convolutions, Image & Signal Processing

🧩 Core Algorithms, Memory & Arrays

⚙️ Machine Learning, Graph & Others

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages