Xinyi Li
Search
CTRL + K
Xinyi Li
Search
CTRL + K
Blogs
AMD matrix cores
CG -- Eigenvalues (Eigenvectors) and convergency
CG -- Gram-Schmidt Conjugation
CG -- The Method of Conjugate Directions
CG -- The Method of Steepest Descent
Conjugate Gradient
FPchecker - CMakeLists.txt
FPChecker and FLiT -- Analysis for benchmarks
FPChecker and FLiT Analysis -- tiny
FPChecker exploration -- mixed precision
FPChecker Installation
FPChecker issue -- cannot link the openMP lib
FPChecker
FPChecker_PLDI22 - precision parts
How NaN generates
IEEE - Special values (NaN, INF, subnormal)
Implement neural network
Investigation about mixed precision in NVIDIA (APEX.AMP)
Investigation on NVIDIA Tensor cores v.s. AMD Matrix cores
SASS Semantics -- Half instructions store pattern
Matrix Multiplication Background
NVIDIA GPU Performance Background
NVIDIA tensor cores
Programming tensor cores using nvcuda-wmma
Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance -- Hiroyuki Ootomo, Rio Yokota
Summarize NaN issues
Tiled Matrix Multiplication -- CUDA implementation
Warp Matrix Functions
Which NaN, INF matters
Paper Reading Annotate
Training Mixed Precision User Guide
Training Neural Networks with Mixed Precision
Paper Summary
Learning Concise Models from Long Execution Traces (trace2model)
Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance
THE EFFECTS OF NUMERICAL PRECISION IN SCIENTIFIC APPLICATIONS
FPChecker_PLDI22 - precision parts
Blogs Map
Homepage
Publications
pytorch -- nan
FPChecker and FLiT -- Analysis for benchmarks
FPChecker and FLiT Analysis -- tiny