Xinyi Li

Search CTRL + K

Xinyi Li

Search CTRL + K

Blogs

AMD matrix cores

CG -- Eigenvalues (Eigenvectors) and convergency

CG -- Gram-Schmidt Conjugation

CG -- The Method of Conjugate Directions

CG -- The Method of Steepest Descent

Conjugate Gradient

FPchecker - CMakeLists.txt

FPChecker and FLiT -- Analysis for benchmarks

FPChecker and FLiT Analysis -- tiny

FPChecker exploration -- mixed precision

FPChecker Installation

FPChecker issue -- cannot link the openMP lib

FPChecker_PLDI22 - precision parts

How NaN generates

HPC system - CHPC

IEEE - Special values (NaN, INF, subnormal)

Implement neural network

Investigation about mixed precision in NVIDIA (APEX.AMP)

Investigation on NVIDIA Tensor cores v.s. AMD Matrix cores

SASS Semantics -- Half instructions store pattern

Matrix Multiplication Background

NVIDIA GPU Performance Background

NVIDIA tensor cores

Programming tensor cores using nvcuda-wmma

Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance -- Hiroyuki Ootomo, Rio Yokota

Summarize NaN issues

Tiled Matrix Multiplication -- CUDA implementation

Warp Matrix Functions

Which NaN, INF matters

Paper Reading Annotate

Training Mixed Precision User Guide

Training Neural Networks with Mixed Precision

Paper Summary

Learning Concise Models from Long Execution Traces (trace2model)

Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance

THE EFFECTS OF NUMERICAL PRECISION IN SCIENTIFIC APPLICATIONS

FPChecker_PLDI22 - precision parts

Enter to select

to navigate

ESC to close

FPChecker and FLiT -- Analysis for benchmarks

FPChecker and FLiT Analysis -- tiny

Pages mentioning this page

No other pages mentions this page