pytorch -- nan

Issue: NaN loss
Guessing

Using mixed precision will cause non-scaled gradient became NaN which caused the final loss as NaN