SPLASH2 LU (2)
•
Applied batched checks to the inner-loop
daxpy function
•
5.3 times speedup with 8PEs