Improve the reliability of scientific software through analyzing and debugging floating-point software.
Ignacio is developing tutorials on tools to debug and improve the reliability of scientific applications, with emphasis on floating-point software defects that are difficult to debug. Scientific software relies heavily on floating-point arithmetic, which replaces real numbers by finite precision numbers. Approximations occur at all levels, and programmers must deal with exceptional cases and sometimes hard to debug scenarios. With larger and more complex systems, e.g., using accelerators that do not implement the IEEE 754 floating-point standard fully, the need for such tools is becoming increasingly important to ensure the reliability of future scientific software. The tutorial will present tools that can be used today (as opposed to tools that are in the research phase) to help programmers detect and fix abnormal floating-point behavior in GPUs, isolate compiler optimizations that can perturb the program output and/or trade performance for accuracy, understand the sensitivity of floating-point operations in the code, and possibly other issues.
Ignacio is a Computer Scientist at the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory (LLNL). His main area of research is high-performance computing (HPC) and main sub-area of research in HPC is programming models and systems.
Selected resources
HPC Best Practices Webinar: Tools and Techniques for Floating-Point Analysis Tutorial on Floating-Point Analysis Tools Tutorial on Floating-Point Analysis Tools Presented at PEARC19 Tutorial on Floating-Point Analysis Tools Presented at SC19 Tutorial on Floating-Point Analysis Tools Presented at Los Alamos National Laboratory