2024
Accelerating Machine Learning Inference on GPUs with SYCL
Proceedings of the 12th International Workshop on OpenCL and SYCL. April 2024.
2022
FPGA Roofline modeling and its Application to Visual SLAM
International Conference on Field Programmable Logic and Applications (FPL). August 2022. Belfast, United Kingdom .
2020
Dynamic Undervolting to Improve Energy Efficiency on Multicore X86 CPUs
IEEE Transactions on Parallel and Distributed Systems. December 2020.
2018
Exploring the Effects of Code Optimizations on CPU Frequency Margins
Workshop in Approximate and Transprecision Computing on Emerging Technologies (ATCET), in conjunction with the International Supercomputing Conference (ISC). June 2018. Frankfurt, Germany.
A Framework for Evaluating Software on Reduced Margins Hardware
48th International Conference on Dependable Systems and Networks (DSN). June 2018. Luxemburg.
2017
Significance-Aware Program Execution on Unreliable Hardware
ACM Transactions on Architecture and Code Optimization (TACO). April 2017.
2016
Towards automatic significance analysis for approximate computing
International Symposium on Code Generation and Optimization (CGO). March 2016. Barcelona, Spain.
Exploiting Significance of Computations for Energy-Constrained Approximate Computing
International Journal of Parallel Programming (IJPP). March 2016.
2015
A significance-driven programming framework for energy-constrained approximate computing
ACM International Conference on Computing Frontiers (CF). May 2015. Ischia, Italy.
A programming model and runtime system for significance-aware energy-efficient computing
ACM 20th Symposium on Principles and Practice of Parallel Programming (PPoPP). February 2015. San Francisco, CA.
2014
GemFI: A Fault Injection Tool for Studying the Behavior of Applications on Unreliable Substrates
International Conference on Dependable Systems and Networks (DSN). June 2014. Atlanta, GA.
2011
Massively parallel programming models used as hardware description languages: The OpenCL case
International Conference on Computer-Aided Design (ICCAD). November 2011. San Jose, CA.
Implementation and Performance Comparison of the Motion Compensation Kernel of the AVS Video Decoder on FPGA, GPU and Multicore Processors
IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). May 2011. Salt Lake City, UT.
Synthesis of Platform Architectures from OpenCL Programs
IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). May 2011. Salt Lake City, UT.
GLOpenCL: OpenCL support on hardware- and software-managed cache multicores
6th International Conference on High Performance Embedded Architectures & Compilers (HiPEAC). January 2011. Heraklion, Greece.
2010
Fisheye lens distortion correction on multicore and hardware accelerator platforms
24th International Parallel and Distributed Processing Symposium (IPDPS). April 2010. Atlanta, GA.
2009
Implementation of a wide-angle lens distortion correction algorithm on the cell broadband engine
23rd International Conference on Supercomputing (ICS). June 2009. New York Metro Area, NY.