All Papers
Author: J. Gomez-Luna
2024
2023
2022
Exploiting Near-Data Processing to Accelerate Time Series Analysis [doi][arXiv]
I. Fernandez, R. Quislant, C. Giannoula, M. Alser, J. Gomez-Luna, E. Gutierrez, O. Plata, O. Mutlu
IEEE Annual Symposium on VLSI (ISVLSI'22),
Nicosia (Cyprus), July 2022
(arXiv:2206.00938 [cs.AR])
Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures [doi]
C. Giannoula, I. Fernandez, J. Gomez-Luna, N. Koziris, G. Goumas, O. Mutlu
ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS/PERFORMANCE’22),
Mumbai, India, June 2022
Benchmarking a New Paradigm: Experimental Analysis and Characterization of a Real Processing-in-Memory System [doi][arXiv]
J. Gomez-Luna, I. El Hajj, I. Fernandez, C. Giannoula, G.F. Oliveira, O. Mutlu
IEEE Access,
10, May 2022, pp. 52565-52608
(arXiv:2105.03814 [cs.AR])
CAVLCU: An Efficient GPU-based Implementation of CAVLC [doi]
A. Fuentes-Alventosa, J. Gomez-Luna, J.M. Gonzalez-Linares, N. Guil, R. Medina-Carnicer
The Journal of Supercomputing,
78, April 2022, pp. 7556-7590
SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures [doi]
C. Giannoula, I. Fernandez, J. Gomez-Luna, N. Koziris, G. Goumas, O. Mutlu
Proceedings of the ACM on Measurement and Analysis of Computing Systems,
6 (1), March 2022, pp. 1-49
2021
Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-In-Memory Hardware [doi]
J. Gomez-Luna, I. El Hajj, I. Fernandez, C. Giannoula, G.F. Oliveira, O. Mutlu
12th International Green and Sustainable Computing Conference (IGSC'21),
Pullman (WA), USA, October 2021
DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks [doi][arXiv]
G.F. Oliveira, J. Gomez-Luna, L. Orosa, S. Ghose, N. Vijaykumar, I. Fernandez, M. Sadrosadati, O. Mutlu
IEEE Access,
9, September 2021, pp. 134457-134502
(arXiv:2105.03725 [cs.AR])
DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks [arXiv]
G.F. Oliveira, J. Gomez-Luna, L. Orosa, S. Ghose, N. Vijaykumar, I. Fernandez, M. Sadrosadati, O. Mutlu
arXiv:2105.03725 [cs.AR],
July 2021
Benchmarking a New Paradigm: an Experimental Analysis of a Real Processing-in-Memory Architecture [arXiv]
J. Gomez-Luna, I. El Hajj, I. Fernandez, C. Giannoula, G.F. Oliveira, O. Mutlu
arXiv:2105.03814 [cs.AR],
July 2021
SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures [doi][arXiv]
C. Giannoula, N. Vijaykumar, N. Papadopoulou, V. Karakostas, I. Fernandez, J. Gomez-Luna, L. Orosa, N. Koziris, G. Goumas, O. Mutlu
27th IEEE International Symposium on High-Performance Computer Architecture (HPCA'21),
Seoul (South Korea), February-March 2021
(arXiv:2101.07557 [cs.AR])
2020
NATSA: A Near-Data Processing Accelerator for Time Series Analysis [doi][arXiv]
I. Fernandez, R. Quislant, C. Giannoula, M. Alser, J. Gomez-Luna, E. Gutierrez, O. Plata, O. Mutlu
38th IEEE International Conference on Computer Design (ICCD'20),
Hardtford (CT, USA), October 2020
(arXiv:2010.02079 [cs.AR])
2019
2018
High-Performance Computation of Bézier Surfaces on Parallel and Heterogeneous Platforms [doi]
R. Palomar, J. Gomez-Luna, F.A. Cheikh, J. Olivares-Bueno, O.J. Elle
International Journal of Parallel Programming,
46 (6), December 2018, pp. 1035-1062
Improving Tasks Throughput on Accelerators Using OpenCL Command Concurrency [arXiv]
A.J. Lazaro, J.M. Gonzalez-Linares, J. Gomez-Luna, N. Guil
arXiv:1806.10113 [cs.DC],
July 2018
2017
A Tasks Reordering Model to Reduce Transfers Overhead on GPUs [doi]
A.J. Lazaro-Muñoz, J.M. Gonzalez-Linares, J. Gomez-Luna, N. Guil
Journal of Parallel and Distributed Computing,
109, November 2017, pp. 258-271
Efficient OpenCL-based Concurrent Tasks Offloading on Accelerators [doi]
A.J. Lazaro-Muñoz, J.M. Gonzalez-Linares, J. Gomez-Luna, N. Guil
International Conference on Computational Science (ICCS’17),
Zurich (Switzerland), June 2017
(Elsevier Procedia Computer Science, Vol. 108, P. Koumoutsakos, M. Lees, V. Krzhizhanovskaya, J. Dongarra and P. Slootpp, Eds., pp. 1353-2357)
Collaborative Computing for Heterogeneous Integrated Systems [doi]
L-W. Chang, J. Gomez-Luna, I.E. Hajj, S. Huang, D. Chen, W-M. Hwu
8th ACM/SPEC on International Conference on Performance Engineering (ICPE’17),
L’Aquila (Italy), April 2017
Chai: Collaborative Heterogeneous Applications for Integrated-Architectures [doi]
J. Gomez-Luna, I.E. Hajj, L-W. Chang, V. Garcia-Floreszx, S.G. de Gonzalo, T.B. Jablin, A.J. Peña, W-M. Hwu
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’17),
Santa Rosa (CA), USA, April 2017
2016
Configurable XOR Hash Functions for Banked Scratchpad Memories in GPUs [doi]
G-J. van den Braak, J. Gomez-Luna, J.M. Gonzalez-Linares, H. Corporaal, N. Guil
IEEE Transactions on Computers,
65 (7), July 2016, pp. 2045-2058
In-Place Matrix Transposition on GPUs [doi]
J. Gomez-Luna, I-J. Sung, L-W. Chang, J.M. Gonzalez-Linares, N. Guil, W-M. W. Hwu
IEEE Transactions on Parallel and Distributed Systems,
27 (3), March 2016, pp. 776-788
2015
Calculation of Dense Trajectory Descriptors on a Heterogeneous Embedded Architecture [doi]
J.R. Cozar, M.J. Marin-Jimenez, J.M. Gonzalez-Linares, N. Guil, J. Gomez-Luna
Journal of Systems Architecture,
61 (10), November 2015, pp. 659-667
2014
CUVLE: Variable-Length Encoding on CUDA [doi]
A. Fuentes-Alventosa, J. Gomez-Luna, J.M. Gonzalez-Linares, N. Guil
Conference on Design & Architectures for Signal & Image Processing (DASIP’14),
Madrid (Spain), October 2014
Asynchronous Tasks Queue Scheme on GPU [link]
A.J. Lazaro-Muñoz, J. Gomez-Luna, J.M. Gonzalez-Linares, N. Guil
XXV Jornadas de Paralelismo (JJPP'14) (parte de las Jornadas Sarteco),
Valladolid (Spain), September 2014
Low-Textured Regions Detection for Improving Stereoscopy Algorithms [doi]
S. Ibarra-Delgado, J.R. Cozar, J.M. Gonzalez-Linares, J. Gomez-Luna, N. Guil
International Conference on High Performance Computing & Simulation (HPCS’14),
Bologna (Italy), July 2014, pp. 676-680
In-Place Transposition of Rectangular Matrices on Accelerators [doi]
I-J. Sung, J. Gomez-Luna, J.M. Gonzalez-Linares, N. Guil, W-M. Hwu
19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’14),
Orlando (FL), USA, February 2014, pp. 207-218
2013
A Robust and Low Resource FPGA-based Stereoscopic Vision Algorithm [doi]
S. Ibarra-Delgado, M. Hernandez-Calviño, N. Guil, J. Gomez-Luna
International Conference on Reconfigurable Computing and FPGAs (ReConFig'13),
Cancun (Mexico), December 2013
Performance Modeling of Atomic Additions on GPU Scratchpad Memory [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
IEEE Transactions on Parallel and Distributed Systems,
24 (11), November 2013, pp. 2273-2282
K-Means con Ordenación, Actualización y Desigualdad Triangular en GPU [link]
A.J. Lazaro-Muñoz, N. Guil, J.M. Gonzalez-Linares, J. Gomez-Luna
XXIV Jornadas de Paralelismo (JJPP'13) (parte de las Jornadas Sarteco),
Madrid (Spain), September 2013
An Optimized Approach to Histogram Computation on GPU [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
Machine Vision and Applications,
24 (5), July 2013, pp 899-908
2012
Performance MOdels for Asynchronous Data Transfers on Consumer Graphics Processing Units [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
Journal of Parallel and Distributed Computing,
72 (9), September 2012
2011
Egomotion Compensation and Moving Objects Detection Algorithm on GPU [doi]
J. Gomez-Luna, H. Endt, W. Stechele, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
International Conference on Parallel Computing (ParCo’11),,
Ghent (Belgium), August-September 2011
(Advances in Parallel Computing, Vol. 22: Applications, Tools and Techniques on the Road to Exascale Computing, IOS Press, pp. 183-190, 2012)
Load Balancing Versus Occupancy Maximization on Graphics Processing Units: The Generalized Hough Transform as a Case Study [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, E.L. Zapata, N. Guil
International Journal of High Performance Computing Applications,
25 (2), May 2011, pp. 205-222
2010
2009
FPGA Implementation of The Generalized Hough Transform [doi]
S.R. Geninatti, J.I. Benavides, M. Hernandez-Calviño, N. Guil, J. Gomez-Luna
International Conference on Reconfigurable Computing and FPGAs (ReConFig'09),
Quintana Roo, Mexico, December 2009
Analisis de la Capacidad Stream Managemnent de CUDA para Procesamiento de Video
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
XXI Jornadas de Paralelismo (JJPP'09),
A Coruña (Spain), September 2009
Parallelization Of a Video Segmentation Algorithm On CUDA-Enabled Graphics Processing Units [doi]
J. Gomez-Luna, J.M. Gonzalez-Linares, J.I. Benavides, N. Guil
15th International Conference on Parallel and Distributed Computing (Euro-Par'09),
Delft, The Netherlands, August 2009
(Springer, LNCS 5704, H. Sips, D. Epema and H-X. Lin, Eds., pp. 924-935)
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1986
1985
1984
1983
1982
1981
Select Publications
- All publications
- Journals
- Conferences
- Other publications
Click on an author to show his publications in the selected group