(Intel GmbH, Germany Microprocessor Lab, Braunschweig)
My research includes assessing the
implications of future DRAM
(like eDRAM, graphics RAM and HBM) and Non-Volatile Memory (NVM)
technologies
(such as Phase-Change Memory [PCM], 3DXPoint and Spin-Transfer Torque [STT] RAM) on the
organization in several
levels of a memory hierarchy, including software layers. My
contributions lie in designing
individual controller features to whole memory subsystem setups to
orchestrate the movement of data based on design objectives for energy
efficiency, performance and costs. Choices of the
architecture can impact the resilience and security of the system, as
well as the programming model (e.g., for software-managed persistent
memories). My evaluations are based on co-emulation of hardware and
software
components with FPGAs and system simulators in order to assess the
benefits of in-memory computing, data mining & analytics, and
responsiveness by leveraging large capacity, durable, byte-addressable
NVM RAM for handheld (tablet, smartphone) and server applications.
If we face the implications of System-on-Chip architectures within the next decade, we will have to handle highly concurrent and heterogeneous chip architectures that will be constrained by power dissipation and thermal limits. Traditionally, thermal, power and workload management have been treated separately to optimize for performance and energy efficiency. High-end SoC architectures already have 50+ manageable processing and communication resources (processors, accelerators, caches, buses, memory channels), such that new integrated approaches are needed to fully use these architectures by orchestrating individual workload, power and performance states. I particularly investigate the impact of application-level profiling and instrumentation steps in control by the programmer (such as access to dynamic voltage and frequency scaling [DVFS] and sensorial input) to steer lower-level resource management decisions, together with partners at the University of Bologna, University of Edinburgh and TU München.
I was a member of the DDR3 memory controller team responsible for specification, design and validation of the four controllers in SCC implemented in 45nm. Intel Labs announced the experimental chip on Dec. 2nd 2009 at several locations including Braunschweig and San Francisco. Details of the chip were presented at ISSCC in Feb. 2010 and two symposia. Technical material about SCC was shared in Intel's former MARC community (communities.intel.com/community/marc).
I gave a talk about the SCC at the 3rd
workshop on Emerging Applications and Many-Core Architecture (EAMA) at the 37th
Int'l Symposium on Computer Architecture (ISCA) in June 2010. I also
regularly reviewed contributions to Intel's Manycore Applications
Research Community (MARC) and associated symposia from 2010 to '13.
M. Gries, U. Hoffmann, M. Konow, M. Riepen: SCC: A Flexible Architecture for Many-Core Platform Research, Novel Architectures Column, Computing in Science and Engineering, vol. 13(6), pages 79-83, Nov./Dec. 2011
N. Ioannou, M. Kauschke, M. Gries, M. Cintra: Phase-based Application-driven Hierarchical Power Management on the Single-chip Cloud Computer, 20th int'l conference on Parallel Architectures and Compilation Techniques (PACT), pages 131-142, Oct. 2011
J. Howard, S. Dighe, S.R. Vangal, G. Ruhl, N. Borkar, S. Jain, V. Erraguntla, M. Konow, M. Riepen, M. Gries, G. Droege, T. Lund-Larsen, S. Steibl, S. Borkar, V.K. De, R. Van Der Wijngaart: A 48-Core IA-32 Processor in 45nm CMOS Using On-Die Message-Passing and DVFS for Performance and Power Scaling, IEEE Journal of Solid State Circuits (JSSC), vol. 46(1), pages 173-183, Jan. 2011
B. Dietrich, S. Nunna, D. Goswami, S. Chakraborty, M. Gries: LMS-based Low-Complexity Game Workload Prediction for DVFS, 28th IEEE Int'l Conference on Computer Design (ICCD), pages 417-424, Oct. 2010
A. Bartolini, M. Cacciari, A. Tilli, L. Benini, M. Gries: A Virtual Platform Environment for Exploring Power, Thermal and Reliability Management Control Strategies in High-performance Multicores, ACM Great Lakes Symposium on VLSI (GLSVLSI), pages 311-316, May 2010