I am an Assistant Professor of Computer
Science at Stony Brook University. I co-direct the
Computer Architecture Stony Brook (COMPAS) Lab. Prior to joining
Stony Brook, I completed my Ph.D. at Carnegie Mellon University
(CMU) under the
supervision of Babak Falsafi. While completing my
dissertation, I spent several years working remotely from Ecole
Polytechnique Fédérale de Lausanne (EPFL).
My research interests are in the area of computer
architecture, with emphasis on the design of server systems. I
work on the entire computing stack, from server software and
operating systems, to networks and processor microarchitecture.
My current research projects include FPGA accelerator integration
into server environments (e.g., Intel HARP, Microsoft Catapult,
and Amazon F1), FPGA programmability (e.g., virtual memory and
high-level synthesis), accelerators for machine learning (e.g.,
convolutional neural networks), efficient network processing and
software-defined networking, speculative performance and
energy-enhancing techniques for high-performance processors, and
programming models and mechanisms for emerging memory
technologies (e.g., HBM and 3D XPoint).
If you are a PhD student at Stony Brook and want to work
with me, please send me email to
arrange an appointment.
| ||Proactive Instruction Fetch |
, In 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2011.
| ||Toward Dark Silicon in Servers |
, In IEEE Micro, volume 31, 2011.
| ||Cuckoo Directory: A Scalable Directory for Many-Core Systems |
, In 17th IEEE International Symposium on High Performance Computer Architecture (HPCA), 2011. (selected by the program committee for Best Student Papers session)
| ||Spatial Memory Streaming |
, In Journal of Instruction-Level Parallelism (JILP), volume 13, 2011.
Courses I have taught or teaching currently:
- Fall '17 - CSE 506 - Operating Systems (grad)
- Spring '17 - CSE 502 - Computer Architecture (grad)
- Spring '17 - CSE 356 - Cloud Computing
- Fall '15 - CSE 506 - Operating Systems (grad)
- Fall '15 - CSE 391(356) - Cloud Computing
- Spring '15 - CSE 506 - Operating Systems (grad)
- Fall '14 - CSE 602 - Advanced Computer Architecture (grad)
- Spring '14 - CSE 502 - Computer Architecture (grad)
- Fall '13 - CSE 506 - Operating Systems (grad)
- Spring '13 - CSE 502 - Computer Architecture (grad)
- Fall '12 - CSE 59x(602) - Datacenters (grad)
Assistant Professor, Computer Science, Stony Brook University
Co-Director, Computer Architecture at Stony Brook (COMPAS) Laboratory
Curriculum Vitae - June 2018
+1 (631) 632-8449
Department of Computer Science
343 New Computer Science
Stony Brook, NY 11794-2424
Computer architecture, with particular emphasis on the design of efficient server systems.
Most recently, my main focus has been on Machine Learning Accelerators, developing hardware techniques to enable fast and efficient implementations of deep learning, and making FPGA-based accelerators more practical and easier to program.
More broadly, my work seeks to understand the fundamental properties and interactions of application software, operating systems, networks, processor microarchitecture, and datacenter dynamics, to enable software and hardware co-design of high-performance, power-efficient, and compact servers.
- Carnegie Mellon University Pittsburgh, PA
- Ph.D. in Electrical and Computer Engineering June 2012
- M.S. in Electrical and Computer Engineering December 2002
- B.S. in Electrical and Computer Engineering December 2002
- B.S. in Computer Science May 2002
Honors and Awards
- David R. Smith Young Scholar in Computer Science Award (2016-2020)
- NSF CAREER Award (2015)
- Graduate Teaching Award (2014)
- Best Paper Award at the 11th International Conference on Virtual Execution Environment (VEE) for "A Comprehensive Implementation and Evaluation of Direct Interrupt Delivery."
- IEEE Micro Top Picks from Computer Architecture Conferences of 2013, "A Case for Specialized Processors for Scale-Out Workloads."
- Best Paper Award at the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) for "Clearing the Clouds: A Study of Emerging Scale-out Workloads on Modern Hardware."
- Best Paper Finalist at the 17th International Symposium on High-Performance Computer Architecture (HPCA) for "Cuckoo Directory: A Scalable Directory for Many-Core Systems."
- Paper Award from the European Network of Excellence on High Performance and Embedded Architecture and Compilation (HiPEAC) for "Cuckoo Directory: A Scalable Directory for Many-Core Systems."
- IEEE Micro Top Picks from Computer Architecture Conferences of 2009, "R-NUCA: Data Placement in Distributed Shared Caches."
- IEEE Micro Top Picks from Computer Architecture Conferences of 2009, "Practical Off-chip Meta-data for Temporal Memory Streaming."
- 2005 DARPA Grand Challenge driverless desert race, 2nd and 3d place autonomous vehicles for RedTeam.
Publications (Total: 35, Conference: 23, Journal: 7; Google Scholar: 2449 citations; ISI Web of Science: 505 citations)
- System and Method for Fused Computation of Convolutional Neural Network Layers.
Manoj Alwani, Michael Ferdman, and Peter Milder. Filed October 11, 2016. (pending)
- Stony Brook University Stony Brook, NY
- CSE 506 - Graduate Operating Systems Fall 2017
- CSE 502 - Graduate Computer Architecture Spring 2017
- CSE 356 - Cloud Computing Spring 2017
- CSE 506 - Graduate Operating Systems Fall 2015
- CSE 356(391) - Cloud Computing Fall 2015
- CSE 506 - Graduate Operating Systems Spring 2015
- CSE 602 - Graduate Advanced Computer Architecture Fall 2014
- CSE 502 - Graduate Computer Architecture Spring 2014
- CSE 506 - Graduate Operating Systems Fall 2013
- CSE 502 - Graduate Computer Architecture Spring 2013
- CSE 602 - Graduate Advanced Computer Architecture Fall 2012
- Ecole Polytechnique Fédérale de Lausanne Lausanne, Switzerland
- TA - Advanced Topics on Memory Systems (graduate) Spring 2009 (Babak Falsafi)
- TA - Multiprocessor Architecture (graduate) Fall 2008 (Babak Falsafi)
- Carnegie Mellon University Pittsburgh, PA
- TA - Multiprocessor Architecture (graduate) Spring 2006 (Babak Falsafi)
- TA - Advanced Techniques in Microprocessors (PhD) Fall 2005 (Babak Falsafi)
- TA - Operating Systems (undergraduate) Fall 2001 (Gregory Kesden)
- TA - Embedded Systems (undergraduate) Fall 2001 (Raj Rajkumar)
- T elinta, Inc. Springfield, NJ
- Chief Technology Officer 2002-
- Cadence Design Systems Pittsburgh, PA
- Software Engineer April 2004-August 2007
- Neolinear, Inc. (startup acquired by Cadence) Pittsburgh, PA
- Software Engineer March 2003-April 2004
- Automatika, Inc. Pittsburgh, PA
- Independent Contractor September 2002-January 2003
- National Robotics Engineering Consortium Pittsburgh, PA
- Circuit Designer and Software Engineer February 2001-May 2002
- Organizing committees: ISCA'17 (finance chair), IISWC'17 (travel grant chair), HPCA'17 (workshops & tutorials chair), ISPASS'17 (workshops & tutorials chair), ISPASS'16 (publication chair), ACM SRC at CGO'15 (local organizer), ISPASS'15 (publication chair), MICRO'14 (publication chair), ISPASS'14 (web chair)
- Program committees: HPCA'19, MICRO'18, ICCD'18, IISWC'18, ISCA'18, DAC'18, GLSVLSI'18, HPCA'18, MICRO'17, ISCA'17, HPCA'17, CRC'17, ISCA'16, IISWC'16, ISPASS'16, MICRO'15, IISWC'15, ISCA'15, CGO'15, MICRO'14, ICS'14, ICPP'14, HiPEAC'14, ICCD'13, WIVOSCA'13, DATE'13, CCGrid'13, ISPASS'13, IPDPS'13
- NSF invited workshops: Workshop on Sustainable Data Centers '15, XPS Workshop on Exploiting Parallelism and Scalability '15
- External reviewer: CAL'17, IEEE Micro'17, ACM TACO'17, ACM TOS'16, MICRO'16, ACM TACO'16, HPCA'16, ACM TACO'15, CAL'15, HPCA'15, ASPLOS'15, CF'14, ISCA'14, TC'14, HPCA'14, PPoPP'14, CAL'13, DAC'13, HPCA'13, JCST'13, MICRO'12, IISWC'12, CAL'12, HPCA'12, IISWC'11, MICPRO/DSD'11, ICS'11, ISCA'11, HPCA'11, HiPEAC'10, ISCA'10, HPCA'10, JPDC'09
- NSF service: 2016 (panelist, reviewer), 2014 (panelist)
- Invited Lectures and Talks: Cloud Computing course at HiPEAC ACACES'17, Keynote at RAPIDO'13
- PhD committees: Weicheng Liu (Low Voltage Clocking Methodologies for Nanoscale ICs), Tan Li (Harness Multicore Parallelism for High Performance Data Replication), Fatima Zarinni (Understanding and Improving Performance in Next-Generation WiFi and Cellular Networks), Mingwei Zhang (Static Binary Instrumentation with Applications to COTS Software Security), Niranjan Hasabnis (Infrastructure for Architecture-independent Binary Analysis and Transformation), Vasily Tarasov (Multi-dimensional Analysis of I/O Workloads for Modern Storage Systems), Zhichao Li (GreenDM: A Versatile Tiering Hybrid Drive for the Trade-Off Evaluation of Performance, Energy, and Endurance), Cheng-Chun (William) Tu (Memory-based Rack-area Network)
- MS committees: Bharath Kumar Reddy Vangoor (To FUSE or not to FUSE?), Kavita Agarwal (A Study of Virtualization Overheads), Arun Olappamanna Vasudevan (Finding the right balance - Security vs Performance with Network Storage Systems)
- Department Service: CS Operations Committee (S'17, F'16, S'16), Graduate Committee (S'17, F'16, S'16), Undergraduate Committee (S'16), Open House Chair (S'17, S'16, S'15), Graduate Admission Committee (S'17, F'16, S'16, F'15, S'15, S'14, F'14, F'13, S'13, F'12), Faculty Recruitment Committee (S'17, S'14, F'14), Department Orientation Organizer (F'16)
- Co-developer of CloudSuite, a benchmark suite for scale-out workloads.
- Co-developer of FLEXUS, a scalable, full-system, cycle-accurate multi-processor and multi-core simulation framework between 2005 and 2012.
- SIMFLEX and ProtoFlex: Fast, Accurate, and Flexible Simulation of Computer Systems Tutorial at
- 2010 IEEE International Symposium on Workload Characterization (IISWC). Atlanta, GA, December 2010 with Eric Chung, Pejman Lotfi-Kamran, and Michael Papamichael.
- 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). New York, NY, December 2009 with Eric Chung and Michael Papamichael.
- 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), Toronto, Canada, October 2008 with Eric Chung and Nikos Hardavellas.
- Organizer of the Fall 2009 weekly seminar of the Systems Labs at Ecole Polytechnique Fédérale de Lausanne.
- Organizer of the Fall 2007 weekly seminar of the Computer Architecture Lab at Carnegie Mellon (CALCM).
- Member, IEEE Computer Society, ACM SIGARCH, ACM SIGMICRO, ACM SIGOPS, HiPEAC Associate.
- Intel Corporation - FPGA Hardware for research
Donation, equipment ($5,500), 6/22/2018
- National Science Foundation - SPX: Harnessing the Power of High-Bandwidth Memory via Provably Efficient Parallel Algorithms
PI, $750,000 ($500,000 SBU, $250,000 WUSTL), 9/15/2017 - 8/14/2021
- Xilinx Corporation - FPGA Hardware for research
Donation, equipment ($7,000), 7/28/2017
- Samsung - SSD Hardware for research
Donation, equipment ($2,000), 7/6/2017
- National Science Foundation - Domestic student travel grant funding for IISWC
PI, $15,000, 6/01/2017 - 12/31/2017
- National Science Foundation - EAGER: Measuring the Stability of Web Links
Co-PI, $89,200, 4/15/2017 - 10/15/2017
- National Science Foundation - Research Experiences for Undergraduates: Secure and Efficient Cloud Infrastructure and Accessibility Services
PI, $21,900, 8/10/2016 - 8/9/2017
- National Science Foundation - EAGER: Preliminary Study to Demonstrate Feasibility and Advantages of Massively Parallel Server Processors
Co-PI, $146,000, 10/1/2016 - 9/30/2017
- Oracle Labs - Exploring Custom Graph Algorithms with PGX and Green-Marl
Gift, $55,000, 8/17/2016
- Google - Taming the Killer Microsecond
Gift, $58,500, 9/2/2016
- National Science Foundation - XPS: FPGA Cloud Platform for Deep Learning, Applications in Computer Vision
PI, $875,000 ($574,000 SBU, $301,000 UNC), 9/1/2015 - 8/31/2019
- Intel Corporation - Hardware for research
Donation, equipment ($21,600), 8/6/2015
- National Science Foundation - CAREER: Leveraging temporal streams for micro-architectural innovation in data center servers
PI, $500,000, 2/15/2015 - 1/31/2020
- National Science Foundation - EAGER: Preliminary Study to Demonstrate the Performance and Power Advantages of FPGAs for Deep Learning in Computer Vision
PI, $95,000, 8/1/2014 - 7/31/2016
- Altera Corporation - FPGA Hardware for research
Donation, equipment ($16,000), 10/22/2014
- Cavium - Support of research activities
Gift, $34,400 + equipment, 7/17/2014
- National Science Foundation - CRI: Secure and Efficient Cloud Infrastructure and Accessibility Services
PI, $200,000, 9/1/2014 - 8/31/2017
- Semiconductor Research Corporation - Flexible Hardware Acceleration of the Network Stack for Performance and Energy Efficiency
PI, $300,000, 1/1/2014 - 1/31/2017
PhD (5 students)
MS Thesis (2 students)
- Varun Agrawal, 2013-present
- Shenghsun Cho, 2014-present
- Yongming Shen, 2014-present
- Mina Abbasi Dinani, 2016-present
- Sergey Madaminov, 2016-present
MS Advanced Project (50 students)
- Tapti Palit, 2014-2015, Benchmarking Network-Intensive Applications
- Manoj Alwani, 2015-2016, Fused Convolutional Neural Network Accelerators
BS Honors Project (1 student)
BS Research (4 students)
Chaitanya Chakka Krishna,
Dhruva Kumar Devineni,
Jerrin Shaji George,
Rajendra Kumar Raghupatruni,
Ravi Prakash Pandey,
Shreyas Prabhu Binnamangala,
Shyam Sundar Chandrasekaran,
Srinath Battula Yagna Reddy,
Sunad S Bhandary,
These days, it seems like everyone's favorite hobby is to travel. Below is a map that shows the countries I visited.