Research Work

VSSAD, Intel Massachusetts Inc., Hudson, MA

About Me Research Talks Teaching Internships



Current Status:

Aamer Jaleel currently works with an advanced development group called VSSAD in Intel Massachusetts, Inc. Aamer's current research work focuses on memory system optimizations for CMPs and workload characterization.

Research Interests:

  • Micro-architecture
  • Memory Systems
  • Workload Characterization

  • Publications:

    HPCA - 2010 "Explaining Cache SER Anomaly Using DUE AVF Measurement", Arijit Biswas, Charles Recchia, Shubuhendu Mukherjee, Vinod Ambrose, Leo Chan, Aamer Jaleel, Athanosios Papathanasiou, Mike Plaster, and Norbert Seifert. To Appear in International Conference on High Performance Computer Architecture (HPCA), Bangalore, India, Januray 2010.
    ISPASS - 2009 "CMPSched$im: Evaluating OS/CMP Interaction on Shared Cache Management", Jaideep Moses, Konstantinos Aisopos, Aamer Jaleel, Ravishanker Iyer, Ramesh Illikkal, Don Newell, and Srihari Makineni, In the International Symposium on Performance Analysis of Systems and Software (ISPASS), Boston, MA, April 2009.
    PACT - 2008 "Adaptive Insertion Policies for Managing Shared Caches on CMPs", Aamer Jaleel, William Hasenplaugh, Moinuddin Qureshi, Julien Sebot, Simon C. Steely Jr, and Joel Emer, In the International Conference on Parallel Architectures and Compiler Techniques (PACT), Toronto, Canada, October 2008. (slides, sample code)
    MoBS - 2008 "CMP$im: A Pin-Based On-The-Fly Multi-Core Cache Simulator", Aamer Jaleel, Robert S. Cohn, Chi-Keung Luk, and Bruce Jacob. In the Fourth Annual Workshop on Modeling, Benchmarking and Simulation (MoBS), co-located with ISCA'2008.
    CAECW - 2008 "Memory Characterization of SPEC CPU2006 Benchmark Suite" Jun Min Lin, Yu Chen, Wenlong Li, Zhao Tang, and Aamer Jaleel. In Workshop for Computer Architecture Evaluation of Commerical Workloads (CAECW), co-located with HPCA'2008.
    CAECW - 2008 "Memory Characterization of Emerging Recognition-Mining-Synthesis Workloads for Multi-Core Processors" Yu Chen, Wenlong Li, Jun Min Lin, Zhao Tang, and Aamer Jaleel. In Workshop for Computer Architecture Evaluation of Commerical Workloads (CAECW), co-located with HPCA'2008.
    Top Picks - 2008 "Set-Dueling-Controlled Adaptive Insertion For High-Performance Caching", Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely Jr., and Joel Emer. To Appear IEEE Micro, Special Issue: Micro's Top Picks from 2007 Computer Architecture Conferences (MICRO TOP PICKS)
    VSSAD TR - 2007 "Memory Characterization of Workloads Using Instrumentation-Driven Simulation -- A Pin-based Memory Characterization of the SPEC CPU2000 and SPEC CPU2006 Benchmark Suites", Aamer Jaleel, VSSAD Technical Report 2007. (Workload Characterization)
    ISCA - 2007 "Adaptive Insertion Policies for High-Performance Caching", Moinuddin K. Qureshi, Aamer Jaleel, Yale N. Patt, Simon C. Steely Jr., and Joel Emer. In Proceedings of the 34th International Symposium on Computer Architecture (ISCA), San Diego, CA, June 2007. (One of the 10 computer architecture papers of 2007 selected as Top Picks by IEEE Micro.)
    ISPASS - 2007 "Using Hardware-Software Co-Simulation To Understand The Memory Performance of Parallel Data-Mining Workloads on Small, Medium, and Large-Scale CMPs", Wenlong Li, Eric Li, Aamer Jaleel, Jiulong Shan, Yurong Chen, Qigang Wang, Ravi Iyer, Ramesh Illikka, Yimin Zhang, Dong Liu, Michael Liao, Wei Wei, John Du. In Proceedings of the 7th International Symposium on Performance Analysis of Systems and Software (ISPASS), San Jose, CA, April 2007.
    ISPASS - 2007 "Cross Binary Simulation Points", Erez Perelmany, Jeremy Lauy, Harish Patil, Aamer Jaleel, Greg Hamerly, Brad Calder In Proceedings of the 7th International Symposium on Performance Analysis of Systems and Software (ISPASS), San Jose, CA, April 2007.
    HPCA - 2007 "Fully-Buffered DIMM memory architectures: Understanding mechanisms, overheads and scaling.", Brinda Ganesh, Aamer Jaleel, David Wang, and Bruce Jacob.In Proceedings of the 13th International Symposium on High Performance Computer Architecture (HPCA), Phoenix, Arizona, February 2007.
    DTTC - 2006 "CMP$im: Using Pin to Characterize Memory Behavior of Emerging Workloads on CMPs", Aamer Jaleel, Robert S. Cohn, and Chi-Keung Luk. Intel Design, Test, and Technologies Conference (DTTC 2006), San Jose, CA, August 2006.
    Tutorial - 2006 "Using the Pin Instrumentation Tool for Computer Architecture Research", Aamer Jaleel, Chi-Keung Luk, Bobbie Manne, and Harish Patil. Tutorial was held in conjunction With ISCA 2006 Boston, MA, June 2006.(Webpage)
    HPCA - 2006 "Last Level Cache (LLC) Performance of Data Mining Workloads On A CMP -- A Case Study of Parallel Bioinformatics Workloads", Aamer Jaleel, Matthew Mattina, and Bruce Jacob.In Proceedings of the 12th International Symposium on High Performance Computer Architecture (HPCA), Austin, Texas, February 2006.(slides) (BioParallel Suite)
    IEEE-TC - 2006 "In-Line Interrupt Handling and Lock-Up Free Translation Look-Aside Buffers (TLBs)", Aamer Jaleel and Bruce Jacob. In IEEE Transactions On Computers, Vol. 55, No. 5, May 2006
    CAN - 2005 "DRAMsim: A memory-system simulator" David Wang, Brinda Ganesh, Nuengwong Tuaycharoen, Katie Baynes, Aamer Jaleel, and Bruce Jacob. SIGARCH Computer Architecture News (CAN), vol. 33, no. 4, pp. 100-107. September 2005.
    ISPASS - 2005 "BioBench: A Benchmark Suite of Bioinformatics Applications", Kursad Albayraktaroglu, Aamer Jaleel, Xue Wu, Manoj Franklin, Bruce Jacob, Chau-Wen Tseng and Donald Yeung. In Proceedings of the 5th International Symposium on Performance Analysis of Systems and Software (ISPASS), Austin, Texas, March 2005. (BioBench Website)
    HPCA - 2005 "Using Virtual Load/Store Queues (VLSQs) to Reduce the Negative Effects of Reordered Memory Instructions", Aamer Jaleel and Bruce Jacob. In Proceedings of the 11th International Symposium on High Performance Computer Architecture (HPCA), San Francisco, California, February 2005. (Awarded Best Presentation) (slides)
    HiPC - 2001 "Improving the Precise Interrupt Mechanism of Software-Managed TLB Miss Handlers", Aamer Jaleel and Bruce Jacob. In Proceedings of the 8th International Conference on High Performance Computing (HiPC), Hyderabad, India, December 2001. (slides)
    ICCD - 2001 "In-Line Interrupt Handling for Software-Managed TLBs", Aamer Jaleel and Bruce Jacob. In Proceedings of the 19th International Conference on Computer Design (ICCD), Austin, Texas, September 2001. (slides)

    Honors/Awards

  • Best Student Paper Presentation, High Performance Computer Architecture (HPCA), February 2005
  • George Corcoran Award for Excellence in Teaching, September 2001
  • Undergraduate Teaching Fellowship, Fall 1999, Spring 2000

  • Technical Reports:

    [UMD-SCA 2006] "CMP$im: A Binary Instrumentation Approach to Modeling Memory Behavior of Workloads on CMPs", Aamer Jaleel, Robert S. Cohn, Chi-Keung Luk, and Bruce Jacob. Technical Report - UMD-SCA-2006-01

    [UMD-SCA 2005] "Disorder - The Reordering of Memory Instructions In Out-of-Order Systems", Aamer Jaleel and Bruce Jacob. Technical Report - UMD-SCA-2005-01


    Theses/Proposals

    [Master's Thesis] "In-Line Interrupt Handling and Lock-Up Free TLBs", Aamer Jaleel. Master's Thesis, University of Maryland, College Park. April 2002.

    [Ph.D. Research Proposal] "The Effects of OoO Execution on the Memory System", Aamer Jaleel. Ph.D. Research Proposal, University of Maryland, College Park. November 2004.(Proposal Exam Slides)


    Non-Conference Talks:

    "Characterizing The LLC Cache Behavior of OpenMP Bioinformatic Workloads". Aamer Jaleel, May 2005, Intel Massachusetts Inc.

    "Design Trade-Offs of On-Chip Routers in High Performance Microprocessors". Aamer Jaleel, January 17th 2002, Intel Santa Clara.

    "The Impact of Out-of-Order Execution on Memory System Behavior", Summer Intern Presentation, Compaq Western Research Lab (WRL), August 29, 2001.


    Industry Internship Experience:

    My experiences as an intern in industry.

    Grad School Projects:

    Fully Buffered Dimms (FBDs): Exploring the different design parameters of implementing Fully Buffered Dimms. Aided in the process of implementing a performance model for fully buffered dimms based on information presented at Denali MemCon.

    Embedded Systems: Investigating hardware-based methods to improve the performance of embedded operating systems. Reverse engineered Texas Instruments embedded micro-controller (MSP430) in Verilog. The Verilog RTL supports interrupts, serial I/O, watch dog timer, and is capable of running an embedded OS (e.g. uCOS) and simple embedded applications. The RTL is expected to be fabricated on 0.25 .m technology through MOSIS. Expected PTQ is August 2004. My contributions to the Verilog RTL include the implementation, verification, and synthesis of the following modules: Serial I/O, SRAM, Register File, Watch Dog Timer, and the Control Logic for the CPU core.

    Page Based Commands for DRAM Systems: Investigated the need for replacing frequent, sequential reads and writes to memory with page based commands. Hacked and recompiled Linux Kernel 2.4.19 to gather statistics on frequency and parameter sizes of calls to functions memcpy and memset (in both user and kernel domains). Proposed the addition of simple new commands to be issued by the processor to the memory controller to implement page based commands. Discussed the possible improvements in latency as well as bandwidth on several representative SPEC2000 benchmarks.

    Impact of Aggressive Out-Of-Order Techniques on Memory Systems: Studied the impact of aggressive out of order techniques (increase in issue widths and reorder buffer sizes) on the performance of a system for several representative SPEC2000 benchmarks. This study contributed to a conference paper (HPCA'05).

    Lock-Up Free TLBs: Designed and simulated a novel method to speed up the processing of software managed interrupts, TLB interrupts in particular. This study was part of my Masters Thesis and contributed to two conference papers (ICCD'01, HiPC'01) and a journal paper in IEEE Transactions on Computers.

    Virtual Memory: Simulated the page table mechanisms of different architectures including Mach, Intel, PA-RISC, PowerPC, UltraSparc, PUMA, Winchip, and Ultrix. Demonstrated the characteristics about these mechanisms: whether they can guarantee performance or not, and whether they can complete in a reasonable amount of time or not. This study contributed to a journal paper in progress.

    Operating Systems: Designed a software environment to simulate a 16-bit pipelined architecture, with a simple operating system that ran atop the simulated architecture. Simulations were written in C; the operating system in the assembly code (DLX) of the simulated architecture. The full-system emulation will be used to teach future sections of ENEE 350: Computer Organization.


    Education:

  • Ph.D. (Electrical Engineering), University of Maryland - College Park MD, 2006
  • M.S. (Electrical Engineering), University of Maryland - College Park MD, 2002
  • B.S. (Computer Engineering), University of Maryland - College Park MD, 2000

  • Ph.D. Advisor:

    Dr. Bruce L. Jacob (blj at umd dot edu) www: http://www.ece.umd.edu/~blj/

    Class Projects

  • Implemented the Bluetooth wireless communication protocol in Verilog. The implementation was then synthesized onto an FPGA using Xilinx tools. The Verilog model of Bluetooth supported communication between seven clients and also included reliable communication.
  • Implemented an x86 operating system with keyboard/screen drivers, multiprogramming, communication and synchronization, memory management, and resource scheduling polices.
  • Implemented encryption algorithms such as Vigenere, IDEA, and RSA in Java. Implemented a secure chat client & server application using the Cryptix java security library.
  • Implemented a packet sniffing application for integrity analysis of file transfers across a network. The application captured all files being shared over a network and maintained a "network seen file system".
  • Implemented the interactive computer game of Battleship, with artificial intelligence, and network gaming capability.


  • Please send comments to: ajaleel at glue dot umd dot edu Free counter and web stats