Omer Khan

Omer Khan
Assistant Professor
Electrical and Computer Engineering
University of Connecticut

Contact Information:
Office: ITE Building 447
Email: khan[_at_]uconn.edu
Phone: (860) 486-2192
Mail: 371 Fairfield Way, Unit 4157, Storrs, CT 06269 USA
Department Page

Publications (DBLP, Google Scholar)

2017

Accelerating Graph and Machine Learning Workloads Using a Shared Memory Multicore Architecture with Auxiliary Support for in-Hardware Explicit Messaging, H. Dogan, M. Ahmad, F. Hijaz, B. Kahne, P. Wilson, O. Khan, IEEE International Parallel and Distributed Processing Symposium, (IPDPS), May/June 2017.
Exploiting the Tradeoff between Program Accuracy and Soft-error Resiliency Overhead for Machine Learning Workloads, Q. Shi, H. Omar, O. Khan, IEEE Workshop on Silicon Errors in Logic - System Effects, (SELSE), March, 2017.
Efficient Situational Scheduling of Graph Workloads on Single Chip Multicores and GPUs, M. Ahmad, C. Michael, O. Khan, IEEE Micro Special Issue on Cognitive Architectures, (IEEE Micro), vol. 37, no., pp. 30-40, Jan.-Feb. 2017.
Advancing the State-of-the-Art in Hardware Trojans Detection, S. K. Haider, C. Jin, M. Ahmad, D. M. Shila, O. Khan, M. van Dijk, IEEE Transactions on Dependable and Secure Computing, (TDSC), 2017.
Towards Resilient yet Efficient Parallel Execution of Convolutional Neural Networks, Q. Shi, H. Omar, O. Khan, Boston area ARChitecture Annual Workshop, (BARC), January, 2017.
Situationally Adaptive Scheduling of Graph Algorithms on Single-Chip Parallel Machines, M. Ahmad, O. Khan, Boston area ARChitecture Annual Workshop, (BARC), January, 2017.

2016

LDAC: Locality-aware Data Access Control for Large-scale Multicore Cache Hierarchies, Q. Shi, G. Kurian, F. Hijaz, S. Devadas, O. Khan, ACM Transactions on Architecture and Code Optimization, (TACO), vol. 13, iss 4, November, 2016. Presentation at 12th International Conference on High-Performance Embedded Architectures and Compilers, (HiPEAC), January, 2017
GPU Concurrency Choices in Graph Analytics, M. Ahmad, O. Khan, IEEE International Symposium on Workload Characterization, (IISWC), September 2016.
A Lightweight Spatio-temporally Partitioned Multicore Architecture for Concurrent Execution of Safety Critical Workloads, Q. Shi, K. Lakshminarasimhan, C. Noll, E. Scholte, O. Khan, SAE 2016 Aerospace Systems and Technology Conference , (ASTC), September, 2016. [Public Technical Report]
A Case for a Situationally Adaptive Many-core Execution Model for Cognitive Computing Workloads, M. Ahmad, C. Michael, O. Khan, ASPLOS 2016 International Workshop on Cognitive Architectures, (CogArch), April, 2016.
Locality-aware data replication in the last-level cache for large scale multicores, F. Hijaz, Q. Shi, G. Kurian, S. Devadas, O. Khan, The Journal of Supercomputing, (SUPE), vol. 72, iss 2, pp. 718-752, February 2016.
Efficient Error-Detection and Recovery Mechanisms for Reliability and Resiliency of Multicores, S. Kundu, O. Khan, IEEE International Conference on VLSI Design, (VLSID), January, 2016.
Tradeoffs in Secure Accelerator Designs, M. Ahmad, O. Khan, Boston area ARChitecture Annual Workshop, (BARC), January, 2016.
OGAPI Oblivious Graph Processing in Multicores, M. Ahmad, O. Khan, Boston area ARChitecture Annual Workshop, (BARC), January, 2016.
A Case for Deploying Multicores in Cyberphysical Embedded Systems, Q. Shi, O. Khan, Boston area ARChitecture Annual Workshop, (BARC), January, 2016.

2015

A Cross-Layer Multicore Architecture to Tradeoff Program Accuracy and Resilience Overheads, Q. Shi, H. Hoffmann, O. Khan, IEEE Computer Architecture Letters, (CAL) vol. 14, No. 2, pp. 85-89, July-Dec., 2015.
OSPREY: Implementation of Memory Consistency Models for Cache Coherence Protocols involving Invalidation-Free Data Access, G. Kurian, Q. Shi, S. Devadas, O. Khan, IEEE/ACM International Conference on Parallel Architectures and Compilation Techniques, (PACT), October 2015.
CRONO: A Benchmark Suite for Multithreaded Graph Algorithms Executing on Futuristic Multicores, M. Ahmad, F. Hijaz, Q. Shi, O. Khan, IEEE International Symposium on Workload Characterization, (IISWC), October 2015. Best Paper Nominee
M-MAP: Multi-Factor Memory Authentication for Secure Embedded Processors, S. K. Haider, M. Ahmad, F. Hijaz, A. Patni, E. Johnson, M. Seita, O. Khan, M. van Dijk, IEEE International Conference on Computer Design, (ICCD), October 2015.
The Execution Migration Machine: Directoryless Shared-Memory Architecture, K. S. Shim, M. Lis, O. Khan, S. Devadas, IEEE Computer, (COMPUTER), Vol. 48, No. 9, pp. 50-59, September 2015.
Efficient Parallelization of Path Planning Workload on Single-chip Shared-memory Multicores, M. Ahmad, K. Lakshminarasimhan, O. Khan, IEEE High Performance Extreme Computing Conference, (HPEC), September 2015.
Efficient Parallel Packet Processing using a Shared Memory Many-core Processor with Hardware Support to Accelerate Communication, F. Hijaz, B. Kahne, P. Wilson, O. Khan, IEEE International Conference on Networking, Architecture, and Storage, (NAS), August 2015.
Exploring the Performance Implications of Memory Safety Primitives in Many-core Processors Executing Multi-threaded Workloads, M. Ahmad, S. K. Haider, F. Hijaz, M. van Dijk, O. Khan, ACM Workshop on Hardware and Architectural Support for Security and Privacy, (HASP), June 2015.
Many-core Architecture Characterization of the Path-Planning Workload, O. Khan, ASPLOS 2015 International Workshop on Cognitive Architectures, (CogArch), March, 2015.
Accelerating Communication in Single-chip Shared Memory Many-core Processors, F. Hijaz, B. Kahne, P. Wilson, O. Khan, Boston area ARChitecture Annual Workshop, (BARC), January, 2015.
HaTCh: Hardware Trojan Catcher, S. K. Haider, C. Jin, M. Ahmad, D. M. Shila, O. Khan, M. van Dijk, IACR Cryptology ePrint Archive 2014, January, 2015 (latest revision May, 2015).

2014

EXECUTION MIGRATION, S. Devadas, O. Khan, M. Lis, K. S. Shim, M. H. Cho, United States Patent No. 8904154, December, 2014.
NUCA-L1: A Non-Uniform Access Latency Level-1 Cache Architecture for Multicores Operating at Near-Threshold Voltages, F. Hijaz, O. Khan, ACM Transactions on Architecture and Code Optimization, (TACO), Vol. 11, No. 3, Article 29, July/August 2014.
Thread Migration Prediction for Distributed Shared Caches, K. S. Shim, M. Lis, O. Khan, S. Devadas, IEEE Computer Architecture Letters, (CAL) Vol. 13, No. 1, pp. 53-56, January-June, 2014.
Rethinking Last-Level Cache Management for Multicores Operating at Near-Threshold Voltages, F. Hijaz, O. Khan, ISCA-41 International Workshop on Near-threshold Computing, (WNTC), June, 2014.
Locality-Aware Data Replication in the Last-Level Cache, G. Kurian, S. Devadas, O. Khan, IEEE International Symposium on High Performance Computer Architecture, (HPCA), February 2014. [talk] Featured in MIT News and Ars Technica
Suppressing the Oblivious RAM Timing Channel While Making Information Leakage and Program Efficiency Trade-offs, C. W. Fletcher, L. Ren, X. Yu, M. van Dijk, O. Khan, S. Devadas, IEEE International Symposium on High Performance Computer Architecture, (HPCA), February 2014.

2013

Toward Holistic Soft-Error-Resilient Shared-Memory Multicores, Q. Shi, O. Khan, IEEE Computer, (COMPUTER), Vol. 46, No. 10, pp. 56-64, October 2013. [Multimedia video interview]
Towards Efficient Dynamic Data Placement in NoC-Based Multicores, Q. Shi, F. Hijaz, O. Khan, 31st IEEE International Conference on Computer Design, (ICCD), October, 2013.
A Private Level-1 Cache Architecture to Exploit the Latency and Capacity Tradeoffs in Multicores Operating at Near-Threshold Voltages, F. Hijaz, Q. Shi, O. Khan, 31st IEEE International Conference on Computer Design, (ICCD), October, 2013.
A Framework to Accelerate Sequential Programs on Homogeneous Multicores, C. W. Fletcher, R. Harding, O. Khan, S. Devadas, IEEE/IFIP International Conference on Very Large Scale Integration, (VLSI-SoC), October, 2013. [MIT CSG Memo]
EM2: A Scalable Shared Memory Architecture for Large-Scale Multicores, O. Khan, M. Lis, K. S. Shim, M. H. Cho, S. Devadas, Book Chapter in Multicore Technology: Architecture, Reconfiguration and Modeling, Edited by M. Y. Qadri and S. J. Sangwine, (CRC Press), July, 2013.
The Locality-Aware Adaptive Cache Coherence Protocol, G. Kurian, O. Khan, S. Devadas, ACM/IEEE International Symposium on Computer Architecture, (ISCA), Tel Aviv, Israel, June, 2013. [talk] Featured in MIT News and Ars Technica
MARTHA: Architecture for Control and Emulation of Power Electronics and Smart Grid Systems, M. Kinsy, I. Celanovic, O. Khan, S. Devadas, IEEE/ACM International Conference on Design, Automation and Test in Europe, (DATE), March, 2013.

2012

Low-Latency Mechanisms for Near-Threshold Operation of Private Caches in Shared Memory Multicores, F. Hijaz, Q. Shi, O. Khan, IEEE/ACM 45th International Symposium on Microarchitecture Workshops, (MICROW), December, 2012.
A Low-Overhead Dynamic Optimization Framework for Multicores, C. Fletcher, R. Harding, O. Khan, S. Devadas, IEEE/ACM International Conference on Parallel Architectures and Compilation Techniques, (PACT), October, 2012.
An Empirical Model for Cooperative Resizing of Processor Structures to Exploit Power-Performance Efficiency at Runtime, O. Khan and S. Kundu, Journal of IET Circuits, Devices & Systems, (IET CDS), Vol. 6, No. 5, pp. 355-365, September 2012.
HORNET: A Cycle-level Multicore Simulator, P. Ren, M. Lis, M. H. Cho, K. S. Shim, C. Fletcher, O. Khan, N. Zheng, S. Devadas, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, (TCAD), Vol. 31, No. 6, pp. 890-903, June 2012.
A Multicore Architecture for Control and Emulation of Power Electronics and Smart Grid Systems Under Hard Real-Time Constraints, M. Kinsy, J. Poon, I. Celanovic, O. Khan, S. Devadas, Work-in-Progress Presentation at 49th Design Automation Conference (DAC), June 2012
A Case for Fine-Grain Adaptive Cache Coherence, G. Kurian, O. Khan, S. Devadas, MIT CSAIL Technical Report, MIT-CSAIL-TR-2012-012, (MIT Tech. Report), May, 2012.
A Case for Cache Line Locality-Based Adaptive Cache Coherence, G. Kurian, O. Khan, S. Devadas, ISCA International Workshop on Future Architectural Support for Parallel Programming, (FASPP), June, 2012.
Judicious Thread Migration When Accessing Distributed Shared Caches, K. S. Shim, M. Lis, O. Khan, S. Devadas, HiPEAC Workshop on Computer Architecture and Operating System Co-design, (CAOS), January, 2012.

2011

DCC: A Dependable Cache Coherence Multicore Architecture, O. Khan, M. Lis, Y. Sinangil, S. Devadas, IEEE Computer Architecture Letters, (CAL), pp. 12-15, January-June, 2011. Presented at "Best Papers from CAL Session" at HPCA 2012
Directoryless Shared Memory Coherence Using Execution Migration, M. Lis*, K. S. Shim, M. H. Cho, O. Khan*, S. Devadas, 23rd IASTED International Conference on Parallel and Distributed Computing, (PDCS), December, 2011. * O. Khan and M. Lis contributed equally to this work. Best Paper Award
Time-Predictable Computer Architecture for Cyber-Physical Systems: Digital Emulation of Power Electronics Systems, M. Kinsy, O. Khan, I. Celanovic, D. Majstorovic, N. Celanovic, S. Devadas, 32nd IEEE Real-Time Systems Symposium, (RTSS), December, 2011.
ARCc: A Case for an Architecturally Redundant Cache-coherence Architecture for Large Multicores, O. Khan, H. Hoffmann, M. Lis, F. Hijaz, A. Agarwal, S. Devadas, 29th IEEE International Conference on Computer Design, (ICCD), October, 2011.
Performance Per Watt Benefits of Dynamic Core Morphing in Asymmetric Multicores, R. Rodrigues, A. Annamalai, I. Koren, S. Kundu, O. Khan, IEEE/ACM International Conference on Parallel Architectures and Compilation Techniques, (PACT), October, 2011.
Hardware/Software Co-design Architecture for Online Testing in Chip Multiprocessors, O. Khan, S. Kundu, IEEE Transactions on Dependable and Secure Computing, (TDSC), pp. 714-727, September/October, 2011. Cover Feature
Brief Announcement: Distributed Shared Memory based on Computation Migration, M. Lis, K. S. Shim, M. H. Cho, O. Khan, S. Devadas, ACM Symposium on Parallelism in Algorithms and Architectures, (SPAA), June, 2011.
Deadlock-Free Fine-Grained Thread Migration, M. H. Cho, K. S. Shim, M. Lis, O. Khan, S. Devadas, ACM/IEEE International Symposium on Networks-on-Chip, (NOCS), May, 2011.Best Paper Award
Library Cache Coherence K. S. Shim, M. H. Cho, M. Lis, O. Khan, S. Devadas, MIT CSAIL Technical Report, MIT-CSAIL-TR-2011-027, (MIT Tech. Report), May, 2011.
Scalable, Accurate Multicore Simulation for 1000-core Era, M. Lis, P. Ren, M. H. Cho, K. S. Shim, C. W. Fletcher, O. Khan, S. Devadas, IEEE International Symposium on Performance Analysis of Systems and Software, (ISPASS), April, 2011.
Shared Memory via Execution Migration, M. Lis, K. S. Shim, O. Khan, S. Devadas, Ideas & Perspectives Session at International Conference on Architectural Support for Programming Languages and Operating Systems, (ASPLOS I&P), Newport Beach, California, March, 2011.
System-level Optimizations for Memory Access in the Execution Migration Machine (EM2), M. Lis, K. S. Shim, M. H. Cho, O. Khan, S. Devadas, HiPEAC Workshop on Computer Architecture and Operating System Co-design, (CAOS), January, 2011.
Microvisor: A Runtime Architecture for Thermal Management in Chip Multiprocessors, O. Khan, S. Kundu, LNCS Transactions on High-Performance Embedded Architectures and Compilers, (Tran. HiPEAC), Volume 4, LNCS 6760, pp. 84-110, 2011.

2010

DCC: A Dependable Cache Coherence Architecture for 1000-Core Processors, O. Khan, M. Lis, Y. Sinangil, S. Devadas, Micro-43 Workshop on Resilient Architectures, (WRA), December, 2010.
Shadow Checker (SC): A Low-Cost Hardware Scheme for Online Detection of Faults in Small Memory Structures of a Microprocessor, R. Rodrigues, S. Kundu, O. Khan, IEEE International Test Conference, (ITC), November, 2010.
Scalable directoryless shared memory coherence using execution migration, M. Lis, K. S. Shim, M. H. Cho, O. Khan, S. Devadas, MIT CSAIL Technical Report, MIT-CSAIL-TR-2010-053, (MIT Tech. Report), November, 2010.
DARSIM: A parallel cycle-level NoC Simulator, M. Lis, K. S. Shim, M. H. Cho, P. Ren, O. Khan, S. Devadas, ISCA International Workshop on Modeling, Benchmarking and Simulation, (MoBS), June, 2010.
EM2: A Scalable Shared-Memory Multicore Architecture, O. Khan, M. Lis, S. Devadas, MIT CSAIL Technical Report, MIT-CSAIL-TR-2010-030, (MIT Tech. Report), June, 2010.
Thread Relocation: A Runtime Architecture for Tolerating Hard Errors in Chip Multiprocessors, O. Khan, S. Kundu, IEEE Transactions on Computers, (TC), pp. 651-665, May, 2010.
A Model to Exploit Power-Performance Efficiency in Superscalar Processors via Structure Resizing, O. Khan, S. Kundu, ACM Great Lakes Symposium on VLSI, (GLSVLSI), Providence, May, 2010.
A Self-Adaptive Scheduler for Asymmetric Multi-cores, O. Khan, S. Kundu, ACM Great Lakes Symposium on VLSI, (GLSVLSI), Providence, May, 2010.
Instruction-Level Execution Migration, O. Khan, M. Lis, S. Devadas, MIT CSAIL Technical Report, MIT-CSAIL-TR-2010-019, (MIT Tech. Report), April, 2010.
Multithreaded Simulation to Increase Performance Modeling Throughput on Large Compute Grids, C. Beckmann, O. Khan, S. Parthasarathy, A. Klimkin, M. Gambhir, B. Slechta, K. Rangan, ASPLOS Exascale Evaluation and Research Techniques Workshop, (EXERT), Pittsburg, March, 2010.

2009

A Hardware/Software Co-Design Architecture for Thermal, Power, and Reliability Management in Chip Multiprocessors, O. Khan, Open Access Dissertations, (PhD. Thesis).
Hardware/Software Co-design Architecture for Thermal Management of Chip Multiprocessors, O. Khan, S. Kundu, IEEE International Conference on Design, Automation and Test in Europe, (DATE), April, 2009.
A Self-Adaptive System Architecture to Address Transistor Aging, O. Khan, S. Kundu, IEEE International Conference on Design, Automation and Test in Europe, (DATE), April, 2009.
Improving Yield and Reliability of Chip Multiprocessors, A. Pan, O. Khan, S. Kundu, IEEE International Conference on Design, Automation and Test in Europe, (DATE), April, 2009.
Run-Time Reconfiguration for Performance and Power Optimizations in Asymmetric Chip Multiprocessors, O. Khan, S. Kundu, K. Rangan, HiPEAC Workshop on Reconfigurable Computing, (WRC), January, 2009.
Predictive Thermal Management for Chip Multiprocessors using Co-Designed Virtual Machines, O. Khan, S. Kundu, International Conference on High Performance Embedded Architectures Compilers, (HiPEAC), January, 2009. Best Paper Candidate

2008

Automatic Adjustment of System Performance to Mitigate Device Aging via a Co-designed Virtual Machine, O. Khan, S. Kundu, Micro-41 Workshop on Dependable Architectures, (MICRO-41 WDA), November, 2008.
A Framework for Predictive Dynamic Temperature Management of Microprocessor Systems, O. Khan, S. Kundu, IEEE/ACM International Conference on Computer Aided Design (ICCAD), November, 2008.