Principles and Practices of Dependable Distributed Computing
NSF Career Award
|
|
|
This research will advance the theoretical foundations and explore practical implementations of dependable distributed system technology. A distributed system is dependable, when it provides guarantees regarding its performance, fault- tolerance, correctness and compositionality. The research objectives will be achieved through synergy between the research in distributed systems with its focus on fault-tolerance and correctness, the research in parallel computing with its focus on speed-up and efficiency, and the practical engineering considerations of specification, development, deployment and performance of systems. This proposal envelops three investigation areas: (1) Robust Algorithmics: Development of fault-tolerant and efficient distributed algorithms and exploration of limitations on achieving robustness in distributed computing. (2) Building Blocks: Definition and analysis of dependable distributed building blocks needed by applications requiring precise guarantees; and design of specification frameworks for capturing designs and optimizing distributed system deployment. (3) Distributed Implementation: Development of exploratory implementations of compositional building blocks and robust algorithms, and evaluation of their performance in realistic and simulated settings; empirical evaluations will complement the analytically established efficiency characterizations. The educational component includes: developing and delivering new courses in distributed computing in support of undergraduate and graduate programs in computer science; and, building a research group that attracts graduate students and postdoctoral researchers. |