Project description

Recent years have witnessed a rapid growth of computationally intensive applications on users' devices for military operations. In most of these application domains, much of the needed data is collected at the edge devices (e.g., sensors, mobile platforms, and users' computers). In the conventional approaches, this data is then channelled through the network to the back-end servers for computations. However, there is a significant amount of computing capability that is embedded along the entire path from edge to the cloud. Hence, exploiting these "dispersed computing capabilities" is not only beneficial for performance and power, but it is also required in the context of military operations, where uncertainty in available bandwidth and latency in decision making are critical considerations. In addition to the traditional challenges that exist for distributed computing over the cloud (e.g., resource allocation, task scheduling, and data-distribution), we face three key challenges in dispersed computing: limited and variable bandwidth, computing and network heterogeneity, and network dynamics. In this project, we combine theoretical advances with real system implementation to provide a comprehensive dispersed computing framework that tackles these three challenges with the following features:

We develop an innovative "decentralized task scheduling framework and pricing-based algorithms" for resource allocation in dispersed computing. In contrast to traditional centralized schedulers for cloud distributed computing as well as grid computing environments, the decentralized task scheduling framework that we propose will reduce the coordination overhead associated with constantly collecting fresh state information about computation and communication from all NCP's, and this will make the dispersed computing system not only more scalable but also more responsive in case of network failures or degradation under attack. Our pricing-based approach to resource allocation will allow dynamic optimization of task assignments for competitive jobs, providing support for job-prioritization and deadline-satisfaction.

We develop a novel "Coded Dispersed Computing (CDC)" framework, which provides a completely new architecture for leveraging the available or under-utilized computing power at various parts of the network, in order to enable coding opportunities that can significantly reduce the bandwidth consumption and latency of dispersed computing. We also develop a novel type of coding, named maximum robustness codes, that provide robustness to maximum number of failing or straggling nodes. Compared to today's uncoded computing systems, CDC significantly improves bandwidth utilization and resilience to failures and stragglers.

We design and deploy a "heterogeneous dispersed computing testbed at USC" that will emulate real world scenarios where video data is collected across distributed sensor nodes. The testbed will be a unique fusion of ultra-low power NCPs, integrated with sophisticated video and audio collection sensors. On the networking front the testbed will support heterogeneous communication protocols. We develop a full suite of software capabilities that expose the underlying heterogeneous testbed capabilities in a unified manner so that the applications can exploit the testbed transparently. The in-house testbed will be complemented with a commercial dispersive computing testbed, albeit with limited heterogeneity, where we demonstrate our approaches at scale.

Publications

S. Li, M. A. Maddah-Ali and A. S. Avestimehr, "Coding for Distributed Fog Computing," IEEE Communications Magazine, April 2017 (available online at https://arxiv.org/abs/1702.06082).

A. Reisizadeh, S. Prakash, R. Pedarsani, and A.S. Avestimehr, "Coded Computation over Heterogeneous Clusters”, in proceedings of ISIT 2017 (available online at https://arxiv.org/abs/1701.05973).

S. Li, M. Maddah-Ali, and A. S. Avestimehr, "Communication-Aware Computing for Edge Processing”, in proceedings of ISIT 2017 (available online at https://arxiv.org/abs/1706.07523).

Q. Yu, M. Maddah-Ali, A.S. Avestimehr, "Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication”, preprint available online at https://arxiv.org/abs/1705.10464.

Adaptive Pricing and Coding for Dispersed Computing

Project description

Publications

Software

Raspberry Pi cluster