Title: Advances in Distributed Learning: Experimental Design Networks and Cache-Enabled Federated Learning
Date: 2023-07-18 11:00-12:00
Location: CSIE R102
Speaker: Prof. Edmund Yeh, Northeastern University
Host: Prof. Ai-Chun Pang
Significant advances in edge computing capabilities enable learning to occur at geographically diverse locations. In general, the training data needed in those learning tasks are not only heterogeneous but also not fully generated locally. In the first part of the talk, we present an experimental design network paradigm, wherein learner nodes train possibly different Bayesian linear regression models via consuming data streams generated by data source nodes over a network. We formulate this problem as a social welfare optimization problem in which the global objective is defined as the sum of experimental design objectives of individual learners, and the decision variables are the data transmission strategies subject to network constraints. We first show that, assuming Poisson data streams, the global objective is a continuous DR-submodular function. We then propose a Frank-Wolfe type algorithm that outputs a solution within a 1-1/e factor from the optimal. Numerical experiments show that the proposed algorithm outperforms several baseline algorithms both in maximizing the global objective and in the quality of the trained models.
In the second part of the talk, we focus on federated learning (FL), a distributed paradigm for collaboratively learning models without having clients disclose their private data. We propose a new approach to improving FL efficiency with respect to total wall-clock training time, through the use of caching. Specifically, instead of having all clients download the latest global model from a parameter server, we select a subset of clients to access a somewhat stale global model stored in caches with less delay. We propose CacheFL -- a cache-enabled variant of FedAvg, and provide theoretical convergence guarantees in the general setting where the local data is imbalanced and heterogeneous. We determine the caching strategies that minimize total wall-clock training time to a given convergence threshold for both stochastic and deterministic communication/computation delays. Through numerical experiments on real data traces, we show the advantage of our proposed scheme against several baselines, over both synthetic and real-world datasets.
Joint work with Stratis Ioannidis, Carlee Joe-Wong, Yuanyuan Li, Yuezhou Liu, Lili Su, and Marie Siew
Edmund Yeh received his B.S. in Electrical Engineering with Distinction and Phi Beta Kappa from Stanford University in 1994. He then studied at Cambridge University on the Winston Churchill Scholarship, obtaining his M.Phil in Engineering in 1995. He received his Ph.D. in Electrical Engineering and Computer Science from MIT under Professor Robert Gallager in 2001. He is a Professor of Electrical and Computer Engineering at Northeastern University with a courtesy appointment in Khoury School of Computer Sciences. He was previously Assistant and Associate Professor of Electrical Engineering, Computer Science, and Statistics at Yale University. He is a Faculty Fellow of the Internet Society Project at Yale Law School.
Professor Yeh is an IEEE Communications Society Distinguished Lecturer. He serves as the inaugural Area Editor for Networking and Computation for IEEE Transactions on Information Theory. He has received three Best Paper Awards, including awards at the 2017 ACM Conference on Information-Centric Networking (ICN), and at the 2015 IEEE International Conference on Communications (ICC) Communication Theory Symposium. Professor Yeh is the recipient of the Alexander von Humboldt Research Fellowship, the Army Research Office Young Investigator Award, the Winston Churchill Scholarship, the National Science Foundation and Office of Naval Research Graduate Fellowships, the Barry M. Goldwater Scholarship, the Frederick Emmons Terman Engineering Scholastic Award, and Stanford University President's Award for Academic Excellence. Professor Yeh served as TPC Co-Chair for ACM MobiHoc 2021 and as General Chair for ACM SIGMETRICS 2020. He has served as both Treasurer and Secretary of the Board of Governors of the IEEE Information Theory Society, as well as Associate Editor for IEEE Transactions on Networking, IEEE Transactions on Mobile Computing, and IEEE Transactions on Network Science and Engineering.