Abstract:
Today, large-scale cloud organizations are deploying datacenters and “edge” clusters globally to provide low-latency access to services. Running stream applications acros...Show MoreMetadata
Abstract:
Today, large-scale cloud organizations are deploying datacenters and “edge” clusters globally to provide low-latency access to services. Running stream applications across geo-distributed sites are emerging as a daily requirement. However, existing efforts have dominantly centered around stateless stream processing, leaving another urgent trend-stateful stream processing-much less explored. A driving need is to store and update states during processing, and most importantly, successfully recover large distributed states when faults and failures happen. Existing studies exhibit major limitations including: (1) they mostly inherit MapReduce's “single master/many workers” architecture, where the central master can easily become ascalability bottleneck; (2) they offer state recovery mainly through three approaches: replication recovery, checkpointing recovery, and DStream-based lineage recovery, which are either slow, resource-expensive or failing to handle multiple failures; and (3) they are not adaptive to heterogeneous hardware settings. We present A-FP4S, a novel adaptive fragments-based parallel state recovery mechanism for stream processing systems. A-FP4S organizes stream operators into a distributed hash table based peer-to-peer overlay and divides each node's local state into many fragments. These fragments are periodically stored in node's multiple neighbors, ensuring different sets of available fragments can reconstruct failed states in parallel. This mechanism is extremely scalable to the lost state, significantly reduces failure recovery time, and can tolerate multiple node failures. A-FP4S is adaptive to heterogeneous hardware settings by automatic parameter tuning over phases. Compared to Apache Storm, A-FP4S achieves 31.8% to 50.5% reduction in recovery latency. Large-scale experiments using real-world datasets demonstrate A-FP4S's attractive scalability and adaptivity properties.
Published in: IEEE Transactions on Parallel and Distributed Systems ( Volume: 34, Issue: 8, August 2023)
Funding Agency:

Department of Computer Engineering and Computer Science, California State University, Long Beach, CA, USA
Hailu Xu received the BS degree in computer science from North China Electric Power University, in 2014, the MS degree in computer science from the University of Toledo, USA, in 2016, and the PhD degree in computer science from Florida International University, in 2020. His current research interests include cloud computing, Big Data system, and operating systems. Currently he is an assistant professor with the Department...Show More
Hailu Xu received the BS degree in computer science from North China Electric Power University, in 2014, the MS degree in computer science from the University of Toledo, USA, in 2016, and the PhD degree in computer science from Florida International University, in 2020. His current research interests include cloud computing, Big Data system, and operating systems. Currently he is an assistant professor with the Department...View more

School of Computing and Information Science, Florida International University, Miami, FL, USA
Pinchao Liu received the BSc (Hons) degree from the Tianjin University of Science and Technology (TUST), China, and the MSc degree from the Tianjin University of Science and Technology, China. He is currently working toward the PhD degree with the School of Computing and Information Science, Florida International University, Miami, USA. His research interests include systems virtualization, cloud computing, and operating ...Show More
Pinchao Liu received the BSc (Hons) degree from the Tianjin University of Science and Technology (TUST), China, and the MSc degree from the Tianjin University of Science and Technology, China. He is currently working toward the PhD degree with the School of Computing and Information Science, Florida International University, Miami, USA. His research interests include systems virtualization, cloud computing, and operating ...View more

Texas A&M University, College Station, TX, USA
Sarker Tanzir Ahmed received the BS degree in computer science from the Bangladesh University of Engineering and Technology, Bangladesh, and the PhD degree in computer science from Texas A&M University, College Station. Currently, he works as an instructional assistant professor with the Department of Computer Science and Engineering, Texas A&M University. His research interests include large-scale information processing,...Show More
Sarker Tanzir Ahmed received the BS degree in computer science from the Bangladesh University of Engineering and Technology, Bangladesh, and the PhD degree in computer science from Texas A&M University, College Station. Currently, he works as an instructional assistant professor with the Department of Computer Science and Engineering, Texas A&M University. His research interests include large-scale information processing,...View more

Texas A&M University, College Station, TX, USA
Dilma Da Silva received the PhD degree in computer science from Georgia Tech, in 1997. She is a professor and holder of the Ford Motor Company Design Professorship II with the Department of Computer Science and Engineering, Texas A&M University, USA. She is an ACM distinguished scientist. Her research interests include operating systems addresses the need for scalable and customizable system software. She is a member of t...Show More
Dilma Da Silva received the PhD degree in computer science from Georgia Tech, in 1997. She is a professor and holder of the Ford Motor Company Design Professorship II with the Department of Computer Science and Engineering, Texas A&M University, USA. She is an ACM distinguished scientist. Her research interests include operating systems addresses the need for scalable and customizable system software. She is a member of t...View more

Department of Computer Science and Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
Liting Hu received the graduate degree in computer science from the Huazhong University of Science and Technology, China, 2007, and the PhD degree in computer science from the Georgia Institute of Technology, USA, 2016. She conducts experimental computer systems research in the areas of stream processing systems, cloud and edge computing, distributed systems, and systems virtualization. Currently she is an assistant profe...Show More
Liting Hu received the graduate degree in computer science from the Huazhong University of Science and Technology, China, 2007, and the PhD degree in computer science from the Georgia Institute of Technology, USA, 2016. She conducts experimental computer systems research in the areas of stream processing systems, cloud and edge computing, distributed systems, and systems virtualization. Currently she is an assistant profe...View more

Department of Computer Engineering and Computer Science, California State University, Long Beach, CA, USA
Hailu Xu received the BS degree in computer science from North China Electric Power University, in 2014, the MS degree in computer science from the University of Toledo, USA, in 2016, and the PhD degree in computer science from Florida International University, in 2020. His current research interests include cloud computing, Big Data system, and operating systems. Currently he is an assistant professor with the Department of Computer Engineering & Computer Science, California State University, Long Beach.
Hailu Xu received the BS degree in computer science from North China Electric Power University, in 2014, the MS degree in computer science from the University of Toledo, USA, in 2016, and the PhD degree in computer science from Florida International University, in 2020. His current research interests include cloud computing, Big Data system, and operating systems. Currently he is an assistant professor with the Department of Computer Engineering & Computer Science, California State University, Long Beach.View more

School of Computing and Information Science, Florida International University, Miami, FL, USA
Pinchao Liu received the BSc (Hons) degree from the Tianjin University of Science and Technology (TUST), China, and the MSc degree from the Tianjin University of Science and Technology, China. He is currently working toward the PhD degree with the School of Computing and Information Science, Florida International University, Miami, USA. His research interests include systems virtualization, cloud computing, and operating systems.
Pinchao Liu received the BSc (Hons) degree from the Tianjin University of Science and Technology (TUST), China, and the MSc degree from the Tianjin University of Science and Technology, China. He is currently working toward the PhD degree with the School of Computing and Information Science, Florida International University, Miami, USA. His research interests include systems virtualization, cloud computing, and operating systems.View more

Texas A&M University, College Station, TX, USA
Sarker Tanzir Ahmed received the BS degree in computer science from the Bangladesh University of Engineering and Technology, Bangladesh, and the PhD degree in computer science from Texas A&M University, College Station. Currently, he works as an instructional assistant professor with the Department of Computer Science and Engineering, Texas A&M University. His research interests include large-scale information processing, streaming frameworks with state-management, web crawling, and high-performance computing.
Sarker Tanzir Ahmed received the BS degree in computer science from the Bangladesh University of Engineering and Technology, Bangladesh, and the PhD degree in computer science from Texas A&M University, College Station. Currently, he works as an instructional assistant professor with the Department of Computer Science and Engineering, Texas A&M University. His research interests include large-scale information processing, streaming frameworks with state-management, web crawling, and high-performance computing.View more

Texas A&M University, College Station, TX, USA
Dilma Da Silva received the PhD degree in computer science from Georgia Tech, in 1997. She is a professor and holder of the Ford Motor Company Design Professorship II with the Department of Computer Science and Engineering, Texas A&M University, USA. She is an ACM distinguished scientist. Her research interests include operating systems addresses the need for scalable and customizable system software. She is a member of the board of CRA-WP (Computer Research Association's Committee on Widening the Participation in Computing) and a co-founder of the Latinas in Computing group.
Dilma Da Silva received the PhD degree in computer science from Georgia Tech, in 1997. She is a professor and holder of the Ford Motor Company Design Professorship II with the Department of Computer Science and Engineering, Texas A&M University, USA. She is an ACM distinguished scientist. Her research interests include operating systems addresses the need for scalable and customizable system software. She is a member of the board of CRA-WP (Computer Research Association's Committee on Widening the Participation in Computing) and a co-founder of the Latinas in Computing group.View more

Department of Computer Science and Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
Liting Hu received the graduate degree in computer science from the Huazhong University of Science and Technology, China, 2007, and the PhD degree in computer science from the Georgia Institute of Technology, USA, 2016. She conducts experimental computer systems research in the areas of stream processing systems, cloud and edge computing, distributed systems, and systems virtualization. Currently she is an assistant professor with the University of California Santa Cruz.
Liting Hu received the graduate degree in computer science from the Huazhong University of Science and Technology, China, 2007, and the PhD degree in computer science from the Georgia Institute of Technology, USA, 2016. She conducts experimental computer systems research in the areas of stream processing systems, cloud and edge computing, distributed systems, and systems virtualization. Currently she is an assistant professor with the University of California Santa Cruz.View more