Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics | IEEE Journals & Magazine | IEEE Xplore

- Donate
- Cart
- Create Account
- Personal Sign In

ADVANCED SEARCH

Journals & Magazines >IEEE Transactions on Image Pr... >Volume: 29

Video Coding for Machines: A Paradigm of Collaborative Compression and Intelligent Analytics

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand ...Show More

Metadata

Abstract:

Video coding, which targets to compress and reconstruct the whole frame, and feature compression, which only preserves and transmits the most critical information, stand at two ends of the scale. That is, one is with compactness and efficiency to serve for machine vision, and the other is with full fidelity, bowing to human perception. The recent endeavors in imminent trends of video compression, e.g. deep learning based coding tools and end-to-end image/video coding, and MPEG-7 compact feature descriptor standards, i.e. Compact Descriptors for Visual Search and Compact Descriptors for Video Analysis, promote the sustainable and fast development in their own directions, respectively. In this article, thanks to booming AI technology, e.g. prediction and generation models, we carry out exploration in the new area, Video Coding for Machines (VCM), arising from the emerging MPEG standardization efforts.¹ Towards collaborative compression and intelligent analytics, VCM attempts to bridge the gap between feature coding for machine vision and video coding for human vision. Aligning with the rising Analyze then Compress instance Digital Retina, the definition, formulation, and paradigm of VCM are given first. Meanwhile, we systematically review state-of-the-art techniques in video compression and feature compression from the unique perspective of MPEG standardization, which provides the academic and industrial evidence to realize the collaborative compression of video and feature streams in a broad range of AI applications. Finally, we come up with potential VCM solutions, and the preliminary results have demonstrated the performance and efficiency gains. Further direction is discussed as well.

Published in: IEEE Transactions on Image Processing ( Volume: 29)

Page(s): 8680 - 8695

Date of Publication: 28 August 2020

ISSN Information:

PubMed ID: 32857694

DOI: 10.1109/TIP.2020.3016485

Author image of Lingyu Duan

Peking University, Beijing, China

Lingyu Duan (Member, IEEE) received the Ph.D. degree in information technology from The University of Newcastle, Callaghan, NSW, Australia, in 2008. He is currently a Full Professor with the National Engineering Laboratory of Video Technology (NELVT), School of Electronics Engineering and Computer Science, Peking University (PKU), China. He has been the Associate Director of the Rapid-Rich Object Search Laboratory (ROSE),...Show More

Lingyu Duan (Member, IEEE) received the Ph.D. degree in information technology from The University of Newcastle, Callaghan, NSW, Australia, in 2008. He is currently a Full Professor with the National Engineering Laboratory of Video Technology (NELVT), School of Electronics Engineering and Computer Science, Peking University (PKU), China. He has been the Associate Director of the Rapid-Rich Object Search Laboratory (ROSE),...View more

Author image of Jiaying Liu

Peking University, Beijing, China

Jiaying Liu (Senior Member, IEEE) received the Ph.D. degree (Hons.) in computer science from Peking University, Beijing China, in 2010.

She was a Visiting Scholar with the University of Southern California, Los Angeles, from 2007 to 2008. She was a Visiting Researcher with Microsoft Research Asia in 2015 supported by the Star Track Young Faculties Award. She is currently an Associate Professor with the Wangxuan Institute o...Show More

Jiaying Liu (Senior Member, IEEE) received the Ph.D. degree (Hons.) in computer science from Peking University, Beijing China, in 2010.

She was a Visiting Scholar with the University of Southern California, Los Angeles, from 2007 to 2008. She was a Visiting Researcher with Microsoft Research Asia in 2015 supported by the Star Track Young Faculties Award. She is currently an Associate Professor with the Wangxuan Institute o...View more

Author image of Wenhan Yang

Peking University, Beijing, China

Wenhan Yang (Member, IEEE) received the B.S. and Ph.D. (Hons.) degrees in computer science from Peking University, Beijing, China, in 2012 and 2018, respectively. He was a Visiting Scholar with the National University of Singapore, from September 2015 to September 2016 and from September 2018 to November 2018. He is currently a Postdoctoral Research Fellow with the Department of Computer Science, City University of Hong K...Show More

Wenhan Yang (Member, IEEE) received the B.S. and Ph.D. (Hons.) degrees in computer science from Peking University, Beijing, China, in 2012 and 2018, respectively. He was a Visiting Scholar with the National University of Singapore, from September 2015 to September 2016 and from September 2018 to November 2018. He is currently a Postdoctoral Research Fellow with the Department of Computer Science, City University of Hong K...View more

Author image of Tiejun Huang

Peking University, Beijing, China

Tiejun Huang (Senior Member, IEEE) received the bachelor’s and master’s degrees in computer science from the Wuhan University of Technology in 1992 and 1995, respectively, and the Ph.D. degree in pattern recognition and intelligent system from the Huazhong University of Science and Technology, China, in 1998. He is currently a Professor with the National Engineering Laboratory for Video Technology, Cooperative Medianet In...Show More

Tiejun Huang (Senior Member, IEEE) received the bachelor’s and master’s degrees in computer science from the Wuhan University of Technology in 1992 and 1995, respectively, and the Ph.D. degree in pattern recognition and intelligent system from the Huazhong University of Science and Technology, China, in 1998. He is currently a Professor with the National Engineering Laboratory for Video Technology, Cooperative Medianet In...View more

Author image of Wen Gao

Peking University, Beijing, China

Wen Gao (Fellow, IEEE) received the Ph.D. degree in electronics engineering from The University of Tokyo, Japan, in 1991. He is currently a Professor of computer science with Peking University, China. Before joining Peking University, he was a Professor of computer science with the Harbin Institute of Technology from 1991 to 1995 and a Professor with the Institute of Computing Technology, Chinese Academy of Sciences. He h...Show More

Wen Gao (Fellow, IEEE) received the Ph.D. degree in electronics engineering from The University of Tokyo, Japan, in 1991. He is currently a Professor of computer science with Peking University, China. Before joining Peking University, he was a Professor of computer science with the Harbin Institute of Technology from 1991 to 1995 and a Professor with the Institute of Computing Technology, Chinese Academy of Sciences. He h...View more

Author image of Lingyu Duan

Peking University, Beijing, China

Lingyu Duan (Member, IEEE) received the Ph.D. degree in information technology from The University of Newcastle, Callaghan, NSW, Australia, in 2008. He is currently a Full Professor with the National Engineering Laboratory of Video Technology (NELVT), School of Electronics Engineering and Computer Science, Peking University (PKU), China. He has been the Associate Director of the Rapid-Rich Object Search Laboratory (ROSE), a joint lab between Nanyang Technological University (NTU), Singapore, and PKU, since 2012. He has also been with the Peng Cheng Laboratory, Shenzhen, China, since 2019. His research interests include multimedia indexing, search, and retrieval, mobile visual search, visual feature coding, video analytics, and so on. He has published about 200 research articles. He is a member of the MSA Technical Committee in IEEE-CAS Society. He received the IEEE ICME Best Paper Awards in 2020 and 2019, the IEEE VCIP Best Paper Award in 2019, the EURASIP Journal on Image and Video Processing Best Paper Award in 2015, the Ministry of Education Technology Invention Award (First Prize) in 2016, the National Technology Invention Award (Second Prize) in 2017, the China Patent Award for Excellence in 2017, and the National Information Technology Standardization Technical Committee Standardization Work Outstanding Person Award in 2015. He was a Co-Editor of the MPEG Compact Descriptor for Visual Search (CDVS) Standard (ISO/IEC 15938-13) and the MPEG Compact Descriptor for Video Analytics (CDVA) Standard (ISO/IEC 15938-15). He is an Associate Editor of the IEEE Transactions on Multimedia, ACM Transactions on Intelligent Systems and Technology, and ACM Transactions on Multimedia Computing, Communications, and Applications and serves as the Area Chair of ACM MM and IEEE ICME.

Lingyu Duan (Member, IEEE) received the Ph.D. degree in information technology from The University of Newcastle, Callaghan, NSW, Australia, in 2008. He is currently a Full Professor with the National Engineering Laboratory of Video Technology (NELVT), School of Electronics Engineering and Computer Science, Peking University (PKU), China. He has been the Associate Director of the Rapid-Rich Object Search Laboratory (ROSE), a joint lab between Nanyang Technological University (NTU), Singapore, and PKU, since 2012. He has also been with the Peng Cheng Laboratory, Shenzhen, China, since 2019. His research interests include multimedia indexing, search, and retrieval, mobile visual search, visual feature coding, video analytics, and so on. He has published about 200 research articles. He is a member of the MSA Technical Committee in IEEE-CAS Society. He received the IEEE ICME Best Paper Awards in 2020 and 2019, the IEEE VCIP Best Paper Award in 2019, the EURASIP Journal on Image and Video Processing Best Paper Award in 2015, the Ministry of Education Technology Invention Award (First Prize) in 2016, the National Technology Invention Award (Second Prize) in 2017, the China Patent Award for Excellence in 2017, and the National Information Technology Standardization Technical Committee Standardization Work Outstanding Person Award in 2015. He was a Co-Editor of the MPEG Compact Descriptor for Visual Search (CDVS) Standard (ISO/IEC 15938-13) and the MPEG Compact Descriptor for Video Analytics (CDVA) Standard (ISO/IEC 15938-15). He is an Associate Editor of the IEEE Transactions on Multimedia, ACM Transactions on Intelligent Systems and Technology, and ACM Transactions on Multimedia Computing, Communications, and Applications and serves as the Area Chair of ACM MM and IEEE ICME.View more

Author image of Jiaying Liu

Peking University, Beijing, China

Jiaying Liu (Senior Member, IEEE) received the Ph.D. degree (Hons.) in computer science from Peking University, Beijing China, in 2010.

She was a Visiting Scholar with the University of Southern California, Los Angeles, from 2007 to 2008. She was a Visiting Researcher with Microsoft Research Asia in 2015 supported by the Star Track Young Faculties Award. She is currently an Associate Professor with the Wangxuan Institute of Computer Technology, Peking University. She has authored over 100 technical articles in refereed journals and proceedings and holds 43 granted patents. Her current research interests include multimedia signal processing, compression, and computer vision. She is a Senior Member of CSIG and CCF. She has served as a member of Membership Services Committee in IEEE Signal Processing Society, the Multimedia Systems and Applications Technical Committee (MSA TC), the Visual Signal Processing and Communications Technical Committee (VSPC TC) in IEEE Circuits and Systems Society, and the Image, Video, and Multimedia (IVM) Technical Committee in APSIPA. She received IEEE ICME 2020 Best Paper Awards and IEEE MMSP 2015 Top10% Paper Awards. She has also served as an Associate Editor for the IEEE Transactions on Image Processing and Elsevier JVCI, the Technical Program Chair of the IEEE VCIP-2019/ACM ICMR-2021, the Publicity Chair of the IEEE ICME-2020/ICIP-2019, and the Area Chair of CVPR-2021/ECCV-2020/ICCV-2019. She was the APSIPA Distinguished Lecturer from 2016 to 2017.

Jiaying Liu (Senior Member, IEEE) received the Ph.D. degree (Hons.) in computer science from Peking University, Beijing China, in 2010.

She was a Visiting Scholar with the University of Southern California, Los Angeles, from 2007 to 2008. She was a Visiting Researcher with Microsoft Research Asia in 2015 supported by the Star Track Young Faculties Award. She is currently an Associate Professor with the Wangxuan Institute of Computer Technology, Peking University. She has authored over 100 technical articles in refereed journals and proceedings and holds 43 granted patents. Her current research interests include multimedia signal processing, compression, and computer vision. She is a Senior Member of CSIG and CCF. She has served as a member of Membership Services Committee in IEEE Signal Processing Society, the Multimedia Systems and Applications Technical Committee (MSA TC), the Visual Signal Processing and Communications Technical Committee (VSPC TC) in IEEE Circuits and Systems Society, and the Image, Video, and Multimedia (IVM) Technical Committee in APSIPA. She received IEEE ICME 2020 Best Paper Awards and IEEE MMSP 2015 Top10% Paper Awards. She has also served as an Associate Editor for the IEEE Transactions on Image Processing and Elsevier JVCI, the Technical Program Chair of the IEEE VCIP-2019/ACM ICMR-2021, the Publicity Chair of the IEEE ICME-2020/ICIP-2019, and the Area Chair of CVPR-2021/ECCV-2020/ICCV-2019. She was the APSIPA Distinguished Lecturer from 2016 to 2017.View more

Author image of Wenhan Yang

Peking University, Beijing, China

Wenhan Yang (Member, IEEE) received the B.S. and Ph.D. (Hons.) degrees in computer science from Peking University, Beijing, China, in 2012 and 2018, respectively. He was a Visiting Scholar with the National University of Singapore, from September 2015 to September 2016 and from September 2018 to November 2018. He is currently a Postdoctoral Research Fellow with the Department of Computer Science, City University of Hong Kong. His current research interests include deep learning-based image processing, bad weather restoration, and related applications and theories.

Wenhan Yang (Member, IEEE) received the B.S. and Ph.D. (Hons.) degrees in computer science from Peking University, Beijing, China, in 2012 and 2018, respectively. He was a Visiting Scholar with the National University of Singapore, from September 2015 to September 2016 and from September 2018 to November 2018. He is currently a Postdoctoral Research Fellow with the Department of Computer Science, City University of Hong Kong. His current research interests include deep learning-based image processing, bad weather restoration, and related applications and theories.View more

Author image of Tiejun Huang

Peking University, Beijing, China

Tiejun Huang (Senior Member, IEEE) received the bachelor’s and master’s degrees in computer science from the Wuhan University of Technology in 1992 and 1995, respectively, and the Ph.D. degree in pattern recognition and intelligent system from the Huazhong University of Science and Technology, China, in 1998. He is currently a Professor with the National Engineering Laboratory for Video Technology, Cooperative Medianet Innovation Center, School of Electronic Engineering and Computer Science, and the Head of the Department of Computer Science, Peking University. His research interests include video coding and image understanding, especially neural coding inspired information coding theory in recent years. He is a member of the Board of the Chinese Institute of Electronics and the Advisory Board of the IEEE Computing. He received the National Science Fund for Distinguished Young Scholars of China in 2014. He was awarded the Distinguished Professor of the Chang Jiang Scholars Program by the Ministry of Education in 2015.

Tiejun Huang (Senior Member, IEEE) received the bachelor’s and master’s degrees in computer science from the Wuhan University of Technology in 1992 and 1995, respectively, and the Ph.D. degree in pattern recognition and intelligent system from the Huazhong University of Science and Technology, China, in 1998. He is currently a Professor with the National Engineering Laboratory for Video Technology, Cooperative Medianet Innovation Center, School of Electronic Engineering and Computer Science, and the Head of the Department of Computer Science, Peking University. His research interests include video coding and image understanding, especially neural coding inspired information coding theory in recent years. He is a member of the Board of the Chinese Institute of Electronics and the Advisory Board of the IEEE Computing. He received the National Science Fund for Distinguished Young Scholars of China in 2014. He was awarded the Distinguished Professor of the Chang Jiang Scholars Program by the Ministry of Education in 2015.View more

Author image of Wen Gao

Peking University, Beijing, China

Wen Gao (Fellow, IEEE) received the Ph.D. degree in electronics engineering from The University of Tokyo, Japan, in 1991. He is currently a Professor of computer science with Peking University, China. Before joining Peking University, he was a Professor of computer science with the Harbin Institute of Technology from 1991 to 1995 and a Professor with the Institute of Computing Technology, Chinese Academy of Sciences. He has published extensively including five books and over 600 technical papers in refereed journals and conference proceedings in the areas of image processing, video coding and communications, pattern recognition, multimedia information retrieval, multimodal interface, and bioinformatics. He has served or serves on the editorial board for several journals, such as the IEEE Transactions on Circuits and Systems for Video Technology, the IEEE Transactions on Multimedia, the IEEE Transactions on Image Processing, the IEEE Transactions on Autonomous Mental Development, the EURASIP Journal on Image and Video Processing, and the Journal of Visual Communication and Image Representation. He chaired a number of prestigious international conferences on multimedia and video signal processing, such as the IEEE ICME and ACM Multimedia and also served on the advisory and technical committees of numerous professional organizations.

Wen Gao (Fellow, IEEE) received the Ph.D. degree in electronics engineering from The University of Tokyo, Japan, in 1991. He is currently a Professor of computer science with Peking University, China. Before joining Peking University, he was a Professor of computer science with the Harbin Institute of Technology from 1991 to 1995 and a Professor with the Institute of Computing Technology, Chinese Academy of Sciences. He has published extensively including five books and over 600 technical papers in refereed journals and conference proceedings in the areas of image processing, video coding and communications, pattern recognition, multimedia information retrieval, multimodal interface, and bioinformatics. He has served or serves on the editorial board for several journals, such as the IEEE Transactions on Circuits and Systems for Video Technology, the IEEE Transactions on Multimedia, the IEEE Transactions on Image Processing, the IEEE Transactions on Autonomous Mental Development, the EURASIP Journal on Image and Video Processing, and the Journal of Visual Communication and Image Representation. He chaired a number of prestigious international conferences on multimedia and video signal processing, such as the IEEE ICME and ACM Multimedia and also served on the advisory and technical committees of numerous professional organizations.View more

References is not available for this document.