Abstract:
Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applica...Show MoreMetadata
Abstract:
Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images. Aiming towards real-world applications, we develop a more challenging yet practical HPT setting, termed as Fine-grained Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail replenishment. Concretely, we analyze the potential design flaws of existing methods via an illustrative example, and establish the core FHPT methodology by combing the idea of content synthesis and feature transfer together in a mutually-guided fashion. Thereafter, we substantiate the proposed methodology with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine model training scheme. Moreover, we build up a complete suite of fine-grained evaluation protocols to address the challenges of FHPT in a comprehensive manner, including semantic analysis, structural detection and perceptual quality assessment. Extensive experiments on the DeepFashion benchmark dataset have verified the power of proposed benchmark against start-of-the-art works, with 12%–14% gain on top-10 retrieval recall, 5% higher joint localization accuracy, and near 40% gain on face identity preservation. Our codes, models and evaluation tools will be released at https://github.com/Lotayou/RATE
Published in: IEEE Transactions on Image Processing ( Volume: 30)
Funding Agency:

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Lingbo Yang received the B.S. degree in mathematics and applied mathematics from Peking University in 2016. He is currently pursuing the Ph.D. degree with the Institute of Digital Media, Peking University. He has been interning at the DAMO Academy, Alibaba Group, since 2019. His research interests include deep generative models, image restoration and editing, and human pose transfer. In 2020, he has authored six peer-revi...Show More
Lingbo Yang received the B.S. degree in mathematics and applied mathematics from Peking University in 2016. He is currently pursuing the Ph.D. degree with the Institute of Digital Media, Peking University. He has been interning at the DAMO Academy, Alibaba Group, since 2019. His research interests include deep generative models, image restoration and editing, and human pose transfer. In 2020, he has authored six peer-revi...View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Pan Wang received the B.S. degree in information and control engineering from the China University of Petroleum, Qingdao, China, in 2013, and the M.S. degree in computer science with the School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing, China, in 2018. He is currently a Computer Vision Algorithm Engineer with the Alibaba DAMO academy. His current research i...Show More
Pan Wang received the B.S. degree in information and control engineering from the China University of Petroleum, Qingdao, China, in 2013, and the M.S. degree in computer science with the School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing, China, in 2018. He is currently a Computer Vision Algorithm Engineer with the Alibaba DAMO academy. His current research i...View more

School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China
Chang Liu received the B.S. degree from Jilin University, Jilin, China, in 2012. He is currently pursuing the Ph.D. degree with the School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing, China. His research interests include computer vision and machine learning, specifically for neural architecture design and visual object detection. He has published more than t...Show More
Chang Liu received the B.S. degree from Jilin University, Jilin, China, in 2012. He is currently pursuing the Ph.D. degree with the School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing, China. His research interests include computer vision and machine learning, specifically for neural architecture design and visual object detection. He has published more than t...View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Zhanning Gao received the B.S. degree in automatic control engineering from Xi’an Jiaotong University, Xi’an, China, in 2012. He is currently pursuing the Ph.D. degree with the Institute of Artificial Intelligence and Robtics, Xi’an Jiaotong University. He was a Research Intern with Visual Computing Group, Microsoft Research Asia, from 2015 to 2017. His research interests include compact image/video representation, large ...Show More
Zhanning Gao received the B.S. degree in automatic control engineering from Xi’an Jiaotong University, Xi’an, China, in 2012. He is currently pursuing the Ph.D. degree with the Institute of Artificial Intelligence and Robtics, Xi’an Jiaotong University. He was a Research Intern with Visual Computing Group, Microsoft Research Asia, from 2015 to 2017. His research interests include compact image/video representation, large ...View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Peiran Ren received the B.Sc. and Ph.D. degrees from Tsinghua University, China, in 2008 and 2014 respectively. He is currently a Senior Algorithm Engineer with the Alibaba Damo Acadamy. His research interests include image and video enhancement and processing, computer aided design, real-time rendering, and appearance acquisition.
Peiran Ren received the B.Sc. and Ph.D. degrees from Tsinghua University, China, in 2008 and 2014 respectively. He is currently a Senior Algorithm Engineer with the Alibaba Damo Acadamy. His research interests include image and video enhancement and processing, computer aided design, real-time rendering, and appearance acquisition.View more

School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China
Xinfeng Zhang received the B.Sc. degree from the Hebei University of Technology, in 2007, and the Ph.D. degree from the Chinese Academy of Sciences, in 2014. He served as a research fellow/postdoc in Nanyang Technological University, University of Southern California, and City University of Hong Kong. He is currently an Assistant Professor with the Department of Computer Science, University of Chinese Academy of Sciences....Show More
Xinfeng Zhang received the B.Sc. degree from the Hebei University of Technology, in 2007, and the Ph.D. degree from the Chinese Academy of Sciences, in 2014. He served as a research fellow/postdoc in Nanyang Technological University, University of Southern California, and City University of Hong Kong. He is currently an Assistant Professor with the Department of Computer Science, University of Chinese Academy of Sciences....View more

Institute of Digital Media, Peking University, Beijing, China
Shanshe Wang (Member, IEEE) received the B.S. degree from the Department of Mathematics, Heilongjiang University, Harbin, China, in 2004, the M.S. degree in computer software and theory from Northeast Petroleum University, Daqing, China, in 2010, and the Ph.D. degree in computer science from the Harbin Institute of Technology. He held a postdoctoral position with Peking University, from 2016 to 2018. He joined the School ...Show More
Shanshe Wang (Member, IEEE) received the B.S. degree from the Department of Mathematics, Heilongjiang University, Harbin, China, in 2004, the M.S. degree in computer software and theory from Northeast Petroleum University, Daqing, China, in 2010, and the Ph.D. degree in computer science from the Harbin Institute of Technology. He held a postdoctoral position with Peking University, from 2016 to 2018. He joined the School ...View more

Institute of Digital Media, Peking University, Beijing, China
Siwei Ma (Member, IEEE) received the B.Sc. degree from Shandong Normal University, in 1999, and the Ph.D. degree in computer science from the Institute of Computing Technology, Chinese Academy of Sciences, in 2005. He worked as a postdoc with the University of Southern California, from 2005 to 2007. He joined the Institute of Digital Media, Peking University, where he is currently a Professor. He has authored over 200 tec...Show More
Siwei Ma (Member, IEEE) received the B.Sc. degree from Shandong Normal University, in 1999, and the Ph.D. degree in computer science from the Institute of Computing Technology, Chinese Academy of Sciences, in 2005. He worked as a postdoc with the University of Southern California, from 2005 to 2007. He joined the Institute of Digital Media, Peking University, where he is currently a Professor. He has authored over 200 tec...View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Xiansheng Hua (Fellow, IEEE) received the B.S. and Ph.D. degrees in applied mathematics from Peking University, Beijing, in 1996 and 2001, respectively. In 2001, he joined Microsoft Research Asia as a Researcher. He became a Researcher and the Senior Director of the Alibaba Group in 2015. He has authored or coauthored over 250 research articles and has filed over 90 patents. His research interests have been in the areas o...Show More
Xiansheng Hua (Fellow, IEEE) received the B.S. and Ph.D. degrees in applied mathematics from Peking University, Beijing, in 1996 and 2001, respectively. In 2001, he joined Microsoft Research Asia as a Researcher. He became a Researcher and the Senior Director of the Alibaba Group in 2015. He has authored or coauthored over 250 research articles and has filed over 90 patents. His research interests have been in the areas o...View more

Institute of Digital Media, Peking University, Beijing, China
Wen Gao (Fellow, IEEE) received the Ph.D. degree in electronic engineering from The University of Tokyo in 1991. He was a Professor of computer science with the Harbin Institute of Technology from 1991 to 1995 and with the Institute of Computing Technology, Chinese Academy of Sciences, from 1996 to 2006. He is currently a Professor of computer science with Peking University. He has authored extensively, including five boo...Show More
Wen Gao (Fellow, IEEE) received the Ph.D. degree in electronic engineering from The University of Tokyo in 1991. He was a Professor of computer science with the Harbin Institute of Technology from 1991 to 1995 and with the Institute of Computing Technology, Chinese Academy of Sciences, from 1996 to 2006. He is currently a Professor of computer science with Peking University. He has authored extensively, including five boo...View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Lingbo Yang received the B.S. degree in mathematics and applied mathematics from Peking University in 2016. He is currently pursuing the Ph.D. degree with the Institute of Digital Media, Peking University. He has been interning at the DAMO Academy, Alibaba Group, since 2019. His research interests include deep generative models, image restoration and editing, and human pose transfer. In 2020, he has authored six peer-reviewed journals and conference papers, yet his best paper is always the next one.
Lingbo Yang received the B.S. degree in mathematics and applied mathematics from Peking University in 2016. He is currently pursuing the Ph.D. degree with the Institute of Digital Media, Peking University. He has been interning at the DAMO Academy, Alibaba Group, since 2019. His research interests include deep generative models, image restoration and editing, and human pose transfer. In 2020, he has authored six peer-reviewed journals and conference papers, yet his best paper is always the next one.View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Pan Wang received the B.S. degree in information and control engineering from the China University of Petroleum, Qingdao, China, in 2013, and the M.S. degree in computer science with the School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing, China, in 2018. He is currently a Computer Vision Algorithm Engineer with the Alibaba DAMO academy. His current research interests include deep generative models, image restoration, video inpainting and object tracking.
Pan Wang received the B.S. degree in information and control engineering from the China University of Petroleum, Qingdao, China, in 2013, and the M.S. degree in computer science with the School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing, China, in 2018. He is currently a Computer Vision Algorithm Engineer with the Alibaba DAMO academy. His current research interests include deep generative models, image restoration, video inpainting and object tracking.View more

School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China
Chang Liu received the B.S. degree from Jilin University, Jilin, China, in 2012. He is currently pursuing the Ph.D. degree with the School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing, China. His research interests include computer vision and machine learning, specifically for neural architecture design and visual object detection. He has published more than ten papers in referred conferences including ECCV, ICCV, and CVPR.
Chang Liu received the B.S. degree from Jilin University, Jilin, China, in 2012. He is currently pursuing the Ph.D. degree with the School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing, China. His research interests include computer vision and machine learning, specifically for neural architecture design and visual object detection. He has published more than ten papers in referred conferences including ECCV, ICCV, and CVPR.View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Zhanning Gao received the B.S. degree in automatic control engineering from Xi’an Jiaotong University, Xi’an, China, in 2012. He is currently pursuing the Ph.D. degree with the Institute of Artificial Intelligence and Robtics, Xi’an Jiaotong University. He was a Research Intern with Visual Computing Group, Microsoft Research Asia, from 2015 to 2017. His research interests include compact image/video representation, large scale content based multimedia retrieval, and complex event video analysis.
Zhanning Gao received the B.S. degree in automatic control engineering from Xi’an Jiaotong University, Xi’an, China, in 2012. He is currently pursuing the Ph.D. degree with the Institute of Artificial Intelligence and Robtics, Xi’an Jiaotong University. He was a Research Intern with Visual Computing Group, Microsoft Research Asia, from 2015 to 2017. His research interests include compact image/video representation, large scale content based multimedia retrieval, and complex event video analysis.View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Peiran Ren received the B.Sc. and Ph.D. degrees from Tsinghua University, China, in 2008 and 2014 respectively. He is currently a Senior Algorithm Engineer with the Alibaba Damo Acadamy. His research interests include image and video enhancement and processing, computer aided design, real-time rendering, and appearance acquisition.
Peiran Ren received the B.Sc. and Ph.D. degrees from Tsinghua University, China, in 2008 and 2014 respectively. He is currently a Senior Algorithm Engineer with the Alibaba Damo Acadamy. His research interests include image and video enhancement and processing, computer aided design, real-time rendering, and appearance acquisition.View more

School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China
Xinfeng Zhang received the B.Sc. degree from the Hebei University of Technology, in 2007, and the Ph.D. degree from the Chinese Academy of Sciences, in 2014. He served as a research fellow/postdoc in Nanyang Technological University, University of Southern California, and City University of Hong Kong. He is currently an Assistant Professor with the Department of Computer Science, University of Chinese Academy of Sciences. He has authored 20 technical proposals to ISO/MPEG, ITU-T, and AVS standards and more than 100 refereed journals/conference papers. His research interests include video compression, image/video quality assessment, and image/video analysis. He received the Best Paper Award at the 2017 Pacific-Rim Conference on Multimedia, the Best Paper Award of IEEE Multimedia 2018, and is the coauthor of a paper that received the Best Student Paper Award in the IEEE International Conference on Image Processing 2018.
Xinfeng Zhang received the B.Sc. degree from the Hebei University of Technology, in 2007, and the Ph.D. degree from the Chinese Academy of Sciences, in 2014. He served as a research fellow/postdoc in Nanyang Technological University, University of Southern California, and City University of Hong Kong. He is currently an Assistant Professor with the Department of Computer Science, University of Chinese Academy of Sciences. He has authored 20 technical proposals to ISO/MPEG, ITU-T, and AVS standards and more than 100 refereed journals/conference papers. His research interests include video compression, image/video quality assessment, and image/video analysis. He received the Best Paper Award at the 2017 Pacific-Rim Conference on Multimedia, the Best Paper Award of IEEE Multimedia 2018, and is the coauthor of a paper that received the Best Student Paper Award in the IEEE International Conference on Image Processing 2018.View more

Institute of Digital Media, Peking University, Beijing, China
Shanshe Wang (Member, IEEE) received the B.S. degree from the Department of Mathematics, Heilongjiang University, Harbin, China, in 2004, the M.S. degree in computer software and theory from Northeast Petroleum University, Daqing, China, in 2010, and the Ph.D. degree in computer science from the Harbin Institute of Technology. He held a postdoctoral position with Peking University, from 2016 to 2018. He joined the School of Electronics Engineering and Computer Science, Institute of Digital Media, Peking University, Beijing, where he is currently a Research Assistant Professor. His current research interests include video compression and image and video quality assessment.
Shanshe Wang (Member, IEEE) received the B.S. degree from the Department of Mathematics, Heilongjiang University, Harbin, China, in 2004, the M.S. degree in computer software and theory from Northeast Petroleum University, Daqing, China, in 2010, and the Ph.D. degree in computer science from the Harbin Institute of Technology. He held a postdoctoral position with Peking University, from 2016 to 2018. He joined the School of Electronics Engineering and Computer Science, Institute of Digital Media, Peking University, Beijing, where he is currently a Research Assistant Professor. His current research interests include video compression and image and video quality assessment.View more

Institute of Digital Media, Peking University, Beijing, China
Siwei Ma (Member, IEEE) received the B.Sc. degree from Shandong Normal University, in 1999, and the Ph.D. degree in computer science from the Institute of Computing Technology, Chinese Academy of Sciences, in 2005. He worked as a postdoc with the University of Southern California, from 2005 to 2007. He joined the Institute of Digital Media, Peking University, where he is currently a Professor. He has authored over 200 technical articles in refereed journals and proceedings in image and video coding, video processing, video streaming, and transmission. He is an Associate Editor of the IEEE Transactions on Circuits and Systems for Video Technology and the Journal of Visual Communication and Image Representation.
Siwei Ma (Member, IEEE) received the B.Sc. degree from Shandong Normal University, in 1999, and the Ph.D. degree in computer science from the Institute of Computing Technology, Chinese Academy of Sciences, in 2005. He worked as a postdoc with the University of Southern California, from 2005 to 2007. He joined the Institute of Digital Media, Peking University, where he is currently a Professor. He has authored over 200 technical articles in refereed journals and proceedings in image and video coding, video processing, video streaming, and transmission. He is an Associate Editor of the IEEE Transactions on Circuits and Systems for Video Technology and the Journal of Visual Communication and Image Representation.View more

Video Coding Laboratory, Institute of Digital Media, Peking University (PKU-IDM-VCL), Beijing, China
Xiansheng Hua (Fellow, IEEE) received the B.S. and Ph.D. degrees in applied mathematics from Peking University, Beijing, in 1996 and 2001, respectively. In 2001, he joined Microsoft Research Asia as a Researcher. He became a Researcher and the Senior Director of the Alibaba Group in 2015. He has authored or coauthored over 250 research articles and has filed over 90 patents. His research interests have been in the areas of multimedia search, advertising, understanding, and mining, and pattern recognition and machine learning. He was honored as one of the recipients of MIT35. He served as a Program Co-Chair for the IEEE ICME 2013, the ACM Multimedia 2012, and the IEEE ICME 2012, and on the Technical Directions Board of the IEEE Signal Processing Society. He is an ACM Distinguished Scientist.
Xiansheng Hua (Fellow, IEEE) received the B.S. and Ph.D. degrees in applied mathematics from Peking University, Beijing, in 1996 and 2001, respectively. In 2001, he joined Microsoft Research Asia as a Researcher. He became a Researcher and the Senior Director of the Alibaba Group in 2015. He has authored or coauthored over 250 research articles and has filed over 90 patents. His research interests have been in the areas of multimedia search, advertising, understanding, and mining, and pattern recognition and machine learning. He was honored as one of the recipients of MIT35. He served as a Program Co-Chair for the IEEE ICME 2013, the ACM Multimedia 2012, and the IEEE ICME 2012, and on the Technical Directions Board of the IEEE Signal Processing Society. He is an ACM Distinguished Scientist.View more

Institute of Digital Media, Peking University, Beijing, China
Wen Gao (Fellow, IEEE) received the Ph.D. degree in electronic engineering from The University of Tokyo in 1991. He was a Professor of computer science with the Harbin Institute of Technology from 1991 to 1995 and with the Institute of Computing Technology, Chinese Academy of Sciences, from 1996 to 2006. He is currently a Professor of computer science with Peking University. He has authored extensively, including five books and more than 600 technical articles in refereed journals and conference proceedings in the areas of image processing, video coding and communication, pattern recognition, multimedia information retrieval, multimodal interface, and bioinformatics. He chaired a number of prestigious international conferences on multimedia and video signal processing, such as IEEE ISCAS, ICME, and the ACM Multimedia, and also served on the advisory and technical committees for numerous professional organizations. He served or serves on the Editorial Board for several journals, including TCSVT, TMM, TIP, and TAMD.
Wen Gao (Fellow, IEEE) received the Ph.D. degree in electronic engineering from The University of Tokyo in 1991. He was a Professor of computer science with the Harbin Institute of Technology from 1991 to 1995 and with the Institute of Computing Technology, Chinese Academy of Sciences, from 1996 to 2006. He is currently a Professor of computer science with Peking University. He has authored extensively, including five books and more than 600 technical articles in refereed journals and conference proceedings in the areas of image processing, video coding and communication, pattern recognition, multimedia information retrieval, multimodal interface, and bioinformatics. He chaired a number of prestigious international conferences on multimedia and video signal processing, such as IEEE ISCAS, ICME, and the ACM Multimedia, and also served on the advisory and technical committees for numerous professional organizations. He served or serves on the Editorial Board for several journals, including TCSVT, TMM, TIP, and TAMD.View more