Abstract:
We propose word-region alignment-guided multimodal neural machine translation (MNMT), a novel model for MNMT that links the semantic correlation between textual and visua...View moreMetadata
Abstract:
We propose word-region alignment-guided multimodal neural machine translation (MNMT), a novel model for MNMT that links the semantic correlation between textual and visual modalities using word-region alignment (WRA). Existing studies on MNMT have mainly focused on the effect of integrating visual and textual modalities. However, they do not leverage the semantic relevance between the two modalities. We advance the semantic correlation between textual and visual modalities in MNMT by incorporating WRA as a bridge. This proposal has been implemented on two mainstream architectures of neural machine translation (NMT): the recurrent neural network (RNN) and the transformer. Experiments on two public benchmarks, English–German and English–French translation tasks using the Multi30k dataset and English–Japanese translation tasks using the Flickr30kEnt-JP dataset prove that our model has a significant improvement with respect to the competitive baselines across different evaluation metrics and outperforms most of the existing MNMT models. For example, 1.0 BLEU scores are improved for the English–German task and 1.1 BLEU scores are improved for the English–French task on the Multi30k test2016 set; and 0.7 BLEU scores are improved for the English–Japanese task on the Flickr30kEnt-JP test set. Further analysis demonstrates that our model can achieve better translation performance by integrating WRA, leading to better visual information use.
Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing ( Volume: 30)
Funding Agency:
Tokyo Metropolitan University, Hino, Japan
Yuting Zhao received the B.Eng. degree from Liaoning Technical University, Fuxin, China, in 2014, and the M.Eng. degree in 2020 from Tokyo Metropolitan University, Hachioji, Japan, where she is currently working toward the Ph.D. degree. Her research interests include natural language processing and multimodal machine learning.
Yuting Zhao received the B.Eng. degree from Liaoning Technical University, Fuxin, China, in 2014, and the M.Eng. degree in 2020 from Tokyo Metropolitan University, Hachioji, Japan, where she is currently working toward the Ph.D. degree. Her research interests include natural language processing and multimodal machine learning.View more
Tokyo Metropolitan University, Hino, Japan
Mamoru Komachi received the M.Eng. and Ph.D. degrees from the Nara Institute of Science and Technology (NAIST), Ikoma, Japan, in 2007 and 2010, respectively. He is currently an Associate Professor with Tokyo Metropolitan University (TMU), Tokyo, Japan. He was an Assistant Professor with NAIST before joining TMU. His research interests include semantics, information extraction, and educational applications of natural langu...Show More
Mamoru Komachi received the M.Eng. and Ph.D. degrees from the Nara Institute of Science and Technology (NAIST), Ikoma, Japan, in 2007 and 2010, respectively. He is currently an Associate Professor with Tokyo Metropolitan University (TMU), Tokyo, Japan. He was an Assistant Professor with NAIST before joining TMU. His research interests include semantics, information extraction, and educational applications of natural langu...View more
Ehime University, Matsuyama, Japan
Tomoyuki Kajiwara received the B.S. and M.S. degrees in engineering from the Nagaoka University of Technology, Nagaoka, Japan, in 2013 and 2015, respectively, and the Ph.D. degree in engineering from Tokyo Metropolitan University, Tokyo, Japan, in 2018. From 2018 to 2020, he was a Specially-Appointed Assistant Professor with Osaka University, Suita, Japan. He is currently an Assistant Professor with Ehime University, Mats...Show More
Tomoyuki Kajiwara received the B.S. and M.S. degrees in engineering from the Nagaoka University of Technology, Nagaoka, Japan, in 2013 and 2015, respectively, and the Ph.D. degree in engineering from Tokyo Metropolitan University, Tokyo, Japan, in 2018. From 2018 to 2020, he was a Specially-Appointed Assistant Professor with Osaka University, Suita, Japan. He is currently an Assistant Professor with Ehime University, Mats...View more
Kyoto University, Kyoto, Japan
Chenhui Chu received the B.S. degree in software engineering from Chongqing University, Chongqing, China, in 2008, and the M.S. and Ph.D. degrees in informatics from Kyoto University, Kyoto, Japan, in 2012 and 2015, respectively. He is currently a program-specific Associate Professor with Kyoto University. His research interests include natural language processing, particularly machine translation, and multimodal machine ...Show More
Chenhui Chu received the B.S. degree in software engineering from Chongqing University, Chongqing, China, in 2008, and the M.S. and Ph.D. degrees in informatics from Kyoto University, Kyoto, Japan, in 2012 and 2015, respectively. He is currently a program-specific Associate Professor with Kyoto University. His research interests include natural language processing, particularly machine translation, and multimodal machine ...View more
Tokyo Metropolitan University, Hino, Japan
Yuting Zhao received the B.Eng. degree from Liaoning Technical University, Fuxin, China, in 2014, and the M.Eng. degree in 2020 from Tokyo Metropolitan University, Hachioji, Japan, where she is currently working toward the Ph.D. degree. Her research interests include natural language processing and multimodal machine learning.
Yuting Zhao received the B.Eng. degree from Liaoning Technical University, Fuxin, China, in 2014, and the M.Eng. degree in 2020 from Tokyo Metropolitan University, Hachioji, Japan, where she is currently working toward the Ph.D. degree. Her research interests include natural language processing and multimodal machine learning.View more
Tokyo Metropolitan University, Hino, Japan
Mamoru Komachi received the M.Eng. and Ph.D. degrees from the Nara Institute of Science and Technology (NAIST), Ikoma, Japan, in 2007 and 2010, respectively. He is currently an Associate Professor with Tokyo Metropolitan University (TMU), Tokyo, Japan. He was an Assistant Professor with NAIST before joining TMU. His research interests include semantics, information extraction, and educational applications of natural language processing.
Mamoru Komachi received the M.Eng. and Ph.D. degrees from the Nara Institute of Science and Technology (NAIST), Ikoma, Japan, in 2007 and 2010, respectively. He is currently an Associate Professor with Tokyo Metropolitan University (TMU), Tokyo, Japan. He was an Assistant Professor with NAIST before joining TMU. His research interests include semantics, information extraction, and educational applications of natural language processing.View more
Ehime University, Matsuyama, Japan
Tomoyuki Kajiwara received the B.S. and M.S. degrees in engineering from the Nagaoka University of Technology, Nagaoka, Japan, in 2013 and 2015, respectively, and the Ph.D. degree in engineering from Tokyo Metropolitan University, Tokyo, Japan, in 2018. From 2018 to 2020, he was a Specially-Appointed Assistant Professor with Osaka University, Suita, Japan. He is currently an Assistant Professor with Ehime University, Matsuyama, Japan. His research interests include natural language processing, paraphrasing, and quality estimation.
Tomoyuki Kajiwara received the B.S. and M.S. degrees in engineering from the Nagaoka University of Technology, Nagaoka, Japan, in 2013 and 2015, respectively, and the Ph.D. degree in engineering from Tokyo Metropolitan University, Tokyo, Japan, in 2018. From 2018 to 2020, he was a Specially-Appointed Assistant Professor with Osaka University, Suita, Japan. He is currently an Assistant Professor with Ehime University, Matsuyama, Japan. His research interests include natural language processing, paraphrasing, and quality estimation.View more
Kyoto University, Kyoto, Japan
Chenhui Chu received the B.S. degree in software engineering from Chongqing University, Chongqing, China, in 2008, and the M.S. and Ph.D. degrees in informatics from Kyoto University, Kyoto, Japan, in 2012 and 2015, respectively. He is currently a program-specific Associate Professor with Kyoto University. His research interests include natural language processing, particularly machine translation, and multimodal machine learning.
Chenhui Chu received the B.S. degree in software engineering from Chongqing University, Chongqing, China, in 2008, and the M.S. and Ph.D. degrees in informatics from Kyoto University, Kyoto, Japan, in 2012 and 2015, respectively. He is currently a program-specific Associate Professor with Kyoto University. His research interests include natural language processing, particularly machine translation, and multimodal machine learning.View more