Abstract:
Unsupervised acoustic model adaptation for large vocabulary speech recognition is typically accomplished by using an estimated transcription of the adaptation data. The e...Show MoreMetadata
Abstract:
Unsupervised acoustic model adaptation for large vocabulary speech recognition is typically accomplished by using an estimated transcription of the adaptation data. The effectiveness of the technique is limited by errors in the estimated transcription. Previous work has mitigated this negative effect by using only those sections of the adaptation data which are transcribed with relatively high confidence. In this work, phoneme correctness predictions are integrated into a discriminative unsupervised acoustic model adaptation procedure. Small but significant performance improvements (over the equivalent maximum likelihood adaptation technique) are observed when using unsupervised discriminative adaptation in combination with support vector machines to predict phoneme correctness.
Published in: IEEE Transactions on Audio, Speech, and Language Processing ( Volume: 20, Issue: 10, December 2012)

Department of Computer Science, University of Sheffield, Sheffield, UK
Matthew Gibson received the B.Sc. (Hons) in 1994 from Glasgow University, the M.Sc. in 1996 from Oxford University, the M.Phil. in 2004 from Cambridge University and the Ph.D. in 2008 from Sheffield University. He is currently a research associate at the department of computer science, Sheffield University. His main research interests are machine learning, automatic speech recognition and speech synthesis.
Matthew Gibson received the B.Sc. (Hons) in 1994 from Glasgow University, the M.Sc. in 1996 from Oxford University, the M.Phil. in 2004 from Cambridge University and the Ph.D. in 2008 from Sheffield University. He is currently a research associate at the department of computer science, Sheffield University. His main research interests are machine learning, automatic speech recognition and speech synthesis.View more

Department of Computer Science, University of Sheffield, Sheffield, UK
Thomas Hain (M'02) holds the degree Dipl.-Ing in Electrical Engineering from the University of Technology, Vienna, and a Ph.D. from Cambridge University. From 1994 he worked in the Speech Technology Group at Philips Speech Processing, which he left in a senior position in 1997. He moved to the Speech, Vision and Robotics Group at Cambridge University where he took up a lectureship in 2001. He is now a member of the Speech...Show More
Thomas Hain (M'02) holds the degree Dipl.-Ing in Electrical Engineering from the University of Technology, Vienna, and a Ph.D. from Cambridge University. From 1994 he worked in the Speech Technology Group at Philips Speech Processing, which he left in a senior position in 1997. He moved to the Speech, Vision and Robotics Group at Cambridge University where he took up a lectureship in 2001. He is now a member of the Speech...View more

Department of Computer Science, University of Sheffield, Sheffield, UK
Matthew Gibson received the B.Sc. (Hons) in 1994 from Glasgow University, the M.Sc. in 1996 from Oxford University, the M.Phil. in 2004 from Cambridge University and the Ph.D. in 2008 from Sheffield University. He is currently a research associate at the department of computer science, Sheffield University. His main research interests are machine learning, automatic speech recognition and speech synthesis.
Matthew Gibson received the B.Sc. (Hons) in 1994 from Glasgow University, the M.Sc. in 1996 from Oxford University, the M.Phil. in 2004 from Cambridge University and the Ph.D. in 2008 from Sheffield University. He is currently a research associate at the department of computer science, Sheffield University. His main research interests are machine learning, automatic speech recognition and speech synthesis.View more

Department of Computer Science, University of Sheffield, Sheffield, UK
Thomas Hain (M'02) holds the degree Dipl.-Ing in Electrical Engineering from the University of Technology, Vienna, and a Ph.D. from Cambridge University. From 1994 he worked in the Speech Technology Group at Philips Speech Processing, which he left in a senior position in 1997. He moved to the Speech, Vision and Robotics Group at Cambridge University where he took up a lectureship in 2001. He is now a member of the Speech and Hearing Group at Sheffield University where he now holds a position as Reader. Thomas Hain has published more than 80 articles in international journals, books and conferences. Aside from work in technical committees at major conferences he served on the IEEE Speech Technical Committee, and organizing committees of Interspeech 2009 and IEEE ASRU 2011. He is currently an editorial board member of Computer Speech & Language and associate editor for ACM Transactions on Speech and Language Processing. His research focuses on speech recognition of natural speech in realistic environments. Current projects include the Natural Speech Technology programme with focus on new models for speech and environments and enhanced adaptive structures for recognition.
Thomas Hain (M'02) holds the degree Dipl.-Ing in Electrical Engineering from the University of Technology, Vienna, and a Ph.D. from Cambridge University. From 1994 he worked in the Speech Technology Group at Philips Speech Processing, which he left in a senior position in 1997. He moved to the Speech, Vision and Robotics Group at Cambridge University where he took up a lectureship in 2001. He is now a member of the Speech and Hearing Group at Sheffield University where he now holds a position as Reader. Thomas Hain has published more than 80 articles in international journals, books and conferences. Aside from work in technical committees at major conferences he served on the IEEE Speech Technical Committee, and organizing committees of Interspeech 2009 and IEEE ASRU 2011. He is currently an editorial board member of Computer Speech & Language and associate editor for ACM Transactions on Speech and Language Processing. His research focuses on speech recognition of natural speech in realistic environments. Current projects include the Natural Speech Technology programme with focus on new models for speech and environments and enhanced adaptive structures for recognition.View more