Abstract:
This paper proposes to use self-attention based model to predict punctuation marks for word sequences. The model is trained using word and speech embedding features which...Show MoreMetadata
Abstract:
This paper proposes to use self-attention based model to predict punctuation marks for word sequences. The model is trained using word and speech embedding features which are obtained from the pre-trained Word2Vec and Speech2Vec, respectively. Thus, the model can use any kind of textual data and speech data. Experiments are conducted on English IWSLT2011 datasets. The results show that the self-attention based model trained using word and speech embedding features outperforms the previous state-of-the-art single model by up to 7.8% absolute overall F1-score. The results also show that it obtains performance improvement by up to 4.7% absolute overall F1-score against the previous best ensemble model.
Published in: ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 12-17 May 2019
Date Added to IEEE Xplore: 17 April 2019
ISBN Information:
ISSN Information:
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Word Embedding ,
- Speech Embeddings ,
- Ensemble Model ,
- Kinds Of Data ,
- Word Features ,
- Speech Data ,
- Neural Network ,
- Training Set ,
- Learning Rate ,
- Long Short-term Memory ,
- Attention Mechanism ,
- Attention Function ,
- Plaintext ,
- Acoustic Features ,
- Residual Connection ,
- Machine Translation ,
- Conditional Random Field ,
- Self-attention Mechanism ,
- Linear Projection ,
- Positional Encoding ,
- Lexical Features ,
- Identical Layers ,
- Automatic Speech Recognition System ,
- English Dataset ,
- Multi-head Self-attention ,
- Audio Segments ,
- Combination Of Features
- Author Keywords
Keywords assist with retrieval of results and provide a means to discovering other relevant content. Learn more.
- IEEE Keywords
- Index Terms
- Word Embedding ,
- Speech Embeddings ,
- Ensemble Model ,
- Kinds Of Data ,
- Word Features ,
- Speech Data ,
- Neural Network ,
- Training Set ,
- Learning Rate ,
- Long Short-term Memory ,
- Attention Mechanism ,
- Attention Function ,
- Plaintext ,
- Acoustic Features ,
- Residual Connection ,
- Machine Translation ,
- Conditional Random Field ,
- Self-attention Mechanism ,
- Linear Projection ,
- Positional Encoding ,
- Lexical Features ,
- Identical Layers ,
- Automatic Speech Recognition System ,
- English Dataset ,
- Multi-head Self-attention ,
- Audio Segments ,
- Combination Of Features
- Author Keywords