Evaluation of acoustic word embeddings

Abstract

Recently, researchers in speech recogni- tion have started to reconsider using whole words as the basic modeling unit, instead of phonetic units. These systems rely on a function that embeds an arbitrary or fixed dimensional speech segments to a vec- tor in a fixed-dimensional space, named acoustic word embedding. Thus, speech segments of words that sound similarly will be projected in a close area in a con- tinuous space. This paper focuses on the evaluation of acoustic word embed- dings. We propose two approaches to eval- uate the intrinsic performances of acoustic word embeddings in comparison to ortho- graphic representations in order to eval- uate whether they capture discriminative phonetic information. Since French lan- guage is targeted in experiments, a partic- ular focus is made on homophone words.

Publication
RepEval@ACL 2016: The 1st Workshop on Evaluating Vector-Space Representations for NLP