A Practitioner's Look at Speech -to-Text
Mark Dickison, Technical and Modeling Lead for Capital One’s internal speech recognition team – SpeakEasy
This talk covers the basics of Speech-to-Text using Hidden Markov Models and Gaussian Mixture Models (HMM-GMM). Included will be a discussion of the basics of signal processing for speech recognition, acoustic and language models, and how they are jointly maximized to produce text.
As time and interest allows, Neural Networks, and how they can be used either in hybrid with HMM-GMM models will also be covered.
Mark Dickison started his career as a computational physicist specializing in network science – acquiring his Ph.D. from Boston University. This was followed by a post-doctoral fellowship at Pennsylvania State in their USP program, which supports the US Defense Threat Reduction Agency. Leaving academia, he joined Booz Allen Hamilton as a data scientist, working with a variety of clients across health, finance, and energy. Mark is currently the Technical and Modeling Lead for Capital One’s internal speech recognition team – SpeakEasy.