Associate professor studying speech signals

Faculty’s affinity for language and mathematics is the basis for further developing speech recognition technologies

Ernest Fokoue, associate professor of statistics in RIT’s John D. Hromi Center for Quality and Applied Statistics, is analyzing audio and speech signals that will enable better voice recognition technology.

Can a person erase traces of his original language and dialect? Or is a voice as unique as a fingerprint?

Researchers at Rochester Institute of Technology, looking to answer those questions, have started to analyze audio and speech signals that will enable better voice recognition technology. Applications for the new research can be used in speech recognition software development, security recognition, text mining and computer-assisted language learning.

Ernest Fokoue, associate professor of statistics in RIT’s John D. Hromi Center for Quality and Applied Statistics, began work on acquiring multiple unique “voices” for his project “Statistical Analysis of Audio and Speech Signals.” His work is part of an evolving field of speech processing, and it is expected to improve the technology behind language processing systems, a field combining linguistics and computational techniques where individual voices can be recognized for not only context, but specific characteristics such as cultural dialect.

“I’m going to use mathematics and statistical signal processing to emulate a linguist,” he says. “I want to be able to recognize, in the most refined way, the subtle differences between people. If I can characterize completely the signal or voice signature, it would be almost like having your voice as a fingerprint.”

He is currently analyzing data collected from more than 150 subjects at RIT who “voiced” five sentences of varying emotional content. The current data for each subject amounted to more than 1.5 million pieces of information, including variations of individual voices.

The data collected in projects such as Fokoue’s are considered “big data,” a term used to refer to the complex and high volume of information analyzed for consumer market trending, business solutions and government and military security analyses. He is building a data matrix for distinguishing characteristics being measured, technology that will have the highest accuracy in recognizing voices to detect a person’s dialect using math.

“I’m hoping with mathematics I can go down to more subtle things than what the linguist is seeing,” says Fokoue. “Part of this for me is to better understand languages’ classifications, recognition. I want to be able to recognize in the most refined way the subtle differences between people. And if you recognize the components, can you actually learn to erase that?”

While he is not an ethnographer, a researcher of cultural phenomena, Fokoue has an affinity for languages, speaking seven fluently—English, French, Spanish, Italian, German, Russian and Fokoue, pronounced fol-kway, his native language of Cameroon.

Recognizing language characteristics is like distinguishing between the voices of siblings, he says. “I have six brothers and we sound similar but not alike. I want to isolate the commonalities, and go down to the differences; there must be a way to recognize that on a computer.”

Note: Fokoue will continue to collect voice samples through the remainder of the academic year. He can be contacted at epfeqa@rit.edu for more information about his project and to volunteer as part of the research data collection.

Topics

research

Recommended News

November 22, 2024

RIT expands research on circular economy in Southeast Asia with new funding boost

A $1 million grant from the U.S. Department of State will continue the work led by professors Clyde Hull and Eric Williams with entrepreneurships based on circular economy principles in member nations of the Association of Southeast Asian Nations (ASEAN).
November 21, 2024

Americans agree more than they might think − not knowing this jeopardizes the nation’s shared values

An essay written by Lawrence Torcello, associate professor in the Department of Philosophy, published by The Conversation.
November 21, 2024

Detecting digital deception

Today, artificial intelligence is being used to manipulate media. At RIT, a team of student and faculty researchers is leading the charge to help journalists and intelligence analysts figure out what is real and what is fake. Their work has more than $2 million in funding from the National Science Foundation and Knight Foundation.
November 21, 2024

In and out of Africa

Africa is rich in natural resources and contains one of the most diverse ecosystems across the globe. The Sahara Desert itself is larger than the continental United States. With all the unique landscapes, wildlife, and growing urban areas, more than 30 RIT faculty have recognized the importance of traveling to the continent, all backed by RIT Global.

More News