The Cofactor Ora Blog

Cofactor My Name

Feb 20, 2019

Cofactor My Name

Our names are very important to us. Each name has a story related to the person's cultural and familial background.

Pronunciation of names provokes constant uncertainty. For instance, how do you pronounce J. K. Rowling's surname? We know how she pronounces it — /ˈroʊlɪŋ/ — and that it rhymes with "bowling". However, her name is very often mispronounced as /ˈraʊlɪŋ/ so that it rhymes with "howling".

As we become a more multicultural society, names as simple as John will become less common.

The correct pronunciation of your name is whatever you decide. With Cofactor My Name, you can record the correct version of your name and present it to the world. My Name will then make it easy for others to register and remember the right pronunciation.

The IPA map of the world

Oct 18, 2018

Cofactor Ora seeks to record pronunciations of the names of all cities, towns, villages, streets and other political and geographical features.

Based on the English pronunciation data already present in Ora, we are able to build a phonetic map of the world showing states, sovereign territories, and other large subdivisions:

The following phonetic transcriptions were used to build this map:

Afghanistan /æfˈgænɪstæn/
Angola /æŋˈgoʊlə/
Albania /ælˈbeɪniə/
United Arab Emirates /juˈnaɪtɪd ˈærəb ˈɛmɪrəts/
Argentina /ˌɑːrdʒənˈtiːnə/
Armenia /ɑːrˈmiːniə/
Antarctica /æntˈɑːrktɪkə/
Australia /ɒˈstreɪliə/

You can view the whole map here.

This is just a beginning. To help build more IPA maps, you can add the IPA transcriptions for all these place names in your language.

Pronunciation in science: the case of biology

Oct 10, 2018
Pronunciation case by case

In biology and medicine, each species, genus, or family of organisms has a standard Latin name such as Escherichia coli, Saccharomyces cerevisiae, Phytophthora, and Rosaceae. As of 2018, there are about 1.8 million species described—including both extinct and extant species—each called a unique scientific name.

Even though biological Latin is primarily a written language, these names do occur in speech. Our language is about both written and verbal communication after all. Knowing how to pronounce these names is important for being able to communicate our ideas clearly and effectively.

There are no hard and fast pronunciation rules for taxonomic names in English—English being the de facto language of science. Moreover, these names often derive from personal or geographical names. One such name is Scythris worcesterensis—the species of moth named after the Worcester area in South Africa. Pronunciation of "worcesterensis" will clearly be influenced by the pronunciation of the original geographical location, /ˈwʊstər/ in this case.

Cofactor Ora's goal is to collect pronunciations of all systematic names found in the Google Knowledge Graph. By encouraging scientists to share their preferred pronunciations for Latin names and other terms that they regularly use in their speech, Cofactor Ora seeks to become the first crowdsourced pronunciation guide to taxonomic names and medical terms.

Exploring Ora from Google

Oct 4, 2018

You are probably using Google on a daily basis. And you've probably noticed that the results often include entities relevant to your search displayed in the panels on the right-hand side of the page.

These entities come from the Google Knowledge Graph — an extensive collection of things ranging from sports teams, notable people, and rock bands to local businesses, streets, and bus stops.

Cofactor Ora is based on the very same Knowledge Graph. In fact, each page in the Cofactor Ora dictionary (Karangahape Road or Aotearoa for example) shows pronunciations for a single Knowledge Graph entity.

Ora's purpose is really to engage with the communities of native speakers to have the entire Knowledge Graph pronounced. Cofactor supports the infrastructure and provides the means for you to view and contribute pronunciations for those entities.

Cofactor Google Chrome plugin

The Google Knowledge Graph has over 1 billion entities. We've always understood that navigating through so many entities in Ora is hard and wanted to make the crowdsourced data more accessible to the end user who is looking for pronunciations.

The new Cofactor plugin allows you to view and listen to Ora's pronunciations without leaving Google Search or Google Maps. It simply integrates with the knowledge panels, allowing you to peek into Ora directly in place.

Here's how it works. Suppose you are searching Google for Aotearoa:

Then the suggested entity for your search will be displayed on the right-hand side of the page with the new speaker button:

which opens a preview of the dictionary page. You can view, listen to, and contribute pronunciations in the same window:

Named entities and speech technologies

Sep 29, 2018

Any real-world thing — a person, place, organization, work of art — is an example of a named entity. Named entities are found everywhere.

Correct pronunciation of named entities is required from many systems — for example, applications like Google Maps that synthesize navigation instructions for drivers using text-to-speech.

Pronunciation of named entities is one of the biggest challenges for speech technologies. Due to their large number, named entities are often excluded from pronunciation lexicons. When processing out-of-vocabulary named entities, G2P engines will often output erroneous transcriptions. As a result, synthetic speech simply mispronounces names.

What makes named entity pronunciation difficult? Firstly, names can be of very diverse etymological origin and can surface in another language without having undergone the process of assimilation. Some street names are good examples of this: Karangahape Road, Tangihua Street, Ngaoho Place. Secondly, name pronunciation is known to be idiosyncratic; there are many pronunciations contradicting common phonological patterns. Consider English city names such as Leicester and Worcester. Thirdly, it's not uncommon for certain names to have different pronunciations when they refer to different things. A famous example of this is the pronunciation of Houston Street in NY vs. Houston, TX.

For most text-to-speech systems, no guess ensures the correct pronunciation better than a direct hit in a pronunciation dictionary. The Cofactor Ora pronunciation lexicon is based on the Google Knowledge Graph. This corpus provides far better coverage of names than any other dictionary.

In 2015, the Google Knowledge Graph had over 1 billion entities. Things like sports teams, actors, directors, movies, artworks, museums, cities, countries, music albums, recording artists, planets, spacecraft, local businesses, and pharmaceutical drugs — you name it.

By helping Cofactor Ora pronounce the Knowledge Graph, you directly contribute to the development of better, safer, higher-quality, and more reliable speech technologies.

What is a grapheme-to-phoneme converter?

Sep 28, 2018

Knowing how words are pronounced is a vital part of most speech recognition and speech synthesis systems. The pronunciation component forms the core of such systems, making their overall performance rely on the coverage and quality of the pronunciation model.

Automatic speech recognition and text-to-speech systems normally use handcrafted word-pronunciation dictionaries. The dictionary maps each word to one or more phonetic transcriptions and usually has a large but finite vocabulary.

Such a static list can never cover all possible words in a language and is usually accompanied by a grapheme-to-phoneme (G2P) engine that can automatically generate pronunciations for out-of-dictionary words.

A G2P converts an input word (a sequence of characters or graphemes) to a corresponding prounciation (a string of phones). For example, given the word "computer" a G2P should output /kəmˈpjuːtər/.

There are different types of G2P algorithms. Unlike the less-common rule-based G2Ps, data-driven G2P methods automatically learn from a set of word-pronunciation pairs (the ground truth). The underlying conversion rules are captured implicitly which also makes the implementation language-independent. Various data-driven models use tree classifiers, hidden Markov models, and neural networks. Recurrent neural networks (RNNs) with long short term memory cells (LSTMs) show good accuracy while being very easy to use — they simply learn from the training data.

G2P conversion can be viewed as a (neural) machine translation problem in which spelling (orthography) is being translated into pronunciation (phonology). The performance and quality of G2Ps is usually judged by their phoneme error rate (PER) which is similar to the word error rate (WER) metric used in machine translation.

G2P algorithms generalize from their training data and typically mispronounce non-standard words or foreign names. For example, they might pronounce the Māori name "Onehunga" as /wʌnˈhʌŋə/ which is far from the correct local pronunciation /ˌɒnɪˈhʌŋə/.

The existence of homographs also complicates things. Unlike Spanish or German where pronunciation of a word can be inferred from its spelling, English is full of words that have the same spelling but are pronounced differently depending on meaning. The examples include words such as "dove" which can be pronounced as /ˈdʌv/ or /ˈdoʊv/ depending on what you are talking about. A more complex example is the name "Houston" which is pronounced as /ˈhjuːstən/ when it refers to the city in Texas and /ˈhaʊstən/ in the name of the Houston Street in New York which highlights the importance of the Cofactor Ora pronunciation knowledge base.

Since most G2P conversion algorithms require clean training data, G2P models are rarely available for underresourced languages such as Māori. Building a manually annotated pronunciation dictionary is the most straightforward way to contribute to the efficient development of G2P converters for Māori. Cofactor Ora collects Māori pronunciations in a systematic and structured way, enabling the development of Māori speech technologies.

Savo vardo tarimas
Įrašykite savo vardo tarimą.