Devin's DECtalk TTS Tutorial

Using TuneBaseAlpha

Shoutouts to slegghetti on steam for their awesome tutorial.

Normal Speech

For normal speech, simply enter the text you would like to be spoken, same as normal TTS.
If you want to test how something sounds, you can download the DECtalk client.

Voice Selection

DECtalk offers 9 voices to choose from, you can specify which voice you'd like by including an escape sequence before your message. For example:
[:np] Hello world!
would say "Hello world!" in the "Paul" voice. The available voices are:

[:np] - Paul
[:nb] - Betty
[:nh] - Harry
[:nf] - Frank
[:nd] - Dennis
[:nk] - Kid
[:nu] - Ursula
[:nr] - Rita
[:nw] - Wendy

You can even change voices mid-message! [:np]Hello, Betty. [:nb]Hello, Paul.

Singing

Basics

To make a song, your message needs to start with [:phone on], and each note needs to be written out in the form (phoneme)<(duration),(pitch)>.
For instance, if I wanted the TTS to sing "bee" on the note A4 for half a second, I would input [:phone on][biy<500,22>]. "B" is a consonant sound, "iy" is the phoneme for "ee" as in "bee", 22 is the code for A4, and "500" means "500ms"

Phonemes & Consonants

DECtalk will usually read text normally, but, while singing, uses a list of pre-set phonemes. This is why we had to spell "bee" as "biy" in the above example.
Most of the phonemes are vowel sounds, and all vowel sounds are phonemes. DECtalk will interpret the last thing it sees before the <> as what it should "sing", so this should always be a phoneme.
For instance, if I want it to sing "beans" I should enter biy<400,22>ns as opposed to biyns<400,22>, as the latter will try to sustain the "s", which it is not programmed to do, so there will just be silence for the rest of the notes duration.

Incorrect phonemes will cause the program to say "command error in phoneme" instead of trying to pronounce what you said, so it is imperative you use them correctly.
Here is a list of known phonemes:

aa (box, hot)
ah (bald, war)
ax (about, tuba)
ae (tad, apple)
ay (tie, rye)
ao ("owe uhh")
aw (loud, down)
ar (tar, bark)
ey (day, drain)
eh (wet, bent)
iy (bee, bean)
ih (tip, flint)
ix (tip, flint)
ir (ear, steer)
ow (boat, soul)
or (corn, fort)
uh (punt, run)
uw (root, rule)
yu (you, cute)

nx (bang, gong)
jh (*sound of television static*)
hx (*I don't know how to describe this but it's weird*)
zh (measure)
More Sounds

Some additional tips regarding consonants:

Always put consonant sounds after the angle brackets unless you're going for a very staccato sound.
Sometimes, DECtalk will not voice a consonant sound clearly, in these cases, doubling up the consonant can help
- EX: If I want to sing "bet", I would say beh<400,22>tt to ensure the "t" sound is voiced.
DECtalk does NOT like singing the letter "c". Avoid using it in favor of "s" or "k" as appropriate.

TuneBaseAlpha

TuneBaseAlpha is a java applet I made to make DECtalk music composition easier. In it, you can select pitch by note name instead of memorizing what numbers correspond to which notes. You just select the note, and then input the desired duration and phoneme, and the program takes care of the syntax. You can download TuneBaseAlpha from my GitHub repository (direct download link).

Example

Lets take a basic example, suppose I want to make the TTS sing "I am singing a scale" in a scale. I first need to convert the syllables into phonemes (I'd normally do this one note at a time, but for the sake of this example, I'll do the whole sentence at once).
I am singing a scale
ay aem siyngiyng ey skeyll
Then, I need to break it up in such a way that each vowel sound leads in to the note, so for this sentence, the phonemes I'd end up putting into TuneBaseAlpha would be:
ay ae msiy ngiy ngey skey
Plugged into TuneBaseAlpha with C3 as the starting note, the output is [:phone on][ay<400,13>ae<400,15>msiy<400,17>ngiy<400,18>ngey<400,20>scey<400,22>], to which I will add "ll" between the last ">" and the last "]". Making my final TTS message:
[:phone on][ay<400,13>ae<400,15>msiy<400,17>ngiy<400,18>ngey<400,20>scey<400,22>ll].
The output:

Devin's DEC​talk TTS Tutorial