site navigation


German Text-to-Speech

last update: 15th March 2023

Contents:

  1. Foreword
  2. Commercial systems
  3. Universities/research
  4. Other systems
  5. Service systems
  6. Further samples
  7. Licensed products
  8. Missing examples
  9. Unknown examples
  10. TTS classification chart
  11. Credits
  12. Change-log

Remarks

Terminology

I added a chart to facilitate the understanding of the concepts used for classification. It's kind of out-dated, as non-uniform unit selection is not explicitely mentioned.

TTS consists always of two components, which I call Dutoit's introduction):

The engines that synthesize the speech (DSP-component) are based mainly on five main technologies:


The test sentences were:

sentence 1:

An den Wochenenden bin ich jetzt immer nach Hause gefahren und habe Agnes besucht. Dabei war eigentlich immer sehr schönes Wetter gewesen.

As I found this sentence a bit too simple, I thought up another test sentence which contains a collection of known problems for the NLP module: (in some demos this sentence is truncuated due to provider's restriction on character number)

sentence 2:

Dr. A. Smithe von der NATO (und nicht vom CIA) versorgt z.B. - meines Wissens nach - die Heroin seit dem 15.3.00 tgl. mit 13,84 Gramm Heroin zu 1,04 DM das Gramm.

Speaking now 6 years after thinking up those sentences, more pressing problems for German speech synthesis used in services like email-reading arise from the pronounciation of english terms, e.g. the following sentence would not be pronounced correctly by most systems without tuning:

sentence 3:

Die Manpowerdiskussion wird gecancelt, du kannst das File vom Server downloaden.

Commercial

company/link description/engine name technology languages voice name year (approx.) s1 s2 s3
Acapela

Acapela was formed in December 2003 from a combination of three European companies specializing in vocal technologies, Babel Technologies (Belgium), Infovox (Sweden) and Elan Speech (France).
Acapela HQ TTS non-uniform unit-selection DE, FR, NL, ES, SE, US, SA, CY, DN, FI, CA, GR, IE, NO, PL, PT, BR, RU, TR Claudia
2015 mp3 mp3 mp3
Claudia (Smile)
2017 mp3 mp3 mp3
Lea (Child)
2013 mp3 mp3 mp3
Jonas (child)
2013 mp3 mp3 mp3
Andreas
2011 mp3 mp3 mp3
Julia
2009 mp3 mp3 mp3
Klaus
2006 mp3 mp3 mp3
Sarah
2003 mp3 mp3 mp3
Custom Voice non-uniform unit-selection, ANN DE Felix DNN (page author):
Artificial neural net model adapted with 15 minutes data
2017 mp3 mp3 mp3
Felix (page author):
Non-uniform unit-selection from 2 hours data
2017 mp3 mp3 mp3
Greeting Bunny non-uniform unit-selection DE, US, FR, IT, ES, NL, SE, NO, DK, BE Bunny
2008 mp3 mp3 mp3
Aculab

Aculab diphone
diphone-concatenation with LPC coded units. LPC (linear predictive coding), originally a compression algorithm, useful for synthesis because based on a source/filter model of speech.
DE, UK, US, FR, BR, IT, ES Julia
1998 mp3 mp3 -
Aristech

Formerly Speechconcept
Cerevoice non-uniform unit-selection
Developments from Aristech, CereProc and University of Edinburgh
DE, EN, FR, IT, ES, US, NL, JP Sophie, adult
Corporate Voice, courtesy of Aristech
2011 mp3 mp3 mp3
Leopold, Austrian adult
courtesy of Aristech
2013 mp3 mp3 mp3
Alex, adult
courtesy of Aristech
2016 mp3 mp3 mp3
Gudrun, adult
courtesy of Aristech
2013 mp3 mp3 mp3
Nick, youth
courtesy of Aristech
2011 mp3 mp3 mp3
Saskia, youth
courtesy of Aristech
2011 mp3 mp3 mp3
Atip

Proser diphone
NLP-component and voices from Atip, Mbrola Engine (diphone-concatenation) from Babeltech
DE, US Carla
2000 mp3 mp3 mp3
Erkan
Turkish accent
2004 mp3 mp3 mp3
Fifi
French accent
2004 mp3 mp3 mp3
Steffen
2000 mp3 mp3 mp3
Eva
2000 mp3 mp3 mp3
AT&T

Natural Voices non-uniform unit-selection DE, IT, US, UK, FR, MX* Klara
2001 mp3 mp3 mp3
Reiner
2002 mp3 mp3 mp3
Babeltech

Brightspeech non-uniform unit-selection
same as Acapela HQ TTS
Ingrid
2002 mp3 mp3 -
Babil diphone
diphone-concatenation based on commercial Mbrola-engine. MBROLA (Multi Band Resynthesis Overlap and Add), similar to PSOLA but the database is treated beforehand to adapt pitch, amplitude and spectral features.
DE, US, UK, ES, FR, NL, BE, BR, PT, IT, SE, NO, DK, FI, IS, TR, CZ, SA Eva
2000 mp3 mp3 mp3
Greta
2000 mp3 mp3 mp3
Steffen
1997 mp3 mp3 mp3
Helga
Same as Infovox 330
1998 - - -
Gerhard
Same as Infovox 330
1998 - - -
Bell Labs

diphone
LPC-diphone concatenation
DE, FR, ES, US, UK, IT, RU, RO, CN
mp3 mp3 -
Centigram

Acquired by Lernout & Hauspie, later Nuance
TruVoice formant DE, US, MX*, FR, IT
1996 mp3 mp3 -
Cepstral

Cepstral TTS non-uniform unit-selection
Associated wiith Alan Black, one of the pioneers of non-uniform unit-selection and lead scientist of Festival, an open source text-to-speech framework developed at Univ. of Edinburgh and the CMU.
DE, UK, US, ES, FR, EG, TH, AF Kathrin
2003 mp3 mp3 mp3
Matthias
2003 mp3 mp3 mp3
Deutsche Telekom

Berkom TTS formant
Research system by former rd department of German Telekom. Hybrid approach combining formant synthesis for voiced phonemes and concatenating with waveform coded units for unvoiced parts.
DE Felix
1998 mp3 mp3 mp3
SAMT hardware-based formant synthesis
(Sprach-Ausgabesystem in Multiplex-Technik): hardware-based formant synthesis of former Forschungsinstitut der Deutschen Bundespost.
DE

Other sample: mp3
1987 - - -
Digital Equipment Corporation

DecTalk formant
First commercial text-to-speech synthesizer. Rule based formant-synthesis - the legendary formant synthesizer, based on Klatt's MITTalk)
DE, US, UK, ES, MX*, FR
1982 mp3 mp3 -
Elan

SaySo non-uniform unit-selection DE, US, FR, IT, ES Lea
2003 mp3 mp3 mp3
Tempo diphone
Pitch Synchronous Overlap and Add (PSOLA): famous algorithm to change pitch and time of speech that made diphone-synthesis a great success for many years.
DE, US, UK, FR, ES, IT, BR, PT, RU, PL Thomas
1998 mp3 mp3 mp3
Dagmar
1996 mp3 mp3 mp3
Eloquent Technologies

Aquired by Scansoft.
ETI Eloquence
rule-based formant-synthesis (Klatt-style). Later sold by Speechworks, also licensed to IBM (ViaVoice Outloud)
DE, UK, US, ES, MX, FR, CA(FR), IT, FI, BR, CN, JP, KR
1998 mp3 mp3 -
GData

Logox microsegment synthesis
Microsegmentsynthesis (concatenating subphonetic units), not developed any more. Originally based on research from Univ. of Saarbrücken.
DE, US, UK Default voice
2000 mp3 mp3 -
Bill
1998 mp3 mp3 mp3
Bill (Swabian accent)
2002 mp3 mp3 mp3
Bill (Hessian accent)
2002 mp3 mp3 mp3
Bill (Saxon accent)
2002 mp3 mp3 mp3
Bill (French accent)
2002 mp3 mp3 mp3
google

wavenet wavenet: artificial neural nets end-to-end AF, AR, BG, BN, CA, CS, DA, DE, EL, EN, ES, FI, FIL, FR, GU, HI, HU, IN, IS, IT, JA, KN, KS, LV, ML, MS, NL, NO, PL, PT, RO, RU, SL, SR, SV, TA, TE, TH, TL, TR, UK, VI, ZH Wavenet A (female)
2018 mp3 mp3 mp3
Wavenet B (male)
2018 mp3 mp3 mp3
Wavenet C (female)
2018 mp3 mp3 mp3
Wavenet D (male)
2018 mp3 mp3 mp3
Wavenet E
2022 mp3 mp3 mp3
Wavenet F
2022 mp3 mp3 mp3
Google Basic so-called basic (non-uniform unit selection?) AF, AR, BG, BN, CA, CS, DA, DE, EL, EN, ES, FI, FIL, FR, GU, HI, HU, IN, IS, IT, JA, KN, KS, LV, ML, MS, NL, NO, PL, PT, RO, RU, SL, SR, SV, TA, TE, TH, TL, TR, UK, VI, ZH Standard A (female)
2018 mp3 mp3 mp3
Standard B (male)
2018 mp3 mp3 mp3
Basic C
2022 mp3 mp3 mp3
Basic D
2022 mp3 mp3 mp3
Basic E
2022 mp3 mp3 mp3
Basic F
2022 mp3 mp3 mp3
Google Translate non-uniform unit-selection Female
Samples were accessed via the translation service.
2013 mp3 mp3 mp3
ibm

Watson unknown CS, DE, EN, ES, FR, IT, JA, KS, NL, PT, SV, ZH Birgit
2022 mp3 mp3 mp3
Dieter
2022 mp3 mp3 mp3
Erika
2022 mp3 mp3 mp3
CTTS non-uniform unit-selection DE, US, UK, JP, KR, IT, ES, FR Male
Courtesy of IBM. Database speaker is Gilles Karolyi. Sentence 3 sample is 8 kHz.
2002 mp3 mp3 mp3
Female

Other sample: mp3
2004 - - -
Infovox

330/Infovox Desktop diphone-concatenation
Probably same as Babeltech Babil. Infovox 310 is apple version
DE, UK, DK, NL, FI, FR, IS, IT, NO, ES, SE Helga
8 kHz version
Other sample: mp3
1996 mp3 mp3 mp3
Gerhard
8 kHz version
Other sample: mp3
1996 mp3 mp3 mp3
210/230 formant-synthesis
successor of KTH's OVE, originally telia promotor
DE, UK, DK, NL, FI, FR, IS, IT, NO, ES, SE
1994 mp3 mp3 -
Desktop PRO non-uniform unit-selection
same as Acapela HQ TTS

- - -
Innoetics

non-uniform unit-selection
Development system from unsupervised audiobook extraction
DE, US, UK, GR, BG Christian
Courtesy of Innoetics
2015 mp3 mp3 mp3
Claudia
Courtesy of Innoetics
2015 mp3 mp3 mp3
Jessi
Courtesy of Innoetics
2015 mp3 mp3 mp3
Kalrsson
Courtesy of Innoetics
2015 mp3 mp3 mp3
Ivona

Owned by Amazon
Ivona TTS non-uniform unit-selection
Licensed by Lumenvox.
DE, US, UK, ES, RO, PL, MX Hans
2011 mp3 mp3 mp3
Marlene
2011 mp3 mp3 mp3
Lernout & Hauspie

Acquired by Scansoft in 2001n after bankruptcy
TTS3000 diphone DE, US, UK, NL, FR, RU, ES, MX, BR, CN, KR Stefan
1996 mp3 mp3 -
Anna

Other sample: mp3
1996 - - -
Loquendo

Acquired by Nuance in 2011
Loquendo TTS non-uniform unit-selection
Formerly called Actor
DE, IT, ES, FR, BR, PT, CN, UK, US, MX, GR, CL, AR, SE Katrin
Courtesy of Loquendo.
2003 mp3 mp3 mp3
Stefan
Courtesy of Loquendo.
2003 mp3 mp3 mp3
Ulrike
2001 mp3 mp3 mp3
Meridian

Orpheus formant
Formerly from Dolphin Oceanic Ltd. Specialized on fast speech as used by blind customers.
DE, UK, US, FR, BR, PT, IT, ES, Welsh, CN, MD, CR, DN, NL, FI, GR, HU, LT, MY, NO, PL, RO, MX, SE Orpheus
2009 mp3 mp3 mp3
Microsoft

Microsoft Azure TTS services deep neural nets DNN ES, DK, DE, AU, CA, GB, IN, US, MX, FI, CA, FR, IT, JP, KR, NO, NL, PL, BR, PT, RU, SE, HK, TW, CN Amala
2022 mp3 mp3 mp3
Bernd
2022 mp3 mp3 mp3
Christoph
2022 mp3 mp3 mp3
Conrad
2022 mp3 mp3 mp3
Elke
2022 mp3 mp3 mp3
Gisela
2022 mp3 mp3 mp3
Kasper
2022 mp3 mp3 mp3
Killian
2022 mp3 mp3 mp3
Klarissa
2022 mp3 mp3 mp3
Klaus
2022 mp3 mp3 mp3
Louisa
2022 mp3 mp3 mp3
Maja
2022 mp3 mp3 mp3
Ralf
2022 mp3 mp3 mp3
Tanja
2022 mp3 mp3 mp3
Katja (Neural)
2020 mp3 mp3 mp3
Microsoft Mobile Voices non-uniform unit-selection ES, DK, DE, AU, CA, GB, IN, US, MX, FI, CA, FR, IT, JP, KR, NO, NL, PL, BR, PT, RU, SE, HK, TW, CN Katja
2014 mp3 mp3 mp3
Stefan
2014 mp3 mp3 mp3
Microsoft Speech Platform - Runtime Languages (Version 11) non-uniform unit-selection ES, DK, DE, AU, CA, GB, IN, US, MX, FI, CA, FR, IT, JP, KR, NO, NL, PL, BR, PT, RU, SE, HK, TW, CN Hedda
2012 mp3 mp3 mp3
Neospeech

A Hoya company. As is ReadSpeaker.
non-uniform unit-selection DE, US, UK, MX, TW, TH, KR, IT, CN, CH, JP, CT, BR, PT, FR Lena
2018 mp3 mp3 mp3
Tim
2018 mp3 mp3 mp3
Nuance

Formerly Scansoft (originating from Kurzweil and Xerox), acquired Europeean pioneers Lernout & Hauspie in 2001, took the name of a smaller company named Nuance which they acquired in 2005
Vocalizer DNN Artificial neural nets US Nuance Website Sample

Other sample: mp3
2018 - - -
Vocalizer non-uniform unit-selection
Formerly called RealSpeak (Vocalizer was the name of the original Nuance product), originally from Lernout & Hauspie), converged with RVoice (formerly Rhetorical) . First commercial German unit-selection TTS
DE, NL, PT, CA, CN, ES, DK, PT, FR, IT, JP, KR, MX, NO, PL, RU, SE, US, UK, AU, SA, ID, Basque, BE, CZ, FI, GR, IN, HU, TH, TR, ZA, RO Victor
2016 mp3 mp3 mp3
Anna
11 kHz, courtesy of Nuance
2010 mp3 mp3 mp3
Yannick
11 kHz, courtesy of Nuance
2006 mp3 mp3 mp3
Yannick 2
Yannick embedded version recorded from a cell phone
2009 mp3 mp3 mp3
Monika
and Beate (?) - same as RVoice F026
2005 mp3 mp3 mp3
Steffi
8 kHz
2004 mp3 mp3 mp3
Steffi 2
Newer version with enhanced voicequality and better pronunciation.
2005 mp3 mp3 mp3
Vera
8 kHz
1999 mp3 mp3 mp3
Nuance (until 2005)

Acquired by Scansoft in 2005
Vocalizer 4.05 non-uniform unit-selection DE, US, UK, AU, CA(FR), MX*, BR Anna Weber
2004 mp3 mp3 -
Vocalizer 1.0 non-uniform unit-selection
licensed Fonix engine
DE, US, UK, NL, FR, IT, NO, ES, SE
2001 mp3 mp3 -
ReadSpeaker

A Hoya company. As is NeoSpeech. Formerly called rSpeak
non-uniform unit-selection
using deep neural artificial networks
DE,GB,US,AU,ES,FR,NL,SE Max
courtesy of ReadSpeaker
2018 mp3 mp3 mp3
Rhetorical Systems

Was headquartered in Edinburgh, Scotland. Acquired by Scansoft / Nuance in 2004
RVoice non-uniform unit-selection DE, UK, US, GR, ES F026
2004 mp3 mp3 mp3
M027
2004 mp3 mp3 mp3
F018
mp3 mp3 mp3
Speechworks

Acquired by Scansoft / Nuance in 2003
Speechify non-uniform unit-selection DE, US, UK, AU, JP, MX*, FR, BR, CA(FR) Tessa
2002 mp3 mp3 mp3
Svox

Originally a spin-off from ETH Zurich. Acquired by Nuance in 2011
Svox Corporate non-uniform unit-selection DE, FR, IT, US, ES Petra
2005 mp3 mp3 mp3
Markus
2005 mp3 mp3 mp3
Marlene

Other sample: mp3
2003 - - -
diphone DE, FR, IT, US, ES Nicole
2000 mp3 mp3 -
thorstenvoice

VITS deep learning model: VITS (Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech) DE Thorsten
2023 mp3 mp3 mp3
Tacotron 2 - DDC deep learning model: Double Decoder Consistency model architecture DE Thorsten
2023 mp3 mp3 mp3
tom weber software

Fahrgastansagen TTS non-uniform unit-selection DE Andreas
Samples courtesy of tom weber software
2015 mp3 mp3 mp3
Marianne
Samples courtesy of tom weber software
2015 mp3 mp3 mp3
VoiceINTERConnect

diphone
Commercial version of the Dress Synthesizer (University of Dresden).
female voice
2000 mp3 mp3 mp3
male voice
2000 mp3 mp3 mp3
Votrax

formant
Early hardware Formant synthesizer. Samples taken from an Audiodata Braille reader.
DE
1974 mp3 mp3 mp3
Voxygen

Spin-off from French Orange Labs.
Hybrid non-uniform unit-selection / HMM synthesis DE, FR, EN, ES, IT, AR Sylvia
courtesy of Voxygen
2014 mp3 mp3 mp3
Matthias
courtesy of Voxygen
2014 mp3 mp3 mp3

Universities / Research

Institution System Remark Year (approx.) / remark s1 s2 s3
IKP Bonn

BOSS
non-uniform unit-selection
2001

mp3 mp3 mp3
Hadifix
mixed inventory concatenation
HADIFIX = HAlbsilben, DIphone und suFIXe
DE
1995

mp3 mp3 -
University of Budapest

Multivox 5 (Profivox)
diphone synthesis
2004
male speaker 1
mp3 mp3 -
2004
male speaker 2
mp3 mp3 -
Multivox 3
formant synthesis
DE, HU, FI, NL, ES, PT, SA, Esperanto
1994


Other sample: mp3
- - -
DFKI

Mary
non-uniform unit-selection
Mary=modular architecture for speech synthesis, open source. Great tool also to teach about speech synthesis because the output and input of different poicessing modules can be viewed as text.
DE, EN , Tibetian
2011
Pavoque corpus
mp3 mp3 mp3
2007
Bits 1
for details see Schröder, M. & Hunecke, A. (2007). Creating German Unit Selection Voices for the MARY TTS Platform from the BITS Corpora. Proc. SSW6, Bonn, Germany.
mp3 mp3 mp3
2007
Bits 2
mp3 mp3 mp3
2007
Bits 3
mp3 mp3 mp3
2007
Bits 4
mp3 mp3 mp3
Mary/Mbrola
diphone
DE, EN
2000

mp3 mp3 mp3
Technical university of Dresden

DRESS
diphone synthesis
1996

mp3 mp3 mp3
Voice 1
concatenative formant-synthesizer
1993


Other sample: mp3
- - -
TUSY
hardware formant-synthesizer
1987


Other sample: mp3
- - -
ROSY
hardware formant-synthesizer
Robotron Synthesizer
1977


Other sample: mp3
- - -
Syni 2
punchcard controlled formant-synthesizer
Robotron Synthesizer
1975


Other sample: mp3
- - -
Syni 1
punchcard controlled formant-synthesizer
Robotron Synthesizer
1972


Other sample: mp3
- - -
Michael Pucher with Austrian academy of sciences

hts-engine-world
HMM-based vocoder synthesis, for details see the article M. Pucher, D. Schabus, J. Yamagishi, F. Neubarth, V. Strom: Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis. Speech Communication, Volume 52, Issue 2, February 2010, Pages 164-179.
Specialized on Austrian dialects/sociolects. based on open-source software: https://github.com/mipuc/hts-engine-world
2020
LEO
Austrian German male
mp3 - mp3
2020
HPO
Viennese dialect male
mp3 - mp3
2020
JOE
Viennese youth female
mp3 - mp3
2020
KEP
Austrian German male, adaptive voice
mp3 - mp3
2020
MPU
Austrian German male, adaptive voice
mp3 - mp3
Jonathan Duddington

eSpeak
formant-synthesis
based on the 1995 unix "speak"-program. Open-source
2006

mp3 mp3 mp3
ETH Zürich

Svox
diphone-concatenation
Predecessor of the commercial version later acquired by Nuance.
1998

mp3 mp3 mp3
Gerhard Mercator University of Duisburg


formant-synthesis
1996

mp3 mp3 -
KTH Stockholm

Infovox
formant synthesis
Developed by Rolf Carlson, Bjorn Granströ;m and Sheri Hunnicut
1992

mp3 - -
Ove III
Hardware formant synthesis
Orator Verbis Electris (OVE) . Developed by Gunnar Fant
1967


Other sample: mp3
- - -
University of Mons

Mbrola
diphone-synthesis
Mbrola: Multi-band Resynthesis Overlap and Add. The NLP (text phonemisation) component is Txt2Pho, the Hadifix NLP in combination with Mbrola-Synthesis . Available for free for noncommercial use. MBROLA-TTS is avalable for about 34 different languages.
1998
de8
Markus Binsteiner's work an a Bavarian dialect
Other sample: mp3
- - -
2000
de7
(by Marc Schröder, DFKI/Uni Saarland, female, 22 kHz), all diphones in three voice qualities (for emotional speech simulation).
mp3 mp3 mp3
2000
de6
(by Marc Schröder, DFKI/Uni Saarland, male, 22 kHz), all diphones in three voice qualities (for emotional speech simulation).
mp3 mp3 mp3
2000
de5
by Fred Englert (ATIP), female, 22 kHz
mp3 mp3 mp3
2000
de4
By IMS Stuttgart, male, 16 kHz, includes english and french diphones
mp3 mp3 mp3
2000
de3
by ATIP, female, first 22005 kHz voice
mp3 mp3 mp3
1997
de2
By ATIP, male, 16 kHz
mp3 mp3 mp3
1996
de1
By ATIP, female, 16 kHz
mp3 mp3 mp3
ÖFAI (Austrian Research Institute for Artificial Intelligence)

VieCtoS
demisyllable-LPC-concatenation
Vienna Concept-to-Speech system. If the prosody sounds poor it's due to my limited knowledge of Tobi-Labels.
1998

mp3 - -
OGI, Oregon Graduate Institute,


LPC-diphone concatenation
Developed at the OGI, Center for Spoken Language Understanding during a workshop in 1998. TTS-Framework is Festival
1998

mp3 mp3 -
Ruhr Univerität Bochum

SyRUB, Version 4.1.1 1995

mp3 mp3 mp3
Simple4All

Tundra corpus
non-uniform unit-selection
EU FP7 Project "Simple4All" Tundra corpus, system features unsupervised learning.
2013

mp3 mp3 mp3
Espnet

Hokuspokus model
ANN: Tacotron2
Thanks to kan-bayashi
en, jp, de
2022
Hokuspokus
mp3 mp3 mp3
Hochschule Hof, Institut für Informationssysteme

VITS
deep learning Vits (VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech) model
de
2023
Friedrich
mp3 mp3 mp3
2023
Eva
mp3 mp3 mp3
2023
Bernd
mp3 mp3 mp3
Tacotron2
deep learning Tacotron 2 model
de
2023
Hokuspokus
mp3 mp3 mp3

with the following systems it wasn't possible to synthesize own sentences:

name/link description year (approx.) mpeg3
AEG Telefunken

SVS (SPRAUS Voll Synthese)

unknown
1975

mp3
Karlchen

unknown concatenation ("Parcor-Synthetisator")
Deutsche Bahn Auskunftssystem
1978

mp3
ATR



non-uniform unit selection
1997
male
mp3
1997
female
mp3
Bose

unkown

unkown
recorded from a bose mini soundlink II bluetooth speaker february 2018
2018

mp3
Univ. of Dresden, Peter Birkholz

Vocal Tract Lab

Articulatory synthesis


Handtweaked articulatory movements transformed into a mathematical model to generate soundwaves
-
ELIS Lab

Eurovocs

diphone-synthesis
Technology from Lernout & Hauspie
1998

mp3
1996

mp3
First Byte



product-name:Monologue, ProVoice. waveform-concatenation synthesis (?
1998

mp3
HHI: Heinrich Hertz Institut



technology unknown
1978

mp3
Keller & Trauth.

SpeakEaZy

waveform-concatenation synthesis
1998

mp3
SlowSoft

SlangTTS

Non-uniform unit-selection synthesis
2020

mp3
Wolfgang_von_Kempelen's Speaking Machine



Hardware manual sound generator ("papa", "mama")
1769

mp3
University of Köln

Institut für Phonetik


articulatory-synthesis (actually not a TTS-system)
1996

mp3
Karl Küpfmüller / Bernhard Cramer



Hardware phoneme concatenation
1955

mp3
University of Lausanne (LAIP)



TTS-system from the university of Lausanne (LAIP), uses MBROLA -engine. Includes a model to reduce/elaborate articulation according to speech-rate.
1998

mp3
Mila (Machine learning laboratory at the University of Montrea)

Char2Wav

Deep neural artificial networks from University of Montreal: An end-to-end model for speech synthesis learned with Deeplearning4J. Char2Wav has two components: a reader and a neural vocoder. The reader is an encoder-decoder model with attention. The encoder is a bidirectional recurrent neural network that accepts text or phonemes as inputs, while the decoder is a recurrent neural network (RNN) with attention that produces vocoder acoustic features. For the German samples, the Pavoque database was used for training.
2017

mp3
Philips/IPO Eindhoven

Spengi

diphone-synthesis
1997

mp3
Unknown Russian TTS



unknown / formant?
1970

mp3
H.W. Strube, University of Göttingen



Articulatory synthesis.
1977

mp3
Texas Instruments Language Translator



LPC coded word-concatenation
1980
Male Voice
mp3
University of West Bohemia in Pilsen

ARTIC (ARtificial Talker In Czech)

concatenative synthesizer
Commercial version available by speechtech by the name of ERIS.
2002

mp3





-

Service products

The following table lists some products to enhance text-to-speech quality.

company product description date sample
ReadSpeaker, now commercialize their own engine under the name rSpeak, both a Hoya company. SagEs / SayIt Serverbased website reader. Based on Acapela products. Sample reads a newspaper article (Tagesspiegel). Note pronunciation of the word "playstation". 7/11/07 mp3
ETeX - Dictionaries. 1/7/05 mp3
Interlinx, aquired by Speech Concept emphasis / SpeechOptimizer Tuning tool for pronounciation and prosody modeling. 1/7/05 mp3

Further examples

Speechsynthesis examples, that did not fit otherwise.

Description Example
Ultrafast speechsynthesis as used by blind, with 14 syllables per second, based on formant synthesis Eloquence mp3
realspeak British English, 31/5/05, "Flight LH312 from Frankfurt to Berlin." mp3
TTS of the Fiat "Blue & Me" Navigation Headunit with Microsoft CE. Voice Steffi of Nuance. mp3
Apple Iphone 2011, Recorded with PC Mikrofon from Apple iPhone 4.1, TTS is faster compact version of Voice Yannick von Nuance mp3 , mp3 , mp3

Licensed Systems

the following engines are based on systems with a different name:

Missing examples

For the following systems I didn't yet get samples:


Unknown examples

For the following systems I have no information about the supplier: