tgies/klattsch

257

+33/day

JavaScript

primitive parallel-formant speech synth in the browser

From the README

klattsch

A primitive parallel-formant speech synthesizer in the browser. Late-70s / early-80s tier (Votrax, SAM).

The name is a portmanteau of Klatt (Dennis Klatt, the formant-synth pioneer) and Klatsch (German for gossip / casual chat).

Live demo

Commercial support - integration consulting from the author

What it does

You type a phoneme string in ARPABET, with optional directives, and the computer says it.

HH AH L OW                        hello, default voice
b140 HH AH L OW                   higher voice
bA3 HH AH L OW                    higher voice (note name)
AY+15 D IH D                      "I did" with a rising contour
D IH D DH AE(+40) T               "did THAT" with a transient pitch ornament on AE
r200 bC#4 ( HH AH ) ( L OW )      sung syllables, one note per group

See the in-app syntax help panel for the full directive table.

Installation

npm install klattsch

The same package works as a CLI, as an importable engine in Node, and as an embeddable engine + AudioWorklet in the browser. Zero runtime dependencies.

Usage

CLI

Render a phoneme string straight to a WAV file:

npx klattsch "HH AH L OW" hello.wav

Node / `OfflineAudioContext`

import { compileString, FormantSynth, encodeWav } from 'klattsch';

const sampleRate = 48000;
const { schedule, totalMs } = compileString('HH AH L OW');
const synth = new FormantSynth({ sampleRate, schedule });
const buf = new Float32Array(Math.ceil(totalMs * sampleRate / 1000));
synth.process(buf);

const { bytes } = encodeWav(buf, sampleRate);
// write bytes to a .wav file

Browser with a bundler (Vite, webpack, esbuild, Rollup)

import { compileString } from 'klattsch';
import workletUrl from 'klattsch/formant-worklet.js?url';

const ctx = new AudioContext();
await ctx.audioWorklet.addModule(workletUrl);
const node = new AudioWorkletNode(ctx, 'formant-processor');
node.connect(ctx.destination);

const { schedule } = compileString('HH AH L OW');
node.port.postMessage({ type: 'schedule', schedule });

Browser without a bundler (CDN)


  import { compileString } from '

  const ctx = new AudioContext();
  await ctx.audioWorklet.addModule(');
  const node = new AudioWorkletNode(ctx, 'formant-processor');
  node.connect(ctx.destination);

  const { schedule } = compileString('HH AH L OW');
  node.port.postMessage({ type: 'schedule', schedule });

How it works

Excitation: voiced source is a Rosenberg-style glottal pulse with a tunable open / closed quotient (g / "effort") and unvoiced source is xorshift noise. These are crossfaded by each phoneme's voicing parameter, with optional aspiration noise mixed in.
Filtering: three parallel bandpass biquads for each formant.
Prosody: the sequencer compiles phoneme strings into a time-stamped schedule of formant targets.
Voice character: vibrato (depth + rate), aspiration / breathiness, s

View on GitHub