ElectronicZoologyfield notes from the garage
Audio • ESP32

PAM8403 Audio Amplifier
with ESP32 Dev Board

Amp: PAM8403 (3W stereo class-D)
Board: ESP32 Dev Board (38-pin)
Speaker: Any 4-8Ω speaker
✓ Confirmed Working
Commands are shown in dark code blocks - open a terminal and press Enter after each one.

What You Need

Software

  • Arduino IDE with ESP32 board support
  • espeak-ng - generate voice clips
  • ffmpeg - convert audio files
  • Python 3 - WAV to C header

Install on Arch Linux:

sudo pacman -S espeak-ng ffmpeg python

How It Works

The ESP32 has two built-in DAC pins (GPIO25 and GPIO26) that output a true analogue voltage (0-3.3V). We use one of these to output audio samples, which the PAM8403 amplifies to drive the speaker.

The audio data is stored as a C byte array compiled directly into the sketch - no SD card or filesystem needed for short clips.

Wiring

FromTo
ESP32 GPIO25 → 10µF cap (+)PAM8403 L
10µF cap (−)PAM8403 G
ESP32 5VPAM8403 VCC
ESP32 GNDPAM8403 GND
PAM8403 L OUT+Speaker +
PAM8403 L OUT−Speaker −

Also place a 100µF cap across PAM8403 VCC and GND (+ leg to VCC).

PAM8403 input pins: The input side has three pins - L (left), G (ground reference), and R (right). For mono, connect your signal to L and ground to G. Leave R unconnected.

Why the 10µF coupling cap? The ESP32 DAC output is centred at ~1.65V (not 0V). Without the cap, that DC offset passes straight into the amplifier and causes a loud thump on power-on and distorts the audio. The cap blocks DC and passes only the AC audio signal.
Why the 100µF decoupling cap? The PAM8403 draws sudden bursts of current when playing audio. The cap sits on the power rail and smooths those spikes, reducing hiss and noise in the output.
No volume pot on your module? Control volume in code - see the sketch below.

Preparing an Audio Clip

Option 1 - Generate a voice clip with espeak-ng

espeak-ng -w /tmp/myclip.wav "your text here"

Option 2 - Extract from an existing audio/video file

Use ffmpeg to cut a section from any WAV, MP3, or video file:

ffmpeg -i /path/to/source.wav -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

What the flags mean:

  • -i /path/to/source.wav - input file (replace with your actual path)
  • -ss 00:00:05 - start cutting at 5 seconds
  • -to 00:00:08 - stop cutting at 8 seconds (3-second clip)
  • /tmp/myclip.wav - output path. /tmp is a temporary folder Linux provides - files survive until reboot
Finding your timestamps: Open the file in your video or audio editor of choice, note where your clip starts and ends, and use those values for -ss and -to.

Examples:

If the file is in your home folder:

ffmpeg -i ~/mysound.wav -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

If the file is in Downloads:

ffmpeg -i ~/Downloads/mysound.mp3 -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

If the file is a video - ffmpeg extracts the audio automatically:

ffmpeg -i ~/Videos/myvideo.mp4 -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

If the filename has spaces - wrap it in quotes:

ffmpeg -i "/home/j/My Audio File.wav" -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

Convert to ESP32-friendly format

The ESP32 DAC works best at 8kHz, 8-bit, mono:

ffmpeg -y -i /tmp/myclip.wav -ar 8000 -ac 1 -acodec pcm_u8 /tmp/myclip_8bit.wav
  • -y - overwrite output without prompting
  • -ar 8000 - 8000 samples per second
  • -ac 1 - mono
  • -acodec pcm_u8 - 8-bit unsigned PCM (0-255, 128 = silence) - matches what dacWrite() expects
Why 8kHz? It's what phone calls use - speech is fully intelligible. A 1-second clip at 8kHz = 8KB; at 44.1kHz = 44KB. The ESP32 has ~4MB flash, so 8kHz is a sensible default.

Convert WAV to C Header

Save the script below as wav_to_header.py, then execute it to convert your audio into a header file for the sketch. How to save a script →

import sys

input_file = sys.argv[1]   # e.g. /tmp/myclip_8bit.wav
output_file = sys.argv[2]  # e.g. myclip.h
var_name = sys.argv[3]     # e.g. myclip

with open(input_file, 'rb') as f:
    data = f.read()

pcm = data[44:]  # skip 44-byte WAV header

lines = [
    '#pragma once',
    '#include <pgmspace.h>',
    f'const uint32_t {var_name}_len = {len(pcm)};',
    f'const uint8_t {var_name}_data[] PROGMEM = {{'
]

row = []
for b in pcm:
    row.append(f'0x{b:02X}')
    if len(row) == 16:
        lines.append('  ' + ', '.join(row) + ',')
        row = []
if row:
    lines.append('  ' + ', '.join(row))
lines.append('};')

with open(output_file, 'w') as f:
    f.write('\n'.join(lines) + '\n')

print(f"Done: {len(pcm)} bytes -> {output_file}")
python3 wav_to_header.py /tmp/myclip_8bit.wav myclip.h myclip

Place myclip.h in your sketch folder - the same directory as your .ino file.

Sketch

/*
 * We stand on the shoulders of giants when we build
 * with knowledge gained from others' efforts.
 * That doesn't make us giants. Be humble.
 * Create with care. Open source is the way.
 *
 * PAM8403 Audio Amplifier with ESP32 - WAV Playback
 * --------------------------------------------------
 * Plays a WAV clip stored as a C byte array via the
 * ESP32 built-in DAC (GPIO25) into a PAM8403 class-D
 * amplifier module.
 *
 * Open source - MIT Licence
 * Built with reference to the Espressif ESP32 Arduino
 * core DAC and PROGMEM examples.
 *
 * Full guide, wiring diagram, and audio prep steps:
 * https://electroniczoology.com/projects/pam8403-esp32-audio
 *
 * Electronic Zoology - field notes from the garage
 */

#include "myclip.h"  // generated from wav_to_header.py

const int DAC_PIN     = 25;
const int SAMPLE_RATE = 8000;
const float VOLUME    = 0.8;  // 0.0 = silent, 1.0 = max

void playAudio() {
  uint32_t intervalUs = 1000000 / SAMPLE_RATE;
  for (uint32_t i = 0; i < myclip_len; i++) {
    uint8_t sample = pgm_read_byte(&myclip_data[i]);
    // Scale around midpoint to control volume
    int val = 128 + (int)((sample - 128) * VOLUME);
    val = constrain(val, 0, 255);
    dacWrite(DAC_PIN, val);
    delayMicroseconds(intervalUs);
  }
  dacWrite(DAC_PIN, 128); // return to midpoint (silence)
}

void setup() {
  playAudio();
}

void loop() {
  // Nothing here - clip plays once on boot.
  // Reset the ESP32 to play again.
}
PROGMEM Stores the audio data in flash instead of RAM. Without it, a large audio array would overflow the ESP32's 320KB RAM. pgm_read_byte() reads it back one byte at a time during playback.
Volume control: Samples are 8-bit unsigned (0-255) centred at 128. To reduce volume, scale the offset from that midpoint - e.g. at VOLUME=0.5, a sample of 200 becomes 128 + (200-128)x0.5 = 164. Start at 0.3 and work up.

Troubleshooting

No sound at all

  • Check 5V and GND are connected to the PAM8403
  • Check speaker is on L OUT+ and L OUT−
  • If your module has an SD (shutdown) pin, tie it HIGH to 3.3V

Very quiet

  • Increase VOLUME in the sketch
  • Check the 10µF cap is the right way around (+ leg toward ESP32)

Hiss or noise

  • Add or increase the 100µF decoupling cap on VCC/GND
  • Make sure GND is shared between ESP32 and PAM8403

Wrong playback speed

  • SAMPLE_RATE in the sketch must match the ffmpeg -ar value (both should be 8000)

Clip too large / flash usage too high

  • Trim the clip shorter with ffmpeg
  • Drop to 4kHz: -ar 4000 - speech stays intelligible at half the size