ElectronicZoologyfield notes from the garage
Audio • ESP32

How to play audio from
ESP32 with PAM8403

Amp: PAM8403 (3W stereo class-D)
Board: ESP32 Dev Board (38-pin)
Speaker: Any 4-8Ω speaker
✓ Confirmed Working

What You Need

Software

Install on Arch Linux:

Copy and paste into terminal

sudo pacman -S espeak-ng ffmpeg python

How It Works

The ESP32 has two built-in DAC pins (GPIO25 and GPIO26) that output a true analogue voltage (0-3.3V). We use one of these to output audio samples, which the PAM8403 amplifies to drive the speaker.

The audio data is stored as a C byte array compiled directly into the sketch - no SD card or filesystem needed for short clips.

Wiring

FromTo
ESP32 GPIO25 → 10µF cap (+)PAM8403 L
10µF cap (−)PAM8403 G
ESP32 5VPAM8403 VCC
ESP32 GNDPAM8403 GND
PAM8403 L OUT+Speaker +
PAM8403 L OUT−Speaker −

Also place a 100µF cap across PAM8403 VCC and GND (+ leg to VCC).

PAM8403 input pins: The input side has three pins - L (left), G (ground reference), and R (right). For mono, connect your signal to L and ground to G. Leave R unconnected.

Why the 10µF coupling cap? The ESP32 DAC output is centred at ~1.65V (not 0V). Without the cap, that DC offset passes straight into the amplifier and causes a loud thump on power-on and distorts the audio. The cap blocks DC and passes only the AC audio signal.
Why the 100µF decoupling cap? The PAM8403 draws sudden bursts of current when playing audio. The cap sits on the power rail and smooths those spikes, reducing hiss and noise in the output.
No volume pot on your module? Control volume in code - see the sketch below.

Preparing an Audio Clip

Option 1 - Generate a voice clip with espeak-ng

Copy and paste into terminal

espeak-ng -w /tmp/myclip.wav "your text here"

Option 2 - Extract from an existing audio/video file

Use ffmpeg to cut a section from any WAV, MP3, or video file:

Copy and paste into terminal

ffmpeg -i /path/to/source.wav -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

What the flags mean:

  • -i /path/to/source.wav - your input file. /path/to/source.wav is a placeholder - replace it with the actual location of your file on disk (e.g. ~/Downloads/mysound.mp3 or /home/j/Videos/clip.mp4). Not sure of the path? Drag the file into a terminal window and the full path will be pasted in automatically.
  • -ss 00:00:05 - start cutting at 5 seconds
  • -to 00:00:08 - stop cutting at 8 seconds (3-second clip)
  • /tmp/myclip.wav - output path. /tmp is a temporary folder Linux provides - files survive until reboot
Finding your timestamps: Open the file in your video or audio editor of choice, note where your clip starts and ends, and use those values for -ss and -to.

Examples:

If the file is in your home folder:

Copy and paste into terminal

ffmpeg -i ~/mysound.wav -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

If the file is in Downloads:

Copy and paste into terminal

ffmpeg -i ~/Downloads/mysound.mp3 -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

If the file is a video - ffmpeg extracts the audio automatically:

Copy and paste into terminal

ffmpeg -i ~/Videos/myvideo.mp4 -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

If the filename has spaces - wrap it in quotes:

Copy and paste into terminal

ffmpeg -i "/home/j/My Audio File.wav" -ss 00:00:05 -to 00:00:08 /tmp/myclip.wav

Convert to ESP32-friendly format

The ESP32 DAC works best at 8kHz, 8-bit, mono:

Copy and paste into terminal

ffmpeg -y -i /tmp/myclip.wav -ar 8000 -ac 1 -acodec pcm_u8 /tmp/myclip_8bit.wav
  • -y - overwrite output without prompting
  • -ar 8000 - 8000 samples per second
  • -ac 1 - mono
  • -acodec pcm_u8 - 8-bit unsigned PCM (0-255, 128 = silence) - matches what dacWrite() expects
Why 8kHz? It's what phone calls use - speech is fully intelligible. A 1-second clip at 8kHz = 8KB; at 44.1kHz = 44KB. The ESP32 has ~4MB flash, so 8kHz is a sensible default.

Convert WAV to C Header

Save the script below as wav_to_header.py, then execute it to convert your audio into a header file for the sketch. How to save a script →

import sys

input_file = sys.argv[1]   # e.g. /tmp/myclip_8bit.wav
output_file = sys.argv[2]  # e.g. myclip.h
var_name = sys.argv[3]     # e.g. myclip

with open(input_file, 'rb') as f:
    data = f.read()

pcm = data[44:]  # skip 44-byte WAV header

lines = [
    '#pragma once',
    '#include <pgmspace.h>',
    f'const uint32_t {var_name}_len = {len(pcm)};',
    f'const uint8_t {var_name}_data[] PROGMEM = {{'
]

row = []
for b in pcm:
    row.append(f'0x{b:02X}')
    if len(row) == 16:
        lines.append('  ' + ', '.join(row) + ',')
        row = []
if row:
    lines.append('  ' + ', '.join(row))
lines.append('};')

with open(output_file, 'w') as f:
    f.write('\n'.join(lines) + '\n')

print(f"Done: {len(pcm)} bytes -> {output_file}")

Copy and paste into terminal

python3 wav_to_header.py /tmp/myclip_8bit.wav myclip.h myclip

Place myclip.h in your sketch folder - the same directory as your .ino file.

Sketch

/*
 * We stand on the shoulders of giants when we build
 * with knowledge gained from others' efforts.
 * That doesn't make us giants. Be humble.
 * Create with care. Open source is the way.
 *
 * PAM8403 Audio Amplifier with ESP32 - WAV Playback
 * --------------------------------------------------
 * Plays a WAV clip stored as a C byte array via the
 * ESP32 built-in DAC (GPIO25) into a PAM8403 class-D
 * amplifier module.
 *
 * Open source - MIT Licence
 * Built with reference to the Espressif ESP32 Arduino
 * core DAC and PROGMEM examples.
 *
 * Electronic Zoology - field notes from the garage
 * https://electroniczoology.com
 */

#include "myclip.h"  // generated from wav_to_header.py

const int DAC_PIN     = 25;
const int SAMPLE_RATE = 8000;
const float VOLUME    = 0.8;  // 0.0 = silent, 1.0 = max

void playAudio() {
  uint32_t intervalUs = 1000000 / SAMPLE_RATE;
  for (uint32_t i = 0; i < myclip_len; i++) {
    uint8_t sample = pgm_read_byte(&myclip_data[i]);
    // Scale around midpoint to control volume
    int val = 128 + (int)((sample - 128) * VOLUME);
    val = constrain(val, 0, 255);
    dacWrite(DAC_PIN, val);
    delayMicroseconds(intervalUs);
  }
  dacWrite(DAC_PIN, 128); // return to midpoint (silence)
}

void setup() {
  playAudio();
}

void loop() {
  // Nothing here - clip plays once on boot.
  // Reset the ESP32 to play again.
}
PROGMEM Stores the audio data in flash instead of RAM. Without it, a large audio array would overflow the ESP32's 320KB RAM. pgm_read_byte() reads it back one byte at a time during playback.
Volume control: Samples are 8-bit unsigned (0-255) centred at 128. To reduce volume, scale the offset from that midpoint - e.g. at VOLUME=0.5, a sample of 200 becomes 128 + (200-128)x0.5 = 164. Start at 0.3 and work up.

Troubleshooting

No sound at all

  • Check 5V and GND are connected to the PAM8403
  • Check speaker is on L OUT+ and L OUT−
  • If your module has an SD (shutdown) pin, tie it HIGH to 3.3V

Very quiet

  • Increase VOLUME in the sketch
  • Check the 10µF cap is the right way around (+ leg toward ESP32)

Hiss or noise

  • Add or increase the 100µF decoupling cap on VCC/GND
  • Make sure GND is shared between ESP32 and PAM8403

Wrong playback speed

  • SAMPLE_RATE in the sketch must match the ffmpeg -ar value (both should be 8000)

Clip too large / flash usage too high

  • Trim the clip shorter with ffmpeg
  • Drop to 4kHz: -ar 4000 - speech stays intelligible at half the size
Prefer digital I2S audio? The How to wire the MAX98357A I2S amp with ESP32 guide covers the I2S alternative - no DAC, no capacitors, better audio quality.

Take a look at some of our other guides

Add a round display How to wire the GC9A01 round display with ESP32 → - pair audio with a 240x240 round TFT display.
Understanding flash storage for audio How ESP32 flash memory works → - know your partition budget before embedding clips.
Send playback triggers wirelessly How to use ESP-NOW to connect two ESP32s without a router → - trigger audio from another board with no router needed.