This is the first blog post in the series where I am describing the progress of building a sound synthesizer from scratch in Go.
I was always fascinated by how computers can generate and process sound, going from something purely digital to something physical. The motivation behind this project is to get more hands-on experience in Golang and at the same time, learn something about DSP (digital signal processing).
Source code corresponding to this blog post is on my GitHub repository.
Goals
My goal for this is to create a Golang application with the following features:
- Oscillators for basic waveforms: sine, triangle, square
- ADSR envelope (attack, decay, sustain, release) processing
- Mixer for multiple signals and effects
- Being able to play full chords
- Audio effects e.g.: vibrato, tremolo, echo, distortion
- TUI - text-based user interface
- Display piano keys and map them to keyboard keys
- Visualize controls for all parts of the synthesizer
- Audio visualizations
- Saving (and loading) presets to a file
- Benchmark, identify, and optimize “hot spots”
- Experiment with the macOS Core Audio driver
Configuration
Go
In the root go.mod file I am defining the minimal required version of Go and a single dependency of ebitengine/oto - more about Oto
later.
| |
Makefile
A simple Make file that I borrowed from one of my other projects; it defines a couple of useful targets to: manage Go modules, run tests and benchmarks, and run the app itself. By default, it displays the help message with the docs.
| |
ebitengine/oto - abstraction layer
Oto is a low-level library to play sound in Go, developed by the same people behind the Ebitengine game engine. It can be used as a part of Ebitengine or on its own - as I do here.
To play a sound, you need to interact with the hardware, and this is done through an operating system’s audio driver. As you can imagine, there are multiple operating systems and platforms (macOS, Windows, Linux, Android, WebAssembly, etc.) - each of them has its own audio driver.
macOS itself has three different audio drivers: Core Audio, Audio Unit, and Audio Toolbox.
Interacting with the audio driver is a complex task and usually requires writing some low-level code using platform-specific language like: C++, or Swift (for iOS or macOS). Moreover, writing a multiplatform application requires implementing the code for each platform you want to support separately.
This is when oto comes into play - it abstracts away the platform-specific details so that you can focus on application logic instead. There is a single way of interacting with Oto, while the library takes care of audio driver specifics.
---
title: "Oto abstraction layer: operating systems and audio drivers"
---
graph TD
client["Client Application<br>(Go)"]
oto["ebitengine/oto<br>(Go lib)"]
subgraph os ["OS and Audio Drivers"]
direction TB
macOS["macOS"]
linux["Linux"]
windows["Windows"]
android["Android"]
other["..."]
end
client -- uses --> oto
oto -- abstracts --> os
Generating sine wave
Mathematical representation
Now, that is where the real implementation (and complexity) starts.
First, let’s take a look at the mathematical representation of a sine wave:
In practice this means that:
- $A$ - can be used to adjust the volume, so the value can be between 0.0 (mute) to 1.0 (full volume)
- $\omega$ - is based on the frequency, and so can be used to change the frequency of the wave
- $\phi$ - phase describes how far the wave is shifted from the origin
In trigonometry, calculations are performed on radians, which are the standard unit of a plane angle in the SI system (International System of Units). Sine function is periodic, and it repeats itself after $360\degree$ so $2 \pi$ radians. $$sin(0) = sin(2 \pi) = sin(4 \pi) = 0$$
Angular frequency $\omega$ is strictly related to frequency but is expressed in radians per second. Frequency of 1 Hz means that there is 1 period per second. $$\omega = 2 \pi f$$ $$f = 1 \text{Hz}$$ $$\omega = 2 \pi \cdot 1 = 2 \pi$$ Angular frequency of $2 \pi$ radians per second means that the wave repeats itself every second (1 Hz).
Phase $\phi$ will come very handy later to calculate the value of a single wave at a specific point in time (period) - between $0$ and $2 \pi$ radians.
The code
First, I declare a struct representing a sine wave oscillator for a single frequency:
| |
To create a new SinOscillator, I need a constructor function:
| |
The constructor does the following:
- lines 2–3: it validates if the amplitude is between 0 and 1
- line 6: calculates angular frequency as $2 \pi f$
- line 12: calculates
phaseStepbased on the audio driversampleRate(44100 Hz) - lines 7–13: returns a pointer to a new
SinOscillatorinstance
Sampling rate — can be thought of as the “resolution” of the audio signal.
In a computer world, audio cannot be represented as a continuous signal but instead as a discrete sequence of samples.
Using a value of 44.1 kHz means that the audio driver samples the signal 44100 times per second.
44.1 kHz is roughly twice the max frequency that a human can hear (~20 kHz) and is pretty much the standard for audio processing.
Phase step — having this allows removing the time variable $t$ from the equation (literally):
angFreq := angularFrequency(frequency)— that many radians are there in one second to produce a signal with the given frequencyconst sampleRate = 44100— oscillator calculates the value of sine function 44100 times per secondphaseStep: angFreq / float64(sampleRate)- that many radians are there in one sample
Based on that, we can shift the phase $\phi$ by angFreq / sampleRate radians each time a single sample is calculated.
Getting the value from oscillator is pretty straightforward now:
| |
- Move the
phasebyphaseStepradians. - If the phase is greater than $2 \pi$ radians, “move it back” $2 \pi$ radians.
The sine function is periodical so $sin(2 \pi + 1) = sin(2 \pi + 1 - 2 \pi)$. - Calculate the value of the sine function at the current phase, adjusted by the amplitude.
Probably, we should first calculate the value and only then move the phase - I will fix that another time :)
Playing sound through oto
At this point, I am able to generate a sine wave of any frequency, but I am still in the “digital domain.” The next step is to somehow push the samples into the audio driver to hear the sound.
Configuring Oto to play the sound using streaming (based on the official docs):
| |
- lines 2–3:
bufferSizeSamplesandhardwareBufferSizewere adjusted using a trial-and-error method to avoid audio artifacts. - lines 9–13: configuring basic audio options e.g., 44.1 kHz sampling rate.
- lines 15–20: creating a new
oto.Contextinstance, waiting onreadyChanuntil it is closed — that means that the context is ready. - lines 22–24: creating a new
oto.Player, initializing it with theoscillatorinstance, and starting playback. - lines 26–32: preventing the program from exiting, printing any errors that occur during playback.
Audio format - bit depth
You might have noticed that I used ctxOptions.Format = oto.FormatSignedInt16LE in the configuration.
Oto is able to use three different audio formats:
FormatUnsignedInt8- 8-bit unsigned integers (0 to 255), the lowest precision, “retro” soundingFormatSignedInt16LE- 16-bit signed integers little-endian (-32768 to 32767), standard, high-quality formatFormatFloat32LE- 32-bit floating-point numbers little-endian (-1 to 1 floating-point), “studio” quality format
Little-endian means that the least significant byte is stored (sent) first.
Since the output of the
SinOscillatoris afloat64, I guess it would be easier to just convert it tofloat32and useFormatFloat32LEformat - I will fix that in the future too.
Bit depth together with the sampling rate acts like an audio resolution:
- sampling rate - how many samples per second, how dense is the audio signal
- bit depth - how many bits per single sample, how precise is the sample
Using the following function converts the value of the sample from float64 (0 to 1) into int16(-32768 to 32767):
| |
Oto Player
To play the actual sound, the audio buffer must be populated with some samples.
I am doing that by telling otoCtx.NewPlayer(oscillator) to use my instance of SinOscillator as their source.
This creates an oscillator for 440 Hz tone and volume of 0.2. 440 Hz is the standard pitch for “A4” (the note A above middle C), which gives the frequency some musical context.
| |
That is only possible because NewPlayer function expects an io.Reader (a standard Go interface) as a parameter.
See the docs: oto package - github.com/ebitengine/oto/v3 - Context.NewPlayer:
| |
Let’s also inspect io.Reader in the official Golang docs: https://pkg.go.dev/io#Reader — there is quite extensive documentation on how to implement it properly, here is a fragment:
| |
Requirements
In practice, this means that the func (s *SinOscillator) Read(p []byte) (n int, err error) must:
- use signed 16-bit integer format for representing samples
- send the samples in little-endian
- take
p []bytebuffer and populate it with samples - return
n intwith the number of bytes written top
Implementation of io.Reader for s *SinOscillator:
| |
- lines 2–6: checking the length of the buffer and “traversing” it
- line 7: getting the next sample from the oscillator
- lines 8–9: “splitting” the 16-bit integer into two bytes, first storing the least significant byte
- lines 10–12: moving the counter; it is the same as the number of written bytes, so it can be returned
If we were to print the size of the buffer
len(p)it should (but not must) be equal tobufferSizeSampleswe declared earlier, so 4096.
That is it! That code is enough to play 440 Hz tone through the speakers of your computer.
Run the code yourself
You can grab the source code corresponding to this post from my GitHub repository.
Running it on macOS is as simple as cloning the repo, installing Go and Make, and then using make:
| |
Summary
This is the end of my first blog post in this new series about building a sound synthesizer in Go.
I described you my goals for this project. Together we learned about the mathematical basis of sound and sine waves: radians, angular frequency, phase.
I showed how to generate sine waves programmatically and then how to use the Ebitengine Oto library to abstract the audio driver and the OS to play the sound.
That involved some math, knowledge of Go’s io.Reader interface, and how data is stored in binary format.
I am looking forward to your comments and feedback!