Running Kokoro TTS via macOS Containerisation Framework

A short 2-in-1 post of two things I’ve been meaning to try out on macOS - first, to try the new macOS Container framework on macOS 15.5... and second, to spin up Kokoro Text-to-Speech (TTS) in a container.

About macOS Containers

At WWDC 2025 in June, Apple announced the Containerisation Framework for macOS, and it’s source available on GitHub at apple/container.

With macOS 15.5 Sequoia, there are limitations, mostly around networking and the full-feature release will be in macOS 26.

Over the years I’ve used Docker Desktop, Lima and Multipass to run Linux containers on macOS, but soon I will no longer need any third party tool!

Install Apple/Container

First, download and install container-0.3.0-installer-signed.pkg from the GitHub releases page. Note that this should be considered beta software with a monthly release cadence, so look for the latest version.

Running the container coomand below for the first time will download the default Linux image from KataContainers:

container system start

For me, I also had to follow the macOS 15 Network Configuration instructions to correct the local virtual subnet for containers. You may or may not need this - read the documentation!

defaults write com.apple.container.defaults network.subnet 192.168.66.1/24

About Kokoro

Kokoro is an 82-million-parameter open-weight text-to-speech (TTS) model. It in the top 10 of the TTS Arena v2 leaderboard and is the leading open-weights (non-proprietary) model.

Kokoro supports 8 or 9 languages with multiple voices. For English, the top voices, ranked by overall grade, are:

American English - af_heart (A) and af_bella (A-)
British English - bf_emma (B-)

To create a more unique voice, Kokoro supports “blending” multiple voices based on a weighted mix of two or more voices (sometimes called voice combinations). The screenshot below shows how easily this is done with the Kokoro FastAPI web UI.

Run Kokoro Text-to-Speech Container

I chose to use Kokoro-FastAPI simply because there is a pre-built multi-architecture container image available. No need to build, and runs on Macs using the ONNX version of the model.

To download and run the Kokoro-FastAI image interactively (-i --tty) so I can see the stdout log:

container run -m 2G -i --tty --name kokoro ghcr.io/remsky/kokoro-fastapi-cpu:latest

The parameter -m 2G allocates 2 GB RAM to the container. Without this parameter, the container will stop with error code 137, which apparently means the container is out of memory. The stdout log ends abrutly like this:

2025-08-18 07:00:58.841 | INFO     | __main__:download_model:60 - Model files already exist and are valid
INFO:     Started server process [6]
INFO:     Waiting for application startup.
07:01:09 AM | INFO     | main:57 | Loading TTS model and voice packs...
07:01:09 AM | INFO     | model_manager:38 | Initializing Kokoro V1 on cpu
07:01:09 AM | DEBUG    | paths:101 | Searching for model in path: /app/api/src/models
07:01:09 AM | INFO     | kokoro_v1:46 | Loading Kokoro model on cpu
07:01:09 AM | INFO     | kokoro_v1:47 | Config path: /app/api/src/models/v1_0/config.json
07:01:09 AM | INFO     | kokoro_v1:48 | Model path: /app/api/src/models/v1_0/kokoro-v1_0.pth
WARNING: Defaulting repo_id to hexgrad/Kokoro-82M. Pass repo_id='hexgrad/Kokoro-82M' to suppress this warning.
/app/.venv/lib/python3.10/site-packages/torch/nn/modules/rnn.py:123: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
  warnings.warn(
/app/.venv/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:143: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`.
  WeightNorm.apply(module, name, dim)

With more memory allocated and the container should startup successfully:

░░░░░░░░░░░░░░░░░░░░░░░░

    ╔═╗┌─┐┌─┐┌┬┐
    ╠╣ ├─┤└─┐ │
    ╚  ┴ ┴└─┘ ┴
    ╦╔═┌─┐┬┌─┌─┐
    ╠╩╗│ │├┴┐│ │
    ╩ ╩└─┘┴ ┴└─┘

░░░░░░░░░░░░░░░░░░░░░░░░

Model warmed up on cpu: kokoro_v1
Running on CPU
67 voice packs loaded

Beta Web Player: http://0.0.0.0:8880/web/
or http://localhost:8880/web/
░░░░░░░░░░░░░░░░░░░░░░░░

In another terminal, get the running container’s IP address via the container ls command:

ID             IMAGE                                     OS     ARCH   STATE    ADDR
xxxx-xxx-xxxx  ghcr.io/remsky/kokoro-fastapi-cpu:latest  linux  arm64  running  192.168.66.2

And, finally connect to this URL in a browser http://192.168.66.2:8880/web/, or use the OpenAI-compatible API endpoint at http://localhost:8880.

The easiest way to stop the container is by hitting Ctrl+C, and to subsequently re-start it, container start -i kokoro.

Conclusion

In this post I record the steps I took to install the new, native macOS container engine, and easily run Kokoro TTS in a container, with no need to wrangle pip or python.

Aside: For a more feature packed CLI tool, try Kokoro-TTS. You have to either create your own image (I did not want to run it in my host macOS, because the Python version that comes with macOS needs to be updated, despite what the documentation indicates).

❮ Older

Newer ❯