|
|
пре 4 дана | |
|---|---|---|
| .gitignore | пре 6 дана | |
| Dockerfile | пре 4 дана | |
| README.md | пре 4 дана | |
| requirements.txt | пре 4 дана | |
| server.py | пре 4 дана |
A minimal HTTP API for speech-to-text transcription built with:
The service accepts audio uploads and returns a single transcription result using the Whisper model.
It is designed to be:
Typical use cases include:
The server works in three stages:
A client sends a multipart/form-data POST request containing an audio file.
The server converts the audio into 16 kHz mono WAV using ffmpeg.
This ensures compatibility and stable input for Whisper.
The server calls the whisper.cpp CLI (whisper-cli) with a specified model.
The transcription result is returned as JSON.
If the transcription result is empty, the server returns diagnostic information to help debugging.
GET /health
Returns server status and verifies:
Example response:
{
"ok": true,
"problems": []
}
POST /transcribe
multipart/form-data
Field name:
file
Example:
file=@audio.wav
Supported formats (handled by ffmpeg):
Success:
{
"text": "Hello this is a transcription."
}
If the transcription is empty, the server returns diagnostics:
{
"text": "",
"note": "empty transcript; returning diagnostics",
"stdout": "...",
"stderr": "...",
"cmd": [...]
}
.
├── server.py
├── requirements.txt
├── Dockerfile
└── README.md
At runtime the container will also contain:
/app/whisper.cpp
/app/whisper.cpp/build/bin/whisper-cli
/app/whisper.cpp/models/ggml-small.bin
Runtime components:
ggml-small.bin)The Docker image builds whisper.cpp automatically.
The server reads these environment variables:
WHISPER_BIN
MODEL_PATH
Defaults inside the container:
WHISPER_BIN=/app/whisper.cpp/build/bin/whisper-cli
MODEL_PATH=/app/whisper.cpp/models/ggml-small.bin
From the project directory:
docker build -t whisper-api .
docker run -d -it -p 5005:5005 --name whisper-server whisper-api
The API will be available at:
http://localhost:5005
curl http://localhost:5005/health
curl -X POST \
-F "file=@test.wav" \
http://localhost:5005/transcribe
Example response:
{"text":"hello this is a test"}
Create a Python virtual environment:
python -m venv venv
source venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Run server:
python server.py
The default Docker build downloads:
ggml-small.bin
You can switch models by modifying the Dockerfile:
| Model | Speed | Accuracy |
|---|---|---|
| tiny | very fast | low |
| base | fast | moderate |
| small | balanced | good |
| medium | slow | high |
| large | very slow | best |
transcription speed depends on:
Typical small-model performance on modern CPUs:
~0.5x – 2x realtime
This server:
For production deployments consider:
This project uses:
See their respective repositories for details.