An another Node binding of whisper.cpp to make same API with whisper.rn as much as possible.
- whisper.cpp: Automatic speech recognition with multi-platform support
- whisper.rn: React Native binding of whisper.cpp
- macOS
- arm64: CPU and Metal GPU acceleration
- x86_64: CPU only
- Windows (x86_64 and arm64)
- CPU
- GPU acceleration via Vulkan
- GPU acceleration via CUDA (x86_64)
- Linux (x86_64 and arm64)
- CPU
- GPU acceleration via Vulkan
- GPU acceleration via CUDA
npm install @fugood/whisper.node
import { initWhisper } from '@fugood/whisper.node'
const context = await initWhisper({
model: 'path/to/ggml-base.en.bin',
useGpu: true,
}, libVariant)
// transcribeFile returns { stop, promise }
const { stop: stop1, promise: promise1 } = context.transcribeFile('audio1.wav', {
language: 'en',
temperature: 0.0,
// ...
})
const result1 = await promise1
// transcribeData also returns { stop, promise }
let audioBuffer // PCM 16-bit, mono, 16kHz
const { stop: stop2, promise: promise2 } = context.transcribeData(audioBuffer, {
language: 'en',
temperature: 0.0,
// ...
})
const result2 = await promise2
// You can also cancel transcription if needed
// await stop1() // Cancels the first transcription
// await stop2() // Cancels the second transcription
// Always release the context when done
await context.release()
import { initWhisperVad } from '@fugood/whisper.node'
// Context-based VAD (for multiple detections)
const vadContext = await initWhisperVad({
model: 'path/to/ggml-vad.bin',
useGpu: true,
nThreads: 2
}, libVariant)
const result = await vadContext.detectSpeechFile('audio.wav')
const result2 = await vadContext.detectSpeechData(audioBuffer)
await vadContext.release()
Note: Audio data should be 16-bit PCM, mono, 16kHz format. The library expects ArrayBuffer containing raw audio data.
-
default
: General usage, not support GPU except macOS (Metal) -
vulkan
: Support GPU Vulkan (Windows/Linux), but some scenario might unstable -
cuda
: Support GPU CUDA (Windows/Linux), but only for limited capabilityLinux: (x86_64: 8.9, arm64: 8.7) Windows: x86_64 - 12.0
MIT
Built and maintained by BRICKS.