whisper.node

An another Node binding of whisper.cpp to make same API with whisper.rn as much as possible.

whisper.cpp: Automatic speech recognition with multi-platform support
whisper.rn: React Native binding of whisper.cpp

Platform Support

macOS
- arm64: CPU and Metal GPU acceleration
- x86_64: CPU only
Windows (x86_64 and arm64)
- CPU
- GPU acceleration via Vulkan
- GPU acceleration via CUDA (x86_64)
Linux (x86_64 and arm64)
- CPU
- GPU acceleration via Vulkan
- GPU acceleration via CUDA

Installation

npm install @fugood/whisper.node

Usage

Basic Transcription

import { initWhisper } from '@fugood/whisper.node'

const context = await initWhisper({
  model: 'path/to/ggml-base.en.bin',
  useGpu: true,
}, libVariant)

// transcribeFile returns { stop, promise }
const { stop: stop1, promise: promise1 } = context.transcribeFile('audio1.wav', {
  language: 'en',
  temperature: 0.0,
  // ...
})

const result1 = await promise1

// transcribeData also returns { stop, promise }
let audioBuffer // PCM 16-bit, mono, 16kHz
const { stop: stop2, promise: promise2 } = context.transcribeData(audioBuffer, {
  language: 'en',
  temperature: 0.0,
  // ...
})

const result2 = await promise2

// You can also cancel transcription if needed
// await stop1() // Cancels the first transcription
// await stop2() // Cancels the second transcription

// Always release the context when done
await context.release()

Voice Activity Detection (VAD)

import { initWhisperVad } from '@fugood/whisper.node'

// Context-based VAD (for multiple detections)
const vadContext = await initWhisperVad({
  model: 'path/to/ggml-vad.bin',
  useGpu: true,
  nThreads: 2
}, libVariant)

const result = await vadContext.detectSpeechFile('audio.wav')

const result2 = await vadContext.detectSpeechData(audioBuffer)
await vadContext.release()

Note: Audio data should be 16-bit PCM, mono, 16kHz format. The library expects ArrayBuffer containing raw audio data.

Lib Variants

default: General usage, not support GPU except macOS (Metal)
vulkan: Support GPU Vulkan (Windows/Linux), but some scenario might unstable
cuda: Support GPU CUDA (Windows/Linux), but only for limited capability

Linux: (x86_64: 8.9, arm64: 8.7) Windows: x86_64 - 12.0

License

MIT

Built and maintained by BRICKS.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github		.github
.husky		.husky
cmake		cmake
lib		lib
packages		packages
scripts		scripts
src		src
test		test
whisper.cpp @ 869335f		whisper.cpp @ 869335f
.gitignore		.gitignore
.gitmodules		.gitmodules
.release-it.json		.release-it.json
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
babel.config.js		babel.config.js
commitlint.config.js		commitlint.config.js
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

whisper.node

Platform Support

Installation

Usage

Basic Transcription

Voice Activity Detection (VAD)

Lib Variants

License

About

Uh oh!

Releases 2

Packages

Languages

License

mybigday/whisper.node

Folders and files

Latest commit

History

Repository files navigation

whisper.node

Platform Support

Installation

Usage

Basic Transcription

Voice Activity Detection (VAD)

Lib Variants

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages