A simple, lightweight Swift wrapper for speech-to-text transcription using Apple's Speech APIs.
- Simple async/await API for speech transcription
- AttributedString output with audio timing metadata
- Multi-language support with automatic model downloading
- Native Apple types - no custom wrappers
- Minimal footprint - single file implementation
- Privacy-focused - on-device processing
AuralKit provides a clean, minimal API for adding speech transcription to your app using iOS 26's SpeechTranscriber
and SpeechAnalyzer
APIs.
import AuralKit
// Create an instance with your preferred locale
let auralKit = AuralKit(locale: .current)
// Start transcribing
for try await text in auralKit.startTranscribing() {
print(text) // AttributedString with timing metadata
}
Add AuralKit to your project through Xcode:
- File → Add Package Dependencies
- Enter:
https://github.com/rryam/AuralKit
- Click Add Package
Or add it to your Package.swift
:
dependencies: [
.package(url: "https://github.com/rryam/AuralKit", from: "0.1.0")
]
import AuralKit
// Create with default locale
let auralKit = AuralKit()
// Or specify a locale
let auralKit = AuralKit(locale: Locale(identifier: "es-ES"))
// Start transcribing
for try await attributedText in auralKit.startTranscribing() {
// Access the plain text
let plainText = String(attributedText.characters)
print(plainText)
// Access timing metadata for each word/phrase
for run in attributedText.runs {
if let timeRange = run.audioTimeRange {
print("Text: \(run.text), Start: \(timeRange.start.seconds)s")
}
}
}
// Stop when needed
await auralKit.stopTranscribing()
Check out the included Aural demo app to see AuralKit in action! The demo showcases:
- Live Transcription: Real-time speech-to-text with visual feedback
- Language Selection: Switch between multiple locales
- History Tracking: View past transcriptions
- Export & Share: Share transcriptions via standard iOS share sheet
- Open
Aural.xcodeproj
in theAural
directory - Build and run on your iOS 26+ device or simulator
- Grant microphone and speech recognition permissions
- Start transcribing!
import SwiftUI
import AuralKit
@available(iOS 26.0, *)
struct ContentView: View {
@State private var auralKit = AuralKit()
@State private var transcribedText = ""
@State private var isTranscribing = false
var body: some View {
VStack(spacing: 20) {
Text(transcribedText.isEmpty ? "Tap to start..." : transcribedText)
.padding()
.frame(maxWidth: .infinity, minHeight: 100)
.background(Color.gray.opacity(0.1))
.cornerRadius(10)
Button(action: toggleTranscription) {
Label(isTranscribing ? "Stop" : "Start",
systemImage: isTranscribing ? "stop.circle.fill" : "mic.circle.fill")
.padding()
.background(isTranscribing ? Color.red : Color.blue)
.foregroundColor(.white)
.cornerRadius(10)
}
}
.padding()
}
func toggleTranscription() {
if isTranscribing {
Task {
await auralKit.stopTranscribing()
isTranscribing = false
}
} else {
isTranscribing = true
Task {
do {
for try await attributedText in auralKit.startTranscribing() {
transcribedText = String(attributedText.characters)
}
} catch {
print("Error: \(error)")
}
isTranscribing = false
}
}
}
}
@available(iOS 26.0, macOS 26.0, *)
public final class AuralKit {
// Initialize with a locale
public init(locale: Locale = .current)
// Start transcribing
public func startTranscribing() -> AsyncThrowingStream<AttributedString, Error>
// Stop transcribing
public func stopTranscribing() async
}
The transcription returns an AttributedString
with rich metadata:
for try await attributedText in auralKit.startTranscribing() {
// Get plain text
let plainText = String(attributedText.characters)
// Access timing information
for run in attributedText.runs {
if let audioRange = run.audioTimeRange {
let startTime = audioRange.start.seconds
let endTime = audioRange.end.seconds
print("\(run.text): \(startTime)s - \(endTime)s")
}
}
}
AuralKit supports all locales available through SpeechTranscriber.supportedLocales
. Common examples:
Locale(identifier: "en-US")
- English (United States)Locale(identifier: "es-ES")
- Spanish (Spain)Locale(identifier: "fr-FR")
- French (France)Locale(identifier: "de-DE")
- German (Germany)Locale(identifier: "it-IT")
- Italian (Italy)Locale(identifier: "pt-BR")
- Portuguese (Brazil)Locale(identifier: "zh-CN")
- Chinese (Simplified)Locale(identifier: "ja-JP")
- Japanese (Japan)Locale(identifier: "ko-KR")
- Korean (Korea)
Add to your Info.plist
:
<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access to transcribe speech.</string>
<key>NSSpeechRecognitionUsageDescription</key>
<string>This app needs speech recognition to convert your speech to text.</string>
- iOS 26.0+ / macOS 26.0+
- Swift 6.0+
- Microphone and speech recognition permissions
do {
for try await text in auralKit.startTranscribing() {
print(text)
}
} catch {
switch error {
case let nsError as NSError:
switch nsError.code {
case -10: // Microphone permission denied
print("Please grant microphone permission in Settings")
case -11: // Speech recognition permission denied
print("Please grant speech recognition permission in Settings")
case -2: // Unsupported locale
print("Selected locale is not supported on this device")
default:
print("Error: \(error.localizedDescription)")
}
default:
print("Unexpected error: \(error)")
}
}
Contributions are welcome! Please feel free to submit a Pull Request.
AuralKit is available under the MIT License. See the LICENSE file for more info.
Made by Rudrank Riyam