Skip to content
This repository was archived by the owner on Apr 11, 2025. It is now read-only.

Commit a99ad0a

Browse files
authored
V1.1 (#11)
1 parent 2dea5e9 commit a99ad0a

File tree

64 files changed

+903
-323
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

64 files changed

+903
-323
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
.idea
22
node_modules
3-
.DS_Store
3+
.DS_Store
4+
demo/c/picovoice_demo_file
5+
demo/c/picovoice_demo_mic

README.md

Lines changed: 51 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -26,28 +26,28 @@ spoken command:
2626
- **Private & Secure:** Everything is processed offline. Intrinsically private; HIPAA and GDPR compliant.
2727
- **Accurate:** Resilient to noise and reverberation. Outperforms cloud-based alternatives by wide margins.
2828
- **Cross-Platform:** Design once, deploy anywhere. Build using familiar languages and frameworks. Raspberry Pi, BeagleBone,
29-
Android, iOS, Linux (x86_64), macOS (x86_64), Windows (x86_64), and modern web browsers are supported. Enterprise customers
30-
can access ARM Cortex-M SDK.
29+
Android, iOS, Linux (x86_64), macOS (x86_64), Windows (x86_64), and modern web browsers are supported. Enterprise customers
30+
can access the ARM Cortex-M SDK.
3131
- **Self-Service:** Design, train, and test voice interfaces instantly in your browser, using [Picovoice Console](https://picovoice.ai/console/).
3232
- **Reliable:** Runs locally without needing continuous connectivity.
3333
- **Zero Latency:** Edge-first architecture eliminates unpredictable network delay.
3434

3535
## Build with Picovoice
3636

3737
1. **Evaluate:** The Picovoice SDK is a cross-platform library for adding voice to anything. It includes some
38-
pre-trained speech models. The SDK is licensed under Apache 2.0 and available on GitHub to encourage independent
39-
benchmarking and integration testing. You are empowered to make a data-driven decision.
38+
pre-trained speech models. The SDK is licensed under Apache 2.0 and available on GitHub to encourage independent
39+
benchmarking and integration testing. You are empowered to make a data-driven decision.
4040

4141
2. **Design:** [Picovoice Console](https://picovoice.ai/console/) is a cloud-based platform for designing voice
42-
interfaces and training speech models, all within your web browser. No machine learning skills are required. Simply
43-
describe what you need with text and export trained models.
42+
interfaces and training speech models, all within your web browser. No machine learning skills are required. Simply
43+
describe what you need with text and export trained models.
4444

4545
3. **Develop:** Exported models can run on Picovoice SDK without requiring constant connectivity. The SDK runs on a wide
46-
range of platforms and supports a large number of frameworks. The Picovoice Console and Picovoice SDK enable you to
47-
design, build and iterate fast.
46+
range of platforms and supports a large number of frameworks. The Picovoice Console and Picovoice SDK enable you to
47+
design, build and iterate fast.
4848

4949
4. **Deploy:** Deploy at scale without having to maintain complex cloud infrastructure. Avoid unbounded cloud fees,
50-
limitations, and control imposed by big tech.
50+
limitations, and control imposed by big tech.
5151

5252
## Platform Features
5353

@@ -66,11 +66,11 @@ platform.
6666

6767
## License & Terms
6868

69-
The Picovoice SDK is free and licensed under Apache 2.0 including the models released within. [Picovoice Console]((https://picovoice.ai/console/)) offers
69+
The Picovoice SDK is free and licensed under Apache 2.0 including the models released within. [Picovoice Console](https://picovoice.ai/console/) offers
7070
two types of subscriptions: Personal and Enterprise. Personal accounts can train custom speech models that run on the
7171
Picovoice SDK, subject to limitations and strictly for non-commercial purposes. Personal accounts empower researchers,
7272
hobbyists, and tinkerers to experiment. Enterprise accounts can unlock all capabilities of Picovoice Console, are
73-
permitted for use in commercial settings, and have a path to graduate to commercial distribution[<sup>*</sup>](https://picovoice.ai/pricing/).
73+
permitted for use in commercial settings, and have a path to graduate to commercial distribution[<sup>\*</sup>](https://picovoice.ai/pricing/).
7474

7575
## Table of Contents
7676

@@ -304,7 +304,7 @@ both of which [run offline in the browser](https://picovoice.ai/blog/offline-voi
304304

305305
### Python
306306

307-
Install the package
307+
Install the package:
308308

309309
```bash
310310
pip3 install picovoice
@@ -323,11 +323,9 @@ def wake_word_callback():
323323
context_path = ...
324324

325325
def inference_callback(inference):
326-
# `inference` exposes three immutable fields:
327-
# (1) `is_understood`
328-
# (2) `intent`
329-
# (3) `slots`
330-
pass
326+
print(inference.is_understood)
327+
print(inference.intent)
328+
print(inference.slots)
331329

332330
handle = Picovoice(
333331
keyword_path=keyword_path,
@@ -336,15 +334,15 @@ handle = Picovoice(
336334
inference_callback=inference_callback)
337335
```
338336

339-
`handle` is an instance of Picovoice runtime engine that detects utterances of wake phrase defined in the file located at
337+
`handle` is an instance of the Picovoice runtime engine. It detects utterances of wake phrase defined in the file located at
340338
`keyword_path`. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within
341-
the context defined by the file located at `context_path`. `keyword_path` is the absolute path to
342-
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` suffix).
343-
`context_path` is the absolute path to [Rhino Speech-to-Intent engine](https://github.com/Picovoice/rhino) context file
344-
(with `.rhn` suffix). `wake_word_callback` is invoked upon the detection of wake phrase and `inference_callback` is
339+
the context defined by the file located at `context_path`. `keyword_path` is the absolute path to the
340+
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` extension).
341+
`context_path` is the absolute path to the [Rhino Speech-to-Intent engine](https://github.com/Picovoice/rhino) context file
342+
(with `.rhn` extension). `wake_word_callback` is invoked upon the detection of wake phrase and `inference_callback` is
345343
invoked upon completion of follow-on voice command inference.
346344

347-
When instantiated, valid sample rate can be obtained via `handle.sample_rate`. Expected number of audio samples per
345+
When instantiated, the required rate can be obtained via `handle.sample_rate`. Expected number of audio samples per
348346
frame is `handle.frame_length`. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio. The
349347
set of supported commands can be retrieved (in YAML format) via `handle.context_info`.
350348

@@ -356,11 +354,7 @@ while True:
356354
handle.process(get_next_audio_frame())
357355
```
358356

359-
When done resources have to be released explicitly
360-
361-
```python
362-
handle.delete()
363-
```
357+
When done, resources have to be released explicitly `handle.delete()`.
364358

365359
### NodeJS
366360

@@ -376,8 +370,8 @@ yarn add @picovoice/picovoice-node
376370
npm install @picovoice/picovoice-node
377371
```
378372

379-
The SDK provides the `Picovoice` class. Create an instance of this class using a Porcupine keyword (with `.ppn` suffix)
380-
and Rhino context file (with `.rhn` suffix), as well as callback functions that will be invoked on wake word detection
373+
The SDK provides the `Picovoice` class. Create an instance of this class using a Porcupine keyword (with `.ppn` extension)
374+
and Rhino context file (with `.rhn` extension), as well as callback functions that will be invoked on wake word detection
381375
and command inference completion events, respectively:
382376

383377
```javascript
@@ -419,7 +413,7 @@ As the audio is processed through the Picovoice engines, the callbacks will fire
419413
### .NET
420414

421415
You can install the latest version of Picovoice by adding the latest
422-
[Picovoice Nuget package](https://www.nuget.org/packages/Picovoice/) in Visual Studio or using the .NET CLI.
416+
[Picovoice NuGet package](https://www.nuget.org/packages/Picovoice/) in Visual Studio or using the .NET CLI.
423417

424418
```bash
425419
dotnet add package Picovoice
@@ -455,13 +449,13 @@ Picovoice handle = new Picovoice(keywordPath,
455449
`handle` is an instance of Picovoice runtime engine that detects utterances of wake phrase defined in the file located at
456450
`keywordPath`. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within
457451
the context defined by the file located at `contextPath`. `keywordPath` is the absolute path to
458-
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` suffix).
452+
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` extension).
459453
`contextPath` is the absolute path to [Rhino Speech-to-Intent engine](https://github.com/Picovoice/rhino) context file
460-
(with `.rhn` suffix). `wakeWordCallback` is invoked upon the detection of wake phrase and `inferenceCallback` is
454+
(with `.rhn` extension). `wakeWordCallback` is invoked upon the detection of wake phrase and `inferenceCallback` is
461455
invoked upon completion of follow-on voice command inference.
462456

463-
When instantiated, valid sample rate can be obtained via `handle.SampleRate`. Expected number of audio samples per
464-
frame is `handle.FrameLength`. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
457+
When instantiated, the required sample rate can be obtained via `handle.SampleRate`. The expected number of audio samples per
458+
frame is `handle.FrameLength`. The Picovoice engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
465459

466460
```csharp
467461
short[] GetNextAudioFrame()
@@ -519,16 +513,16 @@ try{
519513
} catch (PicovoiceException e) { }
520514
```
521515

522-
`handle` is an instance of Picovoice runtime engine that detects utterances of wake phrase defined in the file located at
523-
`keywordPath`. Upon detection of wake word it starts inferring user's intent from the follow-on voice command within
516+
`handle` is an instance of the Picovoice runtime engine that detects utterances of wake phrase defined in the file located at
517+
`keywordPath`. Upon detection of wake word it starts inferring the user's intent from the follow-on voice command within
524518
the context defined by the file located at `contextPath`. `keywordPath` is the absolute path to
525-
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` suffix).
519+
[Porcupine wake word engine](https://github.com/Picovoice/porcupine) keyword file (with `.ppn` extension).
526520
`contextPath` is the absolute path to [Rhino Speech-to-Intent engine](https://github.com/Picovoice/rhino) context file
527-
(with `.rhn` suffix). `wakeWordCallback` is invoked upon the detection of wake phrase and `inferenceCallback` is
521+
(with `.rhn` extension). `wakeWordCallback` is invoked upon the detection of wake phrase and `inferenceCallback` is
528522
invoked upon completion of follow-on voice command inference.
529523

530-
When instantiated, valid sample rate can be obtained via `handle.getSampleRate()`. Expected number of audio samples per
531-
frame is `handle.getFrameLength()`. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
524+
When instantiated, the required sample rate can be obtained via `handle.getSampleRate()`. The expected number of audio samples per
525+
frame is `handle.getFrameLength()`. The Picovoice engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
532526

533527
```java
534528
short[] getNextAudioFrame()
@@ -558,7 +552,7 @@ There are two possibilities for integrating Picovoice into an Android applicatio
558552
[PicovoiceManager](/sdk/android/Picovoice/picovoice/src/main/java/ai/picovoice/picovoice/PicovoiceManager.java) provides
559553
a high-level API for integrating Picovoice into Android applications. It manages all activities related to creating an
560554
input audio stream, feeding it into Picovoice engine, and invoking user-defined callbacks upon wake word detection and
561-
inference completion. The class can be initialized as follow
555+
inference completion. The class can be initialized as follows:
562556

563557
```java
564558
import ai.picovoice.picovoice.PicovoiceManager;
@@ -592,22 +586,22 @@ PicovoiceManager manager = new PicovoiceManager(
592586
);
593587
```
594588

595-
Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating number within
589+
Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating point number within
596590
[0, 1]. A higher sensitivity reduces miss rate at cost of increased false alarm rate.
597591

598-
When initialized, input audio can be processed using
592+
When initialized, input audio can be processed using:
599593

600594
```java
601595
manager.start();
602596
```
603597

604-
Stop the manager by
598+
Stop the manager with:
605599

606600
```java
607601
manager.stop();
608602
```
609603

610-
When done be sure to release resources using
604+
When done be sure to release resources:
611605

612606
```java
613607
manager.delete();
@@ -650,7 +644,7 @@ Picovoice picovoice = new Picovoice(
650644
);
651645
```
652646

653-
Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating number within
647+
Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating point number within
654648
[0, 1]. A higher sensitivity reduces miss rate at cost of increased false alarm rate.
655649

656650
Once initialized, `picovoice` can be used to process incoming audio.
@@ -668,7 +662,7 @@ while (true) {
668662
```
669663

670664
Finally, be sure to explicitly release resources acquired as the binding class does not rely on the garbage collector
671-
for releasing native resources.
665+
for releasing native resources:
672666

673667
```java
674668
picovoice.delete();
@@ -708,6 +702,15 @@ when initialized input audio can be processed using `manager.start()`. The proce
708702

709703
## Releases
710704

705+
### v1.1.0 - December 2nd, 2020
706+
707+
- Improved accuracy.
708+
- Runtime optimizations.
709+
- .NET SDK.
710+
- Java SDK.
711+
- React Native SDK.
712+
- C SDK.
713+
711714
### v1.0.0 - October 22, 2020
712715

713716
- Initial release.
8.12 KB
Binary file not shown.
-6.41 KB
Binary file not shown.

demo/android/Activity/build.gradle

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ buildscript {
55
jcenter()
66
}
77
dependencies {
8-
classpath "com.android.tools.build:gradle:4.0.2"
8+
classpath 'com.android.tools.build:gradle:4.1.1'
99

1010
// NOTE: Do not place your application dependencies here; they belong
1111
// in the individual module build.gradle files
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
#Thu Oct 08 18:22:43 PDT 2020
1+
#Wed Dec 02 11:26:17 PST 2020
22
distributionBase=GRADLE_USER_HOME
33
distributionPath=wrapper/dists
44
zipStoreBase=GRADLE_USER_HOME
55
zipStorePath=wrapper/dists
6-
distributionUrl=https\://services.gradle.org/distributions/gradle-6.1.1-all.zip
6+
distributionUrl=https\://services.gradle.org/distributions/gradle-6.5-bin.zip
3 Bytes
Binary file not shown.
4.23 KB
Binary file not shown.
4.71 KB
Binary file not shown.

0 commit comments

Comments
 (0)