Skip to content

Fix SAPI5 #18300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Jul 29, 2025
Merged

Fix SAPI5 #18300

merged 35 commits into from
Jul 29, 2025

Conversation

gexgd0419
Copy link
Contributor

@gexgd0419 gexgd0419 commented Jun 21, 2025

Link to issue number:

Fixes #18298. Fixes #18301.

Summary of the issue:

When using SAPI 5 voices, NVDA may be frozen at some point. Seems like the thread was stuck calling WavePlayer.idle.

After selecting a SAPI5 Eloquence voice, NVDA would fail to load the same voice and fall back to OneCore voices on next launch.

Description of user facing changes:

The issue above should be fixed.

Description of developer facing changes:

  • The following symbols in synthDrivers.sapi5 are deprecated with no replacement:
    • LP_c_ubyte
    • LP_c_ulong
    • LP__ULARGE_INTEGER
    • SynthDriver.isSpeaking

Description of development approach:

SpVoice.Speak() waits for SAPI5's audio thread, and if the audio thread waits for WavePlayer.idle(), SpVoice.Speak() will also block, causing dead-locks if the WavePlayer is being paused.

To avoid this, a dedicated thread _speakThread is created to send speak requests to SAPI5 one by one.

  • A separate speech queue, _speakRequests, is maintained. Whenever the synth wants to speak, it places a speak request in the queue.
  • The speak thread handles the requests one by one. Therefore, SAPI5's own queue is bypassed.
  • When nothing is being spoken, it takes the next request and let SAPI5 speak it, then wait for it to finish this utterance.
  • When the queue becomes empty, the speak thread calls WavePlayer.idle. Neither the SAPI5 audio thread nor the main thread will be blocked.
  • As SAPI5 speak is never called in the main thread, it will not be blocked.
  • When cancel is called, WavePlayer.stop is called to break out of pausing or idling. The rest of the cancelling work is done in the thread.

SynthDriverAudioStream now implements the ISpAudio interface, which can get the wave format of the voice directly during format negotiation, making it possible to get rid of the legacy audio system entirely.

Testing strategy:

A user reported that it ran hours without crashing.

Known issues with pull request:

Text to be spoken is stored in each speak request, but prosody arguments, such as rate and volume, are not. All speak requests will be spoken in the current prosody setting.

A user reported that this caused the NVDA key to stop working sometimes, and NVDA did not start on the logon screen. Those two issues might not have to do with this change.

Code Review Checklist:

  • Documentation:
    • Change log entry
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • API is compatible with existing add-ons.
  • Security precautions taken.

@coderabbitai summary

@gexgd0419
Copy link
Contributor Author

From the logs:

Python stack for thread 4740 (nvwave.playWaveFile(error.wav)):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "threading.pyc", line 982, in run
  File "nvwave.pyc", line 128, in play
  File "nvwave.pyc", line 326, in feed
  File "nvwave.pyc", line 303, in open
  File "nvwave.pyc", line 459, in _setVolumeFromConfig
  File "nvwave.pyc", line 428, in setVolume

Python stack for thread 11688 (Dummy-15):
  File "comtypes\_comobject.pyc", line 176, in call_without_this
  File "synthDrivers\sapi5.pyc", line 167, in ISpNotifySink_Notify
  File "synthDrivers\sapi5.pyc", line 204, in EndStream
  File "nvwave.pyc", line 371, in idle
  File "nvwave.pyc", line 367, in sync

Python stack for thread 2424 (watchdog.CancellableCallThread.execute(<_FuncPtr object at 0x0DB99B70>)):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "watchdog.pyc", line 428, in run
  File "threading.pyc", line 629, in wait
  File "threading.pyc", line 327, in wait

Python stack for thread 15200 (watchdog):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "threading.pyc", line 982, in run
  File "watchdog.pyc", line 163, in _watcher
  File "watchdog.pyc", line 191, in waitForFreezeRecovery
  File "logHandler.pyc", line 64, in getFormattedStacksForAllThreads

Python stack for thread 6964 (Thread-14):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\browsernav\globalPlugins\browserNav\utils.py", line 94, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 22044 (Thread-13):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\browsernav\globalPlugins\browserNav\utils.py", line 94, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 21716 (Thread-12):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\browsernav\globalPlugins\browserNav\utils.py", line 94, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 18068 (Thread-11):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\browsernav\globalPlugins\browserNav\utils.py", line 94, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 7288 (Thread-10):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\browsernav\globalPlugins\browserNav\utils.py", line 94, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 3388 (Thread-9):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\phoneticPunctuation\globalPlugins\phoneticPunctuation\utils.py", line 70, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 23064 (Thread-8):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\phoneticPunctuation\globalPlugins\phoneticPunctuation\utils.py", line 70, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 24448 (Thread-7):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\phoneticPunctuation\globalPlugins\phoneticPunctuation\utils.py", line 70, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 23608 (Thread-6):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\phoneticPunctuation\globalPlugins\phoneticPunctuation\utils.py", line 70, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 23724 (Thread-5):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\phoneticPunctuation\globalPlugins\phoneticPunctuation\utils.py", line 70, in run
    func, args, kargs = self.tasks.get()
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 11920 (SynthDetector_0):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "threading.pyc", line 982, in run
  File "concurrent\futures\thread.pyc", line 81, in _worker

Python stack for thread 15380 (globalPlugins.rdAccess.directoryChanges.DirectoryWatcher):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "hwIo\ioThread.pyc", line 264, in run

Python stack for thread 8704 (addons.rdAccess.lib.ioThreadEx.IoThreadEx):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "hwIo\ioThread.pyc", line 264, in run

Python stack for thread 21840 (Thread-4):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "C:\Users\konst\AppData\Roaming\nvda\addons\tonysEnhancements\globalPlugins\tonysEnhancements\__init__.py", line 751, in run
    tones.beep(150, 10, left=25, right=25)
  File "tones.pyc", line 90, in beep
  File "nvwave.pyc", line 358, in feed

Python stack for thread 12560 (winInputHook):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "threading.pyc", line 982, in run
  File "winInputHook.pyc", line 108, in hookThreadFunc

Python stack for thread 26640 (UIAHandler.UIAHandler.MTAThread):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "threading.pyc", line 982, in run
  File "UIAHandler\__init__.pyc", line 554, in MTAThreadFunc
  File "queue.pyc", line 171, in get
  File "threading.pyc", line 327, in wait

Python stack for thread 27308 (ThreadPoolExecutor-0_0):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "threading.pyc", line 982, in run
  File "concurrent\futures\thread.pyc", line 81, in _worker

Python stack for thread 10180 (hwIo.ioThread.IoThread):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "hwIo\ioThread.pyc", line 264, in run

Python stack for thread 24624 (ScheduleThread):
  File "threading.pyc", line 1002, in _bootstrap
  File "threading.pyc", line 1045, in _bootstrap_inner
  File "utils\schedule.pyc", line 90, in run

Python stack for thread 22924 (MainThread):
  File "nvda.pyw", line 309, in <module>
  File "core.pyc", line 1049, in main
  File "wx\core.pyc", line 2262, in MainLoop
  File "core.pyc", line 992, in Notify
  File "queueHandler.pyc", line 102, in pumpAll
  File "queueHandler.pyc", line 67, in flushQueue
  File "speech\manager.pyc", line 714, in _handleIndex
  File "speech\manager.pyc", line 435, in _pushNextSpeech
  File "synthDrivers\sapi5.pyc", line 553, in speak

It seems like that a call to WavePlayer.sync was stuck, blocking the speak call and the main thread.

@jcsteh Do you have any idea about why this could happen?

Is there some way to inspect the C++ code via dump files, or do I have to insert log outputs in the C++ code?

@jcsteh
Copy link
Contributor

jcsteh commented Jun 22, 2025

I'm not super familiar with how the new sapi5 code works. Generally, any synth calls on the main thread should never block; any call that could block should be handled on a background thread. That said, I'm guessing this doesn't normally block? Otherwise, we'd be seeing a lot more problems than this.

As to why sync would be hanging, I'm not sure. It suggests the stream hangs without being stopped and never reaches the number of frames we sent, but also never throws an error. I can't fathom how that could happen.

This does seem to be unique to SAPI5, so that leads me to wonder what is different about SAPI5. One thing it does that no other drivers currently do is call feed() with null data. The code supposedly supports this, but I wonder whether I didn't account for something. I'll think on that.

As for debugging, you can use WinDBG to break in and inspect the stack, etc. or save a minidump for later inspection. However, this can be really tricky for screen reader users, since your screen reader is obviously unusable in that state. I can sometimes manage it by firing up Narrator as well, but in-process injection can make things really unstable even if you try to use another screen reader. If you do manage to get a minidump, I can try to take a look.

@gexgd0419
Copy link
Contributor Author

The problem is that I couldn't reproduce this, and this log is from a user's report.

I changed the SAPI5 code, so that the audio data is sent to NVDA's code, instead of directly to the speaker via WinMM.

SAPI5 automatically creates a new thread to run the voice engine (since it can speak asynchronously), so NVDA will receive the audio data on the new thread. The engine may also generate event notifications, which is usually routed to the main loop, but I chose to use ISpNotifySink to receive the notifications on the same thread as the notification sender.

So the code that feeds the audio to the WavePlayer, and the EndStream notification that calls WavePlayer.idle to finalize the audio, usually run on the thread created by SAPI5. But when, for example, the speech is being cancelled, WavePlayer.stop will be called on the main thread.

I'm not sure whether WavePlayer is thread-safe enough, or at least safe to call stop() on another thread. I tried adding some locks, but the issue wasn't fixed.

I also think that stucking in sync is not normal. In fact, I cannot prove that it is actually the case, it's just my guess. The log shows that NVDA was frozen, and the call stack of a particular thread always shows this:

Python stack for thread 11688 (Dummy-15):
  File "comtypes\_comobject.pyc", line 176, in call_without_this
  File "synthDrivers\sapi5.pyc", line 167, in ISpNotifySink_Notify
  File "synthDrivers\sapi5.pyc", line 204, in EndStream
  File "nvwave.pyc", line 371, in idle
  File "nvwave.pyc", line 367, in sync

So I would assume that this thread was stuck in the sync call.

The code in EndStream looks like this:

def EndStream(self, streamNum: int, pos: int):
	synth = self.synthRef()
	# Flush the stream and get the remaining data.
	synth.sonicStream.flush()
	audioData = synth.sonicStream.readShort()
	synth.player.feed(audioData, len(audioData) * 2)
	synth.player.idle()
	# trigger all untriggered bookmarks
	if streamNum in synth._streamBookmarks:
		for bookmark in synth._streamBookmarks[streamNum]:
			synthIndexReached.notify(synth=synth, index=bookmark)
		del synth._streamBookmarks[streamNum]
	synth.isSpeaking = False
	synthDoneSpeaking.notify(synth=synth)

It calls feed before calling idle. But it is possible to feed zero bytes, in case there's no leftover data in sonicStream.

And if the call to idle hangs, the thread will also hang, which is the thread SAPI5 created to run the voice engine. So the next time the main thread want to speak something, SAPI5 will make it wait for the voice engine thread, even if you pass "asynchronous" flag to it.

I'm not sure if it's possible to let the user get a dump file when NVDA got frozen. Maybe it would be helpful if NVDA can, in addition to dumping the Python call stacks of all threads, also generate a dump file that the user can submit.

@seanbudd seanbudd added the merge-early Merge Early in a developer cycle label Jun 23, 2025
@jcsteh
Copy link
Contributor

jcsteh commented Jun 23, 2025

I'm not sure whether WavePlayer is thread-safe enough, or at least safe to call stop() on another thread.

It is designed to support calling feed on one thread and stop on another without the need for locks. That is necessary in order for stop to instantaneously interrupt audio and every synth driver relies on this. It is not safe to call feed on multiple threads at once, but nothing should be doing that.

I also think that stucking in sync is not normal.

If the user pauses audio (e.g. with the shift key), you will get stuck in sync. But that should resume as soon as the user stops audio (e.g. any other key press). Basically, sync will block until all sent audio has been played, unless it is stopped from another thread during the call. Obviously, something is going wrong there, but I don't know what, and it's going to be nearly impossible to figure out without being able to reproduce it.

I'm not sure if it's possible to let the user get a dump file when NVDA got frozen.

Maybe. We do have code to generate a dump when NVDA crashes. However, generating a dump within the same process can be risky. There can also be a lot of temporary freezes, so we'd want to do this with an advanced setting or something to avoid cluttering up the user's system (and potentially destabilising NVDA) with lots of dumps.

@gexgd0419 gexgd0419 changed the base branch from master to beta June 29, 2025 14:58
@gexgd0419 gexgd0419 changed the title Attempt to fix SAPI5 Fix SAPI5 Jun 30, 2025
@gexgd0419 gexgd0419 marked this pull request as ready for review June 30, 2025 11:26
@gexgd0419 gexgd0419 requested a review from a team as a code owner June 30, 2025 11:26
@gexgd0419 gexgd0419 requested a review from seanbudd June 30, 2025 11:26
@gexgd0419 gexgd0419 marked this pull request as draft June 30, 2025 12:14
@gexgd0419
Copy link
Contributor Author

@jcsteh Since only one utterance can be spoken at a time, will cancel be called before every utterance? What should happen if speak is called multiple times without cancel in the middle? Should the utterances be queued and played one by one, or should only the last utterance be spoken?

Also please give me some idea about when exactly should WavePlayer.idle be used. If it should be used when no more audio data for the current utterance will be fed, what will happen when the next utterance arrives? I've noticed that without calling feed, sync or idle, the callback functions will not work, so I'll have to use sync or idle somewhere.

@jcsteh
Copy link
Contributor

jcsteh commented Jul 4, 2025

Since only one utterance can be spoken at a time, will cancel be called before every utterance?

NVDA will call cancel only if speech is to be interrupted. That could be because the user pressed a key or because of an app change or similar.

What should happen if speak is called multiple times without cancel in the middle? Should the utterances be queued and played one by one, or should only the last utterance be spoken?

Queued. When cancel is called, stop audio and purge the queue.

Also please give me some idea about when exactly should WavePlayer.idle be used. If it should be used when no more audio data for the current utterance will be fed,

Ideally, you would call idle when there is no more data for the current utterance and there are no more utterances in the queue. If you can't do that for whatever reason, calling it when there is no more data for the current utterance will be fine.

what will happen when the next utterance arrives?

idle will block until the last utterance is finished. At that point, the synth will be ready with the next utterance, so it will call feed, at which point audio will resume immediately.

I've noticed that without calling feed, sync or idle, the callback functions will not work, so I'll have to use sync or idle somewhere.

Correct. That's why we need idle (which calls sync): for when the last part of an utterance is spoken and there is no more data to speak yet.

@jcsteh
Copy link
Contributor

jcsteh commented Jul 4, 2025

Btw, when I say you need a queue, it's fine if SAPI provides its own queue. You just need a way to purge the queue (or at least ignore certain entries) if cancel is called.

@gexgd0419
Copy link
Contributor Author

It's just weird that idle could block infinitely. I tried to address this issue by calling WavePlayer.stop before speak when idle is in progress, but now the end part of the speech would be cut off.

@jcsteh
Copy link
Contributor

jcsteh commented Jul 4, 2025

As a test, can you try feeding a small chunk of silence (even just a few bytes) when you would normally call feed with None? That's obviously not practical in the real code, but I wonder if it'll fix the hang. If it does, that at least narrows it down to that case, which should make it easier to debug. That's the only difference I can think of between the way the sapi5 code uses nvwave vs everything else.

@jcsteh
Copy link
Contributor

jcsteh commented Jul 4, 2025

Ah, I think I was just able to reproduce this. I won't have time to look into this for a few days, but if I can find some time, I'll see if I can get a C++ debugger onto it to get a stack trace from WasapiPlayer.

@gexgd0419
Copy link
Contributor Author

How did you reproduce that? I'm not a screen reader user actually, so please tell me something I could try.

@jcsteh
Copy link
Contributor

jcsteh commented Jul 4, 2025

Honestly, I was just tabbing around rapidly in NVDA's Speech settings dialog. I suspect it might be one of those tricky situations where I was able to reproduce it once and won't be able to easily reproduce it again, but I'll try when I can.

@gexgd0419
Copy link
Contributor Author

What version did you use? The current stable release, or the latest snapshot in this pull request?

@jcsteh
Copy link
Contributor

jcsteh commented Jul 4, 2025

It's an old alpha, so not an entirely valid test. Version: alpha-36837,8c9efe82 (2025.2.0.36837)

@gexgd0419
Copy link
Contributor Author

What SAPI5 voice did you use?

I still could not easily reproduce the issue by tabbing around or mashing keyboard keys. Not sure whether this has to do with a specific voice, but the issue report says that any SAPI5 voice could do.

I really hope that there is some easy way for users to capture a dump file for NVDA at any moment. I cannot imagine what a screen reader user can do when the screen reader becomes frozen, though. Theoretically, there are some tools that can create a dump file of a specific process, but when NVDA is frozen, further operation becomes difficult, as NVDA has mouse and keyboard hooks installed.

Even if there's a dump file, you need debug symbols to analyze it, so how can I get the debug symbols or PDB files for a particular version?

Or maybe, outputting some log lines inside the C++ code may be helpful. So how can I log things to the NVDA log inside C++ code?

@jcsteh
Copy link
Contributor

jcsteh commented Jul 5, 2025

What SAPI5 voice did you use?

I was able to reproduce it twice: once with Microsoft David and once with a voice called BlastBay Richard. Now I can't reproduce it again at all despite trying for about 15 minutes. Unfortunately, although I did manage to get a debugger onto it at one point, I was running the wrong debugger (x64 instead of x86) and then I accidentally terminated the process instead of attaching the correct debugger.

I really hope that there is some easy way for users to capture a dump file for NVDA at any moment.

It's not easy to implement this, unfortunately.

Theoretically, there are some tools that can create a dump file of a specific process, but when NVDA is frozen, further operation becomes difficult, as NVDA has mouse and keyboard hooks installed.

What I do here is run the debugger as administrator so that NVDA can't hook it. Then I use Narrator to interact with the debugger. It's a very painful process and it only works some of the time, but if I can reproduce the problem enough times, I can usually get what I need.

Even if there's a dump file, you need debug symbols to analyze it, so how can I get the debug symbols or PDB files for a particular version?

There used to be a debug symbol server where symbols for all releases and snapshots were uploaded. However, this article hasn't been updated and I don't know where that symbol server is located now or whether it still exists. If you're debugging locally, your source copy will include pdb files and you don't need the symbol server.

Or maybe, outputting some log lines inside the C++ code may be helpful. So how can I log things to the NVDA log inside C++ code?

LOG_DEBUGWARNING, LOG_ERROR, etc.

@gexgd0419
Copy link
Contributor Author

So now one user reported that the build at commit 77f6702 (Wait for cancellation to complete) works for them. Commits after that are mostly comments, docstrings, and small refactorings.

@cary-rowen
Copy link
Contributor

cc @seanbudd
Does this fix need to be postponed until 2026.1? Is it possible to do it earlier? Which APIs are considered breaking changes?

@seanbudd
Copy link
Member

@cary-rowen - this is not a risk of API breaking changes, rather that we want sufficient testing - hence the merge early label

@seanbudd seanbudd requested a review from Copilot July 24, 2025 02:00
@seanbudd seanbudd marked this pull request as draft July 24, 2025 02:01
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes critical deadlock and startup issues in the SAPI5 speech synthesizer driver. The main problem was that SAPI5's audio thread would sometimes wait for WavePlayer.idle(), causing SpVoice.Speak() to block and freeze NVDA completely. Additionally, SAPI5 Eloquence voices would fail to load on subsequent NVDA launches.

Key changes include:

  • Implementing a dedicated speak thread to handle speech requests asynchronously and prevent deadlocks
  • Replacing the legacy audio streaming system with a modern ISpAudio implementation for better format negotiation
  • Refactoring bookmark management to use a simpler queue-based approach

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
user_docs/en/changes.md Added changelog entries for the two fixed issues
source/synthDrivers/sapi5.py Complete refactor of audio handling, threading, and event management to fix deadlocks and improve reliability
source/synthDrivers/sapi4.py Added missing return type declaration for CoTaskMemAlloc function
Comments suppressed due to low confidence (1)

source/synthDrivers/sapi5.py:203

  • The class name 'SynthDriverAudioStream' is misleading since it now implements ISpAudio, ISpEventSource, and ISpEventSink interfaces, not just audio streaming. Consider renaming to 'SynthDriverAudio' to better reflect its expanded responsibilities.
class SynthDriverAudioStream(COMObject):

@gexgd0419 gexgd0419 marked this pull request as ready for review July 27, 2025 07:06
@seanbudd seanbudd modified the milestones: 2026.1, 2025.2, 2025.3 Jul 29, 2025
@seanbudd seanbudd enabled auto-merge (squash) July 29, 2025 03:45
@seanbudd seanbudd merged commit d2b34e4 into nvaccess:master Jul 29, 2025
22 checks passed
@gexgd0419 gexgd0419 deleted the sapi5-fix branch July 29, 2025 04:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-early Merge Early in a developer cycle
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nvda switches to windows one core voices after restart of program and or reboot. SAPI5 voices cause NVDA to freeze in versions 2025.1->
6 participants