Fix segmentation fault in callbacks #34

wangfu91 · 2025-08-24T12:06:53Z

This PR fixes #33, it refactors how event callback closures are managed in both the SpeechRecognizer and SpeechSynthesizer structs.

Instead of storing callback closures directly in these structs, a new internal CallbackBag struct is introduced and boxed, ensuring that callbacks reside at a stable heap address.

This change improves memory safety and allows the main structs to be moved freely without invalidating callback pointers.

Checklist:

Fix the callbacks in SpeechRecognizer & add tests
Fix SpeechSynthesizer & add tests
Fix PullAudioInputStream & add tests
Fix PushAudioOutputStream & add tests

Moved all event callback closures in SpeechRecognizer and SpeechSynthesizer into dedicated internal CallbackBag structs, boxed on the heap. This ensures the main structs can be freely moved while callbacks remain at a stable memory address, improving safety and FFI compatibility.

adambezecny · 2025-08-25T20:33:01Z

wow, you are productive :) let's address PR31 first, there are two pending items before I can merge. can you have a look? 32 should be easy then. le't s address this one afterwards

wangfu91 · 2025-08-26T15:15:24Z

let's address PR31 first, there are two pending items before I can merge. can you have a look?

Hi @adambezecny ,
I had a quick look at PR #31, but I didn’t see any review comments from you yet—perhaps they weren’t submitted? Could you double-check when you get a chance?

Introduced CallbackBag struct to encapsulate callback management in PushAudioOutputStream. Updated callback registration and invocation logic to use CallbackBag, improving separation of concerns and error handling for undefined callbacks.

adambezecny · 2025-08-26T18:37:59Z

yeap, sorry I did forget to submit review for pr 31, did it know. can you have a look? thanks

Introduced CallbackBag struct to encapsulate callback management in PullAudioInputStream. Updated callback registration and FFI callback functions to use CallbackBag, improving separation of concerns and safety when accessing callbacks.

Refactored the field name 'callbacks' to 'callback_bag' in PushAudioOutputStream, SpeechRecognizer, and SpeechSynthesizer structs for clarity and consistency. Updated all references and usages accordingly.

Moved common test utilities to a new common.rs module and updated integration_test.rs to use it, reducing code duplication. Added new test files for pull input, push output, recognizer, and synthesizer segfault scenarios to improve coverage and isolate specific audio stream behaviors.

Copilot

Pull Request Overview

This PR fixes segmentation faults in callback handling by refactoring how event callback closures are managed across multiple speech SDK components. Instead of storing callback closures directly in the main structs, a new internal CallbackBag struct is introduced and boxed to ensure callbacks remain at stable heap addresses when the main structs are moved.

Introduces CallbackBag structs for SpeechRecognizer, SpeechSynthesizer, PullAudioInputStream, and PushAudioOutputStream
Updates callback registration to use stable heap-allocated callback storage
Adds comprehensive test coverage for segmentation fault scenarios across all affected components

Reviewed Changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
`src/speech/speech_recognizer.rs`	Refactors callback storage using boxed `CallbackBag` to prevent segfaults
`src/speech/speech_synthesizer.rs`	Implements callback bag pattern for synthesizer event callbacks
`src/audio/pull_audio_input_stream.rs`	Updates callback handling with stable heap allocation for input streams
`src/audio/push_audio_output_stream.rs`	Applies callback bag pattern to output stream callbacks
`tests/recognizer_segfault_test.rs`	Adds test case for speech recognizer callback stability
`tests/synthesizer_segfault_test.rs`	Adds test case for speech synthesizer callback stability
`tests/pull_input_segfault_test.rs`	Adds test case for pull input stream callback stability
`tests/push_output_segfault_test.rs`	Adds test case for push output stream callback stability
`tests/common.rs`	Introduces shared test utilities for speech recognition and synthesis
`tests/integration_test.rs`	Refactors existing tests to use new common utilities
`Cargo.toml`	Removes unused `rust-embed` dependency and cleans up formatting

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

tests/integration_test.rs

adambezecny · 2025-09-06T15:45:58Z

hi @wangfu91 give me more time to review please, I am extremely busy lately, really no time for this. I will try to start reviewing in 1-2 weeks from now. sorry :(

wangfu91 · 2025-09-06T15:54:59Z

Hi @adambezecny ,
Thanks for the update!
Totally understand. I'm in no hurry for this PR to be merged, so feel free to prioritize other items as needed. :)

adambezecny · 2025-09-13T17:26:28Z

hi @wangfu91

probably will have not much time to review this week but I already started to be looking into it :)

one fundamental question so far: why to create separate boxed struct for callbacks rather then simply pinning callback attributes? please bear in mind my rust is very rusty, not actively using it for long time, so pardon me if the question is stupid. But I think by pinning the inner callback attributes you could save yourself from such massive refactoring. Also if I remember correctly boxing comes with cost since it forces data to be allocated on heap rather than stack.

I am also struggling to understand how boxing can prevent dangling pointer. Boxed data can still be moved right, leaving the C pointer not valid any more. Is this somehow enforcing pinning implicitly?

wangfu91 · 2025-09-17T14:07:09Z

Hi @adambezecny

Thank you for taking the time to look at this! These are excellent questions. Let me try to explain the reasoning behind this approach.

why to create separate boxed struct for callbacks rather then simply pinning callback attributes?

This is a great question. The main reason is that pinning in Rust doesn't work on individual fields of a movable struct. Pinning guarantees that the memory of the pinned object will not be moved. If we wanted to make the callback's address stable, we would have to pin the entire SpeechRecognizer struct, this would affect the library's public API, every user would be forced to put the SpeechRecognizer in a Box or Pin themselves. This leaks the implementation detail and makes the library much harder to use.

The CallbackBag approach is an internal implementation detail that achieves the same safety guarantee without forcing any complexity onto the user.

Also if I remember correctly boxing comes with cost since it forces data to be allocated on heap rather than stack.

Yes, Box forces a heap allocation, which has a small performance cost compared to the stack, however, in this context the cost is negligible and well worth the safety it provides.

Boxed data can still be moved right, leaving the C pointer not valid anymore. Is this somehow enforcing pinning implicitly?

You are right that a Box itself can be moved. However, it's crucial to distinguish between moving the Box<T> (which is a smart pointer) and moving the T(the data the pointer points to).

The Box<T> is a smart pointer that lives on the stack, it's just a single pointer-sized value.
The T is the actual data, which Box allocates on the heap.

When we move a Box<T>, we are only moving the smart pointer on the stack, the data T on the heap does not move, its memory address remains stable, so we can pass this stable heap address as the context pointer to the C library.

This way, the SpeechRecognizer struct itself can be moved by the library users freely, but the callbacks will remain at a fixed memory address on the heap.

adambezecny

lgtm

adambezecny · 2025-09-15T05:42:12Z

src/speech/speech_recognizer.rs

+/// By creating a separate struct, and then boxing this struct inside our SpeechRecognizer,
+/// we can ensure the SpeechRecognizer itself can be moved freely by end users,
+/// and the callbacks will remain at a fixed memory address on the heap.
+struct CallbackBag {


@wangfu91
one fundamental question so far: why to create separate boxed struct for callbacks rather then simply pinning callback attributes? please bear in mind my rust is very rusty, not actively using it for long time, so pardon me if the question is stupid. But I think by pinning the inner callback attributes you could save yourself from such massive refactoring. Also if I remember correctly boxing comes with cost since it forces data to be allocated on heap rather than stack.

I am also struggling to understand how boxing can prevent dangling pointer. Boxed data can still be moved right, leaving the C pointer not valid any more. Is this somehow enforcing pinning implicitly?

adambezecny · 2025-09-20T19:28:47Z

this is excellent work, I am kind of speechless. Thank you for all the effort you are putting into this. Thanks to all your contributions this library got so much better after I stopped actively contributing to it on regular basis. Really appreciated! thanks :-)

adambezecny · 2025-09-21T09:29:31Z

published v1.3.0 with latest code

wangfu91 · 2025-09-23T13:14:21Z

Hi Adam,
Thank you so much for the kind words! It means a lot coming from you. I’ve really enjoyed working on this library and building on the great foundation you created. I’m glad my contributions have been helpful, and I’m looking forward to continuing to improve it. :)

wangfu91 added 2 commits August 24, 2025 18:23

Add STT and TTS segmentation fault tests.

d80fb06

wangfu91 had a problem deploying to integration August 24, 2025 12:11 — with GitHub Actions Error

wangfu91 had a problem deploying to integration August 24, 2025 12:11 — with GitHub Actions Failure

wangfu91 had a problem deploying to integration August 24, 2025 12:11 — with GitHub Actions Error

wangfu91 temporarily deployed to integration August 27, 2025 14:05 — with GitHub Actions Inactive

wangfu91 had a problem deploying to integration August 27, 2025 14:05 — with GitHub Actions Failure

wangfu91 had a problem deploying to integration August 27, 2025 14:05 — with GitHub Actions Error

wangfu91 temporarily deployed to integration August 27, 2025 14:44 — with GitHub Actions Inactive

wangfu91 had a problem deploying to integration August 27, 2025 14:44 — with GitHub Actions Error

wangfu91 had a problem deploying to integration August 27, 2025 14:44 — with GitHub Actions Failure

wangfu91 added 3 commits August 27, 2025 22:46

Rename callbacks field to callback_bag

a9303d1

Refactored the field name 'callbacks' to 'callback_bag' in PushAudioOutputStream, SpeechRecognizer, and SpeechSynthesizer structs for clarity and consistency. Updated all references and usages accordingly.

Cleanup.

f8c8380

wangfu91 marked this pull request as ready for review August 27, 2025 15:28

Copilot AI review requested due to automatic review settings August 27, 2025 15:28

Copilot AI reviewed Aug 27, 2025

View reviewed changes

wangfu91 had a problem deploying to integration August 27, 2025 15:31 — with GitHub Actions Failure

wangfu91 had a problem deploying to integration August 27, 2025 15:31 — with GitHub Actions Error

Merge branch 'main' into fix-segment-fault-in-callbacks

ff09ba7

wangfu91 temporarily deployed to integration August 29, 2025 13:10 — with GitHub Actions Inactive

wangfu91 had a problem deploying to integration August 29, 2025 13:10 — with GitHub Actions Error

wangfu91 had a problem deploying to integration August 29, 2025 13:10 — with GitHub Actions Failure

wangfu91 commented Aug 29, 2025

View reviewed changes

tests/integration_test.rs Show resolved Hide resolved

adambezecny approved these changes Sep 20, 2025

View reviewed changes

adambezecny merged commit daf839d into jabber-tools:main Sep 20, 2025
8 of 10 checks passed

wangfu91 deleted the fix-segment-fault-in-callbacks branch September 23, 2025 13:14

Fix segmentation fault in callbacks #34

Fix segmentation fault in callbacks #34

Uh oh!

Conversation

wangfu91 commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adambezecny commented Aug 25, 2025

Uh oh!

wangfu91 commented Aug 26, 2025

Uh oh!

adambezecny commented Aug 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

adambezecny commented Sep 6, 2025

Uh oh!

wangfu91 commented Sep 6, 2025

Uh oh!

adambezecny commented Sep 13, 2025

Uh oh!

wangfu91 commented Sep 17, 2025

Uh oh!

adambezecny left a comment

Choose a reason for hiding this comment

Uh oh!

adambezecny Sep 15, 2025

Choose a reason for hiding this comment

Uh oh!

adambezecny commented Sep 20, 2025

Uh oh!

Uh oh!

adambezecny commented Sep 21, 2025

Uh oh!

wangfu91 commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wangfu91 commented Aug 24, 2025 •

edited

Loading