-
Notifications
You must be signed in to change notification settings - Fork 356
Increase robustness of LwsApiCall implementation #2134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Increase robustness of LwsApiCall implementation #2134
Conversation
38a25f5
to
9f7ebd9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are good improvements..
Wanted to know if you had a method or setup to test these failure recovery scenarios and if we could add something for it? Seems like it would be a lot of additional effort.
src/source/Signaling/LwsApiCalls.c
Outdated
retValue = (INT32) lws_write(wsi, pLwsCallInfo->sendBuffer + LWS_PRE, remainingSize, LWS_WRITE_TEXT); | ||
if (retValue < 0) { | ||
DLOGW("Write failed with %d", retValue); | ||
CHK(FALSE, !STATUS_SUCCESS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should put an actual status here -- I believe this would get converted to 1 which is STATUS_NULL_ARG
offset = ATOMIC_LOAD(&pSignalingClient->pOngoingCallInfo->sendOffset); | ||
size = ATOMIC_LOAD(&pSignalingClient->pOngoingCallInfo->sendBufferSize); | ||
|
||
result = (SERVICE_CALL_RESULT) ATOMIC_LOAD(&pSignalingClient->messageResult); | ||
|
||
if (offset != size && result == SERVICE_CALL_RESULT_NOT_SET) { | ||
CHK_STATUS(CVAR_WAIT(pSignalingClient->sendCvar, pSignalingClient->sendLock, SIGNALING_SEND_TIMEOUT)); | ||
retryCount++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we can debug log (DLOGD) the retry count increased
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the CVAR_WAIT
returned non-success (eg status operation timed out), CHK_STATUS
would goto Cleanup
and bypass the retry count -- doesn't seem intended since there's a "// Check if we timed out" down below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sirknightj looks a bit odd, but here is what's happening:
- If
CVAR_WAIT
returns on timeout, we exit anyway. - Only if it was returned early, we increment the
retryCount
and go again for wait ifoffset != size
.
else we stop iterating and go for the rest of the code path.
So, retryCount is basically, making it retry
if wakeup was within timeout.
Let me know if you have a better suggestion to handle sendCvar
wake-ups without just giving up entirely.
connectInfo.port = SIGNALING_DEFAULT_SSL_PORT; | ||
connectInfo.alpn = "http/1.1"; // Force HTTP/1.1 only | ||
connectInfo.protocol = "http/1.1"; // Force HTTP/1.1 protocol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering the purpose of the H2 flag if 1.1 is forced, is it for latency/performance optimization?
It seems that HTTP 2 should from this quick test:
curl -sI https://m-xxxxxxxx.kinesisvideo.us-west-2.amazonaws.com -o/dev/null -w '%{http_version}\n'
2
curl -sI https://v-xxxxxxxx.kinesisvideo.us-west-2.amazonaws.com -o/dev/null -w '%{http_version}\n'
2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this to make this ws connection work more reliably on ESP platforms. Helps reduce memory consumption a bit and works much reliably. BTW, do we really gain much with http2 for our use case?
Now that, I have lot of optimisations in place, I can test http2 with ESP32 and revert the change.
@@ -1907,6 +1948,9 @@ STATUS writeLwsData(PSignalingClient pSignalingClient, BOOL awaitForResponse) | |||
SIZE_T offset, size; | |||
SERVICE_CALL_RESULT result; | |||
|
|||
UINT32 retryCount = 0; | |||
const UINT32 MAX_RETRY_COUNT = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking this should be a #define
(very very minor memory reduction)
- Handle partial writes by sending data in multiple iterations - Use retries when message send fails - Track and handle message receive if in parts using `receiveMessage` var
9f7ebd9
to
152ba5c
Compare
@sirknightj I can see the re-attempts reproducible quite often on ESP platforms and hence the change. Will check if it's straight forward to add some test. |
receiveMessage
var