Skip to content

Conversation

@hermanschaaf
Copy link
Member

@hermanschaaf hermanschaaf commented Apr 18, 2023

This fixes a memory leak detected by the file and s3 destination tests, and refactors the retain/release logic a little bit.

Quoting the Go Arrow docs:

If you send an object over a channel, you must call Retain before sending it as the receiver is assumed to own the object and will later call Release when it no longer needs the object.

I think we should call Retain as close as possible to where a record is sent over a channel.

@hermanschaaf
Copy link
Member Author

Ah wait a minute, I was using an outdated base branch 😮‍💨 Let me double-check this

pks[key] = struct{}{}
res = append(res, r)
continue
case reported:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reported variable was not being used

@github-actions
Copy link

github-actions bot commented Apr 18, 2023

⏱️ Benchmark results

Comparing with 591502f

  • DefaultConcurrencyDFS-2 resources/s: 10,807 ⬇️ 2.79% decrease vs. 591502f
  • DefaultConcurrencyRoundRobin-2 resources/s: 12,412 ⬆️ 0.31% increase vs. 591502f
  • Glob-2 ns/op: 169.9 ⬇️ 66.75% decrease vs. 591502f
  • TablesWithChildrenDFS-2 resources/s: 30,971 ⬆️ 22.99% increase vs. 591502f
  • TablesWithChildrenRoundRobin-2 resources/s: 28,202 ⬆️ 16.33% increase vs. 591502f
  • TablesWithRateLimitingDFS-2 resources/s: 28.5 ⬆️ 1.02% increase vs. 591502f
  • TablesWithRateLimitingRoundRobin-2 resources/s: 830.4 ⬆️ 4.11% increase vs. 591502f
  • BufferedScanner-2 ns/op: 9.282 ⬇️ 32.84% decrease vs. 591502f
  • LogReader-2 ns/op: 30.56 ⬇️ 28.60% decrease vs. 591502f

@codecov
Copy link

codecov bot commented Apr 18, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (591502f) 46.71% compared to head (1a667c9) 46.71%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #795   +/-   ##
=======================================
  Coverage   46.71%   46.71%           
=======================================
  Files          76       76           
  Lines        7832     7832           
=======================================
  Hits         3659     3659           
  Misses       3682     3682           
  Partials      491      491           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@hermanschaaf hermanschaaf changed the title fix: Call Retain in writeAll fix: Arrow Retain and Release fixes Apr 18, 2023
@github-actions github-actions bot added fix and removed fix labels Apr 18, 2023
var reported bool
for _, r := range resources {
if r.NumRows() > 1 {
panic(fmt.Sprintf("record with more than 1 row: %d", r.NumRows()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also added this panic because this function's logic will break down if a record ever contains more than one row.

I wonder if we may want to add a special type that guarantees we're dealing with a arrow.Record with only a single row.

Copy link
Contributor

@yevgenypats yevgenypats left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had another counter comment :)

MaxRows: 1,
}
record1 := testdata.GenTestData(mem, table.ToArrowSchema(), opts)[0]
record1.Retain()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this is removed? I specifically called Retain here because unlike in the server we do use this after call to writeOne or writeAll for compare/testing purposes. This is also why I called defer record1.Release afterwards.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically just a refactor: no change in functionality. Rather than have many retains, I think it's better to have it only in writeAll (and by extension writeOne), where it's clear that the resource is being sent over a channel. Here, just by looking at the call to writeOne, you can't really know whether it's going to be sent over a channel or not.

The record is still available for comparison until the end of this function because the release is deferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is crucial for it to be here. I can explain :) maybe let's sync tmrw morning.

@hermanschaaf
Copy link
Member Author

@yevgenypats Thanks for catching up with me earlier; I've now updated it as we discussed.

The TL;DR is that we'll keep the extra Retain calls in the tests to make it clear that we're keeping the records around for later comparison; normally it is still the plugin's responsibility to call Release.

@hermanschaaf hermanschaaf merged commit a893db6 into main Apr 19, 2023
@hermanschaaf hermanschaaf deleted the fix-memory-leak branch April 19, 2023 09:54
kodiakhq bot pushed a commit that referenced this pull request Apr 19, 2023
🤖 I have created a release *beep* *boop*
---


## [2.3.2](v2.3.1...v2.3.2) (2023-04-19)


### Bug Fixes

* Arrow Retain and Release fixes ([#795](#795)) ([a893db6](a893db6))
* Disallow null character in strings per utf8 spec ([#797](#797)) ([591502f](591502f))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
kodiakhq bot pushed a commit to cloudquery/filetypes that referenced this pull request Apr 19, 2023
This is mainly to fix the memory leak fixed in the plugin-sdk update cloudquery/plugin-sdk#795
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants