-
Notifications
You must be signed in to change notification settings - Fork 108
Open
Labels
Type: bugSomething isn't workingSomething isn't working
Description
DictionaryProvider leaks memory while adding dictionaries with duplicate encoding. Is this expected? Should the provider release the memory of the existing dictionary vector if it accepts another one with same encoding id ?
Sample code:
"dictionaryProvider" should " not leak memory while adding dictionaries with duplicate encoding" in {
val allocator: RootAllocator = new RootAllocator()
val vector: ListVector = ListVector.empty("vector", allocator)
val dictionaryVector1: ListVector = ListVector.empty("dict1", allocator)
val dictionaryVector2: ListVector = ListVector.empty("dict2", allocator)
val writer1: UnionListWriter = vector.getWriter
writer1.allocate
writer1.setValueCount(1)
val dictWriter1: UnionListWriter = dictionaryVector1.getWriter
dictWriter1.allocate
dictWriter1.setValueCount(1)
val dictWriter2: UnionListWriter = dictionaryVector2.getWriter
dictWriter2.allocate
dictWriter2.setValueCount(1)
val dictionary1: Dictionary = new Dictionary(dictionaryVector1, new DictionaryEncoding(1L, false, None.orNull))
val dictionary2: Dictionary = new Dictionary(dictionaryVector2, new DictionaryEncoding(1L, false, None.orNull))
val provider = new DictionaryProvider.MapDictionaryProvider
provider.put(dictionary1)
provider.put(dictionary2)
vector.clear()
provider.getDictionaryIds.asScala.map(id => provider.lookup(id).getVector.clear())
allocator.getAllocatedMemory shouldBe 0
} Reporter: Vimal Varghese
Note: This issue was originally created as ARROW-16920. Please see the migration documentation for further details.
Metadata
Metadata
Assignees
Labels
Type: bugSomething isn't workingSomething isn't working