Skip to content

[R] arrow_table: as.data.frame() sometimes returns a tbl and sometimes a data.frame #34775

@daattali

Description

@daattali

Describe the bug, including details regarding any error messages, version, and platform.

The documentation for arrow_table() says that you can call as.data.frame() on an arrow table object to convert it to a dataframe. It seems like the behaviour is inconsistent. Example:

> class(as.data.frame(arrow::arrow_table(name = "1", mtcars)))
[1] "tbl_df"     "tbl"        "data.frame"
> class(as.data.frame(arrow::arrow_table(mtcars, name = "1")))
[1] "data.frame"

I created the same arrow_table in both cases but only changed the order of the arguments, and it returned a different object type.

I would expect it to be base R dataframe since as.data.frame() is a generic from base R that is generally assumed to convert to actual data.frame objects, not to tibbles. Right now if I want to guarantee a base R data.frame, I need to do as.data.frame(as.data.frame(table)). I need to ocnvert it using the same function twice, which is extremely awkward and unintuitive. I strongly believe arrow should not conflate data.frame() with tbl_df, it can cause errors for users.

Component(s)

R

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions