Skip to content

[R] RecordBatchReaderHead from ExecPlan with UDF cannot be read #33299

@asfimport

Description

@asfimport
  register_scalar_function(
    "times_32",
    function(context, x) x * 32.0,
    int32(),
    float64(),
    auto_convert = TRUE
  )
  record_batch(a = 1:1000) %>%
    dplyr::mutate(b = times_32(a)) %>%
    as_record_batch_reader() %>%
    head(11) %>%
    as_arrow_table()

# Error: NotImplemented: Call to R (resolve scalar user-defined function output data type) from a non-R thread from an unsupported context
# /arrow/cpp/src/arrow/compute/exec.cc:649  kernel_->signature->out_type().Resolve(kernel_ctx_, args.inputs)
# /arrow/cpp/src/arrow/compute/exec/expression.cc:602  executor->Init(&kernel_context, {kernel, types, options})
# /arrow/cpp/src/arrow/compute/exec/project_node.cc:91  ExecuteScalarExpression(simplified_expr, target, plan()->exec_context())
# /arrow/cpp/src/arrow/record_batch.cc:336  ReadNext(&batch)
# /arrow/cpp/src/arrow/record_batch.cc:350  ToRecordBatches()

It works fine if you don't call as_record_batch_reader() in the middle. Oddly, it also works fine if you add as_adq() (aka collapse()) after head() and before evaluating to table--that is, if you run it through an ExecPlan again, it doesn't error.

Reporter: Neal Richardson / @nealrichardson
Assignee: Dewey Dunnington / @paleolimbot

PRs and other links:

Note: This issue was originally created as ARROW-18101. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions