Skip to content

[CI][Python] Conda Python 3.10 jobs fail with UnicodeDecodeError due to gdb issue #46343

@raulcd

Description

@raulcd

Describe the bug, including details regarding any error messages, version, and platform.

The CI job AMD64 Conda Python 3.10 Without Pandas has started failing on test_gdb.py with several UnicodeDecodeError invalid continuation byte.
Example of job failures from different PRs and main:

The log is quite long so I am adding only one of the test failures but there are several so please check the output of the jobs:

______________________________ test_scalars_heap _______________________________

gdb_arrow = <pyarrow.tests.test_gdb.GdbSession object at 0x7f722c8add10>

    def test_scalars_heap(gdb_arrow):
>       check_heap_repr(gdb_arrow, "heap_null_scalar", "arrow::NullScalar")

opt/conda/envs/arrow/lib/python3.10/site-packages/pyarrow/tests/test_gdb.py:728: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
opt/conda/envs/arrow/lib/python3.10/site-packages/pyarrow/tests/test_gdb.py:242: in check_heap_repr
    s = gdb.print_value(f"*{expr}")
opt/conda/envs/arrow/lib/python3.10/site-packages/pyarrow/tests/test_gdb.py:143: in print_value
    out = self.run_command(f"p {expr}")
opt/conda/envs/arrow/lib/python3.10/site-packages/pyarrow/tests/test_gdb.py:137: in run_command
    return self.wait_until_ready()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pyarrow.tests.test_gdb.GdbSession object at 0x7f722c8add10>

    def wait_until_ready(self):
        """
        Record output until the gdb prompt displays.  Return recorded output.
        """
        # TODO: add timeout?
        while (not self.last_stdout_line.startswith(b"(gdb) ") and
               self.proc.poll() is None):
            block = self.proc.stdout.read(4096)
            if self.verbose:
                sys.stdout.buffer.write(block)
                sys.stdout.buffer.flush()
            block, sep, last_line = block.rpartition(b"\n")
            if sep:
                self.last_stdout.append(self.last_stdout_line)
                self.last_stdout.append(block + sep)
                self.last_stdout_line = last_line
            else:
                assert block == b""
                self.last_stdout_line += last_line
    
        if self.proc.poll() is not None:
            raise IOError("gdb session terminated unexpectedly")
    
>       out = b"".join(self.last_stdout).decode('utf-8')
E       UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 46: invalid continuation byte

opt/conda/envs/arrow/lib/python3.10/site-packages/pyarrow/tests/test_gdb.py:122: UnicodeDecodeError
----------------------------- Captured stdout call -----------------------------
p *heap_null_scalar
_______________________________ test_array_data ________________________________

Component(s)

Continuous Integration, Python

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions