Skip to content

[bug] Null attributes not serialized to protobuf correctly, breaking Loki #6138

@johnnyggalt

Description

@johnnyggalt

Package

OpenTelemetry

Package Version

Package Name Version
OpenTelemetry.Exporter.OpenTelemetryProtocol 1.11.1
OpenTelemetry.Exporter.Prometheus.AspNetCore 1.11.0-beta.1
OpenTelemetry.Extensions.Hosting 1.11.1
OpenTelemetry.Instrumentation.AspNetCore 1.11.0
OpenTelemetry.Instrumentation.Process 1.11.0-beta.1
OpenTelemetry.Instrumentation.Runtime 1.11.0

Runtime Version

net9.0

Description

As discovered via this issue, the TagWriter class of the OpenTelemetry.Exporter.OpenTelemetryProtocol package is misbehaving when a LogRecord has an attribute with a null value. Instead of writing the attribute's key followed by an empty (missing) value, it skips the attribute altogether (see this code). This results in what I assume is an invalid serialized object and it ends up breaking Loki when it receives such a log record.

To repro this, it was as simple as doing this and forwarding it onto Loki:

logger.LogInformation("A null value: {Value}", null)

Steps to Reproduce

  1. Stand up Loki instance. I'm using grafana/otel-lgtm:latest and mapping a volume to a local directory so I can quickly and easily delete all its storage and start afresh. See my docker compose service below
  2. dotnet new console
  3. dotnet add package Microsoft.Extensions.Hosting
  4. dotnet add package OpenTelemetry.Exporter.OpenTelemetryProtocol
  5. In Program.cs, copy/paste the below code
  6. Update the exporter endpoint URL as necessary
  7. Run the program and confirm log entries appear in Loki:
    1. Open Grafana dashboard
    2. Go to Explore
    3. Click the Label drop-down and choose service_name
    4. Click the Value drop-down and choose the value there
    5. Click the Run Query button as necessary
  8. Uncomment the line of code that logs a null value and re-execute the repro
  9. Go back to Grafana and attempt to refresh the logs. It should spin for a while before failing

Docker Compose service

  monitoring:
    image: grafana/otel-lgtm:latest
    ports:
      - 3001:3000 # Grafana admin
      - 4317:4317 # OpenTelemetry GRPC ingestion
      - 9091:9090 # Prometheus admin
    volumes:
      - ./path/to/grafana_data/:/data

Program.cs:

using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Hosting;
using OpenTelemetry.Logs;

var hostBuilder =
    Host
        .CreateDefaultBuilder()
        .ConfigureServices(services =>
            services
                .AddLogging(logging =>
                    logging
                        .AddOpenTelemetry(openTelemetry =>
                            openTelemetry
                                .AddOtlpExporter(exporter =>
                                    exporter.Endpoint = new Uri("http://localhost:4317")
                                )
                            )
                    )
                .AddHostedService<HostedService>()
        );

hostBuilder.RunConsoleAsync().Wait();

public class HostedService(ILogger<HostedService> logger) : IHostedService
{
    public Task StartAsync(CancellationToken cancellationToken)
    {
        logger.LogInformation("Hosted Service is starting.");

        string? value = null;
        //logger.LogInformation("Here is a null {Value}", value);

        return Task.CompletedTask;
    }

    public Task StopAsync(CancellationToken cancellationToken)
    {
        return Task.CompletedTask;
    }
}

Expected Result

I should see the log entry in Grafana's log viewer.

Actual Result

Once the poisoned log entry is submitted to the GRPC endpoint, the Grafana log viewer will spin for a while before displaying an error message:

Image

Error message as text for search purposes:

failed to parse series labels to categorize labels: 1:2: parse error: unexpected "=" in label set, expected identifier or "}"

Additional Context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingneeds-triageNew issues which have not been classified or triaged by a community memberpkg:OpenTelemetryIssues related to OpenTelemetry NuGet package

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions