-
Notifications
You must be signed in to change notification settings - Fork 3k
Support AWS ELB Access logs #41185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support AWS ELB Access logs #41185
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Michalis! Good job. I took a very brief look, and noticed a couple of things.
|
||
// Read first line to determine format | ||
if !scanner.Scan() { | ||
return plog.Logs{}, fmt.Errorf("no log lines found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no test for this condition
if err != nil { | ||
return plog.Logs{}, fmt.Errorf("failed to parse log line: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no test for this condition as well
} | ||
|
||
// Check for control message | ||
if fields[0] == "Enable" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make this Enable
a constant?
return value, nil | ||
} | ||
|
||
func findLogSyntaxByField(field string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment on top of this function to explain this logic?
if isValidTimestamp(field) { | ||
return clbAccessLogs, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are trying to run time.Parse
for every log, time.Parse
can be expensive. I would move this placement in the function to the default
so that the other cheap conditions run first, like this:
func findLogSyntaxByField(field string) (string, error) {
switch field {
case "http", "https", "h2", "grpcs", "ws", "wss":
return albAccessLogs, nil
case "tls":
return nlbAccessLogs, nil
default:
if isValidTimestamp(field) {
return clbAccessLogs, nil
}
return "", fmt.Errorf("invalid type: %v", field)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good suggestion !
epochNanoseconds, err := convertToUnixEpoch(albRecord.Time) | ||
if err == nil { | ||
recordLog.SetTimestamp(pcommon.Timestamp(epochNanoseconds)) | ||
} else { | ||
f.logger.Debug("Timestamp cannot be converted to unix epoch nanoseconds", zap.Error(err)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the timestamp is not valid, is there a reason to still record this log, or should the error just be thrown?
The same comment for the other 2 logs doing this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't put much thought into this, but I think you are right. Because we are doing the same for the other fields in case they are not in the right format. For example in convertTextToNlbAccessLogRecord
when a field processing fails we return an error that leads to no log record being created.
} | ||
|
||
// ELB status code and backend status code can be - in case of TCP and SSL entry | ||
if fields[7] != "-" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a constant for -
so maybe use it here? And for all the others hardcoded "-"
scanner := bufio.NewScanner(reader) | ||
|
||
// Initialize scopeLogsByResource | ||
scopeLogsByResource := map[string]plog.ScopeLogs{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All logs coming from the same file should be part of the same resource. I don't think this map is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true! When I first created this I thought they could be mixed but the file is per load balancer. I will update the code.
...sion/encoding/awslogsencodingextension/internal/unmarshaler/elb-access-log/benchmark_test.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. Please sign the CLA :)
const ( | ||
AttributeELBStatusCode = "aws.elb.status.code" // int | ||
AttributeELBBackendStatusCode = "aws.elb.backend.status.code" // int | ||
AttributeTlsListenerResourceID = "tls.listener.resource_id" // string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this have an aws.elb.
prefix? It seems fairly specific to AWS & ELB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it is specific to nlb
. I don't think this would be added in tls
specific semantics conventions.
} | ||
line = scanner.Text() | ||
|
||
fields, err := extractFields(line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would iterate over the fields. We can optimise it later though.
|
||
#### ELB Access Log Fields | ||
|
||
ELB access log record fields are mapped this way in the resulting OpenTelemetry log: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are more fields in ELB access logs that are not mentioned below. Seems we're dropping them at the moment. Can you please add a list of the fields that are not translated, and mention that they're TODO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the documentation!
Co-authored-by: Andrew Wilkins <[email protected]>
…er/elb-access-log/benchmark_test.go Co-authored-by: Constança Manteigas <[email protected]>
…emetry-collector-contrib into elb-access-logs
I am waiting for the authorization! |
Please take a look at the CI failures |
// Copyright The OpenTelemetry Authors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
package elbaccesslogs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file has no unit tests. Can you add some?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file only defines three constants. Are tests for these really useful? I don't see tests for the s3-access-logs fields.go.
I could add some though, no problem.
Hi @atoulme , could you please approve the workflow another time to check if there are remaining CI failures? |
…emetry-collector-contrib into elb-access-logs
@atoulme or @andrzej-stencel since I already have an approval, could you give an approval for the CI to run and if it succeeds we can merge this one? |
@andrzej-stencel could I have another workflow approval. I think I got rid of the last linter issue. |
@evan-bradley whenever possible, could I get a final review and approval so we can merge this one? |
// forward until it finds the ending quote, and considers | ||
// that just 1 value. Otherwise, it returns the value as | ||
// it is. | ||
func scanField(logLine string) (string, string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered replacing this complex scanning logic with a csv reader, using a space as the delimiter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scan field logic is used to optimize performance, csv reader wouldn't be good for our use case (there is a benchmark test to check this)
@atoulme @edmocosta @dehaansa Please, do you have time to merge this PR? Thanks |
extension/encoding/awslogsencodingextension/internal/unmarshaler/elb-access-log/elb.go
Outdated
Show resolved
Hide resolved
…n-telemetry#41185) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description extension/encoding/awslogsencoding The ELB(elastic load balancer) logs can have the following syntax based on their type: - [Network Load Balancer Access Logs](https://docs.aws.amazon.com/elasticloadbalancing/latest//network/load-balancer-access-logs.html#access-log-entry-format) - [Classic Load Balancer Access Logs](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/access-log-collection.html#access-log-entry-format) - [Application Load Balancer Connection Logs](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-connection-logs.html#connection-log-entry-format) This PR adds support for them so we can get valuable insights into traffic patterns, security, and application performance. ``` awslogs_encoding: format: elb_access_log ``` ### Describe alternatives you've considered _No response_ ### Additional context _No response_ <!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. --> #### Link to tracking issue Fixes open-telemetry#40710 <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation As taken from README #### ELB Access Log Fields ELB access log record fields are mapped this way in the resulting OpenTelemetry log: ##### Application Load Balancer (ALB) > AWS Fields are according to [documentation](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-access-logs.html). | **AWS Field** | **OpenTelemetry Field(s)** | |------------------------------|-----------------------------------------------------------------------| | type | `network.protocol.name` | | time | Log timestamp | | elb | `cloud.resource_id` | | client:port | `client.address`, `client.port` | | received_bytes | `http.request.size` | | sent_bytes | `http.response.size` | | "request" | `url.full`, `http.request.method`, `network.protocol.version` | | ssl_cipher | `tls.cipher` | | ssl_protocol | `tls.protocol.version` | | elb_status_code | `aws.elb.status.code` | | target:port | _Currently not supported_ | | request_processing_time | _Currently not supported_ | | target_processing_time | _Currently not supported_ | | response_processing_time | _Currently not supported_ | | target_status_code | _Currently not supported_ | | "user_agent" | _Currently not supported_ | | target_group_arn | _Currently not supported_ | | "trace_id" | _Currently not supported_ | | "domain_name" | _Currently not supported_ | | "chosen_cert_arn" | _Currently not supported_ | | matched_rule_priority | _Currently not supported_ | | request_creation_time | _Currently not supported_ | | "actions_executed" | _Currently not supported_ | | "redirect_url" | _Currently not supported_ | | "error_reason" | _Currently not supported_ | | "target:port_list" | _Currently not supported_ | | "target_status_code_list" | _Currently not supported_ | | "classification" | _Currently not supported_ | | "classification_reason" | _Currently not supported_ | | conn_trace_id | _Currently not supported_ | ##### Network Load Balancer (NLB) > AWS Fields are according to [documentation](https://docs.aws.amazon.com/elasticloadbalancing/latest//network/load-balancer-access-logs.html#access-log-entry-format). | **AWS Field** | **OpenTelemetry Field(s)** | |------------------------------|-------------------------------------------------------------| | type | `network.protocol.name` | | version | `network.protocol.version` | | time | Log timestamp | | elb | `cloud.resource_id` | | listener | `aws.elb.tls.listener.resource_id` | | client:port | `client.address`, `client.port` | | received_bytes | `http.request.size` | | sent_bytes | `http.response.size` | | tls_cipher | `tls.cipher` | | tls_protocol_version | `tls.protocol.version` | | destination:port | _Currently not supported_ | | connection_time | _Currently not supported_ | | tls_handshake_time | _Currently not supported_ | | incoming_tls_alert | _Currently not supported_ | | chosen_cert_arn | _Currently not supported_ | | chosen_cert_serial | _Currently not supported_ | | tls_named_group | _Currently not supported_ | | domain_name | _Currently not supported_ | | alpn_fe_protocol | _Currently not supported_ | | alpn_be_protocol | _Currently not supported_ | | alpn_client_preference_list | _Currently not supported_ | | tls_connection_creation_time | _Currently not supported_ | ##### Classic Load Balancer (CLB) > AWS Fields are according to [documentation](https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/access-log-collection.html) | **AWS Field** | **OpenTelemetry Field(s)** | |-----------------------|--------------------------------------------------------------------------------------------| | time | Log timestamp | | elb | `cloud.resource_id` | | client:port | `client.address`, `client.port` | | elb_status_code | `aws.elb.status.code` | | backend_status_code | `aws.elb.backend.status.code` | | received_bytes | `http.request.size` | | sent_bytes | `http.response.size` | | "request" | `url.full`, `http.request.method`, `network.protocol.name`, `network.protocol.version` | | ssl_cipher | `tls.cipher` | | ssl_protocol | `tls.protocol.version` | | backend:port | _Currently not supported_ | | request_processing_time | _Currently not supported_ | | backend_processing_time | _Currently not supported_ | | response_processing_time | _Currently not supported_ | | user_agent | _Currently not supported_ | <!--Please delete paragraphs that you did not use before submitting.--> --------- Co-authored-by: Andrew Wilkins <[email protected]> Co-authored-by: Constança Manteigas <[email protected]> Co-authored-by: Edmo Vamerlatti Costa <[email protected]>
Description
extension/encoding/awslogsencoding
The ELB(elastic load balancer) logs can have the following syntax based on their type:
This PR adds support for them so we can get valuable insights into traffic patterns, security, and application performance.
Describe alternatives you've considered
No response
Additional context
No response
Link to tracking issue
Fixes #40710
Testing
Documentation
As taken from README
ELB Access Log Fields
ELB access log record fields are mapped this way in the resulting OpenTelemetry log:
Application Load Balancer (ALB)
network.protocol.name
cloud.resource_id
client.address
,client.port
http.request.size
http.response.size
url.full
,http.request.method
,network.protocol.version
tls.cipher
tls.protocol.version
aws.elb.status.code
Network Load Balancer (NLB)
network.protocol.name
network.protocol.version
cloud.resource_id
aws.elb.tls.listener.resource_id
client.address
,client.port
http.request.size
http.response.size
tls.cipher
tls.protocol.version
Classic Load Balancer (CLB)
cloud.resource_id
client.address
,client.port
aws.elb.status.code
aws.elb.backend.status.code
http.request.size
http.response.size
url.full
,http.request.method
,network.protocol.name
,network.protocol.version
tls.cipher
tls.protocol.version