Skip to content

[BUG] S3 Scan hits NPE when bucket key has a null value #3316

@graytaylor0

Description

@graytaylor0

Describe the bug
It is possible for a NPE to occur when the global state Map for s3 scan contains a key for the bucket but the value is null (

Instant mostRecentLastModifiedTimestamp = globalStateMap.containsKey(bucket) ? Instant.parse((String) globalStateMap.get(bucket)) : null;
)

2023-09-08T15:42:56.313 [Thread-10] ERROR org.opensearch.dataprepper.plugins.source.ScanObjectWorker - Received an exception while processing S3 objects, backing off and retrying
java.lang.NullPointerException: text
	at java.util.Objects.requireNonNull(Objects.java:246) ~[?:?]
	at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1945) ~[?:?]
	at java.time.Instant.parse(Instant.java:395) ~[?:?]
	at org.opensearch.dataprepper.plugins.source.S3ScanPartitionCreationSupplier.listFilteredS3ObjectsForBucket(S3ScanPartitionCreationSupplier.java:104) ~[s3-source-2.4.0.jar:?]
	at org.opensearch.dataprepper.plugins.source.S3ScanPartitionCreationSupplier.apply(S3ScanPartitionCreationSupplier.java:87) ~[s3-source-2.4.0.jar:?]
	at org.opensearch.dataprepper.plugins.source.S3ScanPartitionCreationSupplier.apply(S3ScanPartitionCreationSupplier.java:32) ~[s3-source-2.4.0.jar:?]
	at org.opensearch.dataprepper.sourcecoordination.LeaseBasedSourceCoordinator.getNextPartition(LeaseBasedSourceCoordinator.java:153) ~[data-prepper-core-2.4.0.jar:?]
	at org.opensearch.dataprepper.plugins.source.ScanObjectWorker.startProcessingObject(ScanObjectWorker.java:128) ~[s3-source-2.4.0.jar:?]
	at org.opensearch.dataprepper.plugins.source.ScanObjectWorker.run(ScanObjectWorker.java:106) ~[s3-source-2.4.0.jar:?]
	at java.lang.Thread.run(Thread.java:829) [?:?]

Expected behavior
No NullPointerException

** Steps to reproduce **
Configure an s3 scan pipeline with the same bucket duplicated twice

- bucket:
    name: "my-bucket"
- bucket:
    name: "my-bucket"

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • OS: [e.g. Ubuntu 20.04 LTS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions