Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 11 additions & 2 deletions dmrpp_generator/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,9 +98,17 @@ def get_bucket(filename, files, buckets):
break
return buckets[bucket_type]

def _get_s3_extra(self):
"""Helper to build the extra dict for S3 operations."""
extra = {}
if self.config.get("requester_pay", False):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to parse the value as a boolean here?

Suggested change
if self.config.get("requester_pay", False):
if f'{self.config.get("requester_pay", False)}.lower() == 'true':

or is the value in self.config already a boolean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i just tested the value is already a boolean

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to update this config location to align with the other dmrpp flags like get_dmrpp_timeout i.e. the config.collection.meta.dmrpp (from collection) or config.dmrpp (from workflow)

The terraform variable is likely not needed for this PR, but it would be nice to keep these dmrpp specific variable flags consolidated to one location.

For this PR, it would just mean updating self.config.get to self.dmrpp_meta.get

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s technically possible, but I’m not sure it makes sense. This seems like it would apply on a per-collection basis, whereas what we actually need is for it to depend on where the DMRPP generator runs relative to the bucket location. Our goal is to enable requester pays specifically for cross-account S3 requests. If i were to do it on the collection config that means i would need to modify my collection configuration based on which env were running the dmrpp generator on.

I think i would prefer it on the config level where it is based on the step function workflow setup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ok i see, so in the dmrpp override in the workflows, i can work that in.

extra["RequestPayer"] = "requester"
return extra

def upload_file_to_s3(self, filename, uri):
""" Upload a local file to s3 if collection payload provided """
return s3.upload(filename, uri, extra={})
extra = self._get_s3_extra()
return s3.upload(filename, uri, extra=extra)

def process(self):
if 'EBS_MNT' in os.environ:
Expand Down Expand Up @@ -267,7 +275,8 @@ def dmrpp_generate(self, input_file, local=False, dmrpp_meta=None, args=None):
"""
# Force dmrpp_meta to be an object
dmrpp_meta = dmrpp_meta if isinstance(dmrpp_meta, dict) else {}
file_full_path = input_file if local else s3.download(input_file, path=self.path)
extra = self._get_s3_extra()
file_full_path = input_file if local else s3.download(input_file, path=self.path, extra=extra)
cmd = self.get_dmrpp_command(dmrpp_meta, file_full_path)
if args:
cmd_split = cmd.split(' ', maxsplit=1)
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ certifi==2024.7.4
charset-normalizer==3.3.2
cumulus-message-adapter==2.0.3
cumulus_message_adapter_python==2.2.0
cumulus_process==1.3.0
cumulus_process==1.5.0
decorator==5.1.1
dicttoxml==1.7.16
idna==3.7
Expand Down
Loading