Skip to content

Conversation

@achraf-mer
Copy link
Collaborator

No description provided.

@EshamAaqib
Copy link
Contributor

I don't think this will work because we need to pass the env vars from -e this reflects to env: or envFrom:on K8s deployments and Helm (Ex: https://github.com/h2oai/helium/blob/a639155f3939a5cf0f7a33f27da84c08c5feac26/helm/helium-chart/templates/deployment.yaml#L28 ). Without this it will be harder to pass the vars via Helm.

@EshamAaqib
Copy link
Contributor

EshamAaqib commented Jul 24, 2023

I would suggest changing the gen.py main function to something like this might work,

def main(
    load_8bit: bool = os.environ.get('LOAD_8BIT', 'False')
    load_4bit: bool = os.environ.get('LOAD_4BIT', 'False')
    load_half: bool = os.environ.get('LOAD_HALF', 'True')
    load_gptq: str = os.environ.get('LOAD_GPTQ', '')
    load_exllama: bool = os.environ.get('LOAD_EXLLAMA', 'False')
    use_safetensors: bool = os.environ.get('USE_SAFETENSORS', 'False')
.....
.....
.....
)

As per my knowledge this should directly fetch the env var set by Docker if not it will use the default value, Also we will not need to do any changes to Docker as well

@EshamAaqib EshamAaqib requested a review from ozahavi July 24, 2023 07:37
@ChathurindaRanasinghe
Copy link
Collaborator

I would suggest changing the gen.py main function to something like this might work,

def main(
    load_8bit: bool = os.environ.get('LOAD_8BIT', 'False')
    load_4bit: bool = os.environ.get('LOAD_4BIT', 'False')
    load_half: bool = os.environ.get('LOAD_HALF', 'True')
    load_gptq: str = os.environ.get('LOAD_GPTQ', '')
    load_exllama: bool = os.environ.get('LOAD_EXLLAMA', 'False')
    use_safetensors: bool = os.environ.get('USE_SAFETENSORS', 'False')
.....
.....
.....
)

As per my knowledge this should directly fetch the env var set by Docker if not it will use the default value, Also we will not need to do any changes to Docker as well

CC: @achraf-mer

@achraf-mer
Copy link
Collaborator Author

I would suggest changing the gen.py main function to something like this might work,

def main(
    load_8bit: bool = os.environ.get('LOAD_8BIT', 'False')
    load_4bit: bool = os.environ.get('LOAD_4BIT', 'False')
    load_half: bool = os.environ.get('LOAD_HALF', 'True')
    load_gptq: str = os.environ.get('LOAD_GPTQ', '')
    load_exllama: bool = os.environ.get('LOAD_EXLLAMA', 'False')
    use_safetensors: bool = os.environ.get('USE_SAFETENSORS', 'False')
.....
.....
.....
)

As per my knowledge this should directly fetch the env var set by Docker if not it will use the default value, Also we will not need to do any changes to Docker as well

let's use command with a set of args in helm chart, IMO best than to sprinkle env vars, that way it is sure only one way to set flags and not two (env vars and query args).

@pseudotensor pseudotensor self-requested a review August 1, 2023 17:39
Copy link
Collaborator

@pseudotensor pseudotensor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! @ChathurindaRanasinghe Please review post-merge.

@achraf-mer achraf-mer merged commit 0d588e5 into main Aug 1, 2023
@achraf-mer achraf-mer deleted the am/refactor branch August 1, 2023 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants