Skip to content

Conversation

averath
Copy link

@averath averath commented Feb 22, 2024

Added

Vertex AI support (similar to OpenAI)

Enable/Disable provider option

@justinh-rahb
Copy link
Collaborator

justinh-rahb commented Feb 22, 2024

Fantastic addition @averath! Could we possibly get some more details describing the changes, how it was implemented, whether it will impact any existing functionality or connections? Screenshots would be great too. Thanks for the PR! 🙏

@justinh-rahb
Copy link
Collaborator

Screenshot 2024-02-22 at 3 31 16 PM

Figured it out. We should probably make sure we have instructions for getting an auth token from Google Cloud:

Here's how you can obtain an access token using the Google Cloud SDK:

Install the Google Cloud SDK by following the instructions here: https://cloud.google.com/sdk/install.

Open a terminal or command prompt and run the following command to authenticate using your Google Cloud credentials:

gcloud auth application-default login

Follow the prompts to log in to your Google Cloud account and grant permissions to the Google Cloud SDK.

Once you're authenticated, you can obtain an access token by running the following command:

gcloud auth application-default print-access-token

This will output an access token that you can use to authenticate your requests to the Vertex AI endpoint.

@justinh-rahb
Copy link
Collaborator

I noticed also that this new module does not check a /models endpoint to populate the model list, it's been set statically in the code. This seems fine and good for testing, but we probably should do it properly so that new models can be handled automatically.

@jannikstdl
Copy link
Contributor

I noticed also that this new module does not check a /models endpoint to populate the model list, it's been set statically in the code. This seems fine and good for testing, but we probably should do it properly so that new models can be handled automatically.

Yes especially with the Google naming switches haha

@averath
Copy link
Author

averath commented Feb 23, 2024

Thank you guys for your feedback! I'm happy to see it

I noticed also that this new module does not check a /models endpoint to populate the model list, it's been set statically in the code. This seems fine and good for testing, but we probably should do it properly so that new models can be handled automatically.

I was thinking about it, however it's not that easy like it should be. They indeed expose API to receive models, but only models created by you. In this list there's no Gemini, because it's created by Google. Reference: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.models/list .

We can add instructions how to obtain key. What I was thinking is to let user choose whether the external connection for OpenAI and the external connection for VertexAI shall be enabled or not, because I'm using just the second one and see errors related to OpenAI.

@justinh-rahb
Copy link
Collaborator

justinh-rahb commented Feb 23, 2024

Hmm I've perhaps not setup my API credentials correctly, but if I use the method I posted above, the key only last for so long before it expires:

Screenshot 2024-02-23 at 8 45 28 AM

And indeed, running the print-access-token again gives me a new key which works (for now). My suspicion is that the "correct" way to do this (according to Google) would be using OAuth2 to authenticate with your GCP project (this will require a callback endpoint), and exchanging that for access token and refresh token. That or setting up a service account might work too, I'm going to look into that avenue next.

@averath
Copy link
Author

averath commented Feb 23, 2024

Setting up a service account should do the thing, because personal tokens last a couple of minutes/hours depends on settings

@justinh-rahb
Copy link
Collaborator

Thank you guys for your feedback! I'm happy to see it

I noticed also that this new module does not check a /models endpoint to populate the model list, it's been set statically in the code. This seems fine and good for testing, but we probably should do it properly so that new models can be handled automatically.

I was thinking about it, however it's not that easy like it should be. They indeed expose API to receive models, but only models created by you. In this list there's no Gemini, because it's created by Google. Reference: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.models/list .

That's fair enough, if there's no straight-forward way to get their public models then I guess hard-code is the only other option. Any way it can be hidden unless the Vertex API connection is actually setup? Right now whether you use it or not you end up with Gemini Pro in your models list.

@averath
Copy link
Author

averath commented Feb 23, 2024

Thank you guys for your feedback! I'm happy to see it

I noticed also that this new module does not check a /models endpoint to populate the model list, it's been set statically in the code. This seems fine and good for testing, but we probably should do it properly so that new models can be handled automatically.

I was thinking about it, however it's not that easy like it should be. They indeed expose API to receive models, but only models created by you. In this list there's no Gemini, because it's created by Google. Reference: https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.models/list .

That's fair enough, if there's no straight-forward way to get their public models then I guess hard-code is the only other option. Any way it can be hidden unless the Vertex API connection is actually setup? Right now whether you use it or not you end up with Gemini Pro in your models list.

I'm working on it to enable/disable other providers as well, so you could configure whether you want to use Ollama, OpenAI, VertexAI

@justinh-rahb
Copy link
Collaborator

@averath I've been butting my head against a wall for an hour now trying to sort out the API key... unless you've got access to MakerSuite there is no straightforward way to get a permanent key, Google really wants you to do things the OAuth2 way. I think this feature is going to be a lot easier to use if authentication is refactored so that it will exchange OAuth2 credentials for short-lived access tokens and then handle refresh automatically afterwards.

@averath
Copy link
Author

averath commented Feb 23, 2024

Oh really? So service account won't work?

Do you think we can leave it like it is as a first stage and align it later? We could mark it as experimental feature.

@justinh-rahb
Copy link
Collaborator

justinh-rahb commented Feb 24, 2024

Oh really? So service account won't work?

Do you think we can leave it like it is as a first stage and align it later? We could mark it as experimental feature.

@averath based on the various documention I've reviewed, you'll need to refactor this by incorporating Google Cloud modules. This is to utilize a credentials JSON file linked to a service account, which is necessary for generating the access and refresh tokens required by the API. I have a project that implements this for accessing a GCS bucket but it works similarly for other Google APIs, including VertexAI. Here's how I've done it before: utils/gcs.py

Here are a few methods for managing the JSON credentials file downloaded from the console:

  • One option is to base64 encode the file, as I have, and deploy it as an environment variable, decoding it when needed.
  • Alternatively, you could bind-mount the file's location and directly supply the file to the function.
  • There's also the possibility of enabling the JSON file to be uploaded through the user interface, then storing it using any of the aforementioned techniques.

However, it's important to note that these methods may not be highly secure. Mishandling Google Cloud Platform credentials can lead to significant security risks.


Instead of relying on an Endpoint URL, we could streamline the process by accepting two specific parameters: location and project. This approach simplifies configuration, as these are the primary elements that vary across different setups, effectively addressing customization needs without the complexity of handling full URLs.

@tjbck tjbck changed the base branch from main to dev February 25, 2024 02:14
tjbck and others added 10 commits February 25, 2024 16:05
fix: add semver to container builds
# Conflicts:
#	src/lib/components/chat/Settings/Connections.svelte
#	src/lib/components/chat/SettingsModal.svelte
#	src/routes/(app)/+layout.svelte
#	src/routes/(app)/+page.svelte
@justinh-rahb
Copy link
Collaborator

It's great to see you're still engaged with this project, @averath. Since your pull request was initiated, there have been significant updates to the codebase. I recommend reviewing the Ollama modules that inspired your enhancements for VertexAI to ensure consistency with current coding standards and conventions. Additionally, integrating support for both MakerSuite API key and GCP OAuth2 authentication could enhance functionality. Incorporating the google-generativeai package would facilitate the use of the MakerSuite API key. Implementing a switch to select between these authentication methods is essential, given the differences in endpoints and authentication mechanisms.

@tjbck
Copy link
Contributor

tjbck commented Feb 28, 2024

Thanks for the PR! I believe Vertex AI support is a nice addition but providing options to enable/disable providers seem irrelevant to this PR and it makes it hard for us to review the code. I'll close this PR for now but feel free to create new atomic level PRs, Thanks!

@tjbck tjbck closed this Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants