Skip to content

Conversation

@mclenhard
Copy link

Adds new e2e test that loads an MCP client, which in turn runs the server and processes the actual tool call. Afterwards, it then grades the response for correctness.

note: I'm the package author

@mclenhard
Copy link
Author

hey @ferrislucas any feedback ?

@ferrislucas
Copy link
Owner

Hi @mclenhard , sorry I’ve been very busy. I really appreciate the contribution! I love this idea. I have two questions:

  1. Can this be made a dev dependency instead of requiring it as a dependency to run the server?
  2. It seems like this is intended to be used as a GitHub action, and I think it needs a yml file to setup the action. Is that your intent?

Thanks again! It would be great to get this to run on every PR without requiring end users to have it in their build.

@mclenhard
Copy link
Author

Yeah, I can make this a dev dependency. Good idea.

Regarding the GitHub action, It isn't 100% necessary as a Github action. For some projects I just model them as tests that I run locally. Really up to you how you want the setup to be. That being said, I can add it as a GitHub action, but you'll need to set up an OPEN_AI API key. You'll just add it as a GitHub action secret. Since this is an open-source project, though, if you enable data sharing in OpenAI, you can get 2.5 million tokens free, which should be more than enough.

Let me know how if you want to go the Github action route and I can setup the yaml's.

@ferrislucas
Copy link
Owner

@mclenhard awesome! I would love to get this configured as an action, I’m happy to setup the necessary secret. Thank you so much for contributing! Much appreciated!

@mclenhard
Copy link
Author

Sounds great will work on this tomorrow!

@mclenhard
Copy link
Author

Just added the github action! Let me know if you have any feedback.

@ferrislucas
Copy link
Owner

Thanks @mclenhard ! I think the action needs to run on a macOS image in order for Iterm to available and pass the eval. There probably needs to be an init step to start Iterm before the eval. Does that make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants