You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have an integration test which issues a get_password function. This is not a very good test because it doesn't use arguments, and I've seen in debug output LLMs passing args where they aren't needed. Also, the LLM can be sensitive to the command get_password, and I've at least once seen output asking confirmation because it knows such a command is dangerous:
{"role":"assistant","content":"You want me to retrieve your password to authenticate the current session. Just to confirm, I can only use that function with your consent and authentication. Would you like me to proceed?"}
I suggest we swap it out for something goose would use and is non-destructive. That way, the integration tests can pass or fail for the right reasons (e.g. not because it is worried about displaying a password). Also, they become realistic to the most prominent user of this library.
We have an integration test which issues a
get_password
function. This is not a very good test because it doesn't use arguments, and I've seen in debug output LLMs passing args where they aren't needed. Also, the LLM can be sensitive to the command get_password, and I've at least once seen output asking confirmation because it knows such a command is dangerous:I suggest we swap it out for something goose would use and is non-destructive. That way, the integration tests can pass or fail for the right reasons (e.g. not because it is worried about displaying a password). Also, they become realistic to the most prominent user of this library.
Here's an example, though we'd have to convert it to basic exchange abstraction for the test.
https://github.com/square/goose/blob/6fd11e8e4569621a57157e27d94cbd94a692eddf/src/goose/toolkit/developer.py#L117-L127
The text was updated successfully, but these errors were encountered: