Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opampsupervisor] configure agent healthcheck port #34643

Closed
dpaasman00 opened this issue Aug 13, 2024 · 7 comments
Closed

[opampsupervisor] configure agent healthcheck port #34643

dpaasman00 opened this issue Aug 13, 2024 · 7 comments
Assignees
Labels

Comments

@dpaasman00
Copy link
Contributor

dpaasman00 commented Aug 13, 2024

Component(s)

cmd/opampsupervisor

Is your feature request related to a problem? Please describe.

I'd like to programmatically determine if the agent is healthy after being started by the supervisor without using an opamp connection to the supervisor. I want to do this by checking the agent's healthcheck extension endpoint. This would be particularly useful when updating an existing agent to be ran with the supervisor. Currently the port picked by the supervisor is non-deterministic making it difficult to determine which port should be targeted. I'd like to change this by making the healthcheck extension port configurable.

Describe the solution you'd like

The port assigned to the agent's healthcheck extension should be configurable in the supervisor config. A new parameter in the agent configuration section, maybe even in agent.description.

Describe alternatives you've considered

No response

Additional context

The relevant code for picking a random port.

@dpaasman00 dpaasman00 added enhancement New feature or request needs triage New item requiring triage labels Aug 13, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@dpaasman00
Copy link
Contributor Author

I'd like to take this on if this is wanted

@tigrannajaryan
Copy link
Member

The port assigned to the agent's healthcheck extension should be configurable in the supervisor config.

Please add the motivation for this request to the description. It is not clear why this is needed. What problem is this solving? Why does the port need to be user selectable?

@dpaasman00
Copy link
Contributor Author

@tigrannajaryan Updated the description, let me know if I need to add additional context.

@Frapschen Frapschen removed the needs triage New item requiring triage label Aug 15, 2024
@BinaryFissionGames
Copy link
Contributor

I think we want the port to be configurable regardless of actually checking it outside the supervisor, grabbing a random port is error-prone in that:

  1. Between generating the port and binding to the port, a different process may bind to that port
  2. The supervisor may choose a random port that conflicts with another application that starts after the supervisor.

These are both rare scenarios, but with enough time + users I could imagine these things happening more than once.

@tigrannajaryan
Copy link
Member

  1. Between generating the port and binding to the port, a different process may bind to that port

This indeed can happen. A possible fix is instead of Supervisor choosing a port, the healthcheck extension can choose a port and the opamp extension will report this port to Supervisor via effective config. Not entirely clear how opamp extension will learn about the port number though.

@tigrannajaryan
Copy link
Member

@tigrannajaryan Updated the description, let me know if I need to add additional context.

Thanks, looks good.

evan-bradley pushed a commit that referenced this issue Sep 5, 2024
**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Add a new configuration parameter to `agent` called `health_check_port`.
If this is set, then the supervisor will configure the agent's
healthcheck extension to use the given port. If it is unset, then we
will grab a random port same as before.

**Link to tracking Issue:** #34643

**Testing:** <Describe what testing was performed and which tests were
added.>
- Updated config validation tests
- Verified that healthcheck extension is configured with the correct
port and works as expected
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this issue Sep 12, 2024
…elemetry#34704)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Add a new configuration parameter to `agent` called `health_check_port`.
If this is set, then the supervisor will configure the agent's
healthcheck extension to use the given port. If it is unset, then we
will grab a random port same as before.

**Link to tracking Issue:** open-telemetry#34643

**Testing:** <Describe what testing was performed and which tests were
added.>
- Updated config validation tests
- Verified that healthcheck extension is configured with the correct
port and works as expected
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants