Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WPB 3796] Fix background worker for federation policy allowAll #3526

Closed

Conversation

fisx
Copy link
Contributor

@fisx fisx commented Aug 22, 2023

https://wearezeta.atlassian.net/browse/WPB-3796

Checklist

  • Add a new entry in an appropriate subdirectory of changelog.d
  • Read and follow the PR guidelines

@zebot zebot added the ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist label Aug 22, 2023
@fisx fisx force-pushed the WPB-3796/fix-background-worker-for-allowAll-fed-policy branch from 189af78 to f524514 Compare August 22, 2023 13:52
@fisx fisx requested a review from akshaymankar August 22, 2023 13:52
@fisx fisx marked this pull request as ready for review August 22, 2023 13:52
(To pull remotes from rabbitMQ, allowing us to quit talking to brig
about this.)
(This was only needed to get federation remote configs, but we get
those from rabbitMQ now, and without bugs in allowAll.)
@fisx fisx force-pushed the WPB-3796/fix-background-worker-for-allowAll-fed-policy branch from f524514 to 06e2f3f Compare August 23, 2023 13:34
@fisx fisx changed the title [WPB 3796] Fix background worker for allow all fed policy [WPB 3796] Fix background worker for federation policy allowAll Aug 23, 2023
ensureConsumers consumersRef chan $ domain <$> chanRemotes.remotes
remoteDomains <- getRemoteDomains
ensureConsumers consumers chan remoteDomains
threadDelay (round $ 1_000_000 * fromMaybe 60 env.backendNotificationsConfig.remotesRefreshInterval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a Maybe, we can put default values in all the places, i.e. helm chart and integration test configs? IMO this is unnecessary complexity.

pushBackoffMaxWait :: Int
pushBackoffMaxWait :: Int,
-- | Number of seconds between two calls to `getRemoteDomains`.
remotesRefreshInterval :: Maybe Double
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using Double can we just take number of milliseconds? We won't need rounding like this. I think we can safely assume people running wire are capable of understanding different units of time than seconds.
This also makes it more in line with the config one line above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i get the last point, but i just think doubles are more ergonomic. anyway i'll align it with the rest.

@@ -0,0 +1,5 @@
New config value `background-worker.backendNotificationPusher.remotesRefreshIntervalMs`. If in doubt, set to 30000 ms. (This controls the delay between backend-to-backend notification queue update pulls from rabbitMQ to background-worker.)

Remote federation domains in the config file are no longer supported (deprecated since Charts...). Before deploying this release, make sure you have completed the migration steps from [the CHANGELOG of that release](...).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was never a release which supported this, it came and went away between releases. So, this line is not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorrect. we released this last week or so.


Remote federation domains in the config file are no longer supported (deprecated since Charts...). Before deploying this release, make sure you have completed the migration steps from [the CHANGELOG of that release](...).

If you have enabled federation and therefore rabbitMQ before this upgrade, you may need to remove all queues called `backend-notifications.delete-federation` manually, or this release won't start.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you have enabled federation and therefore rabbitMQ before this upgrade, you may need to remove all queues called `backend-notifications.delete-federation` manually, or this release won't start.
If you have enabled federation and therefore rabbitMQ before this upgrade, you should remove the queue called `backend-notifications.delete-federation` manually.

I think the release still starts, just keeps printing warning logs, if not something is not right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the thing that fails is the line below the one where the queue is created: it chokes on the fact that the queue has an invalid domain name and gets trapped in a loop, i think. but you are right, my wording was inaccurate.

@@ -0,0 +1 @@
[background-worker] Get federation remote domains from rabbitMQ, not from brig (this fixes a bug in federation policy `allowAll`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no release made with this bug, so you can also just drop this changelog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

@@ -26,6 +26,7 @@ config:
backendNotificationPusher:
pushBackoffMinWait: 10000 # in microseconds, so 10ms
pushBackoffMaxWait: 300000000 # microseconds, so 300s
remotesRefreshIntervalMs: 30000
Copy link
Member

@akshaymankar akshaymankar Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding Ms here has two problems:

  1. Capital M means Mega not milli or micro.
  2. The things above do not have this suffix, so it is inconsistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sigh

i really don't like the way we do config files. very brittle and fragile at the same time.

i'll remove the Ms.

Comment on lines 92 to 93
-- | FUTUREWORK: This really doesn't belong here in a module about pushing backend-to-backend
-- notifications. Maybe "Wire.API.Federation.Defederate"? (Same for `ensureTLQueue`.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not create the module right now?

Comment on lines 98 to 106
ensureQueue :: Q.Channel -> Text -> IO ()
ensureQueue chan queue = do
ensureTLQueue :: Q.Channel -> Text -> IO ()
ensureTLQueue chan queue = do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does TL stand for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"top level". not great, yes.

do you have a suggestion what to use for a name?

i'll think of something...

fisx added a commit that referenced this pull request Sep 15, 2023
(for some reason, remotes listed in config files don't do it any more.
stealing this fix from #3526)
@@ -58,6 +58,8 @@ data:
cert: /etc/wire/integration-secrets/provider-publiccert.pem
botHost: https://brig-integration

originDomain: 612c6a18-4e3e-11ee-a205-97507f092421.example.com
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this hard-coded like this raises an eyebrow. Why is this not dynamic?

@@ -207,3 +207,7 @@ spec:
secretKeyRef:
name: brig
key: rabbitmqPassword
- name: TEST_INCLUDE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably remove this before merging? :)

@fisx
Copy link
Contributor Author

fisx commented Sep 27, 2023

@akshaymankar this can be closed, right? i'll close it! #3588

@fisx fisx closed this Sep 27, 2023
@echoes-hq echoes-hq bot added the echoes: technical-debt Changes intended at mitigating risks label Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
echoes: technical-debt Changes intended at mitigating risks ok-to-test Approved for running tests in CI, overrides not-ok-to-test if both labels exist
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants