Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase dbus client timeouts during CA install #1078

Closed
wants to merge 1 commit into from

Conversation

zultron
Copy link
Contributor

@zultron zultron commented Sep 13, 2017

In FreeIPA 4.5 and 4.6, the ipa-server-install program can fail during the "Configuring certificate server (pki-tomcatd)" stage on a heavily-loaded host.

The dogtag service is memory intensive, and on a memory-constrained system, can be unresponsive right after start-up, especially for expensive operations that generate new certificates.

These delays can cause failures when dbus clients time out while attempting to run
new certificate operations through certmonger. This patch changes dbus client timeouts for some such operations to 120 seconds (from the default 25 seconds).

@@ -595,7 +595,7 @@ def modify_ca_helper(ca_name, helper):
old_helper = ca_iface.Get('org.fedorahosted.certmonger.ca',
'external-helper')
ca_iface.Set('org.fedorahosted.certmonger.ca',
'external-helper', helper)
'external-helper', helper, timeout=120)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a code comment here, explaining briefly why we are setting the timeout to this value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -261,7 +261,7 @@ def configure_certmonger_renewal(self):
iface.add_known_ca(
name,
command,
dbus.Array([], dbus.Signature('s')))
dbus.Array([], dbus.Signature('s')), timeout=120)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a code comment here, explaining briefly why we are setting the timeout to this value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for the review.

@felipevolpone felipevolpone added the re-run Trigger a new run of PR-CI label Sep 14, 2017
@stlaz stlaz added re-run Trigger a new run of PR-CI and removed re-run Trigger a new run of PR-CI labels Sep 14, 2017
@freeipa-pr-ci freeipa-pr-ci removed the re-run Trigger a new run of PR-CI label Sep 14, 2017
@stlaz
Copy link
Contributor

stlaz commented Sep 15, 2017

I actually don't think this change will change much. The crash we're seeing is pretty deterministic, really. Do go ahead and try:

domain=IPA.TEST; docker run -t --name freeipa-popelnice -h $HOSTNAME \
                         --tmpfs /run --tmpfs /tmp --security-opt seccomp:unconfined \
                         -v /dev/urandom:/dev/random:ro -v /opt/ipa-data/master:/data \
                         -v /sys/fs/cgroup:/sys/fs/cgroup:ro --cap-add=SYS_TIME \
                         -e DEBUG_TRACE=1 -e DEBUG_NO_EXIT=1 freeipa/freeipa-server:fedora-26 \
                        --hostname $HOSTNAME --domain $domain --realm ${domain^^} \
                        -p milan_je_buh123 -a milan_je_buh123 --setup-dns --auto-forwarders --no-reverse -U

It should fail on pkispawn during a pki-tomcat start up. Dogtag seems to be trying to set up a kernel keyring but is denied.

I am currently trying to get some more info from the Dogtag guys on how not to use the keyring.

@zultron
Copy link
Contributor Author

zultron commented Sep 15, 2017

@stlaz, you're right, this PR doesn't address the problem you're pointing out. (I work around that problem for now with docker run --privileged.)

This PR addresses dbus timeouts breaking the install. This can happen on memory-constrained systems, where any swapping after dogtag startup can exceed the default 25 second dbus default timeout.

@rcritten
Copy link
Contributor

I don't see a downside in increasing the timeout whether it's for memory constrained systems, slow disks or whatever. Fast systems will never hit it so it's a moot point in that case. The 120 seconds seems arbitrary though, how did you come up with it? Should there be a define somewhere, DBUS_TIMEOUT, and use that instead? I think I'd want a comment about the default 25 seconds (or whatever it is) as well to leave a bread crumb for future devs.

@zultron
Copy link
Contributor Author

zultron commented Sep 20, 2017

@rcritten, the 120 seconds is arbitrary, you're correct. On a system that swaps pretty heavily during dogtag startup, the certmonger request was observed to take 61 seconds to complete, so I chose a value around 2x. Do you have another suggestion for determining the timeout?

I'm fine with adding the additional comments and plumbing in a DBUS_TIMEOUT parameter of some sort.

Thanks for the review and suggestions.

@stlaz
Copy link
Contributor

stlaz commented Sep 21, 2017

I agree with @rcritten that this should be a constant. Personally, I would probably have it configurable as well.

@rcritten
Copy link
Contributor

It can get tricky to make this sort of thing configurable because normally you'd put this sort of value into /etc/ipa/default.conf but since there is no config yet...

I'd be ok with hardcoding it. There is precedence for that elsewhere.

Normally I'd like to see some sort of "Waiting" message somewhere in the case of log timeouts so users don't think something is hung up but in this case I don't see an obvious place to put it. I think for the average user this is fine given AFAIK this timeout hasn't come up before. On underpowered h/w I'd hope that the user would expect things to take forever.

zultron added a commit to zultron/freeipa that referenced this pull request Sep 22, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
@zultron
Copy link
Contributor Author

zultron commented Sep 22, 2017

Please see the updated PR.

I admit to being a bit out of my depth here, and wasn't sure if ipalib.constants.DEFAULT_CONFIG['dbus_certmonger_timeout'] placed near other timeouts would be better than e.g. ipalib.constants.DBUS_CERTMONGER_TIMEOUT by itself.

@stlaz
Copy link
Contributor

stlaz commented Sep 22, 2017

The DEFAULT_CONFIG dictionary serves for initializing the api. You want to put your constant among the others outside the DEFAULT_CONFIG.

zultron added a commit to zultron/freeipa that referenced this pull request Sep 22, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
@zultron
Copy link
Contributor Author

zultron commented Sep 22, 2017

Thanks, @stlaz, PR updated.

@rcritten
Copy link
Contributor

LGTM, tested ok in master (after minor rebase). Will wait for add'l reviews before giving final ack.

@stlaz
Copy link
Contributor

stlaz commented Sep 25, 2017

If @rcritten tested this, it's ok with me. ACK.

@stlaz stlaz added the ack Pull Request approved, can be merged label Sep 25, 2017
@nicki-krizek nicki-krizek added the re-run Trigger a new run of PR-CI label Oct 17, 2017
@freeipa-pr-ci freeipa-pr-ci removed the re-run Trigger a new run of PR-CI label Oct 17, 2017
@nicki-krizek
Copy link
Contributor

@stlaz Please make sure all the tests executed and green before acking :) PR CI wasn't triggered before.

The commit message is also missing an upstream ticket -- do we want to create one or do we just push this to master only?

@zultron
Copy link
Contributor Author

zultron commented Oct 17, 2017

@tomaskrizek, I'd like this to go into a 4.6 release, if possible. I went ahead and created [issue #7213] on pagure (with the assumption that that's where "upstream" is), and added that URL to the commit log (it's pushed, but there seems to be some lag in updating this PR).

When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213
zultron added a commit to zultron/freeipa that referenced this pull request Oct 17, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213
zultron added a commit to zultron/freeipa that referenced this pull request Oct 17, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213
@stlaz stlaz removed the ack Pull Request approved, can be merged label Oct 18, 2017
@stlaz
Copy link
Contributor

stlaz commented Oct 18, 2017

We'll add this to the master and backport to the ipa-4-6 branch. I did not notice tests were failing, sorry.

@stlaz stlaz added the re-run Trigger a new run of PR-CI label Oct 18, 2017
@freeipa-pr-ci freeipa-pr-ci removed the re-run Trigger a new run of PR-CI label Oct 18, 2017
@nicki-krizek
Copy link
Contributor

@zultron Thanks for creating the issue and linking it!

nicki-krizek pushed a commit to nicki-krizek/freeipa that referenced this pull request Oct 18, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213
@nicki-krizek
Copy link
Contributor

I've re-based the commit against master and opened #1170 for it. We'll also backport it to ipa-4-6. I'm closing this PR since it's opened against ipa-4-5.

Thanks for your contribution, I apologize for the bureaucracy :)

nicki-krizek pushed a commit to nicki-krizek/freeipa that referenced this pull request Oct 18, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213
nicki-krizek pushed a commit to nicki-krizek/freeipa that referenced this pull request Oct 18, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213
felipevolpone pushed a commit to felipevolpone/freeipa that referenced this pull request Oct 18, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue #157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213

Reviewed-By: Stanislav Laznicka <slaznick@redhat.com>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
apophys pushed a commit to apophys/freeipa that referenced this pull request Oct 23, 2017
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213

Reviewed-By: Stanislav Laznicka <slaznick@redhat.com>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
zultron added a commit to zultron/freeipa that referenced this pull request Jun 27, 2018
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213
stanislavlevin pushed a commit to stanislavlevin/freeipa that referenced this pull request Jun 3, 2019
When running on memory-constrained systems, the `ipa-server-install`
program often fails during the "Configuring certificate server
(pki-tomcatd)" stage in FreeIPA 4.5 and 4.6.

The memory-intensive dogtag service causes swapping on low-memory
systems right after start-up, and especially new certificate
operations requested via certmonger can exceed the dbus client default
25 second timeout.

This patch changes dbus client timeouts for some such operations to
120 seconds (from the default 25 seconds, IIRC).

See more discussion in FreeIPA PR freeipa#1078 [1] and FreeIPA container
issue freeipa#157 [2].  Upstream ticket at [3].

[1]: freeipa#1078
[2]: freeipa/freeipa-container#157
[3]: https://pagure.io/freeipa/issue/7213

Reviewed-By: Stanislav Laznicka <slaznick@redhat.com>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants