-
Notifications
You must be signed in to change notification settings - Fork 645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track execute()
and enqueue()
tasks separately from scheduled tasks.
#2645
Track execute()
and enqueue()
tasks separately from scheduled tasks.
#2645
Conversation
The test failure is ... because the # of allocations for less than expected. By about 1k. That's a good thing.
And ... I see the fix just got merged: e4102ae#diff-660361849e6074a975e1787c5836a6138eef55096f2b80fb76d5da0af73b48d5R40 |
@weissi this is the change we discussed in person. |
@swift-server-bot test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment in the diff
// therefore the best course of action is to run them. | ||
for task in immediateTasksCopy { | ||
self.run(task) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fairly substantial behavioural change. Previously these tasks got dropped. Normally I’d say the logic here is good but we should prepare ourselves for a few subtle bugs popping out of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, true. But without this, the application would have their Task
-s/functions not executed at all. I don't think any behaviour can be worse than that. We haven't been noticing, because before Tasks
this wasn't frequently exercised. But yes, there could be some unintended consequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed it's a substantial change but it's definitely a good one. I think this will actually make tearing down an EventLoop much easier than it was before this patch. Previously we just couldn't implement this sensibly, now we can.
But of course, it'll likely trigger a few subtle changes here or there but I can't see this changing anything that was reliably working fine for the worse.
@swift-nio-bot test perf please |
@swift-nio-bot perf test please |
@gmilos nice! asyncwriter_single_writes_1M_times | 1.464787299 | 1.596832264 | current | -8% 8% better in one of the perf tests that actually uses Weirdly, it claims that this one is 8% worse schedule_and_run_100k_tasks | 0.252803345 | 0.233980813 | previous | 8% |
@weissi how much signal is in the performance tests? Looks like there are a few that did regress, including:
|
And another:
There seems to be a pattern where the tests with large queues degraded in performance. I'll take a look. |
Motivation: `SelectableEventLoop` uses `ScheduledTask` to track tasks scheduled for some specific point in time, but also tasks that are supposed to be executed immediately. This casuse performance issues where the number of immediate tasks grows, as they need to be kept sorted in the `_scheduledTasks` `PriorityQueue`. Additionally, that datastructure is protected by lock(s) which casues lock contention from other threads. Finally, immediate tasks differ from future scheduled tasks in that they don't have a `failFn` to handle destruction at the end of `SelectableEventLoop.run()` execution. Modifications: * Modify `ScheduledTask` to only track closure tasks and require `failFn` to be provided. * Retain `UnderlyingTask` for tracking closure or `ErasedUnownedJob` for async-await tasks. * Add `SelectableEventLoop._immediateTasks` to track immediate tasks (collection of `UnderlyingTask`-s). * Split production / consumption of the tasks to match the datastructure changes described above. * Modify the `defer` block of `SelectableEventLoop.run()` to execute immediate tasks, instead of dropping them silently. Result: Performance of workloads relying on queuing lots of tasks gets better.
daf6e3d
to
1b7ad7d
Compare
@swift-nio-bot perf test please |
I reviewed the test responsible for:
This doesn't exercise my code change in any meaningful way. So the degradation must be due to an unrelated difference. I'm reviewing the test @weissi pointed out:
This may be down to me. |
I reviewed:
In the current version of the PR (which isn't the same as what was perf tested before), I see ~2x improvement. I had to raise the number of repeats by 100x to make it human scale. I also recorded sample. Which point to better arc performance with the simpler task type stored in |
@swift-server-bot test this please |
@swift-server-bot test perf please |
@swift-server-bot perf test please |
@swift-server-bot add to allowlist |
@Lukasa the perf tests are broken: Both of the results are from 3rd Jan (run id 156). Don't know why it still posts them as if they were current |
…bleEventLoop down.
e05c0c7
to
a51573a
Compare
I did a manual performance test run on my desktop. I run perf tests on main & my branch. They show major wins:
There are couple of regressions also:
but it looks like the lock tests are noisy (some of the others go faster, so it's probably just natural variability in the OS locking)
When I started looking into the problem, I realised the test is badly implemented. The test prewarms the heap with 100k tasks, runs them to completion, and hands over for the measured test run. But then, the test loop is run 10x, thus placing 10x as many tasks into the heap than what the prewarm did. We're therefore testing the efficiency of heap doubling and not just the scheduling. I was able to prove this is indeed responsible for the test instability by running it with much bigger set of tasks:
The runs range from 3.7 to 7.1. I'll try to fix this test separately from this PR. I'm also attaching the full manual perf test. With all of this, @Lukasa the PR is ready for re-review. |
@swift-server-bot test this please |
nice!!
yes, ignore the locks one for this. They depend on the contention reached.
awesome, thank you!! |
Co-authored-by: Johannes Weiss <johannesweiss@apple.com>
@weissi that's all the feedback handled. lmk if there is any more. |
@swift-server-bot test this please |
1 similar comment
@swift-server-bot test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! That looks good to me.
From a "strict semver" this is a patch change, but I'm going to take some editorial discretion and mark it "minor". We're making a subtle but observable behavioural change, and while it will probably be fine I want to make it possible for users to ensure that they don't have to deal with both variants by calling this a minor instead of a patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looking really nice. I have a few small refactoring suggestions, as we're in the space and you've already started with refactors ;)
} | ||
} | ||
|
||
assert(self.tasksCopy.count <= Self.taskCopyBatchSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion seems like it is straightforward to break. We can add two tasks in each loop iterator above, but we only check the length once. I think the loop above needs an extra length check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Reorganising the loop to check the length of the taskCopy
array at each point we're mutating (adding to) it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, I tried to write a test for this, and realised the code before was fine, just. It's because:
- if
_scheduledTasks
and_immediateTasks
are non-empty, we'd be adding 2 items to thetasksCopy
each time round the loop. AndtaskCopyBatchSize
is even, so one test every 2 items is sufficient - if either
_scheduledTasks
or_immediateTasks
run out of tasks, we'd be testing the size oftasksCopy
once per each addition. This applies even if we just run out when we're terminating the batch.
This wasn't intentional micro-optimisation, so I'm adding the additional length check back to make the code safer against future refactoring (including making the size of the batch odd).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah my concern was very much about future refactors 😉
immediateTasksCopy.reserveCapacity(self._immediateTasks.count) | ||
while let immediate = self._immediateTasks.popFirst() { | ||
immediateTasksCopy.append(immediate) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this while let
, can we use swap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is removing tasks from self._immediateTasks
and putting them to a different array immediateTasksCopy
. So swap
isn't applicable, unless there is some other swap
I'm not aware of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this swap
. Adopted here: 8b818e6#diff-565b354476ebe678da3be159a85e33e0b7630b1e45b62b20efe506210c1d62faR632
…ing-optimisations' into gm-selectableeventloop-job-queueing-optimisations
…p-job-queueing-optimisations
0ab81f4
to
4fb9f15
Compare
Looks like |
Ah, |
|
||
|
||
// nextScheduledTaskDeadline is the overall next deadline, but iff there are no more immediate tasks left. | ||
return moreImmediateTasksToConsider ? now : nextScheduledTaskDeadline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will we ever hit now
? We can only get here if both moreImmediateTasksToConsider
and moreScheduledTasksToConsider
are both false, which should imply we never hit now
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. This changed because the batch size checks got moved to each of the if
-s, and not the loop overall. I think the statement is harmless, but can be simplified. Checking it carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The statement was indeed harmless, because while it'd always return the deadline for next scheduled task, this deadline is only enacted iff there are no more tasks to run. In all other cases we have to recheck, because more tasks may get queued up. Still, this deserved a cleanup, with now
returned if there are any more immediate tasks left and an associated assert. Also, I added a test that tries to get immediate tasks stuck (which never reproduced a problem, but useful defence for the future).
Motivation:
SelectableEventLoop
usesScheduledTask
to track tasks scheduled for some specific point in time, but also tasks that are supposed to be executed immediately.This casuse performance issues where the number of immediate tasks grows, as they need to be kept sorted in the
_scheduledTasks
PriorityQueue
. Additionally, that datastructure is protected by lock(s) which casues lock contention from other threads.Finally, immediate tasks differ from future scheduled tasks in that they don't have a
failFn
to handle destruction at the end ofSelectableEventLoop.run()
execution.Modifications:
ScheduledTask
to only track closure tasks and requirefailFn
to be provided.UnderlyingTask
for tracking closure orErasedUnownedJob
for async-await tasks.SelectableEventLoop._immediateTasks
to track immediate tasks (collection ofUnderlyingTask
-s).defer
block ofSelectableEventLoop.run()
to execute immediate tasks, instead of dropping them silently.Result:
Performance of workloads relying on queuing lots of tasks gets better.