Remove unreliable `SchedulingBenchmark` #2650

gmilos · 2024-02-14T10:44:06Z

Motivation:

At the moment SchedulingBenchmark preheats single EL to a specified number of tasks. However, the actual performance test runs number of times, and EL doesn't get drained in between the runs.

There are two problems with it:

perf test will trigger task heap doublings, which the preheating aims to avoid
each test run will be working with proportionally deeper heap, which means the run times are going to grow with each run

I proposed a fix, which made the benchmark work reliably, but @weissi and then others come to a consensus we're better off without it.

Modifications:

SchedulingBenchmark removed along with the dispatch site.

Result:

No schedule-but-dont-run perf tests.

Motivation: At the moment `SchedulingBenchmark` preheats single EL to a specified number of tasks. However, the actual performance test runs number of times, and EL doesn't get drained in between the runs. There are two problems with it: * perf test will trigger task heap dublings, which the preheating aims to avoid * each test run will be working with proportionally deeper heap, which means the run times are going to grow with each run Modifications: I plumbed through the # of runs to the `Benchmark.setUp`, and prepare ELG with # of ELs that match the expected number of runs. Result: Performance test will be more reliable.

weissi · 2024-02-14T10:47:19Z

Sources/NIOPerformanceTester/SchedulingBenchmark.swift

-            for _ in 0..<self.numTasks {
-                self.loop.scheduleTask(in: .nanoseconds(0)) {
-                    counter &+= 1
+    func setUp(runs: Int) throws {


@gmilos instead of introducing the runs everywhere, can we not just use an ELG with just 1 thread? Then it should be fine, right?

wait, hang on, it already tied it to exactly one EventLoop self.loop

This whole test is completely busted. We need to either fix it (wait for the scheduled tasks) or just delete this perf test, currently it doesn't add value.

CC @FranzBusch

yeah, here we can see that this is not useful

schedule_100k_tasks 0.063766844 0.105724529 0.07314636890000001 0.012951859063592227

min runtime: 0.063s
mean runtime: 0.073s
max runtime: 0.105s
std deviation: 0.012 s

that's way too high of a std dev to be useful

@weissi where did you get these test results? #2650 (comment)

It looks like it's still the broken (3rd Jan re-copied) test results, right?

I'd strongly suggest to delete this test. If we don't delete this test, then the only thing we can/should do is to spawn 1 EventLoopThread in each test's setUp. Relying on round robin and messing with existing groups (by enqueuing 100k tasks that will never run) is not a good idea, especially if there's a possibility that other perf tests are running on the same loop.

But again: I'd say we should delete the test, I don't think it adds value.

Fine by me.

@FranzBusch do you have opinions (seems like you added it originally in #2009)? If you're happy to delete, we can do that.

Fine by me as well.

weissi · 2024-02-14T10:49:40Z

[...] runs number of times, and EL doesn't get drained in between the runs.

@gmilos that's the actual issue here. It should be completely drained after each run.

[...]
Modifications:

I plumbed through the # of runs to the Benchmark.setUp, and prepare ELG with # of ELs that match the expected number of runs.

That doesn't sound ideal as if it's actually the case that the others aren't drained, then this will now cause high CPU load and expects us to have #runs CPUs available etc.

weissi · 2024-02-14T10:53:13Z

@swift-nio-bot test perf please

swift-server-bot · 2024-02-14T10:53:24Z

performance report

build id: 156

timestamp: Wed Jan 3 13:45:50 UTC 2024

results

name	min	max	mean	std
write_http_headers	0.042907723	0.043165247	0.042973653	9.70556125379205e-05
http_headers_canonical_form	0.10455533	0.10730633	0.10511068750000001	0.000809335616920894
http_headers_canonical_form_trimming_whitespace	0.020678702	0.021188336	0.0207580329	0.00015479947182040933
http_headers_canonical_form_trimming_whitespace_from_short_string	0.018708244	0.01923846	0.0187952384	0.00015906405380027787
http_headers_canonical_form_trimming_whitespace_from_long_string	0.030301067	0.030804568	0.030385388800000003	0.00015088716639322867
bytebuffer_write_12MB_short_string_literals	0.143270983	0.14943897	0.1441855301	0.0018536697851552317
bytebuffer_write_12MB_short_calculated_strings	0.067587874	0.069486342	0.0687012671	0.0005389776086500247
bytebuffer_write_12MB_medium_string_literals	0.938363651	0.97485219	0.9508204281999999	0.013250425598645685
bytebuffer_write_12MB_medium_calculated_strings	0.086556923	0.089021016	0.0870135612	0.000731880891976859
bytebuffer_write_12MB_large_calculated_strings	0.163417139	0.164472042	0.1641449972	0.0003404484955280193
bytebuffer_lots_of_rw	0.044265314	0.044929763	0.044431870299999995	0.00023316710060290754
bytebuffer_write_http_response_ascii_only_as_string	0.029828004	0.030381939	0.0299376602	0.00016310420389622758
bytebuffer_write_http_response_ascii_only_as_staticstring	0.029231652	0.029859072	0.0294445389	0.00017518514336073792
bytebuffer_write_http_response_some_nonascii_as_string	0.028767805	0.029312969	0.0288888285	0.00021134165086169015
bytebuffer_write_http_response_some_nonascii_as_staticstring	0.028939677	0.030695064	0.029339388700000003	0.0005196977050649629
no-net_http1_1k_reqs_1_conn	0.011615747	0.012100875	0.0117296609	0.00013522579434613994
http1_1k_reqs_1_conn	0.060492661	0.061901803	0.0612277168	0.0004381693782502116
http1_1k_reqs_100_conns	0.090465821	0.090860192	0.0906602483	0.00011466376957197908
future_whenallsucceed_100k_immediately_succeeded_off_loop	0.080549118	0.082637506	0.081468365	0.0007871206764443573
future_whenallsucceed_100k_immediately_succeeded_on_loop	0.080940765	0.088269588	0.0824305449	0.002134804165856356
future_whenallsucceed_10k_deferred_off_loop	0.023354389	0.023773316	0.023462138	0.0001324403603740188
future_whenallsucceed_10k_deferred_on_loop	0.014468765	0.014600609	0.0145289954	4.971332905815584e-05
future_whenallcomplete_100k_immediately_succeeded_off_loop	0.040924739	0.041610577	0.041152195600000004	0.00023931122711101126
future_whenallcomplete_100k_immediately_succeeded_on_loop	0.041419036	0.041985228	0.0416712287	0.0001902419231779779
future_whenallcomplete_10k_deferred_off_loop	0.016106523	0.017959537	0.0167431273	0.0006612598717008233
future_whenallcomplete_100k_deferred_on_loop	0.084619949	0.087602916	0.08551536779999999	0.0008921542183957266
future_reduce_10k_futures	0.017307059	0.017845536	0.0175047041	0.00015765937100206426
future_reduce_into_10k_futures	0.015271552	0.015405853	0.0153215517	4.1640959098918226e-05
channel_pipeline_1m_events	0.099658043	0.099798237	0.09974338660000001	4.948307737039316e-05
websocket_encode_50b_space_at_front_100k_frames_cow	0.049749614	0.050189388	0.0498925438	0.00020159322906883591
websocket_encode_50b_space_at_front_1m_frames_cow_masking	0.657249144	0.660868877	0.6581938381000001	0.001124422093879371
websocket_encode_1kb_space_at_front_1m_frames_cow	0.526078369	0.526747823	0.5263413171	0.00018762750014942448
websocket_encode_50b_no_space_at_front_100k_frames_cow	0.050102496	0.05058825	0.0502541197	0.0002134421036789604
websocket_encode_1kb_no_space_at_front_100k_frames_cow	0.052487978	0.052930183	0.052637525500000004	0.0001995756449003469
websocket_encode_50b_space_at_front_100k_frames	0.073903666	0.074350969	0.0741008752	0.00020326936093480533
websocket_encode_50b_space_at_front_10k_frames_masking	0.00889408	0.008927912	0.008907087599999999	9.962841542451471e-06
websocket_encode_1kb_space_at_front_10k_frames	0.012442082	0.012877802	0.0125207999	0.00013005579675017288
websocket_encode_50b_no_space_at_front_100k_frames	0.071874749	0.072902787	0.0723683473	0.0003529541819519128
websocket_encode_1kb_no_space_at_front_10k_frames	0.011704753	0.011798532	0.011730439300000001	3.002173715104898e-05
websocket_decode_125b_10k_frames	0.012596187	0.013051204	0.012710906	0.0001339040473863448
websocket_decode_125b_with_a_masking_key_10k_frames	0.013026622	0.01625301	0.0137532196	0.0011449994208533418
websocket_decode_64kb_10k_frames	0.012870337	0.013384654	0.0130012155	0.00014205029683355097
websocket_decode_64kb_with_a_masking_key_10k_frames	0.013328684	0.013495727	0.0134062059	5.629989460509189e-05
websocket_decode_64kb_+1_10k_frames	0.012897385	0.016614399	0.013328305499999998	0.0011557375071808039
websocket_decode_64kb_+1_with_a_masking_key_10k_frames	0.013289503	0.013809198	0.0134011819	0.00014745835970232413
circular_buffer_into_byte_buffer_1kb	0.033002613	0.033536173	0.0331520546	0.00018518534877924055
circular_buffer_into_byte_buffer_1mb	0.064661982	0.065130012	0.0648244472	0.00020092638953595694
byte_buffer_view_iterator_1mb	0.01756013	0.018081564	0.0176232754	0.00016123574967605693
byte_buffer_view_contains_12mb	0.052910349	0.053560145	0.0531561787	0.00021187141882076117
byte_to_message_decoder_decode_many_small	0.041325565	0.041860639	0.0415019664	0.00023618235080039308
generate_10k_random_request_keys	0.091185533	0.091505915	0.09138744169999999	0.00010962978709684585
bytebuffer_rw_10_uint32s	0.04080077	0.041416125	0.0409719066	0.00021478161702229284
bytebuffer_multi_rw_10_uint32s	0.074633317	0.075221512	0.0748953122	0.00024639593300070653
lock_1_thread_10M_ops	0.151529459	0.152741502	0.1520439887	0.0003694800419021466
lock_2_threads_10M_ops	0.786501782	0.909838773	0.8525907718999999	0.03231466284401043
lock_4_threads_10M_ops	0.937752797	0.959532161	0.9473351966999999	0.007973824451028328
lock_8_threads_10M_ops	0.957658591	0.987632844	0.9778233794	0.008966224334225099
schedule_100k_tasks	0.063766844	0.105724529	0.07314636890000001	0.012951859063592227
schedule_and_run_100k_tasks	0.252803345	0.267169133	0.2608068538	0.004253384282073607
execute_100k_tasks	0.103045814	0.105475272	0.1042747825	0.0009121376774441264
bytebufferview_copy_to_array_100k_times_1kb	0.010984296	0.011033014	0.0109959675	1.4597997512977405e-05
circularbuffer_copy_to_array_10k_times_1kb	0.019746973	0.020199469	0.019804835	0.00013886138398657403
deadline_now_1M_times	0.024568465	0.024832095	0.0246682263	9.23638256450479e-05
asyncwriter_single_writes_1M_times	1.464787299	1.467467645	1.4662272632	0.0008228612221267796
asyncsequenceproducer_consume_1M_times	0.907417083	0.910416828	0.9089906522	0.0010595941502413693
udp_10k_writes	0.37901331	0.379875118	0.3793720076	0.0002815189041030453
udp_10k_vector_writes	0.205883308	0.206418052	0.20622898890000002	0.00016904891221474557
udp_10k_vector_reads	0.386684625	0.387768161	0.3872836356	0.00033373534307398853
udp_10k_vector_reads_and_writes	0.109082179	0.109593621	0.1093604517	0.00017203582618684093
tcp_100k_messages_throughput	0.75330207	0.787823236	0.7734669324000001	0.010813256483347567

comparison

name	current	previous	winner	diff
write_http_headers	0.042907723	0.042886202	previous	0%
http_headers_canonical_form	0.10455533	0.106193642	current	-1%
http_headers_canonical_form_trimming_whitespace	0.020678702	0.021160017	current	-2%
http_headers_canonical_form_trimming_whitespace_from_short_string	0.018708244	0.019237102	current	-2%
http_headers_canonical_form_trimming_whitespace_from_long_string	0.030301067	0.031139957	current	-2%
bytebuffer_write_12MB_short_string_literals	0.143270983	0.143459794	current	0%
bytebuffer_write_12MB_short_calculated_strings	0.067587874	0.07066772	current	-4%
bytebuffer_write_12MB_medium_string_literals	0.938363651	0.94105786	current	0%
bytebuffer_write_12MB_medium_calculated_strings	0.086556923	0.08698647	current	0%
bytebuffer_write_12MB_large_calculated_strings	0.163417139	0.165702724	current	-1%
bytebuffer_lots_of_rw	0.044265314	0.043246136	previous	2%
bytebuffer_write_http_response_ascii_only_as_string	0.029828004	0.028208719	previous	5%
bytebuffer_write_http_response_ascii_only_as_staticstring	0.029231652	0.028714732	previous	1%
bytebuffer_write_http_response_some_nonascii_as_string	0.028767805	0.027803065	previous	3%
bytebuffer_write_http_response_some_nonascii_as_staticstring	0.028939677	0.028839596	previous	0%
no-net_http1_1k_reqs_1_conn	0.011615747	0.011778778	current	-1%
http1_1k_reqs_1_conn	0.060492661	0.061404357	current	-1%
http1_1k_reqs_100_conns	0.090465821	0.09061921	current	0%
future_whenallsucceed_100k_immediately_succeeded_off_loop	0.080549118	0.080259785	previous	0%
future_whenallsucceed_100k_immediately_succeeded_on_loop	0.080940765	0.079877066	previous	1%
future_whenallsucceed_10k_deferred_off_loop	0.023354389	0.023212502	previous	0%
future_whenallsucceed_10k_deferred_on_loop	0.014468765	0.014316848	previous	1%
future_whenallcomplete_100k_immediately_succeeded_off_loop	0.040924739	0.040145402	previous	1%
future_whenallcomplete_100k_immediately_succeeded_on_loop	0.041419036	0.0405237	previous	2%
future_whenallcomplete_10k_deferred_off_loop	0.016106523	0.015676013	previous	2%
future_whenallcomplete_100k_deferred_on_loop	0.084619949	0.08085791	previous	4%
future_reduce_10k_futures	0.017307059	0.016911554	previous	2%
future_reduce_into_10k_futures	0.015271552	0.014511281	previous	5%
channel_pipeline_1m_events	0.099658043	0.101659459	current	-1%
websocket_encode_50b_space_at_front_100k_frames_cow	0.049749614	0.049812283	current	0%
websocket_encode_50b_space_at_front_1m_frames_cow_masking	0.657249144	0.668089258	current	-1%
websocket_encode_1kb_space_at_front_1m_frames_cow	0.526078369	0.523242559	previous	0%
websocket_encode_50b_no_space_at_front_100k_frames_cow	0.050102496	0.04962388	previous	0%
websocket_encode_1kb_no_space_at_front_100k_frames_cow	0.052487978	0.052218856	previous	0%
websocket_encode_50b_space_at_front_100k_frames	0.073903666	0.072742069	previous	1%
websocket_encode_50b_space_at_front_10k_frames_masking	0.00889408	0.008845607	previous	0%
websocket_encode_1kb_space_at_front_10k_frames	0.012442082	0.012337981	previous	0%
websocket_encode_50b_no_space_at_front_100k_frames	0.071874749	0.07207833	current	0%
websocket_encode_1kb_no_space_at_front_10k_frames	0.011704753	0.011690726	previous	0%
websocket_decode_125b_10k_frames	0.012596187	0.012334842	previous	2%
websocket_decode_125b_with_a_masking_key_10k_frames	0.013026622	0.01274516	previous	2%
websocket_decode_64kb_10k_frames	0.012870337	0.012671642	previous	1%
websocket_decode_64kb_with_a_masking_key_10k_frames	0.013328684	0.013136916	previous	1%
websocket_decode_64kb_+1_10k_frames	0.012897385	0.012642493	previous	2%
websocket_decode_64kb_+1_with_a_masking_key_10k_frames	0.013289503	0.013195296	previous	0%
circular_buffer_into_byte_buffer_1kb	0.033002613	0.033011484	current	0%
circular_buffer_into_byte_buffer_1mb	0.064661982	0.06466184	previous	0%
byte_buffer_view_iterator_1mb	0.01756013	0.017563643	current	0%
byte_buffer_view_contains_12mb	0.052910349	0.052952322	current	0%
byte_to_message_decoder_decode_many_small	0.041325565	0.041571445	current	0%
generate_10k_random_request_keys	0.091185533	0.090277131	previous	1%
bytebuffer_rw_10_uint32s	0.04080077	0.041266035	current	-1%
bytebuffer_multi_rw_10_uint32s	0.074633317	0.072410584	previous	3%
lock_1_thread_10M_ops	0.151529459	0.15131291	previous	0%
lock_2_threads_10M_ops	0.786501782	0.820194284	current	-4%
lock_4_threads_10M_ops	0.937752797	0.87456686	previous	7%
lock_8_threads_10M_ops	0.957658591	0.873560162	previous	9%
schedule_100k_tasks	0.063766844	0.062177021	previous	2%
schedule_and_run_100k_tasks	0.252803345	0.233980813	previous	8%
execute_100k_tasks	0.103045814	0.099815383	previous	3%
bytebufferview_copy_to_array_100k_times_1kb	0.010984296	0.010981564	previous	0%
circularbuffer_copy_to_array_10k_times_1kb	0.019746973	0.019756913	current	0%
deadline_now_1M_times	0.024568465	0.024640981	current	0%
asyncwriter_single_writes_1M_times	1.464787299	1.596832264	current	-8%
asyncsequenceproducer_consume_1M_times	0.907417083	0.885448468	previous	2%
udp_10k_writes	0.37901331	0.375730776	previous	0%
udp_10k_vector_writes	0.205883308	0.204086694	previous	0%
udp_10k_vector_reads	0.386684625	0.38397455	previous	0%
udp_10k_vector_reads_and_writes	0.109082179	0.10824488	previous	0%
tcp_100k_messages_throughput	0.75330207	0.778933674	current	-3%

significant differences found

weissi · 2024-02-14T10:53:24Z

@swift-nio-bot perf test please

gmilos · 2024-02-27T12:00:39Z

@weissi re #2650 (comment)

That doesn't sound ideal as if it's actually the case that the others aren't drained, then this will now cause high CPU load and expects us to have #runs CPUs available etc.

No, because the tasks never run. They are just scheduled for some future date (that never arrives during the test run). So the tasks scheduled in the past runs are effectively dormant.

This reverts commit 859ba72.

gmilos requested review from weissi and FranzBusch February 14, 2024 10:44

gmilos mentioned this pull request Feb 14, 2024

Track execute() and enqueue() tasks separately from scheduled tasks. #2645

Merged

weissi reviewed Feb 14, 2024

View reviewed changes

gmilos added 3 commits February 28, 2024 20:48

Revert "Fix SchedulingBenchmark preheating logic."

f315d2a

This reverts commit 859ba72.

Remove SchedulingBenchmark. Deemed not valuable enough.

90967dc

Merge branch 'main' into gm-fixup-SchedulingBenchmark

458201c

gmilos changed the title ~~Fix SchedulingBenchmark preheating logic.~~ Remove unreliable SchedulingBenchmark Feb 28, 2024

gmilos enabled auto-merge (squash) February 29, 2024 17:15

Lukasa approved these changes Mar 1, 2024

View reviewed changes

gmilos merged commit 325f762 into apple:main Mar 1, 2024
9 of 10 checks passed

Lukasa added the semver/none No version bump required. label Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove unreliable `SchedulingBenchmark` #2650

Remove unreliable `SchedulingBenchmark` #2650

gmilos commented Feb 14, 2024 •

edited

Loading

weissi Feb 14, 2024

weissi Feb 14, 2024

weissi Feb 14, 2024

weissi Feb 14, 2024

weissi Feb 14, 2024

gmilos Feb 27, 2024

weissi Feb 27, 2024

Lukasa Feb 27, 2024

gmilos Feb 27, 2024 •

edited

Loading

FranzBusch Feb 28, 2024

weissi commented Feb 14, 2024

weissi commented Feb 14, 2024

swift-server-bot commented Feb 14, 2024

weissi commented Feb 14, 2024

gmilos commented Feb 27, 2024

Remove unreliable SchedulingBenchmark #2650

Remove unreliable SchedulingBenchmark #2650

Conversation

gmilos commented Feb 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gmilos Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weissi commented Feb 14, 2024

weissi commented Feb 14, 2024

swift-server-bot commented Feb 14, 2024

performance report

results

comparison

weissi commented Feb 14, 2024

gmilos commented Feb 27, 2024

Remove unreliable `SchedulingBenchmark` #2650

Remove unreliable `SchedulingBenchmark` #2650

gmilos commented Feb 14, 2024 •

edited

Loading

gmilos Feb 27, 2024 •

edited

Loading