Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorg runtime tests (3rd generation); mostly to use descriptor files for tests not annotations in *Descriptor.java files (supports later versions of java) #3407

Merged
merged 43 commits into from
Dec 23, 2021

Conversation

parrt
Copy link
Member

@parrt parrt commented Dec 14, 2021

The goal is to remove dependence on tools.jar for annotation processing.

@parrt parrt added this to the 4.9.4 milestone Dec 14, 2021
@KvanTTT
Copy link
Member

KvanTTT commented Dec 15, 2021

There is something weird with Go runtime tests, they are hanging for now.

@parrt
Copy link
Member Author

parrt commented Dec 15, 2021 via email

@KvanTTT
Copy link
Member

KvanTTT commented Dec 18, 2021

.txt is the file format for text files, but we have descriptions. Maybe replace .txt with .desc or something else?

@parrt
Copy link
Member Author

parrt commented Dec 18, 2021

Easier to open files and .descr doesn't add much :)

I think I might have this kinda working! I also split cpp so it finishes w/o timing out.

@parrt
Copy link
Member Author

parrt commented Dec 22, 2021

I think it's also an unrelated change. Let's set up CI in another pull request.

I have broadened the scope of the PR to be a reorganization of the entire runtime test rig; changed the title.

Moreover, it looks like there are some problems with GitHub CI because of the following error:

Can't find any online and idle self-hosted runner in the current repository, account/organization that matches the required labels: 'self-hosted , macOS , x64'
Waiting for a self-hosted runner to pickup this job...

Yeah, I don't know what the issue is there but I don't think it has something in the configuration I changed. Github is likely just pissed that we have so many actions coming in.

@KvanTTT
Copy link
Member

KvanTTT commented Dec 22, 2021

Yeah, I don't know what the issue is there but I don't think it has something in the configuration I changed. Github is likely just pissed that we have so many actions coming in.

There is something wrong with @ericvergnaud machine because GitHub CI uses self-hosted configuration, see #3425 thread. Changing runs-on: [self-hosted, macOS, x64] to runs-on: macos-latest resolved the problem, but probably not completely.

@parrt
Copy link
Member Author

parrt commented Dec 22, 2021

Ok, i'm running some intellij code Analysis and cleaning up accordingly at moment then I think we can merge. Next step, maybe somebody smarter than me can figure out how to get this to compile with 11 but generate 7 (or 8) haha

@KvanTTT
Copy link
Member

KvanTTT commented Dec 22, 2021

Looks like self-hosted GitHub CI is working again: https://github.com/antlr/antlr4/actions

@parrt
Copy link
Member Author

parrt commented Dec 22, 2021

Let's see if it works for this PR. :)

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

Looks like all unit tests pass except for swift-recursion due to timeout (works on my machine). I think it's time to merge!!

@parrt parrt merged commit a03db24 into antlr:master Dec 23, 2021
@KvanTTT
Copy link
Member

KvanTTT commented Dec 23, 2021

Looks like all unit tests pass except for swift-recursion due to timeout (works on my machine). I think it's time to merge!!

Maven says it's not timeout but incorrect test: https://github.com/antlr/antlr4/runs/4612727990?check_suite_focus=true#step:6:542

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

Yes but if you look at this Time elapsed: 1,173.605 it appears to be a time out that's simply didn't get any output causing a failed test. That is almost exactly 20 minutes.

@KvanTTT
Copy link
Member

KvanTTT commented Dec 23, 2021

It's the elapsed time for the entire test group and it's not so much. For instance, here we have 27 min and green tests (also swift RECURSION group): https://github.com/antlr/antlr4/runs/4609645043?check_suite_focus=true I'll recheck on my machine.

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

Hmm... Interesting. OK well it passes elsewhere so I'm not sure why would fail over there. Let me see if it fails here locally on an identical box to @ericvergnaud's.

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

All TestLeftRecursion tests pass on my identical M1 mac to the test box. (13min8s.) So, the only difference would be the number of threads as I keep it single threaded here. That's could mean there's an issue with Swift testing in parallel.

@ericvergnaud
Copy link
Contributor

My experience is that GitHub runners do not behave consistently. The exact same commit will sometimes pass the tests in 5ms, other times in 10 mns, and other times timeout. This is not specific to Swift.
This also happens with other CIs, but more frequently with GitHub, possibly because we're using self-hosted runners, and the network conversation becomes less reliable.
Here's a ping trace from my box:
PING github.com (140.82.121.3): 56 data bytes
64 bytes from 140.82.121.3: icmp_seq=0 ttl=51 time=34.212 ms
64 bytes from 140.82.121.3: icmp_seq=1 ttl=51 time=52.130 ms
64 bytes from 140.82.121.3: icmp_seq=2 ttl=51 time=91.085 ms
64 bytes from 140.82.121.3: icmp_seq=3 ttl=51 time=71.296 ms
64 bytes from 140.82.121.3: icmp_seq=4 ttl=51 time=154.234 ms
64 bytes from 140.82.121.3: icmp_seq=5 ttl=51 time=119.809 ms
64 bytes from 140.82.121.3: icmp_seq=6 ttl=51 time=200.065 ms
64 bytes from 140.82.121.3: icmp_seq=7 ttl=51 time=184.354 ms
Request timeout for icmp_seq 8
64 bytes from 140.82.121.3: icmp_seq=9 ttl=51 time=148.027 ms
64 bytes from 140.82.121.3: icmp_seq=10 ttl=51 time=162.412 ms
64 bytes from 140.82.121.3: icmp_seq=11 ttl=51 time=67.909 ms
64 bytes from 140.82.121.3: icmp_seq=12 ttl=51 time=117.240 ms
64 bytes from 140.82.121.3: icmp_seq=13 ttl=51 time=136.301 ms
64 bytes from 140.82.121.3: icmp_seq=14 ttl=51 time=177.255 ms
64 bytes from 140.82.121.3: icmp_seq=15 ttl=51 time=108.688 ms
64 bytes from 140.82.121.3: icmp_seq=16 ttl=51 time=126.442 ms
Request timeout for icmp_seq 17
64 bytes from 140.82.121.3: icmp_seq=18 ttl=51 time=103.139 ms
64 bytes from 140.82.121.3: icmp_seq=19 ttl=51 time=151.846 ms
64 bytes from 140.82.121.3: icmp_seq=20 ttl=51 time=187.243 ms
^C
--- github.com ping statistics ---
21 packets transmitted, 19 packets received, 9.5% packet loss
round-trip min/avg/max/stddev = 34.212/125.984/200.065/46.485 ms

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

Very interesting. So even if we bought a bigger mac or hosted at Amazon, we could still have problems communicating with GitHub

@KvanTTT
Copy link
Member

KvanTTT commented Dec 23, 2021

My experience is that GitHub runners do not behave consistently. The exact same commit will sometimes pass the tests in 5ms, other times in 10 mns, and other times timeout. This is not specific to Swift.

Have you checked if it depends on simultaneous running pipelines? For instance, there are 6 running pipelines, it's many: https://github.com/antlr/antlr4/actions Maybe it would be much faster with a single pipeline. I'm not sure communication is the problem.

@ericvergnaud
Copy link
Contributor

ericvergnaud commented Dec 23, 2021

These 6 pipelines are queued, not running. We have 4 runners so can only run 4 pipelines simultaneously. Initially I implemented 8, but that turned out to be disastrous, so scaled down to 4.

@KvanTTT
Copy link
Member

KvanTTT commented Dec 23, 2021

What if scale down to 1 and try several runs?

@ericvergnaud
Copy link
Contributor

Not sure what that means tbh

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

Wow. swift-build, driver, etc... all running in Intel mode on my M1. I'm playing with threads. Ok, this creates lots of processes/threads that are transient but seem a bit out of hand:

mvn -Dparallel=classes -DthreadCount=4 -Dtest=swift.** test

Screen Shot 2021-12-23 at 1 52 41 PM

I've seen 40 builds simultaneously.

Ok, it completed much faster than single threaded: real 7m7.051s.

@KvanTTT
Copy link
Member

KvanTTT commented Dec 23, 2021

FYI, some runtime tests were lost during transformation, for instance SemPredEvalParser/PredFromAltTestedInLoopBack_1. But I've restored it in my rebased PR (and fixed).

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

Now I'm running with methods not classes:

mvn -Dparallel=methods -DthreadCount=4 -Dtest=swift.** test

Almost nothing being done in parallel. not sure why.

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

FYI, some runtime tests were lost during transformation, for instance SemPredEvalParser/PredFromAltTestedInLoopBack_1.

was worried about that. i have a todo on my list to count :) but thanks

@KvanTTT
Copy link
Member

KvanTTT commented Dec 23, 2021

Maybe it related to some bug with test inheritance (PredFromAltTestedInLoopBack_1 and PredFromAltTestedInLoopBack_2 had common base).

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

yeah, i saw a few that didn't get translated but was usually a missed ignored() result or something. could be inheritance yep.

@KvanTTT
Copy link
Member

KvanTTT commented Dec 23, 2021

I've seen 40 builds simultaneously.

Yes, looks weird, most part of them should be killed.

@parrt
Copy link
Member Author

parrt commented Dec 23, 2021

Ok, mvn -Dparallel=methods -DthreadCount=4 -Dtest=swift.** test does nothing in parallel that I can see:

Screen Shot 2021-12-23 at 2 20 44 PM

and is slow...hasn't finished so not sure if faster than single-thread but it ain't doing methods in parallel that's for sure. ok, real 25m7.639s so no speed improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants