-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random ArrayIndexOutOfBounds exceptions coming from ParserATNSimulator #804
Comments
Since it's always possible for users to assign types to a token which fall outside the normal bounds defined by [ int index = t + 1;
if (index >= 0 && index < from.edges.length) {
// add the edge to the DFA
from.edges[index] = to;
} I had not observed the issue myself because one of the memory optimizations I needed required the inclusion of this range check quite a while back. Which brings us to the actual cause.... When we added code to generate rule bypass transitions during ATN deserialization, we allowed the deserializer to define tokens which are greater than One final note - the use of |
Wow! You found it. excellent. I'll explore when I finish class prep. |
Not sure that’s the explanation:
|
@ericvergnaud makes a good point. I looked at a few other possibilities but haven't found any clear problems so far. The two areas of focus for me are:
|
Given the context in which it occurs, I don’t think the issue lies in the algorithm itself. I’m more suspicious with the 2 different allocation strategies for edges depending on whether the DFA is a precedence one or not, especially given the below:
This clearly changes an existing DFA from being a non precedence one to being a precedence one, which leads to change the allocation strategy. In a multi threaded context, you can therefore simultaneously have 1 thread executing the below in addDFAEdge:
and another thread executing the below in dfa.setPrecedenceStartState: s0.edges = Arrays.copyOf(s0.edges, precedence + 1); if the latter occurs between the former allocation and assignment, we could observe the behaviour. This could happen as a consequence of: dfa.s0 = s0; in adaptivePredict, which is done after checking for the precedence nature of the DFA, but might have been changed in between by another thread. In brief, I would delegate s0 assignment to DFA so it can be reliably protected and see if it improves.
|
Great job, guys! Thanks. sounds suspicious as you say. |
This is one of the more... "interesting" portions of the algorithm for sure. The above situation can only be a problem if You'll find that many of these sections are not synchronized, but rather rely on deterministic algorithms to produce usable results. In other words, in places where the algorithm could read a stale value and then write a new one over the top of it, the new value is equivalent to the old value for purposes of this algorithm. |
Note that while |
An I correct in observing that adaptivePredict is only called during testing/profiling?
|
No, |
Unlike some systems where "warmup" is strictly before the "fast" path, ANTLR 4 is generally observed to continue adding DFA states as new input is received throughout the parsing process. However, the bulk of this occurs during the "first few files", after which point the majority of decisions end up going through the DFA. ¹ The customer was implementing search functionality and using over 1,000 threads for parsing on a large multi-processor machine. With this scale, a single |
Yeah it’s all over the place in generated code, but my IDE of course only sees it in non generated code.
|
FWIW, we're seeing this issue intermittently in our system, too:
|
Hi Martin. wow, it's a party! Anything you can do to provide a (smallest possible) test that intermittently causes this issue would be much appreciated and would facilitate a quick fix. This bug concerns me greatly. Sam and Eric have already thought about this quite a bit I will jump into it as soon as I catch up after this first week of school, hopefully. |
Thanks @martint. The specific index included in the exception message made me realize what the problem is. The failure sequence is:
I'll send a fix later today. |
Hooray! Crowd-sourcing at it's finest! :) |
…ion from multiple threads Fixes antlr#804
Fixes #804 potential misuse of the DFA start state when initializing a decision
Thanks for fixing this! Any chance of getting a new release with this fix? Or if that's a big deal, getting a point release with just this fix? We hit this error quite frequently. |
well it’s kind of a major transaction cost to get out a new release. how about if I provide a jar? ;)
|
It looks like a transient problem, even in the environment where we saw the issue originally it hasn't been reproducible. Only happens in our automation environment (rehat 5.7 running on open stack running on I dunno what) and only occasionally.
It happens in here:
so in some case
atn.maxTokenType+1 > t
must be true.I checked and it looks like the field "edges" is always protected by its parent
DFAState
object except for in one suspicious place where it appears to be guarded by an instance ofDFA
inDFA.java
'ssetPrecedenceDfa
method. (all in 4.3)It would seem on the face of it that you need to synchronize inside
setPrecedenceDfa
on theDFAState
rather than theDFA
(or perhaps in addition too?).Maybe we should just add an assert there.
And perhaps add something about which threads are active and their stacks if it fails?
The text was updated successfully, but these errors were encountered: