-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support to resolve bindings to built-in functions, types and variables #1101
base: main
Are you sure you want to change the base?
Conversation
|
... and to silence clippy :)
@@ -75,4 +75,8 @@ impl SpannedLanguage { | |||
.flat_map(|section| §ion.topics) | |||
.flat_map(|topic| &topic.items); | |||
} | |||
|
|||
fn built_ins(&self) -> impl Iterator<Item = &SpannedBuiltIn> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I wonder if this is still needed? self.built_ins
is already a public iterable, and the single callsite can consume it the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not needed, I wanted to keep uniform wrt. items()
. But I agree it doesn't make much sense. I'll remove it.
pub return_type: Option<String>, | ||
pub parameters: Vec<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit suggestion: ordering:
pub return_type: Option<String>, | |
pub parameters: Vec<String>, | |
pub parameters: Vec<String>, | |
pub return_type: Option<String>, |
#[derive_spanned_type(Clone, Debug, ParseInputTokens, WriteOutputTokens)] | ||
pub struct BuiltInType { | ||
pub name: String, | ||
pub fields: Vec<BuiltInField>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we plan on adding functions to these types here?
pub fields: Vec<BuiltInField>, | |
pub fields: Vec<BuiltInField>, | |
pub functions: Vec<BuiltInFunction>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I was declaring them as fields with appropriate function types, as that's how they are generated in the built-in source file (since types are declared as struct
s). But for defining them in the language I like your idea better.
#[derive(Clone, Debug, Deserialize, Eq, PartialEq, Serialize)] | ||
#[derive_spanned_type(Clone, Debug, ParseInputTokens, WriteOutputTokens)] | ||
pub struct BuiltInField { | ||
pub def: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit suggestion: using the full name for clarity. there are many "definitions" used through out the codebase:
pub def: String, | |
pub definition: String, |
4 │ modifier noReentrancy() { | ||
│ ──────┬───── | ||
│ ╰─────── def: 3 | ||
5 │ require(!locked, "No reentrancy"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that we have built-ins, what do you think of using success
and failure
file suffixes to indicate the status of binding (instead of parsing), as that is what we are really checking here.
i.e. If a file has syntax errors, but we still bind everything else, it is a success. But for a test like this, it will change from 0.4.11-failure
to 0.5.0-success
because of the unresolved require
..
We can still include the parse errors for debuggability though.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be the case that parsing fails and all bindings are resolved. At least I think that should never happen. If that could happen, I'd suspect that the test is not well written (as in, why does it contain syntax that's not important for testing the binding). I like the idea of tagging tests success
ful only when bindings are resolved. What do you think about using the success
suffix when both parsing and bindings succeed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parsing can still fail when we are explicitly testing binding in the precense of syntax errors, or when we are testing a syntax that is not available in all versions.
Binding here should run regardless and produce results. Right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm referring to the meaning of success
/failure
mark only. There would be no change in how the tests are executed.
│ ╰─────── def: 3 | ||
5 │ require(!locked, "No reentrancy"); | ||
│ ───┬─── ───┬── | ||
│ ╰───────────── ambiguous: built-in, built-in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, we are returning multiple definitions in this case (overloads) .. should we print both instead of ambiguous
here? it would be a "sucessful" result still..
│ ╰───────────── ambiguous: built-in, built-in | |
│ ╰───────────── ref: built-in, built-in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on both: remove the ambiguous since we're moving away complexity away from the bindings and multiple definitions should be expected more; and it should still be considered successful.
│ ╰───────────────────── def: 3 | ||
│ │ | ||
│ ╰──── def: 4 | ||
5 │ require(_addr != address(0), "Not valid address"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest removing the require()
from both of these tests, since they are testing modifiers, and require()
is just adding noise for the earlier versions ..
We can add a separate test for it to still cover it..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think I have the same issue in several tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, these are testing something relevant: that the body of the modifier can access the arguments passed in (or the state variables in the simple
case). I replaced require()
by assert()
which exists in all supported versions.
|
||
use semver::Version; | ||
|
||
#[allow(unused_variables)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using built_ins.rs.jinja2
, I suggest generating separate files for each version:
slang_solidity/src/generated/bindings/builtins/0.4.11.sol
slang_solidity/src/generated/bindings/builtins/0.5.0.sol
- etc...
And include it here via include_str()!
:
#[allow(unused_variables)]
pub fn get_contents(version: &Version) -> &'static str {
if *version < Version::new(0, 5, 0) {
include_str("./builtins/0.4.11")
} else {
// etc...
}
}
This allows syntax highlighting, diffing the versions, and comparing changes during code review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I like it.
#[allow(unused_variables)] | ||
pub fn get_contents(version: &Version) -> &'static str { | ||
if *version < Version::new(0, 5, 0) { | ||
r####"contract $$ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion instead of using $$
, as this will be user visible.. would that work for our internal bindings?
r####"contract $$ { | |
r####"interface $BuiltIn$Functions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can use any name here. I had already changed the name in the complete built-ins branch. The important thing is keeping it in sync with the rules file where it's linked.
Using an interface wouldn't work, since the rules would not bind the declared state variables as globals, because state vars are not allowed in interfaces (even though they will parse successfully). We could make a special case in the rules file if needed though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because state vars are not allowed in interfaces
We could make a special case in the rules file
That makes sense. Let's stick to the language semantics, to make sure they are understood by users.
In that case, contract $BuiltIns$
sounds good to me (PascalCase).
;; All built-in symbols are defined in an internal contract named '$$' | ||
;; so we need to construct an equivalent import path to reach them. | ||
;; We should have access to both type members (eg. defined enums & structs) | ||
;; as well as functions and state variables (see special case below), hence | ||
;; why we're introducing a path through `@typeof`. | ||
node built_ins | ||
attr (built_ins) push_symbol = BUILT_INS_FILE_PATH | ||
|
||
node built_in_library | ||
attr (built_in_library) push_symbol = "$$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I suggest adding markers so that they stay in sync. And we can add that everywhere else its name is used/defined.
;; All built-in symbols are defined in an internal contract named '$$' | |
;; so we need to construct an equivalent import path to reach them. | |
;; We should have access to both type members (eg. defined enums & structs) | |
;; as well as functions and state variables (see special case below), hence | |
;; why we're introducing a path through `@typeof`. | |
node built_ins | |
attr (built_ins) push_symbol = BUILT_INS_FILE_PATH | |
node built_in_library | |
attr (built_in_library) push_symbol = "$$" | |
;; All built-in symbols are defined in an internal contract (see __SLANG_BUILT_INS_CONTRACT_NAME__) | |
;; so we need to construct an equivalent import path to reach them. | |
;; We should have access to both type members (eg. defined enums & structs) | |
;; as well as functions and state variables (see special case below), hence | |
;; why we're introducing a path through `@typeof`. | |
node built_ins | |
attr (built_ins) push_symbol = BUILT_INS_FILE_PATH | |
node built_in_library | |
attr (built_in_library) push_symbol = "$$" ;; __SLANG_BUILT_INS_CONTRACT_NAME__ (keep in sync) |
node built_in_member | ||
attr (built_in_member) push_symbol = "." | ||
|
||
edge @source_unit.lexical_scope -> built_in_member |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm missing something here:
- Why would we use push symbols (
.
,@typeof
, etc..) here? the users can never write something like$BuiltIns$.XXX
anyways. We are already exporting theContractMembers
below to the parent file scope. - What here prevents users from referencing the
$BuiltIns$
contract directly in their code? and if they can't, what is the significance of using dollar signs in the first place? why can't we just usecontract BuiltIns
andstruct Address
for example? - How is precedence calculated, in case the user defines their own
require()
method orAddress
type for example?
Happy to follow up offline to understand better. Not a blocker of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Since built-ins are inside the special
$BuiltIns$
contract, we need a way to "promote" them to the global namespace. That's why.
and@typeof
are pushed here, in connection to the built-ins path. This is as if the user had referenced global with the$BuiltIns$.
prefix.Having said that, I think we can find a better way to do this, by directly linking
the$BuiltIns$
members to the global namespace from the built-ins side. -
For now, nothing prevents users from referencing
$BuiltIns$
. I have some code in another branch that renames all identifiers in the built-ins file by replacing$
for%
. That way it will be impossible for users to explicitly reference the built-ins contract. Similarly with built-in types such asaddress
. -
That's a good point. For types it shouldn't be a problem because of the renaming of built-in identifiers. But we currently don't have any way to prefer a user's definition over the built-in. We could resolve it via ranking, but since we've discussed about removing that feature, we may need to fallback to a nano pass later.
@@ -349,6 +378,12 @@ inherit .enclosing_def | |||
)] | |||
]] { | |||
edge @contract.lexical_scope -> @member.def | |||
|
|||
;; Special case: for the built-ins file, export state variables in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will export all members (functions, structs, etc...), not just state variables, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was needed because normally contract state variables are not accessible externally, and we need that to be able to emulate globals (eg. the global tx
is connected in the stack graph as $BuiltIns$.tx
). Functions and structs are exposed in a different scope type_members
for this reason.
But, as I said in the previous reply, I think there's a better way that would be less confusing.
@@ -79,16 +103,16 @@ struct RuntimeModel { | |||
} | |||
|
|||
impl RuntimeModel { | |||
pub fn from_language(language: &Rc<Language>) -> Result<Self> { | |||
pub fn from_language(language: &LanguageModel) -> Result<Self> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused about the newly added model, that just unwraps to the existing model. Additionally, built-ins are only used/generated in a single runtime (solidity rust) and ignored everywhere else (stubs, testlang, other solidity outputs).
Instead of the changes in the generator crate here, WDYT of just rendering/writing the built ins in the solidity crate directly? as this is all very solidity specific, and cannot be applied to any other runtime. It can also make doing this easier, since the versions/sources can be generated/written to disk directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an attempt at remaining generic for any language defined in Slang, so that all of them can potentially define built-ins. But since we've decided to take that apart and have the user explicitly add the built-ins all this can be re-worked.
I'll rework the built-ins generation code.
crates/metaslang/bindings/src/lib.rs
Outdated
type CursorID = usize; | ||
pub struct DefinitionHandle(GraphHandle); | ||
|
||
pub const BUILT_INS_FILE_PATH: &str = "@@built-ins@@"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it is possible to supply this value to Bindings::create()
, instead of statically defining it, so that the Compilation
API can make sure to use a name that doesn't conflict with anything the user might use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do that, but wouldn't it make the API awkward to use? Like, why would the user need to decide how to name an internal file? Hmm, I'm thinking the underlying issue here is whether we want to have the "built-ins" as a concept that the Bindings API needs to know about, or is it something that's completely handled outside.
If we want to extract that concept, then having a way to configure the binding rules execution (as in setting a couple of graph global variables) would be enough. I'm not opposed to that idea, but the counterpart is that now users will need to implement this handling themselves.
WDYT?
crates/metaslang/bindings/src/lib.rs
Outdated
_ = self.add_file_internal(file, tree_cursor); | ||
self.built_ins_file = Some(file); | ||
} | ||
|
||
pub fn add_file(&mut self, file_path: &str, tree_cursor: Cursor<KT>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that BUILT_INS_FILE_PATH
is known at this point, I wonder why the additional overload add_built_ins()
? Can't add_file()
just check if file_path == BUILT_INS_FILE_PATH
and do the right thing?
It eliminates the error of misusing the original function to add built ins by mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the opposite thing: add_file
should check that the user is not trying to add a file as the built-ins, or resolve that internally in some other way so that there's no possibility of clashing. But as I said in the previous comment, I guess it depends on who should handle the built-ins concept.
pub use definition::SolidityDefinition; | ||
|
||
pub fn render_built_ins(built_ins: &[BuiltIn]) -> String { | ||
let mut lines: Vec<String> = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion if it is more ergonomic for you: write!()
and writeln!()
can be used directly in strings, instead of creating a Vec<String>
and joining it at the end.
Example in crates/codegen/spec/src/generators/grammar_ebnf.rs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also suggest moving this to a mod bindings
to maintain the structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I didn't know you could write!
or writeln!
directly to a string. I'll change this.
|
||
pub type Bindings = metaslang_bindings::Bindings<KindTypes>; | ||
pub type Definition<'a> = metaslang_bindings::Definition<'a, KindTypes>; | ||
pub type Reference<'a> = metaslang_bindings::Reference<'a, KindTypes>; | ||
|
||
pub fn create_with_resolver( | ||
version: Version, | ||
parser: &Parser, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand the change here, given that Bindings::add_built_ins()
(and Bindings::add_file()
) exist in the public API. The Bindings
owner is responsible for loading/parsing the built ins file, and passing its cursor, similar to everything else.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this was before we decided to move loading the built-ins responsibility away from the Bindings API. I was going to make that change in another PR, but may as well do it here, along with the other issues we've discussed so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Left a few questions/suggestions.
From now on, `success` means that both parsing was successful and all references were resolved.
The user must parse and add them explicitly.
e51c286
to
f08836a
Compare
Instead of a single built-ins file, the user can now add any number of system files and the API will ensure there's no collision with user provided files. For Solidity, any system files are linked to the ROOT_NODE as a special name "@@built-ins@@" that cannot collide with user files (always prefixed by "user:").
This PR adds the infrastructure to the Bindings implementation to be able to:
Bindings
object for a specific language versionSo far, this contains just a few built-ins to test the infrastructure and verify that all functions, types and variables can be resolved.