Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing caller function / memory from Wasmtime implementation #130

Closed
radu-matei opened this issue Jan 15, 2022 · 17 comments
Closed

Accessing caller function / memory from Wasmtime implementation #130

radu-matei opened this issue Jan 15, 2022 · 17 comments

Comments

@radu-matei
Copy link
Member

There doesn't appear to be a way in the wit-bindgen-wasmtime implementation for accessing a caller's export function or memory from a host implementation.
Is this a valid scenario for this crate, or should something that needs low-level access use something else?

Thanks!

@alexcrichton
Copy link
Member

The theory is that this isn't necessary because the interface types conversion should theoretically convey all the module was willing to convey, and consumers don't get anything further beyond the interface types boundary (in the same sense that you can't get un-exported functions from a wasm module through the wasmtime crate).

That being said there's nothing stopping one-off functions from having access to internals like this. It shouldn't be too hard to thread things along if there's a use case.

@radu-matei
Copy link
Member Author

I'm trying to build an implementation for wasi-parallel using WIT, and I need to either get a function reference from a table, or get an exported function, both of which would require accessing some sort of caller object, similar to the one from the wasmtime crate.

@alexcrichton
Copy link
Member

Hm ok for this it might be good to do two things in parallel, one is to bring this up with interface-types/wasi-parallel in that the interface currently desired by wasi-parallel isn't implementable with interface-types vanilla, and two would be adding a form of escape hatch to wit-bindgen to forward along the store if configured to do so (or something like that)

@tschneidereit
Copy link
Member

the interface currently desired by wasi-parallel isn't implementable with interface-types vanilla

I think that that's something we should treat as either blocking wasi-parallel or Interface Types: we want all WASI interfaces to be virtualizable without breaking encapsulation, so while we can work around limitations like this in wit-bindgen, that's not workable more generally, I think

@abrown
Copy link

abrown commented Jan 19, 2022

The wasi-parallel interface was designed under the assumption that there would be some way to express (e.g., interface types, component model, etc.) that the API must receive a "function" to execute. This did not seem like a crazy assumption several months ago: one could imagine how a WASI threading API would require a function to fork or how some API would require a function to handle callbacks. Does this assumption no longer hold?

we want all WASI interfaces to be virtualizable without breaking encapsulation

@tschneidereit, surely passing a "function" must does not necessarily mean that it will break encapsulation: sure, a funcref will contain store state but do you think no other options are possible?

We purposely left this area vague in the design but since then have explored several possibilities: function indices, funcrefs, strings, fat binaries, components (WebAssembly/wasi-parallel#3). None seem satisfactory yet in their current form so more work is needed. But I am at a loss as to how to proceed: who else is interested in enabling this and how would something like this be standardized in a way acceptable to wit-bindgen? @lukewagner?

@penzn
Copy link

penzn commented Jan 20, 2022

On another note - what is the relationship between wit-bindgen and WASI, are WASI interfaces expected to be available via interface types? Maybe I am missing something obvious, wouldn't WASI be an interface between wasm code and the host, while interface types is a much higher level introp (between Wasm and a different runtime, or RPC, for example)?

@lukewagner
Copy link

If we're talking about shared-memory parallelism for wasi-parallel (my suggestion in wasi-parallel/#3 was thinking more about non-shared-memory parallelism), I think a good way to express this in a way that is virtualizable and compatible with the overall component model goals is as a core module import where the wasi-parallel module imports memory from its client component (and thus there is never a need to ask for the "caller's memory").

E.g., a component importing wasi-parallel might look like:

(component
  (import "wasi-parallel" (module $Parallel
    (import "libc" "memory" (memory 1))
    (export "fork" (func (param funcref)))
  ))
  (module $Libc
    (memory (export "memory") 1)
    ...
  )
  (instance $libc (instantiate $Libc))
  (instance $parallel (instantiate $Parallel (import "libc" (instance $libc))))
  (module $Main
    (import "libc" "memory" (memory 1))
    (import "parallel" "fork" (func (param funcref)))
    ...
  )
  (instance $m (instantiate $Main (import "libc" (instance $libc)) (import "parallel" (instance $parallel))))
)

Thus, this client component explicitly wires up wasi-parallel to the memory it should use via instantiate. Because wasi-parallel is imported as core module, it can use all core types, including funcref.

I think the remaining question is how to specify the interface of wasi-parallel. Normally wit is used to describe component instance interfaces (using interface types), but what we want here is a core module interface (using core types). Because wasi-parallel isn't using interface types, it doesn't need the normal wit-bindgen glue-code generation, so I wonder if the answer is that, unlike most other WASI proposals, wasi-parallel wouldn't use wit but, rather, just a plain core module type, written in whatever syntax we want. The simplest option would just be to use the moduletype s-expression defined by module-linking, in which case, for my above example, the interface for wasi-parallel would simply be:

(module
  (import "libc" "memory" (memory 1))
  (export "fork" (func (param funcref)))
)

Does that make sense?

@abrown
Copy link

abrown commented Jan 20, 2022

I think I understand the module-linking part of this ("thread the memory through to the instantiated module, etc.") and I would be fine with that approach if I felt the second question was resolved.

so I wonder if the answer is that, unlike most other WASI proposals, wasi-parallel wouldn't use wit but, rather, just a plain core module type, written in whatever syntax we want

The issue with moduletype s-expressions is that it is quite unclear to me how users will consume this API. If there is no way to express the API in WIT, then presumably wit-bindgen can't do much for wasi-parallel, and users will have to manually find ways to use the API. E.g., a future Python user attempting to make a wasi-parallel call would not be able to use auto-generated wasi-parallel bindings but would have to build these by hand.

Is there no way Should there be no way to express this type of thing in WIT? Or specifically to @lukewagner's example: what would it take to express interfaces that use module-linking in WIT?

[edit: tweaked question since clearly there is no way currently to express this type of thing in WIT]

@lukewagner
Copy link

I might be misunderstanding, but it seems like if wasi-parallel offers shared-memory-multi-threading, wit-bindgen doesn't have anything to offer since you don't want to copy data between linear memories (there is only 1 and you want to pass around pointers into it). Thus, to use wasi-parallel, I imagine all you'd need is a C header file containing the exports of the wasi-parallel module (which could be published alongside the moduletype). This does require non-C languages to perform some degree of manual integration, but that feels like the inherent tradeoff involved with shared-memory interfaces (and not different than if wasi-parallel was published today as a native .so+.h).

@penzn
Copy link

penzn commented Jan 21, 2022

@lukewagner, is that a temporary limitation, will memory sharing be coming to wit-bindgen at some point? I recall that in the presentation about the future of interface types you said that component model would support both share-nothing and shared memory scenarios.

On a side note, I feel that there are two different use cases here: for external interface it might not be necessary to support core features that the other language or middleware can't express, while it might be necessary for a host API like WASI.

@lukewagner
Copy link

Well, my hypothesis is that: when doing memory sharing, you don't need/want wit-bindgen; you're passing around i32 pointers through interfaces and you want it that way. So I don't understand this as a limitation, so much as the nature of shared-everything interfaces. To be clear: a component can import a core-module (and then share the component's internal linear memory with a privately-created instance of that imported core-module -- this doesn't break the shared-nothing architecture we're striving for with component model since the imported core-module is immutable code.

@abrown
Copy link

abrown commented Jan 21, 2022

For me, that the tooling might not work for wasi-parallel seems like a major issue. The interface already has create_buffer, read_buffer, and write_buffer calls that would allow the use of wasi-parallel to be shared-nothing. So if the shared-memory part is the blocker for having tools support, it need not be.

@lukewagner
Copy link

Oh, sorry, maybe I'm misunderstanding the fundamental interface here. When I heard "caller's memory", I assumed that was because the interfaces were passing pointers (offsets) relative to a linear memory shared between the caller and callee (and closed over by all the funcrefs) -- but it sounds like maybe that isn't the case? Could you sketch out the interfaces of those 3 buffer functions a bit so we could see how the data flows between the orchestrator and the kernel functions?

@penzn
Copy link

penzn commented Jan 24, 2022

To be clear: a component can import a core-module (and then share the component's internal linear memory with a privately-created instance of that imported core-module -- this doesn't break the shared-nothing architecture we're striving for with component model since the imported core-module is immutable code.

In wasi-parallel only the callback would modify the memory of the module, the core module only controls execution of the callback. There is functionality to declare what memory the callback would need, because in GPU case it needs to be copied to and/or from device, but core module would only be there to transfer data, rather than accessing it for its own purposes. This callback is what we need the funcref for.

@lukewagner
Copy link

Ah, that helps, thanks. Could you describe in a bit more detail how you're imagining to declare the memory that the callback needs and how it gets passed to/from the callback in the GPU and non-GPU cases?

@abrown
Copy link

abrown commented Jan 31, 2022

Ok, since the last comment on this thread, @penzn, @lukewagner, @egalli, and I met to discuss the issue in more detail. Here is my summary of what we discussed (others can supplement if they want):

  1. the wasi-parallel API is flexible enough to either use shared memory (e.g., by passing a funcref) or not (e.g., using the buffer API here); by "flexible" we mean that we could mold the API to whichever paradigm and mechanism is easiest for users, though we would prefer to not share state to enable parallelism for non-CPU devices
  2. we discussed both of Luke's module-linking suggestions (shared memory, shared-nothing); my takeaway was that both of these involve significant work on the toolchains that compile the Wasm modules to use wasi-parallel (e.g., the compiler must understand module linking, the compiler must be able to generate wasi-parallel code that uses module-linking, there is no way for the user to currently specify what code would be the parallel kernel, etc.)
  3. if these suggestions must be implemented in the toolchain, then WIT and wit-bindgen cannot help developers use wasi-parallel; WIT has no way to express the wasi-parallel API (it would be a core module) and, indeed, it would be outside WIT's scope to understand that wasi-parallel code must be generated by a special toolchain (i.e., to create the module-linking modules, etc.)
  4. we iterated on some alternatives, some way to dynamically pass a "function pointer" without shared state ("use this function without shared memory", first-class instances, etc.), but my takeaway was that Wasm and WASI have no way to express this and it would involve a long, uphill battle before this would be possible

@alexcrichton
Copy link
Member

I'm going to close this issue in an effort to help clean up the wit-bindgen issue tracker for now. The component model support in Wasmtime will not provide access to the caller's function or memory, by design. I think that a number of other features might be necessary for wasi-parallel but that discussion is probably best done in the component-model repository rather than wit-bindgen here as nowadays it's trying to pivot to more an implementation of the component-model rather than a testing-bed for features to go into the spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants