-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop synchronous counterparts? #128
Comments
That does make using the library synchronously annoying, but if it gets rid of a lot of code duplication - I'd say it's worth it. |
I would support having async only as well, but one thing I forsee issues with is the current way Buffer is set up. More so, since Buffer is subclassing bytearray, it would be a bit interesting working with it where the reading and writing is asynchronous, but everything built in in bytearray is synchronous. |
Hmm, that's actually a great point which I haven't considered, thanks for bringing it up @CoolCat467! Indeed the Buffer class should probably not become asynchronous as that is pretty misleading on what it's duing. An asnyc function should pretty much always indicate that there will be some I/O operation, during which we can actually switch to another task in the event loop while we're waiting on this operation to finish. Clearly, this is not what buffers do, and I'm not sure how I feel about making those async. That said, the synchronous code is a burden to maintain here, so I'm not quite sure how to resolve this. We could drop async support only in some places, most notably the connection classes, leaving the rest as-is, however that wouldn't actually be all that helpful in removing this burden, as most of the code duplication is in the I will say that I feel pretty strongly against bringing in synchronous code into #129, handling HTTP requests and communications with various APIs. |
I have an idea, but I don't like some of the consequences of it. My idea is that we could write a class decorator or something that would find async functions and provide a sync counterpart that would just be a wrapper for starting an event loop and executing the async code. It would work, but it would introduce a lot of overhead and it would also lead to compatibility issues for working with different asychronous libraries like Trio. Edit: An even better idea would be doing something like writing a macro that would take a file with async functions and remove the |
This actually already exists in mcproto, but is only utilized in tests, to make it eaiser to test both async and sync implementation of the readers and writers. There is the The reason why this isn't used outside of tests is because it completely breaks static analysis due to how dynamic it is. And indeed, it also comes with a processing time overhead. One of the goals of mcproto is to be fully statically typed, and this solution would break that goal. We could probably write stub files that would contain the actual typing information, but at that point, we might as well just go with the current code duplication approach, as with stubs, we'd just end up manually writing the signatures for all of the functions that would be dynamically generated, cancelling the benefits of that dynamic generation.
This is an interesting idea, but it doesn't really solve everything, and would probably be way too complicated to actually implement. Consider for example the connection classes, where asyncio is utilized to create the underlying connection ( |
I agree about the create_connection part, but if the intent is to only have synchronous stuff for Buffer, then it shouldn't be too bad. There are some great modules for modifying python AST, and with structural pattern matching it might not be super hard |
Indeed, if we do this purely to make the buffer work, it should be alright. So now the question is whether doing something like this would actually be a good idea, as opposed to purely whether it would be viable. There is a few things that I don't like about introducing this kind of system: Yet another thing to learnI've recently been trying to reduce the burden of contributing to this repo, as it was (and still is) using quite a lot of various tools already, which require the contributors to understand these (i.e. the tools used for static analysis, formatting, documentation creation (sphinx) and changelog keeping tool (towncrier)). Therefore I'm not sure I like the idea of bringing yet another complex system like this. This system would mean that every contributor would need to be made aware of it's existence, and would require people to run this code generation script any time a change to the async base io classes was made. This is not trivial to explain and it would be pretty difficult to automate (I talk about why that is below), so it would certainly decrease the "ease of contributing" even more. Another thing to consider is that this makes it impossible to use any async calls in the base io async classes, other than when referencing internal async methods, as code generation like this woudln't be able to account for an async call made to an external library (such as the discussed Automation?First of all, we should establish that the sync file can't simply be generated only during a release. That is because pyright would be failing anywhere where the sync classes would be used (so, mainly in the buffer class). This means we will need to keep the generated file around constantly, as if it was just another regular python file in the code-base. However this immediately means that contributors will need to be aware that this file is generated, and that they shouldn't be making any direct changes to it, and instead leaving it to the (theoretically automated) code generation system. So there's already a reason why this can't be completely seamless with contributors not having to learn anything new, even when this system would be automated. Automation approaches, and their issuesOK, so let's consider how this could actually be automated now. With the way we currently have things set up, there are 2 ways automation could be done:
Fundamental automaton issueRegardless of the specific issues these automation approaches, there is a fundamental issue, comming from the fact that we need the generated file to be constantly present and up-to-date for static analysis. That is the simple fact, that any kind of automation will only run so often. In both of the cases above, the code generation would only run after or during commiting. To explain why this is an issue, consider this scenario: You've just made a change to the With pre-commit, you will actually fail to create the commit here, assuming pyright ran after the code generation, which at least forces you to address the changes first, which is certainly better than with the GitHub CI option, but it's still not ideal and it slows the development down. Also, again, users can skip pre-commit, meaning we would need GitHub CI to at least verify that the codegen script was ran, and the sync file is "up to date" with the async one. ConclusionIt's clear that automation would work poorly at best, and would still require the contributors to have some knowledge that there is code-generation going on, forcing them to adjust their workflow to respect that. This will therefore slow down development yet again, and make it even harder to contribute than it already is. I'm therefore not really convinced that this is something worth doing, as it might prove to be even more annoying to deal with than the code duplication caused by sync code. However I don't really have any good alternatives, other than simply keeping both sync and async versions around, or to make the very weird choice of making buffer interactions asynchronous, which I'm really not a big fan of either. Edit: A bit about why making buffer async is a bad ideaTo provide a better explanation, other than "I'm not a big fan of" for why buffer should not be made asynchronous. First of all, as I've mentioned before, an async function implies that there is a blocking I/O operation that will need to be waited for, during which we can switch to something else in the event loop. Clearly though, this is not what's going on in the Additionally though, |
Alright, what about leaving the current synchronous alternatives as they are, but leave some features, especially when it comes to HTTP communication (minecraft APIs) purely async, and just mentioning this in the docs. This would make most people who would want to actually make a bot capable of reaching play state (logging in) have to use async at least for these calls, likely making them use async connections as well, though not forcing them to, and projects that would not need these API calls could remain fully synchronous (only useful for offline (aka "warez") servers, or for things that don't need to go through the login state, i.e. a lib that only obtains the server's status, such as mcstatus) As the sync support is already in place, while it may be a bit hard to orient in, it's unlikely to actually need that many updates, as it's based on how the core communication happens in the minecraft protocol. Minecraft isn't likely to change this between protocol versions, as it would be a completely breaking change, making it impossible for people on this version to potentially even obtain server status from older minecraft version. Because of that, even though it is a significant source of code repetition, it probably won't be a huge burden to maintain. I don't love the idea of this split, where there are some sync alternatives, but not for everything. However as I said, the base io and connection classes are unlikely to need much maintaining, and the newer things that don't necessarily need the support (that don't need to utilize the conneciton classes for example) can remain async only. |
That sounds like a pretty good plan. Slightly unrelated, but will there be support for different asynchronous libraries like Trio? It's not a huge issue, because you can run an asyncio event loop on top of Trio, but it would be nice to have native support for Trio. |
That might require a rework of the Connection classes, as they use |
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Following issue #128, this explains the lack of synchronous alternatives to the new API calls. This issue established, that while the low level interactions will provide support for both sync and async, any non-related functionalities that potentially could use an async version will only have an async version, lowering the burden of maintaining these.
Currently, mcproto provides both asynchronous and synchronous versions for pretty much everything that needs it.
Necessity of synchronous code
In general, people that already have a synchronous code-base won't actually be that disadvantaged if sync support is dropped, moving to purely async model. This is because they can always just use
asyncio.run
in their synchronous code, which will block until the function is complete, essentially acting synchronously, and only write async functions for the code working with mcproto.The only reason synchronous code might be beneficial is to make it easier to use the library for people who don't have that much experience with async code. Indeed, a purely async library might discourage some less experienced people from even trying it out, however, I'd say this library is already relatively complex and will require users to have a certain skill level, at which async likely won't be that "mysterious" and hard to understand. Nevertheless, it might discourage some users from using mcproto.
But it's also important to consider the other side, dropping synchronous versions could be beneficial even to the less experienced people, as they won't accidentally end up using the wrong (synchronous) versions, blocking the rest of their async code. This is actually somewhat common with discord.py users (from what I've seen in people that use mcproto), and simply being purely async would avoid such cases completely.
Code duplication
This effort to support both versions already causes some major code duplication in the codebase, with for example the
protocol/base_io.py
file holding essentially exact duplicates between theBaseAsyncWriter
andBaseSyncWriter
classes, and also betweenBaseAsyncReader
andBaseSyncReader
, which then carries over to duplication inconnection.py
, with the base classesSyncConnection
andAsyncConnection
, and then all subclasses moving from those (TCPAsyncConnection
andTCPSyncConnection
, and similarly for UDP). Another place affected ispackets/interactions.py
, which needs alternatives for both of these approaches in the reader/writer functions for packets (sync_write_packet
andasync_write_packet
, similarly for readers). Eventually, this will also end up affecting any HTTP requests we might end up making (such as for interactions with mojang API, for yggdrasil authentication, but also potentially for OAuth2 based Microsoft authentication).In almost all of these places, the alternative code is just the same rewritten code, with an extra
await
and a change in a function name or type-hint. That means any changes made to one of those will need to be copied to the other, including docstring changes.This code duplication is then also affecting the tests, which currently either have the same full code duplication as in the codebase, or use some complex mechanics to "synchronize" the code, allowing us to only write tests for the synchronous classes, with the async ones getting wrapped to act as the same, synchronous counterparts, with the help of
asyncio.run
. This can make many people confused about how to write new tests, and even how to understand and edit the existing ones.The text was updated successfully, but these errors were encountered: