[gym.vector] Add BatchedVectorEnv, (chunking + flexible n_envs) #2072

lebrice · 2020-10-18T20:08:41Z

Adds the following features, compared to using the vectorized Async and Sync
VectorEnvs:

Chunking: Running more than one environment per worker. This is done by
passing SyncVectorEnvs as the env_fns to the AsyncVectorEnv.
Flexible batch size: Supports any number of environments, irrespective
of the number of workers or of CPUs. The number of environments will be
spread out as equally as possible between the workers.

For example, if you want to have a batch_size of 17, and n_workers is 6,
then the number of environments per worker will be: [3, 3, 3, 3, 3, 2].

Internally, this works by creating up to two AsyncVectorEnvs, env_a and
env_b. If the number of envs (batch_size) isn't a multiple of the number
of workers, then we create the second AsyncVectorEnv (env_b).

In the first environment (env_a), each env will contain
ceil(n_envs / n_workers) each. If env_b is needed, then each of its envs
will contain floor(n_envs / n_workers) environments.

The observations/actions/rewards are reshaped to be (n_envs, *shape), i.e.
they don't have an extra 'chunk' dimension.

Signed-off-by: Fabrice Normandin fabrice.normandin@gmail.com

lebrice · 2020-10-18T20:11:18Z

Resolves #1795

jkterry1 · 2020-11-02T00:44:33Z

@lebrice can you please get this to pass tests? This is something of note to me.

jkterry1 · 2020-11-03T23:59:47Z

@lebrice this appears to still be failing tests?

lebrice · 2020-11-04T01:26:25Z

Hey @justinkterry yes, the tests for python versions below 3.6 are failing, I think it might have to do with the type hints.

I'll take a quick look, but if it comes to the type hints, I'm not currently planning on removing them to be honest, and this PR probably won't get accepted even if I did, given that the people at openai are probably working on other things..

jkterry1 · 2020-11-07T01:11:43Z

@lebrice Python 3.5 is removed now (#2084), try again?

lebrice · 2020-11-17T05:36:47Z

Hey @justinkterry, @pzhokhov @tristandeleu , the issues above have been fixed. Would you mind maybe taking a look over the code and letting me know what you think?

tristandeleu

The main strength of BatchedVectorEnv is to allow the batch size to not be a multiple of the number of workers. In my opinion this is a somewhat niche use case. Even the chunking option in SubprocVecEnv, which prompted #1795, only handles cases where the overall number of environments is a multiple of the number of workers.

I would rather see a fix to AsyncVectorEnv to allow it to run SyncVectorEnv environments first, possibly with a VectorEnvWrapper to flatten the first two batch dimensions (similar to you unroll function), before going for a new VectorEnv subclass. And if there is major interest in having instances where the batch size is not a multiple of the number of workers, then we can think about adding BatchedVectorEnv.

That being said @pzhokhov & @christopherhesse have the final say on it, and this is a good PR!

tristandeleu · 2020-11-17T23:45:58Z

gym/vector/batched_vector_env.py

+from gym.vector.utils.spaces import _BaseGymSpaces
+
+
+@singledispatch


That's really cool! It could be worth updating some of the utility functions in gym.vector with singledispatch (in another PR of course, this would be outside the scope of this one)

EDIT: I just saw #2093 that keeps track of that thanks!

gym/vector/vector_env.py

tristandeleu · 2020-11-18T00:04:39Z

gym/vector/batched_vector_env.py

+        #     return self.viewer.isopen
+
+
+def distribute(values: Sequence[T], n_groups: int) -> List[Sequence[T]]:


Utility functions could possibly be moved to somewhere in gym.vector.utils if you think this makes sense.

They also require tests (for distribute, chunk, unroll, fuse_and_batch, n_consecutive and zip_dicts).

tristandeleu · 2020-11-18T00:28:02Z

gym/vector/tests/test_batched_vector_env.py

+    env = BatchedVectorEnv(env_fns, n_workers=n_workers)
+    env.seed(123)
+
+    assert env.single_observation_space[0].shape == (4,)


It would be more explicit to equal env.single_observation_space to its Tuple space directly, instead of checking individual elements. Maybe something like

env.single_observation_space == Tuple(Box(-high, high, dtype=np.float32), Discrete(1))

And same thing for env.observation_space below. I understand that the observation space for CartPole is a bit verbose in its definition of high, which might not be relevant in this test, but maybe this calls for a simpler environment (e.g. one of gym.envs.unittest).

Removed the hard-coded "CartPole" environment reference, and I compared with the single observation spaces and their batched counterparts.

tristandeleu · 2020-11-18T00:42:08Z

gym/vector/batched_vector_env.py

+    return groups
+
+
+def unroll(chunks: Sequence[Sequence[T]], item_space: Space = None) -> List[T]:


This breaks if chunks is a dict (if the space is a dict space). Here is a snippet:

import gym import numpy as np from collections import OrderedDict from gym.spaces import Dict, Box, Discrete from gym.vector import BatchedVectorEnv class DictEnv(gym.Env): def __init__(self): super(DictEnv, self).__init__() self.observation_space = Dict(OrderedDict([ ('position', Box(-2., 2., shape=(2,), dtype=np.float32)), ('velocity', Box(0., 1., shape=(2,), dtype=np.float32)) ])) self.action_space = Discrete(2) def reset(self): return self.observation_space.sample() def step(self, action): observation = self.observation_space.sample() reward, done = 0., False return (observation, reward, done, {}) def make_env(seed): def _make_env(): return DictEnv() return _make_env env = BatchedVectorEnv([make_env(i) for i in range(10)], n_workers=4) observations = env.reset()

Here obs_a (after the call to unroll) in reset is ['p', 'o', 's', 'i', 't', 'i', 'o', 'n', 'v', 'e', 'l', 'o', 'c', 'i', 't', 'y'].

EDIT: For reference, with env = AsyncVectorEnv([make_env(i) for i in range(4)]), observations is a dict with two arrays of size (4, 2).

Good catch! Thanks for pointing this out. I was indeed missing a test case for this, so I reused your snippet directly as a test, and this is now behaving correctly.
I'll probably also add other tests to cover when the actions are Tuples/Dicts/weird spaces as well.

gym/vector/batched_vector_env.py

jkterry1 · 2021-07-27T03:48:23Z

@lebrice I'm a maintainer now so I can actually merge this. If you could please add detailed automated testing of this I should be happy to, pending the secondary approval of @benblack769.

jkterry1 · 2021-07-30T21:08:20Z

Reviewer: @benblack769

Adds the following features, compared to using the vectorized Async and Sync VectorEnvs: - Chunking: Running more than one environment per worker. This is done by passing `SyncVectorEnv`s as the env_fns to the `AsyncVectorEnv`. - Flexible batch size: Supports any number of environments, irrespective of the number of workers or of CPUs. The number of environments will be spread out as equally as possible between the workers. For example, if you want to have a batch_size of 17, and n_workers is 6, then the number of environments per worker will be: [3, 3, 3, 3, 3, 2]. Internally, this works by creating up to two AsyncVectorEnvs, env_a and env_b. If the number of envs (batch_size) isn't a multiple of the number of workers, then we create the second AsyncVectorEnv (env_b). In the first environment (env_a), each env will contain ceil(n_envs / n_workers) each. If env_b is needed, then each of its envs will contain floor(n_envs / n_workers) environments. The observations/actions/rewards are reshaped to be (n_envs, *shape), i.e. they don't have an extra 'chunk' dimension. Additionally, this adds the following change to the AsyncVectorEnv and SyncVectorEnv classes: - When some environments have `done=True` while stepping, those environments are reset, as was done previously. Additionally, the final observation for those environments is placed in the info dict at key FINAL_STATE_KEY (currently 'final_state'). Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

pseudo-rnd-thoughts · 2022-04-17T16:49:02Z

@lebrice Do you have any plans for this PR?

lebrice · 2022-04-25T22:10:33Z

Hey @pseudo-rnd-thoughts , Yes I do, but I'm waiting on #2104 before I work more on this.

lebrice changed the title ~~Add BatchedVectorEnv, (chunking + flexible n_envs)~~ [gym.vector] Add BatchedVectorEnv, (chunking + flexible n_envs) Oct 18, 2020

tristandeleu reviewed Nov 18, 2020

View reviewed changes

This was referenced Nov 18, 2020

[Enhancement] Remove / refactor code duplicated from openai gym DLR-RM/stable-baselines3#229

Closed

Bugfix: Allow Nesting of Sync/Async VectorEnvs #2104

Closed

tristandeleu mentioned this pull request Jul 31, 2021

Discussion of Vector Environment API Breaking Changes #2279

Open

lebrice added 11 commits August 3, 2021 10:59

BatchedVectorEnv follows VectorEnv Async/wait API

574385f

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Add support for Tuple observation spaces, fix test

5ac3a92

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Minor cleanup of test for reset behaviour

70a080f

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Add test for tuple observations

9f7e040

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Debugging tests, Removed assert

4c386a2

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Fixed singledispatch use for python3.6

b7bbe0c

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Commented out the 'render' with tile_images block

f2b7823

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Removed 'FINAL_STATE_KEY' feature

c056f7b

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Fixed bug with Dict observations

2921a2c

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Removed hard-coded check to CartPole Space shape

b274060

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

lebrice force-pushed the batched_vector_env branch from ec8cbbf to b274060 Compare August 3, 2021 15:02

lebrice added 2 commits August 3, 2021 11:03

Remove merge conflict artifacts

e7313d8

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

Add bugfix patch for nesting of VectorEnvs

96a2738

Signed-off-by: Fabrice Normandin <fabrice.normandin@gmail.com>

jkterry1 closed this May 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[gym.vector] Add BatchedVectorEnv, (chunking + flexible n_envs) #2072

[gym.vector] Add BatchedVectorEnv, (chunking + flexible n_envs) #2072

lebrice commented Oct 18, 2020 •

edited

Loading

lebrice commented Oct 18, 2020 •

edited

Loading

jkterry1 commented Nov 2, 2020

jkterry1 commented Nov 3, 2020

lebrice commented Nov 4, 2020 •

edited

Loading

jkterry1 commented Nov 7, 2020 •

edited

Loading

lebrice commented Nov 17, 2020 •

edited

Loading

tristandeleu left a comment

tristandeleu Nov 17, 2020

tristandeleu Nov 18, 2020

tristandeleu Nov 18, 2020

lebrice Nov 18, 2020

tristandeleu Nov 18, 2020

lebrice Nov 18, 2020 •

edited

Loading

jkterry1 commented Jul 27, 2021

jkterry1 commented Jul 30, 2021

pseudo-rnd-thoughts commented Apr 17, 2022

lebrice commented Apr 25, 2022

		from gym.vector.utils.spaces import _BaseGymSpaces


		@singledispatch

		# return self.viewer.isopen


		def distribute(values: Sequence[T], n_groups: int) -> List[Sequence[T]]:

		return groups


		def unroll(chunks: Sequence[Sequence[T]], item_space: Space = None) -> List[T]:

[gym.vector] Add BatchedVectorEnv, (chunking + flexible n_envs) #2072

[gym.vector] Add BatchedVectorEnv, (chunking + flexible n_envs) #2072

Conversation

lebrice commented Oct 18, 2020 • edited Loading

lebrice commented Oct 18, 2020 • edited Loading

jkterry1 commented Nov 2, 2020

jkterry1 commented Nov 3, 2020

lebrice commented Nov 4, 2020 • edited Loading

jkterry1 commented Nov 7, 2020 • edited Loading

lebrice commented Nov 17, 2020 • edited Loading

tristandeleu left a comment

Choose a reason for hiding this comment

tristandeleu Nov 17, 2020

Choose a reason for hiding this comment

tristandeleu Nov 18, 2020

Choose a reason for hiding this comment

tristandeleu Nov 18, 2020

Choose a reason for hiding this comment

lebrice Nov 18, 2020

Choose a reason for hiding this comment

tristandeleu Nov 18, 2020

Choose a reason for hiding this comment

lebrice Nov 18, 2020 • edited Loading

Choose a reason for hiding this comment

jkterry1 commented Jul 27, 2021

jkterry1 commented Jul 30, 2021

pseudo-rnd-thoughts commented Apr 17, 2022

lebrice commented Apr 25, 2022

lebrice commented Oct 18, 2020 •

edited

Loading

lebrice commented Oct 18, 2020 •

edited

Loading

lebrice commented Nov 4, 2020 •

edited

Loading

jkterry1 commented Nov 7, 2020 •

edited

Loading

lebrice commented Nov 17, 2020 •

edited

Loading

lebrice Nov 18, 2020 •

edited

Loading