Skip to content

Releases: Eclectic-Sheep/sheeprl

v0.5.7

29 May 09:40
62b3da0
Compare
Choose a tag to compare

v0.5.7 Release Notes

  • Fix policy steps computation for on-policy algorithms in #293

v0.5.6

28 May 09:23
4d09234
Compare
Choose a tag to compare

v0.5.6 Release Notes

  • Fix buffer checkpoint and added the possibility to specify the pre-fill steps upon resuming. Updated the how-tos accordingly in #280
  • Updated how-tos in #281
  • Fix division by zero when computing sps-train in #283
  • Better code naming in #284
  • Fix Minedojo actions stacking (and more generally multi-discrete actions) and missing keys in #286
  • Fix computation of prefill steps as policy steps in #287
  • Fix the Dreamer-V3 imagination notebook in #290
  • Add the ActionsAsObservationWrapper to let the user add the actions played as observations in #291

v0.5.5

22 Apr 09:33
2bae379
Compare
Choose a tag to compare

v0.5.5 Release Notes

  • Added parallel stochastic in dv3: #225
  • Update dependencies and python version: #230, #262, #263
  • Added dv3 notebook for imagination and obs reconstruction: #232
  • Created citation.cff: #233
  • Added replay ratio for off-policy algorithms: #247
  • Single strategy for the player (now it is instantiated in the build_agent() function: #244, #250, #258
  • Proper terminated and truncated signals management: #251, #252, #253
  • Added the possibility to choose whether or not to learn initial recurrent state: #256
  • Added A2C benchmarks: #266
  • Added prepare_obs() function to all the algorithms: #267
  • Improved code readability: #248, #265
  • bug fix: #220, #222, #224, #231, #243, #255, #257

v0.5.4

26 Feb 11:20
0639e16
Compare
Choose a tag to compare

v0.5.4 Release Notes

  • Added Dreamer V3 different sizes configs (#208).
  • Update torch version: 2.2.1 or in [2.0., 2.1.] (#212).
  • Fix observation normalization in dreamer v3 and p2e_dv3 (#214).
  • Update README (#215).
  • Fix installation and agent evaluation: new commands are made available for agent evaluation, model registration, and for the available agents (#216).

v0.5.3

12 Feb 09:34
f688ab4
Compare
Choose a tag to compare

v0.5.3 Release Notes

  • Added benchmarks (#185)
  • Added possibility to use a user-defined evaluation file (#199)
  • Let the user choose for num_threads and matmul precision (#203)
  • Added Super Mario Bros Environment (#204)
  • Fix bugs (#183, #186, #193, #195, #200, #201, #202, #205)

v0.5.2

12 Jan 08:39
2c9c0b3
Compare
Choose a tag to compare

v0.5.2 Release Notes

  • Added A2C algorithm (#33).
  • Added a new how-to on how to add an external algorithm (no need to clone sheeprl locally) in (#175).
  • Added optimizations (#177):
    • Metrics are instantiated only when needed.
    • Removed the torch.cat() operation between empty and dense tensors in the MultiEncoder class.
    • Added possibility not to test the agent after training.
  • Fixed GitHub actions workflow (#180).
  • Fixed bugs (#181, #183).
  • Added benchmarks with respect to StableBaselines3 (#185).
  • Added BernoulliSafeMode distribution, which is a Bernoulli distribution where the mode is computed safely, i.e. it returns self.probs > 0.5 without seeting any NaN (#186) .

v0.5.1

19 Dec 14:11
9218948
Compare
Choose a tag to compare

v0.5.1 Release Notes

v0.5.0

19 Dec 13:22
0453284
Compare
Choose a tag to compare

v0.5.0 Release Notes

  • Added Numpy buffers (#169):
    • The user can now decide if to use the torch.as_tensor function or the torch.from_numpy one to convert the Numpy buffer into tensors when sampling (#172).
  • Added optimizations to reduce training time (#168).
  • Added the possibility to keep only the last n checkpoints in an experiment to avoid filling up the disk (#171).
  • Fix bugs (#167).
  • Update documentation.

v0.4.9

01 Dec 14:49
52f49be
Compare
Choose a tag to compare

v0.4.9 Release Notes

  • Added torch>=2.0 as dependency in #161
  • Let mlflow be an optional package to be installed, i.e. the user can directly install it with pip install sheeprl[mlflow] in #164
  • Fix the resume_from_checkpoint in #163. In particular:
    • Added save_configs function to save the configs of the experiment in the <log_dir>/config.yaml file.
    • Fix the resume from checkpoint of all the algorithms (restart from the correct policy step + fix decoupled).
    • Given more flexibility to p2e finetuning scripts regarding the fabric configs.
    • MineDojo Wrapper: avoid modifying the kwargs (to always save consistent configs in the <log_dir>/config.yaml file).
    • Tensorboar Logger creation: update logger configs to always save consistent configs in the <log_dir>/config.yaml file.
    • Added as_dict() method (to dotdict class) to get a primitive python dictionary from a dotdict object.

v0.4.8

28 Nov 15:07
ad65aad
Compare
Choose a tag to compare

v0.4.8 Release Notes

  • The following config keys have been moved in #158 :
    • cnn_keys, mlp_keys, per_rank_batch_size, per_rank_sequence_length, per_rank_num_batches and total_steps have been moved to the specifig algo config
  • We have added the integration of the MLflowLogger in #159 . This comes with new documentation and notebooks under the example folder on how to use it.