a3c-rl

Use a slightly different algorithm to train a pong agent. reach a mean reward about 12 in 2e7 steps with learning rate 1e-4.

Combine the cpu and gpu. Because I use a Monte-Carlo method, ie. n-steps, for such a episode game. I put the backward part on the gpu which can speed up the training a lot.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
README.md~		README.md~
a3c.py		a3c.py
agents.py		agents.py
atari.py		atari.py
nStepQ.py		nStepQ.py
oneStepQ.py		oneStepQ.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

a3c-rl

About

Releases

Packages

Languages

wuwuwuxxx/a3c-rl

Folders and files

Latest commit

History

Repository files navigation

a3c-rl

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages