Skip to content

wuwuwuxxx/a3c-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

a3c-rl

Use a slightly different algorithm to train a pong agent. reach a mean reward about 12 in 2e7 steps with learning rate 1e-4.

Combine the cpu and gpu. Because I use a Monte-Carlo method, ie. n-steps, for such a episode game. I put the backward part on the gpu which can speed up the training a lot.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages