目录
梳理rl的一些新进展
参考深度 | 超越DQN和A3C:深度强化学习领域近期新进展概览
原blog:https://towardsdatascience.com/advanced-reinforcement-learning-6d769f529eb3
DQN
\[
Q\left(s_{t}, a_{t} ; \theta\right) \leftarrow Q\left(s_{t}, a_{t} ; \theta\right)+\alpha[\underbrace{\underbrace{(r_{t}+\max _{a} \hat{Q}\left(s_{t+1}, a ; \theta^{\prime}\right))}_{\text { target }}-Q\left(s_{t}, a_{t} ; \theta\right) )}_{\text {TD-error}}]
\]
ac
\[
d \theta_{v} \leftarrow d \theta_{v}+\partial{\underbrace{\left(R-V\left(s_{i} ; \theta_{v}\right)\right)}_{\text{advantage}}}^{2} / \partial \theta_{v}
\]
Modern Deep Reinforcement Learning Algorithms
原论文有点大。。打开太慢。。转存一份:https://daiwk.github.io/assets/Modern%20Deep%20Reinforcement%20Learning%20Algorithms.pdf
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
blog:https://openai.com/blog/evolution-strategies/
代码:https://github.com/openai/evolution-strategies-starter
Virtual-Taobao: Virtualizing Real-world Online Retail Environment for Reinforcement Learning
Simulating User Feedback for Reinforcement Learning Based Recommendations
https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch