OpenAI universe

AlphaGO provides us with possibility and imagination that AI one day can do almost things in the future, for example translation, writing, driving and even write high efficiency code, coders will lost their jobs if companies hire AI robots to write code for them.

Let’s back to reality, reinforcement learning algorithm-Monte Carlo Search Tree plays an important role in AlphaGo, reinforcement learning is powerful in interactive environment, especially in scenes for example driving and video games. For those who want to apply RL algorithms to video games environment, the crucial task he should to complete is to create a game simulator, it is a hard work and a waste of time for researchers who aren’t familiar with game development.  Continue Reading


A glimpse of Markov property

Markov property is a core property in Markov Process, understanding it will give you a broader horizon on Reinforcement Learning. It’s simple that Markov Process doesn’t care about the past, however it is the past that definite the present, which means present is the outcome of the past. Nevertheless, the only thing we should do is focus on the present, because the present will be the past.

So, what we should take into consideration? Remembering all the past is not a ideal method, we should summarize them. From Sutton’s book “What we would like, ideally, is a state signal that summarize past sensation compactly, yet in such a way that all relevant information is retained. … A state signal that succeeds in retaining all relevant information is said to be Markov, or to have Markov property.”