Comments on: DeepMind scientists: Reinforcement learning is enough for general AI https://bdtechtalks.com/2021/06/07/deepmind-artificial-intelligence-reward-maximization/?utm_source=rss&utm_medium=rss&utm_campaign=deepmind-artificial-intelligence-reward-maximization Technology solving problems... and creating new ones Sat, 19 Jun 2021 09:26:29 +0000 hourly 1 By: isvietnamsafe https://bdtechtalks.com/2021/06/07/deepmind-artificial-intelligence-reward-maximization/comment-page-1/#comment-19683 Sat, 19 Jun 2021 09:26:29 +0000 https://bdtechtalks.com/?p=10542#comment-19683 I’ve just read the research paper. This Reward Is Enough theory is similar to Darwin’s fitness theory, but RIE generalises to entities not necessarily requiring reproduction. The theory makes sense as a means to explain.

Darwin knew variations will arise but didn’t know about genes, the authors will also need to show how the modeling, predicting, perceiving, imagining, .. will arise. Jeff Hawkins’ A Thousand Brains theory gives an answer for the first 3.

]]>
By: Tru Do https://bdtechtalks.com/2021/06/07/deepmind-artificial-intelligence-reward-maximization/comment-page-1/#comment-19676 Sat, 19 Jun 2021 01:21:14 +0000 https://bdtechtalks.com/?p=10542#comment-19676 I’ve just read the research paper. This Reward Is Enough theory is similar to Darwin’s fitness theory, but RIE generalises to entities not necessarily requiring reproduction. The theory makes sense as a means to explain.

Darwin knew variations will arise but didn’t know about genes, the authors will also need to show how the modeling, predicting, perceiving, imagining, .. will arise. Jeff Hawkins’ A Thousand Brains theory gives an answer for the first 3.

]]>
By: raul https://bdtechtalks.com/2021/06/07/deepmind-artificial-intelligence-reward-maximization/comment-page-1/#comment-19635 Thu, 17 Jun 2021 13:49:25 +0000 https://bdtechtalks.com/?p=10542#comment-19635 Como no próprio texto citado:
“O mundo natural enfrentado por animais e humanos, e presumivelmente também os ambientes enfrentados no futuro por agentes artificiais, são inerentemente tão complexos que requerem habilidades sofisticadas a fim de ter sucesso (por exemplo, sobreviver) dentro desses ambientes,”
Será que vamos deixar as maquinas com toda essa autonomia para no futuro competir com nos humanos? Esses caras são tão burros assim?
Agentes artificiais são meus ovos, bando de imbecis

]]>
By: Randy Crawford https://bdtechtalks.com/2021/06/07/deepmind-artificial-intelligence-reward-maximization/comment-page-1/#comment-19307 Wed, 09 Jun 2021 17:21:40 +0000 https://bdtechtalks.com/?p=10542#comment-19307 I agree with Roitblat. RL may be sufficient to solve all AI problems, but it just shifts the difficulty from “cognition” to defining the reward function. I don’t know much about RL, but I know a little about planning and Markov Decision Processes, so I’ll venture an opinion on whether RL is a sufficient means of guiding a computer through the Real World, as strong AI must do.

Do humans succeed at complex tasks because we excel at specifying exactly we want before we execute a systematic plan? Do we not recognize the value of learning new/better essential skills before we plan and embark on our next quest? Board games in closed information domains are amenable to efficient solution using RL because the rules don’t change and new players don’t suddenly appear. I’m not convinced that effectiveness translates to the Real World. Dynamic constraints that are incompletely known complicate planning *outside* games to a degree that, yes, you can solve all Real World decision problems using RL, but the amount of rule discovery and refinement and then replanning may be so enormous and laborious that the heat death of the universe is likely to arise before your RL model can figure out how to read a novel and summarize the plot.

Thus unless RL can learn new skills and employ them in future plan design and execution, it will have to search its way through each new plan by refining its ever-more-complicated reward function as constraints change. That approach doesn’t seem especially clever. It sounds a lot like naive search, albeit done efficiently.

]]>