reinforcement learning - TechTalks https://bdtechtalks.com Technology solving problems... and creating new ones Mon, 21 Apr 2025 12:50:16 +0000 en-US hourly 1 https://i0.wp.com/bdtechtalks.com/wp-content/uploads/2018/02/cropped-TechTalks-logo.jpg?fit=32%2C32&ssl=1 reinforcement learning - TechTalks https://bdtechtalks.com 32 32 99082954 Are we at the cusp of a new era for artificial intelligence? https://bdtechtalks.com/2025/04/21/are-we-at-the-cusp-of-a-new-era-for-artificial-intelligence/?utm_source=rss&utm_medium=rss&utm_campaign=are-we-at-the-cusp-of-a-new-era-for-artificial-intelligence Mon, 21 Apr 2025 12:50:14 +0000 https://bdtechtalks.com/?p=24412 The "Era of Experience" envisions AI's evolution beyond human data, emphasizing self-learning from real-world interactions. But challenges loom for this vision.

The post Are we at the cusp of a new era for artificial intelligence? first appeared on TechTalks.

]]>
24412
Under the hood: The Innovations powering DeepSeek’s AI breakthrough https://bdtechtalks.com/2025/04/07/deepseek-innovations/?utm_source=rss&utm_medium=rss&utm_campaign=deepseek-innovations https://bdtechtalks.com/2025/04/07/deepseek-innovations/#respond Mon, 07 Apr 2025 13:00:00 +0000 https://bdtechtalks.com/?p=24275 Here is how DeepSeek models disrupted AI norms and revealed that outstanding performance and efficiency don’t require secrecy

The post Under the hood: The Innovations powering DeepSeek’s AI breakthrough first appeared on TechTalks.

]]>
https://bdtechtalks.com/2025/04/07/deepseek-innovations/feed/ 0 24275
How UC Berkeley is making humanoid robotic research fast and affordable https://bdtechtalks.com/2024/08/19/berkeley-humanoid-robot/?utm_source=rss&utm_medium=rss&utm_campaign=berkeley-humanoid-robot https://bdtechtalks.com/2024/08/19/berkeley-humanoid-robot/#respond Mon, 19 Aug 2024 12:36:12 +0000 https://bdtechtalks.com/?p=22141 Researchers at UC Berkeley have released a mid-sized humanoid robot that is safe, affordable, and has a thin sim-to-real gap.

The post How UC Berkeley is making humanoid robotic research fast and affordable first appeared on TechTalks.

]]>
https://bdtechtalks.com/2024/08/19/berkeley-humanoid-robot/feed/ 0 22141
Diffusion models are now turbocharging reinforcement learning systems https://bdtechtalks.com/2024/03/04/diffusion-world-model/?utm_source=rss&utm_medium=rss&utm_campaign=diffusion-world-model https://bdtechtalks.com/2024/03/04/diffusion-world-model/#respond Mon, 04 Mar 2024 14:00:00 +0000 https://bdtechtalks.com/?p=20996 Diffusion models are best known for their image-generation abilities. Now, they are being used to learn world models for reinforcement learning systems.

The post Diffusion models are now turbocharging reinforcement learning systems first appeared on TechTalks.

]]>
https://bdtechtalks.com/2024/03/04/diffusion-world-model/feed/ 0 20996
New AI technique uses language to learn world models https://bdtechtalks.com/2023/08/07/dynalang-language-world-models/?utm_source=rss&utm_medium=rss&utm_campaign=dynalang-language-world-models https://bdtechtalks.com/2023/08/07/dynalang-language-world-models/#comments Mon, 07 Aug 2023 13:00:00 +0000 https://bdtechtalks.com/?p=17020 A new paper by UC Berkeley presents a new technique for reinforcement learning agents to learn better world models through language

The post New AI technique uses language to learn world models first appeared on TechTalks.

]]>
https://bdtechtalks.com/2023/08/07/dynalang-language-world-models/feed/ 2 17020
What is reinforcement learning from human feedback (RLHF)? https://bdtechtalks.com/2023/01/16/what-is-rlhf/?utm_source=rss&utm_medium=rss&utm_campaign=what-is-rlhf https://bdtechtalks.com/2023/01/16/what-is-rlhf/#comments Mon, 16 Jan 2023 14:00:00 +0000 https://bdtechtalks.com/?p=15625 Reinforcement learning from human feedback (RLHF) is the technique that has made ChatGPT very impressive. But there is more to RLHF that large language models (LLM).

The post What is reinforcement learning from human feedback (RLHF)? first appeared on TechTalks.

]]>
https://bdtechtalks.com/2023/01/16/what-is-rlhf/feed/ 1 15625
Is reinforcement learning overhyped? https://bdtechtalks.com/2022/10/20/is-reinforcement-learning-overhyped/?utm_source=rss&utm_medium=rss&utm_campaign=is-reinforcement-learning-overhyped https://bdtechtalks.com/2022/10/20/is-reinforcement-learning-overhyped/#comments Thu, 20 Oct 2022 13:00:00 +0000 https://bdtechtalks.com/?p=14987 By Aleksandras Šulženko Imagine you are about to sit down to play a game with a friend. But this isn’t just any friend – it’s a computer program that doesn’t know the rules of the game. It does, however, understand that it has a goal, and that goal is to win. Because this friend doesn’t […]

The post Is reinforcement learning overhyped? first appeared on TechTalks.

]]>
https://bdtechtalks.com/2022/10/20/is-reinforcement-learning-overhyped/feed/ 1 14987
DeepMind AlphaTensor: The delicate balance between human and artificial intelligence https://bdtechtalks.com/2022/10/10/deepmind-alphatensor/?utm_source=rss&utm_medium=rss&utm_campaign=deepmind-alphatensor https://bdtechtalks.com/2022/10/10/deepmind-alphatensor/#comments Mon, 10 Oct 2022 13:00:00 +0000 https://bdtechtalks.com/?p=14894 DeepMind AlphaTensor shows how the right combination of human and artificial intelligence can find solutions to complicated problems.

The post DeepMind AlphaTensor: The delicate balance between human and artificial intelligence first appeared on TechTalks.

]]>
https://bdtechtalks.com/2022/10/10/deepmind-alphatensor/feed/ 2 14894
Reinforcement learning models are prone to membership inference attacks https://bdtechtalks.com/2022/08/15/reinforcement-learning-membership-inference-attacks/?utm_source=rss&utm_medium=rss&utm_campaign=reinforcement-learning-membership-inference-attacks https://bdtechtalks.com/2022/08/15/reinforcement-learning-membership-inference-attacks/#respond Mon, 15 Aug 2022 13:00:00 +0000 https://bdtechtalks.com/?p=14436 A new study by researchers at McGill University, Mila, and the University of Waterloo highlights the privacy threats of deep reinforcement learning algorithms.

The post Reinforcement learning models are prone to membership inference attacks first appeared on TechTalks.

]]>
https://bdtechtalks.com/2022/08/15/reinforcement-learning-membership-inference-attacks/feed/ 0 14436
A gentle introduction to model-free and model-based reinforcement learning https://bdtechtalks.com/2022/06/13/model-free-and-model-based-rl/?utm_source=rss&utm_medium=rss&utm_campaign=model-free-and-model-based-rl https://bdtechtalks.com/2022/06/13/model-free-and-model-based-rl/#respond Mon, 13 Jun 2022 13:00:00 +0000 https://bdtechtalks.com/?p=13981 Neuroscientist Daeyeol Lee discusses different modes of reinforcement learning in humans and animals, AI and natural intelligence, and future directions of research.

The post A gentle introduction to model-free and model-based reinforcement learning first appeared on TechTalks.

]]>
https://bdtechtalks.com/2022/06/13/model-free-and-model-based-rl/feed/ 0 13981
This deep learning technique solves one of the tough challenges of robotics https://bdtechtalks.com/2022/05/09/diffskill-robotics-deformable-object-manipulation/?utm_source=rss&utm_medium=rss&utm_campaign=diffskill-robotics-deformable-object-manipulation https://bdtechtalks.com/2022/05/09/diffskill-robotics-deformable-object-manipulation/#respond Mon, 09 May 2022 13:00:00 +0000 https://bdtechtalks.com/?p=13684 DiffSkill is a deep learning technique that makes robots much more stable at handling deformable objects.

The post This deep learning technique solves one of the tough challenges of robotics first appeared on TechTalks.

]]>
https://bdtechtalks.com/2022/05/09/diffskill-robotics-deformable-object-manipulation/feed/ 0 13684
New RL technique achieves superior performance in control tasks https://bdtechtalks.com/2022/04/04/reinforcement-learning-td-mpc/?utm_source=rss&utm_medium=rss&utm_campaign=reinforcement-learning-td-mpc https://bdtechtalks.com/2022/04/04/reinforcement-learning-td-mpc/#respond Mon, 04 Apr 2022 13:04:02 +0000 https://bdtechtalks.com/?p=13354 Researchers at UCSD show that combining model-free and model-based reinforcement learning improves performance on control tasks.

The post New RL technique achieves superior performance in control tasks first appeared on TechTalks.

]]>
https://bdtechtalks.com/2022/04/04/reinforcement-learning-td-mpc/feed/ 0 13354
Reinforcement learning for the real world https://bdtechtalks.com/2022/01/06/real-world-reinforcement-learning/?utm_source=rss&utm_medium=rss&utm_campaign=real-world-reinforcement-learning https://bdtechtalks.com/2022/01/06/real-world-reinforcement-learning/#respond Thu, 06 Jan 2022 14:00:00 +0000 https://bdtechtalks.com/?p=12595 UC Berkeley's Sergey Levine “self-supervised offline reinforcement learning” as a promising direction of research for AI.

The post Reinforcement learning for the real world first appeared on TechTalks.

]]>
https://bdtechtalks.com/2022/01/06/real-world-reinforcement-learning/feed/ 0 12595
DeepMind RL method promises better co-op between AI and humans https://bdtechtalks.com/2021/11/22/deepmind-reinforcement-learning-fictitious-coplay/?utm_source=rss&utm_medium=rss&utm_campaign=deepmind-reinforcement-learning-fictitious-coplay https://bdtechtalks.com/2021/11/22/deepmind-reinforcement-learning-fictitious-coplay/#respond Mon, 22 Nov 2021 14:00:00 +0000 https://bdtechtalks.com/?p=12181 AI researchers at DeepMind present a new technique to improve the capacity of reinforcement learning agents to cooperate with humans at different skill levels.

The post DeepMind RL method promises better co-op between AI and humans first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/11/22/deepmind-reinforcement-learning-fictitious-coplay/feed/ 0 12181
Reinforcement learning frustrates humans in teamplay, MIT study finds https://bdtechtalks.com/2021/11/01/reinforcement-learning-hanabi/?utm_source=rss&utm_medium=rss&utm_campaign=reinforcement-learning-hanabi https://bdtechtalks.com/2021/11/01/reinforcement-learning-hanabi/#comments Mon, 01 Nov 2021 13:00:00 +0000 https://bdtechtalks.com/?p=11981 A new study by MIT Lincoln Laboratory shows Hanabi players are frustrated when teamed up with top-performing reinforcement learning systems.

The post Reinforcement learning frustrates humans in teamplay, MIT study finds first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/11/01/reinforcement-learning-hanabi/feed/ 1 11981
Stanford reinforcement learning system simulates evolution https://bdtechtalks.com/2021/10/25/stanford-deep-evolutionary-reinforcement-learning/?utm_source=rss&utm_medium=rss&utm_campaign=stanford-deep-evolutionary-reinforcement-learning https://bdtechtalks.com/2021/10/25/stanford-deep-evolutionary-reinforcement-learning/#respond Mon, 25 Oct 2021 13:00:00 +0000 https://bdtechtalks.com/?p=11905 A new reinforcement learning framework developed by AI researchers at Stanford simulates evolution.

The post Stanford reinforcement learning system simulates evolution first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/10/25/stanford-deep-evolutionary-reinforcement-learning/feed/ 0 11905
Reinforcement learning improves game testing, EA’s AI team finds https://bdtechtalks.com/2021/10/04/ea-reinforcement-learning-game-testing/?utm_source=rss&utm_medium=rss&utm_campaign=ea-reinforcement-learning-game-testing https://bdtechtalks.com/2021/10/04/ea-reinforcement-learning-game-testing/#respond Mon, 04 Oct 2021 13:00:00 +0000 https://bdtechtalks.com/?p=11716 Adversarial reinforcement learning can help automate large parts of testing game environments for bugs and playability issues, EA's AI research team finds.

The post Reinforcement learning improves game testing, EA’s AI team finds first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/10/04/ea-reinforcement-learning-game-testing/feed/ 0 11716
Demystifying deep reinforcement learning https://bdtechtalks.com/2021/09/02/deep-reinforcement-learning-explainer/?utm_source=rss&utm_medium=rss&utm_campaign=deep-reinforcement-learning-explainer https://bdtechtalks.com/2021/09/02/deep-reinforcement-learning-explainer/#respond Thu, 02 Sep 2021 13:00:00 +0000 https://bdtechtalks.com/?p=11410 Deep reinforcement learning is one of the most interesting branches of AI, responsible for achievements such as mastering complex games, self-driving cars, and robotics.

The post Demystifying deep reinforcement learning first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/09/02/deep-reinforcement-learning-explainer/feed/ 0 11410
Is DeepMind’s new reinforcement learning system a step toward general AI? https://bdtechtalks.com/2021/08/02/deepmind-xland-deep-reinforcement-learning/?utm_source=rss&utm_medium=rss&utm_campaign=deepmind-xland-deep-reinforcement-learning https://bdtechtalks.com/2021/08/02/deepmind-xland-deep-reinforcement-learning/#comments Mon, 02 Aug 2021 13:00:00 +0000 https://bdtechtalks.com/?p=11106 DeepMind has released a new paper that shows impressive advances in reinforcement learning. How far does it bring us toward general AI?

The post Is DeepMind’s new reinforcement learning system a step toward general AI? first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/08/02/deepmind-xland-deep-reinforcement-learning/feed/ 2 11106
Deep reinforcement learning helps us master complexity https://bdtechtalks.com/2021/07/22/deep-reinforcement-learning-complexity/?utm_source=rss&utm_medium=rss&utm_campaign=deep-reinforcement-learning-complexity https://bdtechtalks.com/2021/07/22/deep-reinforcement-learning-complexity/#respond Thu, 22 Jul 2021 13:00:00 +0000 https://bdtechtalks.com/?p=11019 By Chris Nicholson Deep reinforcement learning—where machines learn by testing the consequences of their actions—is one of the most promising and impactful areas of artificial intelligence. It combines deep neural networks with reinforcement learning, which together can be trained to achieve goals over many steps. It’s a crucial part of self-driving vehicles and industrial robots, […]

The post Deep reinforcement learning helps us master complexity first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/07/22/deep-reinforcement-learning-complexity/feed/ 0 11019
Evolution, rewards, and artificial intelligence https://bdtechtalks.com/2021/06/17/evolution-rewards-artificial-intelligence/?utm_source=rss&utm_medium=rss&utm_campaign=evolution-rewards-artificial-intelligence https://bdtechtalks.com/2021/06/17/evolution-rewards-artificial-intelligence/#comments Thu, 17 Jun 2021 13:00:00 +0000 https://bdtechtalks.com/?p=10646 A paper by DeepMind scientist triggered much debate about the path to artificial intelligence. Here, we'll try to draw the line between theory and practice.

The post Evolution, rewards, and artificial intelligence first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/06/17/evolution-rewards-artificial-intelligence/feed/ 1 10646
What Google’s AI-designed chip tells us about the nature of intelligence https://bdtechtalks.com/2021/06/14/google-reinforcement-learning-ai-chip-design/?utm_source=rss&utm_medium=rss&utm_campaign=google-reinforcement-learning-ai-chip-design https://bdtechtalks.com/2021/06/14/google-reinforcement-learning-ai-chip-design/#comments Mon, 14 Jun 2021 13:00:00 +0000 https://bdtechtalks.com/?p=10609 Google's scientists developed a reinforcement learning system that can design floorplans for AI chips. The achievement shows proves how humans and AI can collaborate together.

The post What Google’s AI-designed chip tells us about the nature of intelligence first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/06/14/google-reinforcement-learning-ai-chip-design/feed/ 1 10609
DeepMind scientists: Reinforcement learning is enough for general AI https://bdtechtalks.com/2021/06/07/deepmind-artificial-intelligence-reward-maximization/?utm_source=rss&utm_medium=rss&utm_campaign=deepmind-artificial-intelligence-reward-maximization https://bdtechtalks.com/2021/06/07/deepmind-artificial-intelligence-reward-maximization/#comments Mon, 07 Jun 2021 12:30:00 +0000 https://bdtechtalks.com/?p=10542 In a new paper, scientists at DeepMind suggest that reward maximization and reinforcement learning are enough to develop artificial general intelligence.

The post DeepMind scientists: Reinforcement learning is enough for general AI first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/06/07/deepmind-artificial-intelligence-reward-maximization/feed/ 4 10542
Reinforcement learning challenge to push boundaries of embodied AI https://bdtechtalks.com/2021/04/26/reinforcement-learning-embodied-ai/?utm_source=rss&utm_medium=rss&utm_campaign=reinforcement-learning-embodied-ai https://bdtechtalks.com/2021/04/26/reinforcement-learning-embodied-ai/#respond Mon, 26 Apr 2021 13:00:00 +0000 https://bdtechtalks.com/?p=10174 The ThreeDWorld Transport Challenge will test the limits of reinforcement learning in solving task and motion planning problems.

The post Reinforcement learning challenge to push boundaries of embodied AI first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/04/26/reinforcement-learning-embodied-ai/feed/ 0 10174
How reinforcement learning chooses the ads you see https://bdtechtalks.com/2021/02/22/reinforcement-learning-ad-optimization/?utm_source=rss&utm_medium=rss&utm_campaign=reinforcement-learning-ad-optimization https://bdtechtalks.com/2021/02/22/reinforcement-learning-ad-optimization/#comments Mon, 22 Feb 2021 14:00:00 +0000 https://bdtechtalks.com/?p=9626 Reinforcement learning enables ad agencies to make billions of dollars by optimizing ads for viewers and maximizing click-through rates.

The post How reinforcement learning chooses the ads you see first appeared on TechTalks.

]]>
https://bdtechtalks.com/2021/02/22/reinforcement-learning-ad-optimization/feed/ 1 9626
DeepMind AlphaStar: AI breakthrough or pushing the limits of reinforcement learning? https://bdtechtalks.com/2019/11/04/deepmind-ai-starcraft-2-reinforcement-learning/?utm_source=rss&utm_medium=rss&utm_campaign=deepmind-ai-starcraft-2-reinforcement-learning https://bdtechtalks.com/2019/11/04/deepmind-ai-starcraft-2-reinforcement-learning/#respond Mon, 04 Nov 2019 14:00:58 +0000 https://bdtechtalks.com/?p=5838 DeepMind's AlphaStar proves we can still push the limits of AI. But it also reminds us of the challenges we must overcome to replicate the human brain.

The post DeepMind AlphaStar: AI breakthrough or pushing the limits of reinforcement learning? first appeared on TechTalks.

]]>
https://bdtechtalks.com/2019/11/04/deepmind-ai-starcraft-2-reinforcement-learning/feed/ 0 5838
Reinforcement learning helped robots solve Rubik’s Cube—does it matter? https://bdtechtalks.com/2019/10/21/openai-rubiks-cube-reinforcement-learning/?utm_source=rss&utm_medium=rss&utm_campaign=openai-rubiks-cube-reinforcement-learning https://bdtechtalks.com/2019/10/21/openai-rubiks-cube-reinforcement-learning/#respond Mon, 21 Oct 2019 13:00:25 +0000 https://bdtechtalks.com/?p=5745 OpenAI created a robotic hand that solves the Rubik's Cube. It's an interesting feat, but not a breakthrough in artificial intelligence.

The post Reinforcement learning helped robots solve Rubik’s Cube—does it matter? first appeared on TechTalks.

]]>
https://bdtechtalks.com/2019/10/21/openai-rubiks-cube-reinforcement-learning/feed/ 0 5745
What happens when AI plays hide-and-seek 500 million times https://bdtechtalks.com/2019/09/23/openai-hide-and-seek-reinforcement-learning/?utm_source=rss&utm_medium=rss&utm_campaign=openai-hide-and-seek-reinforcement-learning https://bdtechtalks.com/2019/09/23/openai-hide-and-seek-reinforcement-learning/#comments Mon, 23 Sep 2019 13:00:12 +0000 https://bdtechtalks.com/?p=5528 OpenAI created an artificial intelligence system that develops behavior by playing hide-and-seek millions of time. Here's what we learned.

The post What happens when AI plays hide-and-seek 500 million times first appeared on TechTalks.

]]>
https://bdtechtalks.com/2019/09/23/openai-hide-and-seek-reinforcement-learning/feed/ 1 5528
How AI can enhance drone flight https://bdtechtalks.com/2019/06/17/neuroflight-neural-networks-drone-controller/?utm_source=rss&utm_medium=rss&utm_campaign=neuroflight-neural-networks-drone-controller https://bdtechtalks.com/2019/06/17/neuroflight-neural-networks-drone-controller/#respond Mon, 17 Jun 2019 13:00:22 +0000 https://bdtechtalks.com/?p=5002 Using neural networks and machine learning, researchers at Boston University have developed a state-of-the-art controller system for drones.

The post How AI can enhance drone flight first appeared on TechTalks.

]]>
https://bdtechtalks.com/2019/06/17/neuroflight-neural-networks-drone-controller/feed/ 0 5002
The interesting facts behind DeepMind’s Quake-playing AI https://bdtechtalks.com/2019/06/03/deepmind-ai-quake-iii-arena-ctf/?utm_source=rss&utm_medium=rss&utm_campaign=deepmind-ai-quake-iii-arena-ctf https://bdtechtalks.com/2019/06/03/deepmind-ai-quake-iii-arena-ctf/#respond Mon, 03 Jun 2019 13:00:19 +0000 https://bdtechtalks.com/?p=4967 DeepMind latest paper introduced an AI that has mastered the game Quake III. While Quake isn't the most complicated game AI has mastered, but it still presents some interesting challenges.

The post The interesting facts behind DeepMind’s Quake-playing AI first appeared on TechTalks.

]]>
https://bdtechtalks.com/2019/06/03/deepmind-ai-quake-iii-arena-ctf/feed/ 0 4967
What is reinforcement learning? https://bdtechtalks.com/2019/05/28/what-is-reinforcement-learning/?utm_source=rss&utm_medium=rss&utm_campaign=what-is-reinforcement-learning https://bdtechtalks.com/2019/05/28/what-is-reinforcement-learning/#respond Tue, 28 May 2019 13:00:34 +0000 https://bdtechtalks.com/?p=4930 From game-playing bots to robotic hands that dexterously handle objects, reinforcement learning creates AI models that requires little training data.

The post What is reinforcement learning? first appeared on TechTalks.

]]>
https://bdtechtalks.com/2019/05/28/what-is-reinforcement-learning/feed/ 0 4930
AI defeated human champions at Dota 2. Here’s what we learned. https://bdtechtalks.com/2019/04/17/openai-five-neural-networks-dota-2/?utm_source=rss&utm_medium=rss&utm_campaign=openai-five-neural-networks-dota-2 https://bdtechtalks.com/2019/04/17/openai-five-neural-networks-dota-2/#respond Wed, 17 Apr 2019 13:00:10 +0000 https://bdtechtalks.com/?p=4712 Nine months after losing to humans at a Dota 2 e-sports event, OpenAI's team of neural networks came back and made history by defeating the world champions of the online strategy game.

The post AI defeated human champions at Dota 2. Here’s what we learned. first appeared on TechTalks.

]]>
https://bdtechtalks.com/2019/04/17/openai-five-neural-networks-dota-2/feed/ 0 4712