AI research papers - TechTalks

Under the hood: The Innovations powering DeepSeek’s AI breakthrough

Ben Dickson — Mon, 07 Apr 2025 13:00:00 +0000

Here is how DeepSeek models disrupted AI norms and revealed that outstanding performance and efficiency don’t require secrecy

The post Under the hood: The Innovations powering DeepSeek’s AI breakthrough first appeared on TechTalks.

How Open-Sora 2.0 cuts the costs of AI video generation without sacrificing quality

Ben Dickson — Mon, 24 Mar 2025 14:11:14 +0000

Open-Sora 2.0 cuts the costs of creating a bleeding edge text-to-video AI model by using the right data, architecture, and training regime.

The post How Open-Sora 2.0 cuts the costs of AI video generation without sacrificing quality first appeared on TechTalks.

Claude 3.5 Sonnet outperforms GPT-4o and o1 in software engineering, OpenAI study shows

Ben Dickson — Mon, 24 Feb 2025 14:00:00 +0000

A new OpenAI study reveals Claude 3.5 Sonnet outperforms GPT-4o and o1 on SWE-Lancer, a new benchmark simulating real-world software engineering tasks.

The post Claude 3.5 Sonnet outperforms GPT-4o and o1 in software engineering, OpenAI study shows first appeared on TechTalks.

How multiagent fine-tuning overcomes the data bottleneck of LLMs

Ben Dickson — Mon, 27 Jan 2025 17:03:28 +0000

Multiagent debate and fine-tuning can enable LLMs to create high-quality training data to improve themselves across different tasks.

The post How multiagent fine-tuning overcomes the data bottleneck of LLMs first appeared on TechTalks.

New training paradigm prevents machine learning models from learning spurious correlations

Ben Dickson — Mon, 20 Jan 2025 14:26:35 +0000

Meta researchers show how memorization-aware training can help machine learning models avoid developing dangerous biases.

The post New training paradigm prevents machine learning models from learning spurious correlations first appeared on TechTalks.

GEAR turbo-charges LLMs with advanced graph-based RAG capabilities

Ben Dickson — Mon, 13 Jan 2025 20:44:56 +0000

GEAR enhances RAG by automatically extracting triples and using beam search to create and iterate over graph representations from retrieved documents.

The post GEAR turbo-charges LLMs with advanced graph-based RAG capabilities first appeared on TechTalks.

Augmentation-based jailbreaking reveals critical flaws in AI models

Ben Dickson — Mon, 30 Dec 2024 14:00:00 +0000

Best-of-N jailbreaking is a black-box attack that can circumvent the safeguards of frontier LLMs, including Claude, GPT-4o, and Gemini.

The post Augmentation-based jailbreaking reveals critical flaws in AI models first appeared on TechTalks.

Encoders make a strong comeback with ModernBERT

Ben Dickson — Fri, 27 Dec 2024 14:15:52 +0000

ModernBERT combines the powers of encoder-based models with the latest techniques in making transformers more efficient.

The post Encoders make a strong comeback with ModernBERT first appeared on TechTalks.

Tokenformer is a Transformer model that scales more efficiently

Ben Dickson — Mon, 16 Dec 2024 14:00:00 +0000

Tokenformer uses the attention mechanism exclusively to create a transformer architecture that can be scaled without training from scratch.

The post Tokenformer is a Transformer model that scales more efficiently first appeared on TechTalks.

LLMs don’t need all the attention layers, study shows

Ben Dickson — Mon, 09 Dec 2024 14:00:00 +0000

LLMs can shed a substantial portion of their attention layers without hurting their performance.

The post LLMs don’t need all the attention layers, study shows first appeared on TechTalks.

Nvidia’s Hymba is an efficient SLM that combines state-space models and transformers

Ben Dickson — Mon, 02 Dec 2024 13:58:52 +0000

Hymba integrates transformers and state-space models to reduce costs and increase speed while maintaining accuracy.

The post Nvidia’s Hymba is an efficient SLM that combines state-space models and transformers first appeared on TechTalks.

How treating LLMs as “actors” can produce better results

Ben Dickson — Mon, 25 Nov 2024 13:56:08 +0000

Think of LLMs as actors, prompts as scripts, and LLM outputs as performances.

The post How treating LLMs as “actors” can produce better results first appeared on TechTalks.

Self-Evolving Reward Learning aligns LLMs with less human feedback

Ben Dickson — Mon, 18 Nov 2024 12:50:59 +0000

Large language models (LLMs) have internal world models that they can use to review their own answers and automatically label data to train reward models.

The post Self-Evolving Reward Learning aligns LLMs with less human feedback first appeared on TechTalks.

Adversarial pop-ups trick AI agents into clicking malicious links

Ben Dickson — Sun, 10 Nov 2024 21:34:26 +0000

AI agents click on malicious popups that human users would easily avoid.

The post Adversarial pop-ups trick AI agents into clicking malicious links first appeared on TechTalks.

New technique teaches LLMs to optimize their “thought” process

Ben Dickson — Mon, 04 Nov 2024 13:59:40 +0000

Though Preference Optimization (TPO) teaches LLMs to generate logical thoughts before responding to queries.

The post New technique teaches LLMs to optimize their “thought” process first appeared on TechTalks.

Minimized RNNs offer a fast and efficient alternative to Transformers

Ben Dickson — Mon, 28 Oct 2024 14:08:42 +0000

With a few changes, RNNs can be optimized for parallel training, making them competitive with Transformers while keeping them efficient.

The post Minimized RNNs offer a fast and efficient alternative to Transformers first appeared on TechTalks.

Would you play an AI-generated game?

Ben Dickson — Fri, 25 Oct 2024 20:33:56 +0000

Unbounded is a game engine that creates interactive experiences on the fly using LLMs and image generation models.

The post Would you play an AI-generated game? first appeared on TechTalks.

The (not so) hidden costs of AI’s “Bigger is Better” paradigm

Ben Dickson — Sun, 20 Oct 2024 20:29:04 +0000

The arms race for scaling AI models comes at the cost of less efficient solutions, narrow research directions, and centralization of power.

The post The (not so) hidden costs of AI’s “Bigger is Better” paradigm first appeared on TechTalks.

Simulating millions of LLM agents with AgentTorch

Ben Dickson — Wed, 02 Oct 2024 19:06:20 +0000

AgentTorch is a framework that allows you to simulate large populations through LLM agents and archetypes.

The post Simulating millions of LLM agents with AgentTorch first appeared on TechTalks.

Promtriever trains LLMs for information retrieval and instruction following

Ben Dickson — Mon, 23 Sep 2024 12:51:14 +0000

Information retrieval should not come at the cost of instruction-following capabilities.

The post Promtriever trains LLMs for information retrieval and instruction following first appeared on TechTalks.

Can AI make scientific discoveries?

Ben Dickson — Mon, 16 Sep 2024 08:30:50 +0000

Current AI algorithms can solve the "easy problem" of scientific research, but the "hard problem" of coming up with the actual problem is the human's job.

The post Can AI make scientific discoveries? first appeared on TechTalks.

How LLMs can automatically design agentic systems

Ben Dickson — Mon, 09 Sep 2024 13:51:34 +0000

Why not let LLMs design agentic system themselves? This is what ADAS proposes.

The post How LLMs can automatically design agentic systems first appeared on TechTalks.

What to know about GameNGen, Google’s DOOM simulator

Ben Dickson — Mon, 02 Sep 2024 09:05:01 +0000

Google Research's GameNGen is a diffusion model that can imagine DOOM video frames. Why would we need such a thing?

The post What to know about GameNGen, Google’s DOOM simulator first appeared on TechTalks.

How UC Berkeley is making humanoid robotic research fast and affordable

Ben Dickson — Mon, 19 Aug 2024 12:36:12 +0000

Researchers at UC Berkeley have released a mid-sized humanoid robot that is safe, affordable, and has a thin sim-to-real gap.

The post How UC Berkeley is making humanoid robotic research fast and affordable first appeared on TechTalks.

Thinking in graphs improves LLMs’ planning abilities, but challenges remain

Ben Dickson — Mon, 12 Aug 2024 08:23:08 +0000

LLMs perform very poorly at planning asynchronous task. But formulating the task as a graph can help improve their performance.

The post Thinking in graphs improves LLMs’ planning abilities, but challenges remain first appeared on TechTalks.

Why accuracy is a misleading metric when evaluating compressed LLMs

Ben Dickson — Tue, 06 Aug 2024 20:06:49 +0000

Compressed LLMs maintain their accuracy metrics in comparison to the baseline models. But their behavior changes dramatically, according to other metrics.

The post Why accuracy is a misleading metric when evaluating compressed LLMs first appeared on TechTalks.

Meta SAM 2 is the most impressive object segmentation model

Ben Dickson — Mon, 05 Aug 2024 07:05:33 +0000

Meta's new object segmentation model, SAM 2, provides near-real-time inference on a wide variety of objects and environments.

The post Meta SAM 2 is the most impressive object segmentation model first appeared on TechTalks.

Why vision-language models fail on simple visual tests

Ben Dickson — Thu, 01 Aug 2024 14:14:56 +0000

Vision-language models (VLMs) score high on competitive multi-modal benchmarks but fail on basic visual acuity tests, according to a new study.

The post Why vision-language models fail on simple visual tests first appeared on TechTalks.

How to turbocharge LLMs for spreadsheet tasks

Ben Dickson — Mon, 29 Jul 2024 07:48:34 +0000

Large language models are not designed for spreadsheets. Microsoft's SpreadsheetLLM makes spreadsheets digestible by LLMs.

The post How to turbocharge LLMs for spreadsheet tasks first appeared on TechTalks.

PAS finds the best prompting technique for your LLM

Ben Dickson — Mon, 22 Jul 2024 07:26:20 +0000

PAS is an automated prompt engineering (APE) system that chooses the best prompting technique for each input to an LLM.

The post PAS finds the best prompting technique for your LLM first appeared on TechTalks.

How AI agents can self-improve with symbolic learning

Ben Dickson — Mon, 08 Jul 2024 08:19:19 +0000

Researchers at AIWaves have released a symbolic learning framework that allows LLM-based AI agents to self-improve their components based on new data.

The post How AI agents can self-improve with symbolic learning first appeared on TechTalks.

DeepMind releases benchmark for evaluating long-context LLMs

Ben Dickson — Mon, 01 Jul 2024 07:50:18 +0000

Google DeepMind has released Long-Context Frontiers (LOFT), a benchmark for LLMs that can process hundreds of thousands or millions of tokens in one prompt.

The post DeepMind releases benchmark for evaluating long-context LLMs first appeared on TechTalks.

Energy-Based World Models bring human-like cognition to AI

Ben Dickson — Mon, 24 Jun 2024 12:12:15 +0000

Energy-based world models (EBWM) enable AI systems to reflect on their predictions and achieve human-like cognitive abilities missing in autoregressive models.

The post Energy-Based World Models bring human-like cognition to AI first appeared on TechTalks.

HippoRAG takes cues from the brain to improve LLM retrieval

Ben Dickson — Mon, 17 Jun 2024 06:41:00 +0000

HippoRAG is a technique inspired from the interactions between the cortex and hippocampus to improve knowledge retrieval for large language models (LLM).

The post HippoRAG takes cues from the brain to improve LLM retrieval first appeared on TechTalks.

How to boost language models with graph neural networks

Ben Dickson — Mon, 10 Jun 2024 08:39:04 +0000

GNN-RAG brings together the knowledge graph–processing abilities of graph neural networks and the language abilities of LLMs to unlock new applications.

The post How to boost language models with graph neural networks first appeared on TechTalks.

DeepSeek-Prover uses synthetic data to boost theorem proving in LLMs

Ben Dickson — Mon, 03 Jun 2024 06:32:55 +0000

DeepSeek has created an algorithm that enables an LLM to bootstrap itself by starting with a small dataset of labeled theorem proofs and create increasingly higher quality example to fine-tune itself.

The post DeepSeek-Prover uses synthetic data to boost theorem proving in LLMs first appeared on TechTalks.

How to optimize ChatGPT and other LLMs for software engineering

Ben Dickson — Mon, 27 May 2024 06:33:30 +0000

A new study shows the strengths and pain points of using ChatGPT in software engineering. The can help organizations turbocharge their developers with LLMs.

The post How to optimize ChatGPT and other LLMs for software engineering first appeared on TechTalks.

Boost LLM application development with many-shot learning

Ben Dickson — Mon, 20 May 2024 13:00:00 +0000

A study by Carnegie Mellon University and Tel Aviv University shows that many-shot learning with long-context LLMs matches retrieval (RAG) and fine-tuning.

The post Boost LLM application development with many-shot learning first appeared on TechTalks.

How far can you trust chain-of-thought prompting?

Ben Dickson — Mon, 13 May 2024 13:04:21 +0000

A new study shows that chain-of-thought (CoT) prompts only improve large language models (LLM) on very narrow planning tasks and don't generalize broadly.

The post How far can you trust chain-of-thought prompting? first appeared on TechTalks.

Train your LLMs to choose between RAG and internal memory automatically

Ben Dickson — Mon, 06 May 2024 13:00:00 +0000

Adapt-LLM is a technique that enables language models to choose between their parametric memory and getting help from an information retrieval (RAG) system.

The post Train your LLMs to choose between RAG and internal memory automatically first appeared on TechTalks.

What OpenELM language models say about Apple’s generative AI strategy

Ben Dickson — Mon, 29 Apr 2024 13:00:00 +0000

Apple has released the full code, weights, checkpoints, and more for OpenELM, its latest language models. Here is what it means for its generative AI strategy.

The post What OpenELM language models say about Apple’s generative AI strategy first appeared on TechTalks.

How to turn any LLM into an embedding model

Ben Dickson — Mon, 22 Apr 2024 13:00:00 +0000

Researchers at Quebec AI Institute (Mila) have released LLM2Vec, a technique that can turn any decoder-only LLM into a universal embedding model.

The post How to turn any LLM into an embedding model first appeared on TechTalks.

Stanford’s ReFT fine-tunes LLMs at a fraction of the cost

Ben Dickson — Mon, 15 Apr 2024 13:00:00 +0000

Representation Fine-Tuning (ReFT) is a technique to fine-tune LLMs for specific tasks based by only modifying a small fraction of their representations.

The post Stanford’s ReFT fine-tunes LLMs at a fraction of the cost first appeared on TechTalks.

Compress GPT-4 and Claude prompts with LLMLingua-2

Ben Dickson — Mon, 01 Apr 2024 13:12:38 +0000

LLMLingua-2 is a prompt compression technique by Microsoft that can reduce the size of prompts by up to five times.

The post Compress GPT-4 and Claude prompts with LLMLingua-2 first appeared on TechTalks.

How to fine-tune LLMs for better RAG performance

Ben Dickson — Mon, 25 Mar 2024 14:00:00 +0000

Retrieval Augmented Fine Tuning (RAFT) combines supervised fine-tuning with RAG to improve LLM domain knoweldge and ability to use in-context documents.

The post How to fine-tune LLMs for better RAG performance first appeared on TechTalks.

Netflix study shows limits of cosine similarity in embedding models

Ben Dickson — Thu, 21 Mar 2024 14:28:51 +0000

Blindly using cosine similarity in embedding models can have arbitrary and therefore meaningless similarities, a research by Netflix shows.

The post Netflix study shows limits of cosine similarity in embedding models first appeared on TechTalks.

How to customize LLMs for low-frequency topics

Ben Dickson — Mon, 18 Mar 2024 14:00:00 +0000

New study provides insights on the effectiveness of LLM RAG and fine-tuning for topics that are not included in the model's training data.

The post How to customize LLMs for low-frequency topics first appeared on TechTalks.

How to improve the throughput of LLM application servers

Ben Dickson — Tue, 12 Mar 2024 14:00:00 +0000

RelayAttention is a technique that increases the throughput of LLM servers by reducing memory access to KV values of system prompts.

The post How to improve the throughput of LLM application servers first appeared on TechTalks.

Diffusion models are now turbocharging reinforcement learning systems

Ben Dickson — Mon, 04 Mar 2024 14:00:00 +0000

Diffusion models are best known for their image-generation abilities. Now, they are being used to learn world models for reinforcement learning systems.

The post Diffusion models are now turbocharging reinforcement learning systems first appeared on TechTalks.

How language models can teach themselves to follow instructions

Ben Dickson — Mon, 29 Jan 2024 14:00:00 +0000

Meta and NYU have released "self-rewarding language models" a technique that enables LLMs to self-improve for instruction-following.

The post How language models can teach themselves to follow instructions first appeared on TechTalks.