Comments on: StreamingLLM gives language models unlimited context https://bdtechtalks.com/2023/11/27/streamingllm/?utm_source=rss&utm_medium=rss&utm_campaign=streamingllm Technology solving problems... and creating new ones Thu, 30 Nov 2023 06:47:13 +0000 hourly 1 By: Ben Dickson https://bdtechtalks.com/2023/11/27/streamingllm/comment-page-1/#comment-37280 Wed, 29 Nov 2023 21:26:10 +0000 https://bdtechtalks.com/?p=19016#comment-37280 In reply to Andy Tenland.

That is what I meant. It enables you to continue your conversation with the LLM past the context window, though as you said, it sticks to length of the context window (e.g., 4k tokens). That’s what the article says too if you read it carefully.

]]>
By: Andy Tenland https://bdtechtalks.com/2023/11/27/streamingllm/comment-page-1/#comment-37279 Wed, 29 Nov 2023 15:12:37 +0000 https://bdtechtalks.com/?p=19016#comment-37279 In reply to Ben Dickson.

Your explanation is inaccurate. It does not change the context window in any way. If the LLM has a 4k context window, it can only respond using the context of the latest 4k tokens. StreamingLLM makes LLMs more efficient by removing the need the reset the cache and improves accuracy vs LLMs that aren’t resetting their cache. It doesn’t make it so that an LLM with a 4k context window can accurately respond to a 128k token prompt. This article is spreading misinformation. Read the FAQ section here. https://github.com/mit-han-lab/streaming-llm

]]>
By: Ben Dickson https://bdtechtalks.com/2023/11/27/streamingllm/comment-page-1/#comment-37274 Wed, 29 Nov 2023 06:20:51 +0000 https://bdtechtalks.com/?p=19016#comment-37274 In reply to Jonathan Hostetler.

Hi Jonathan. StreamingLLM does not change the architecture of the model to expand the context window. What it does is shift the context window while maintaining the accuracy and the reused part of the KV cache. So basically, you can extend the conversation with the LLM into millions of tokens as if its context window was unlimited, but without making any changes to the model or retraining it. I hope this helps.

]]>
By: Jonathan Hostetler https://bdtechtalks.com/2023/11/27/streamingllm/comment-page-1/#comment-37273 Wed, 29 Nov 2023 03:41:31 +0000 https://bdtechtalks.com/?p=19016#comment-37273 This seems amazing but I’m a bit confused. From what I understand you saying in this article, StreamingLLM could expand the context window of an LLM such as Llama to 4 million tokens, meaning I could hypothetically input 3 million words. However, the GitHub page explicitly says that it does not expand the context window. Am I missing something?

]]>