BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Anthropic Announces Claude 2.1 LLM with Wider Context Window and Support for AI Tools

Anthropic Announces Claude 2.1 LLM with Wider Context Window and Support for AI Tools

According to Anthropic, the newest version of Claude delivers many “advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and our new beta feature: tool use.” Anthropic also announced reduced pricing to improve cost efficiency for our customers across models.

The enhanced context window is a standout feature of Claude 2.1, boasting a 200,000-token capacity. This surpasses OpenAI's GPT-4, which offers a 128,000-token window. Anthropic states the new model is less likely to produce false statements compared to its predecessor. Claude 2.1 attempts to avoid incorrect answers and acknowledge uncertainties, often opting to demur rather than provide incorrect information. Anthropic says the model demonstrates a 30% reduction in incorrect answers and a substantially lower rate of mistakenly affirming unsupported claims.

Another notable addition is Claude 2.1's ability to use tools and interact with APIs. This feature enables the model to utilize external resources like calculators, databases, or even perform web searches to respond more effectively to queries. It can also be integrated into users' tech stacks, allowing for more versatile applications across various fields.

Furthermore, Claude 2.1 introduces system prompts, enabling users to set specific contexts for their requests. This feature ensures more structured and consistent responses from the model. The pricing structure is set at $8 per million tokens processed in input prompts and $24 per million tokens in the model's output, making it accessible to a wide range of users, including developers and businesses.

Some users expressed mixed reviews of the new models. On the positive side, some users found Claude 2.1 enjoyable for tasks like chatting and summarization and commended its advancements and capabilities, particularly in summarization tasks. However, other users also expressed frustrations with the model's refusals and perceived heavy censorship, which some users feel makes the tool less practical and autonomous. Additionally, there are concerns about Claude's limitations in handling certain content, like academic or research materials, due to strict safety protocols and content guidelines.

Findings:
* At 200K tokens (nearly 470 pages), Claude 2.1 was able to recall facts at some document depths
* Facts at the very top and very bottom of the document were recalled with nearly 100% accuracy
* Facts positioned at the top of the document were recalled with less performance than the bottom (similar to GPT-4)
* Starting at ~90K tokens, performance of recall at the bottom of the document started to get increasingly worse
* Performance at low context lengths was not guaranteed - Greg Kamradt

Anthropic's timely launch of Claude 2.1 coincides with a period of internal strife at OpenAI, marked by the temporary suspension of new ChatGPT Plus subscriptions and the recent situation surrounding CEO Sam Altman. Despite this, Devin Coldewey writes, "GPT-4 is still the gold standard on code generation, for instance, and Claude will handle requests differently than its competitors, some better, some worse."

Users who want to learn more about Claude 2.1 may refer to the model card on Anthropic's website. Anthropic has also made an example repository demonstrating how to work with tools.

About the Author

Rate this Article

Adoption
Style

BT