Claude 4.5 Opus' Soul Document (84 minute read) There appears to be a document in the character training for Claude compressed within Claude's weights. While it is possible for the model to hallucinate the text, it appears to be real. This post details how the document was extracted. The full output of Claude's Soul Document is available in the post. | Databricks reportedly in talks to raise $5B at $134B valuation (3 minute read) Databricks is reportedly in talks to raise $5 billion at a $134 billion valuation. The company has resisted pressure to go public, but an IPO would likely be well-received. Databricks has more than 20,000 customers, including OpenAI, Block, Shell, and Toyota. Its platform offers tools that support the full lifecycle of AI development, including feature engineering, model training, evaluation, and deployment. | | How prompt caching works (32 minute read) Prompt caching works per-content, not per-conversation. Prefix caching works at the token level, not the request level, which is why it works across requests. Any change in the prefix breaks the entire hash chain. | Are we in a GPT-4-style leap that evals can't see? (9 minute read) Chat is a terrible way to evaluate models. GPT-4 was so significantly better at answering questions that it was an obvious step forward. The industry needs to add more types of benchmarking. Models are currently being evaluated like students in an exam, but that's not representative of how the world works. | | What makes a great ChatGPT app (10 minute read) The biggest mistake developers make is trying to port entire products into ChatGPT when they should instead expose a few powerful capabilities the model can orchestrate mid-conversation. Devs should organize around giving ChatGPT new data it can't access ("know"), enable real actions like booking appointments or playing games ("do"), and present information through richer UI than text walls ("show"). | Context plumbing (6 minute read) Context is always changing, as it is dynamic. The context is not always where the AI runs. To make an agent run really well, you need to move the context to where it needs to be. Agents shouldn't have to look up context for every single query, because that's slow. Engineers need to build pipes that continuously flow potential context from where it is created to where it is going to be used. | | It's Hard to Feel the AGI (6 minute read) Some of the most accomplished minds in AI are beginning to revise their projections for the near future. This may not be entirely unexpected for those who follow the field. LLMs and other generative models have undoubtedly made many tangible achievements. Discovering their limits remains a challenging task that could be lost if industry climate drops to a new 'AI winter'. | Vibe proving is here (1 minute read) An AI model just proved Erdos Problem #124 in Lean by itself. The problem has been open for nearly 30 years. The achievement shows how mathematical superintelligence is getting closer. The technology will change and dramatically accelerate progress in mathematics and all dependent fields. | | The Thinking Game (Website) The Thinking Game is a documentary that covers the pivotal moments over five years at DeepMind, an AI company founded by Demis Hassabis. | | | Love TLDR? Tell your friends and get rewards! | | Share your referral link below with friends to get free TLDR swag! | | | | Track your referrals here. | | Want to advertise in TLDR? š° If your company is interested in reaching an audience of AI professionals and decision makers, you may want to advertise with us. Want to work at TLDR? š¼ Apply here or send a friend's resume to jobs@tldr.tech and get $1k if we hire them! If you have any comments or feedback, just respond to this email! Thanks for reading, Andrew Tan, Ali Aminian, & Jacob Turner | | | |
0 Comments
VHAVENDA IT SOLUTIONS AND SERVICES WOULD LIKE TO HEAR FROM YOUš«µš¼š«µš¼š«µš¼š«µš¼