j4orz

annual letter 2025: thinking whales

thinking

deepseek's r1 is an impressive model, particularly around what they're able to deliver for the price.

we will obviously deliver much better models and also it's legit invigorating to have a new competitor! we will pull up some releases.

— Sam Altman (@sama) January 28, 2025

Congrats to DeepSeek on producing an o1-level reasoning model! Their research paper demonstrates that they’ve independently found some of the core ideas that we did on our way to o1.

— Mark Chen (@markchen90) January 28, 2025

toolcalling

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/ https://www.anthropic.com/engineering https://sankalp.bearblog.dev/my-claude-code-experience-after-2-weeks-of-usage/ https://sankalp.bearblog.dev/my-experience-with-claude-code-20-and-how-to-get-better-at-using-coding-agents/#conclusion https://simonwillison.net/2025/Dec/31/the-year-in-llms/#the-year-of-agents https://sourcegraph.com/blog/revenge-of-the-junior-developer https://steipete.me/posts/2025/shipping-at-inference-speed https://blog.ezyang.com/2026/01/the-gap-between-a-helpful-assistant-and-a-senior-engineer/

today we launch deep research, our next agent.

this is like a superpower; experts on demand!

it can go use the internet, do complex research and reasoning, and give you back a report.

it is really good, and can do tasks that would take hours/days and cost hundreds of dollars.

— Sam Altman (@sama) February 3, 2025

GPT-5-Codex is here: a version of GPT-5 better at agentic coding.

It is faster, smarter, and has new capabilities. Let us know what you think!

The team has been absolutely cooking, very fun to watch.

— Sam Altman (@sama) September 15, 2025

the vibes on codex feel like the first few months of chatgpt.

fun energy!

— Sam Altman (@sama) September 16, 2025

the jump from gpt4 to 5-codex is just massive for those who can see it. codex is an alien juggernaut just itching to become superhuman. feeling the long awaited takeoff. there’s very little doubt that the datacenter capex will not go to waste

— roon (@tszzl) September 21, 2025

right now is the time where the takeoff looks the most rapid to insiders (we don’t program anymore we just yell at codex agents) but may look slow to everyone else as the general chatbot medium saturates

— roon (@tszzl) September 16, 2025

I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become…

— Andrej Karpathy (@karpathy) December 26, 2025

I'm Boris and I created Claude Code. Lots of people have asked how I use Claude Code, so I wanted to show off my setup a bit.

My setup might be surprisingly vanilla! Claude Code works great out of the box, so I personally don't customize it much. There is no one correct way to…

— Boris Cherny (@bcherny) January 2, 2026

age of research?