Ryan Greenblatt from Redwood Research recently published "Getting 50% on ARC-AGI with GPT-4.0," where he used GPT4o to reach a state-of-the-art accuracy on Francois Chollet's ARC Challenge by generating many Python programs.
Sponsor:
Sign up to Kalshi here [ Ссылка ] -- the first 500 traders who deposit $100 will get a free $20 credit! Important disclaimer - In case it's not obvious - this is basically gambling and a *high risk* activity - only trade what you can afford to lose.
We discuss:
- Ryan's unique approach to solving the ARC Challenge and achieving impressive results.
- The strengths and weaknesses of current AI models.
- How AI and humans differ in learning and reasoning.
- Combining various techniques to create smarter AI systems.
- The potential risks and future advancements in AI, including the idea of agentic AI.
[ Ссылка ]
[ Ссылка ]
TOC
00:00:00 Intro
00:01:38 Prelude on goals in LLMs
00:02:42 Ryan intro
00:03:11 Ryan's ARC Challenge Approach
00:38:15 Language models, reasoning and agency
01:14:14 Timelines on superintelligence
01:27:05 Growth of superintelligence
02:06:41 Reflections on ARC
02:11:49 Why wouldn't AI knowledge be subjective
Host: Dr. Tim Scarfe
Pod: [ Ссылка ]
Refs:
Getting 50% (SoTA) on ARC-AGI with GPT-4o [Ryan Greenblatt]
[ Ссылка ]
On the Measure of Intelligence [Chollet]
[ Ссылка ]
Connectionism and Cognitive Architecture: A Critical Analysis [Jerry A. Fodor and Zenon W. Pylyshyn]
[ Ссылка ]
Software 2.0 [Andrej Karpathy]
[ Ссылка ]
Why Greatness Cannot Be Planned: The Myth of the Objective [Kenneth Stanley]
[ Ссылка ]
Biographical account of Terence Tao’s mathematical development. [M.A.(KEN) CLEMENTS]
[ Ссылка ]
Model Evaluation and Threat Research (METR)
[ Ссылка ]
Why Tool AIs Want to Be Agent AIs
[ Ссылка ]
Simulators - Janus
[ Ссылка ]
AI Control: Improving Safety Despite Intentional Subversion
[ Ссылка ]
[ Ссылка ]
What a Compute-Centric Framework Says About Takeoff Speeds
[ Ссылка ]
Global GDP over the long run
[ Ссылка ]
Safety Cases: How to Justify the Safety of Advanced AI Systems
[ Ссылка ]
The Danger of a “Safety Case"
[ Ссылка ]
The Future Of Work Looks Like A UPS Truck (~02:15:50)
[ Ссылка ]
SWE-bench
[ Ссылка ]
Using DeepSpeed and Megatron to Train Megatron-Turing NLG
530B, A Large-Scale Generative Language Model
[ Ссылка ]
Algorithmic Progress in Language Models
[ Ссылка ]
Ещё видео!