Solving Chollet's ARC-AGI with GPT4o

Ryan Greenblatt from Redwood Research recently published "Getting 50% on ARC-AGI with GPT-4.0," where he used GPT4o to reach a state-of-the-art accuracy on Francois Chollet's ARC Challenge by generating many Python programs.

Sponsor:
Sign up to Kalshi here [ Ссылка ] -- the first 500 traders who deposit $100 will get a free $20 credit! Important disclaimer - In case it's not obvious - this is basically gambling and a *high risk* activity - only trade what you can afford to lose.

We discuss:
- Ryan's unique approach to solving the ARC Challenge and achieving impressive results.
- The strengths and weaknesses of current AI models.
- How AI and humans differ in learning and reasoning.
- Combining various techniques to create smarter AI systems.
- The potential risks and future advancements in AI, including the idea of agentic AI.

[ Ссылка ]
[ Ссылка ]

TOC
00:00:00 Intro
00:01:38 Prelude on goals in LLMs
00:02:42 Ryan intro
00:03:11 Ryan's ARC Challenge Approach
00:38:15 Language models, reasoning and agency
01:14:14 Timelines on superintelligence
01:27:05 Growth of superintelligence
02:06:41 Reflections on ARC
02:11:49 Why wouldn't AI knowledge be subjective

Host: Dr. Tim Scarfe

Pod: [ Ссылка ]

Refs:
Getting 50% (SoTA) on ARC-AGI with GPT-4o [Ryan Greenblatt]
[ Ссылка ]

On the Measure of Intelligence [Chollet]
[ Ссылка ]

Connectionism and Cognitive Architecture: A Critical Analysis [Jerry A. Fodor and Zenon W. Pylyshyn]
[ Ссылка ]

Software 2.0 [Andrej Karpathy]
[ Ссылка ]

Why Greatness Cannot Be Planned: The Myth of the Objective [Kenneth Stanley]
[ Ссылка ]

Biographical account of Terence Tao’s mathematical development. [M.A.(KEN) CLEMENTS]
[ Ссылка ]

Model Evaluation and Threat Research (METR)
[ Ссылка ]

Why Tool AIs Want to Be Agent AIs
[ Ссылка ]

Simulators - Janus
[ Ссылка ]

AI Control: Improving Safety Despite Intentional Subversion
[ Ссылка ]
[ Ссылка ]

What a Compute-Centric Framework Says About Takeoff Speeds
[ Ссылка ]

Global GDP over the long run
[ Ссылка ]

Safety Cases: How to Justify the Safety of Advanced AI Systems
[ Ссылка ]

The Danger of a “Safety Case"
[ Ссылка ]

The Future Of Work Looks Like A UPS Truck (~02:15:50)
[ Ссылка ]

SWE-bench
[ Ссылка ]

Using DeepSpeed and Megatron to Train Megatron-Turing NLG
530B, A Large-Scale Generative Language Model
[ Ссылка ]

Algorithmic Progress in Language Models
[ Ссылка ]

Смотрите далее

Мультфильм "Секреты воды"

КосмоСториз: ИО, ЧТО ТЫ ТАКОЕ?

КосмоСториз: «ХАЯБУСА-2» УСПЕШНО СЕЛ НА АСТЕРОИД РЮГУ

ИБП APC Smart UPS 2200 PowerChute Business Edition Обзор Установка Настройка Мониторинг через SNMP

1 серия.УНЧ.Оплеуха микрухам 2.0. Делаем сами.

Настройка арматуры бачка унитаза - регулировка сливного механизма | Видеоурок Пламбер

Эмбриональная колонизация космоса [Ковчег поколений]

АГС-17 ПЛАМЯ – автоматический гранатомет! Мощнейшее оружие поддержки и наступления калибра 30 мм!

Как проверить реле зарядки, регулятор напряжения

Старых карт нет-4. Техника прошлого

Спортивное электронное табло для хоккея | Электронные табло Импульс | РусИмпульс

Мифы, в которые мы верим #2

Россия ведет священную войну с Западом. Новый миропорядок: возврат СССР, распад ООН и будущее Китая

B-29 DOC: Highlights from Flight #3

Гелертер верят - Развитая цивилизация существовала до появления людей? [Времени не существует]

Новые клипы

Тренды Наука