"Eliciting Latent Knowledge" is a research problem introduced by Paul Christiano, Ajeya Cotra and Mark Xu as part of work conducted at the Alignment Research Center.
Timestamps:
00:00 - Eliciting Latent Knowledge
00:46 - The Alignment Research Center (ARC)
02:25 - The Eliciting Latent Knowledge Problem
04:53 - Ontology identification
07:26 - Toy Scenario: The SmartVault
08:07 - How the SmartVault AI works
10:38 - What could go wrong?
12:57 - Addressing the problem by asking questions
16:18 - A counterexample
20:20 - The direct translator
22:25 - The human simulator
23:51 - Will gradient descent learn the direct translator or the human simulator?
25:14 - Closing thoughts
Link to google doc: [ Ссылка ]
Topics: #safety #ai #alignment #ELK
For related content:
- Twitter: [ Ссылка ]
- Research lab: [ Ссылка ]
- personal webpage: [ Ссылка ]
- YouTube: [ Ссылка ]
- TikTok: [ Ссылка ]
- Instagram: [ Ссылка ]
- LinkedIn: [ Ссылка ]
- Discord server for filtir: [ Ссылка ]
(Optional) if you'd like to support the channel:
- [ Ссылка ]
- [ Ссылка ]
![](https://i.ytimg.com/vi/8ckO18RAY_o/maxresdefault.jpg)