Image of podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

Dwarkesh Podcast

Tue Jun 11 2024



Francois Chollet and Mike Knoop - ARC Benchmark and LLMs:

  • The ARC benchmark is designed as an IQ test for machine intelligence, emphasizing core knowledge over memorization.
  • Large Language Models (LLMs) struggle with the ARC benchmark due to the requirement of synthesizing new solution programs for each unique task, a challenge they have not effectively addressed.
  • Francois Chollet emphasizes that true general intelligence involves efficiently adapting to novelty, a critical aspect lacking in current AI models like LLMs.
  • Despite some success in specific tasks through memorization, LLMs still fall short when faced with genuinely novel challenges like those presented by the ARC benchmark.

Skill vs Intelligence in AI Models:

  • Skill is distinct from intelligence where skill relies on memorization and pattern matching, while intelligence necessitates adaptability to new scenarios without prior exposure.
  • While LLMs excel at scaling their capabilities through increased data and compute power, they lack the genuine adaptive intelligence required for comprehensive problem-solving.

Human Generalization vs. Machine Learning:

  • Humans possess a unique ability to adeptly handle novel situations without solely relying on memory or pre-existing solutions.
  • Machines such as LLMs can perform well on tasks based on memorized patterns but struggle with truly novel challenges demanding real-time program synthesis abilities.

Creativity and Interpolation in AI Models:

  • Creativity cannot be simplified as interpolation in higher dimensions; it entails generating innovative ideas by combining existing concepts uniquely.
  • Larger AI models exhibit enhanced skills across various tasks due to expanded training data but still lack the creative capacity inherent in human thought processes.

Pattern Matching vs. True Reasoning in Intelligence:

  • Intelligence encompasses a spectrum ranging from pattern matching to true reasoning, with individuals typically utilizing a combination of both methods.
  • Tasks that appear reasoning-heavy also involve intuition and guided pattern matching alongside actual reasoning.
  • The ability to reason effectively is crucial when faced with novelty, uncertainty, and change, highlighting the significance of generalization in intelligence.

Importance of Generalization for Intelligence:

  • Francois emphasizes that intelligence becomes essential when dealing with novel situations or uncertainties.
  • Humans rely on generalization to navigate through scenarios they have not encountered before, showcasing the necessity of adaptable thinking.
  • Generalization allows individuals to apply past knowledge to new contexts efficiently, enabling effective problem-solving in diverse situations.

Challenges in AI Progress Towards AGI:

  • Trends like closing frontier research and the focus on large language models (LLMs) have impeded advancements towards artificial general intelligence (AGI).
  • OpenAI's shift towards closed research practices has limited innovation by withholding technical details from the public domain.
  • The dominance of LLMs has diverted resources away from exploring varied directions in AI research, potentially hindering breakthroughs in AGI development.

ARC Benchmark Prize Details:

  • A $1 million prize pool is allocated annually for solving the challenging ARC benchmark.
  • Participants aim to achieve an 85% benchmark score, with a $500,000 reward for the first team reaching this target.
  • Additional prizes include a $100,000 progress award for top scores and best paper explaining achieved scores conceptually.

Implications of Public Domain Sharing Requirement:

  • Requiring solutions or papers to be shared publicly aims to foster open collaboration and knowledge dissemination within the AI community.
  • Top scorers contributing their approaches will advance collective understanding in tackling complex benchmarks like ARC.

ARC-AGI Prize Competition Details:

  • The ARC-AGI Prize competition will award a significant $1,000,000 prize to the top progress prize by the end of November.
  • Between December and February, knowledge from top scores and approaches used by participants will be shared to align the community with advancements.
  • The contest aims to run annually until achieving an ambitious 85% success rate.

Approaches in AI Progress - Deep Learning vs. Program Synthesis:

  • Different benchmarks like MML show varying levels of saturation where models quickly reach their limits.
  • Jack Cole's innovative approach involves active inference and test-time fine-tuning in LLMs for program synthesis.
  • Contrasting discrete program search with LLM approach reveals deep recombination with small primitive programs versus shallow recombination with millions of building blocks in LLMs.
  • Balancing memorization and deep search is crucial for maximizing compute cycles effectively in advancing AI capabilities.

Challenges and Evolution of ARC Benchmark:

  • Francois Chollet acknowledges flaws in the current ARC benchmark due to redundancy and lack of novelty in tasks.
  • Plans are underway to release an updated version (ARC2) later this year with improvements based on past learnings.
  • Consideration is given to making the old private test set available through a query-based API server to prevent unintentional training on data.

Core Knowledge Acquisition and Intelligence:

  • Core knowledge can be learned through experience but is also partially innate, acquired mainly during early childhood years.
  • Humans possess core knowledge that includes basic physics principles like trajectories or bouncing patterns observed throughout life.

Future Prospects and Transparency in AGI Development:

  • The ARC competition aims to accelerate progress towards AGI by encouraging public sharing of meaningful advancements.
  • Disaggregating bets on open-source solutions versus scaling hypothesis helps understand actual limits of compute power required for AGI development.
  • Continuous evolution of the competition rules based on feedback ensures alignment with community needs.

Accessing ARC Prize Information:

  • Interested individuals can visit ArcPrize.org for more details about participating in the competition offering a one million dollar prize.