Summary of Eric Drexler’s work on reframing AI safety

This post contains a bullet point summary of Reframing Superintelligence: Comprehensive AI Services as General Intelligence. (I wrote this in 2017, so it does not necessarily refer to the most up-to-date version of Drexler’s work.)

I find Drexler’s work very interesting because he has a somewhat unusual perspective on AI. My take is that his ideas have some merit, and I like that he’s questioning key assumptions. But I’m less sure I would agree with all the details, and I think we should be much more uncertain about AI than his texts often (implicitly) suggest.

The key ideas are:

  • He thinks AGI isn’t necessarily agent-like. Instead, we might build “comprehensive AI services” (CAIS) which are superintelligent, but don’t act like an opaque agent.
  • He thinks the usual concept of intelligence is misguided, and that AI is radically unlike human intelligence.
  • He thinks humans might retain control of high-level strategic decisions.

In the following, I will summarise the chapters that I found most interesting.


The R&D automation model of recursive improvement:

  • As AI advances, we can expect to see AI products automate human tasks throughout the AI development process, enabling recursive AI technology improvement.
    • But this does not entail recursive self-improvement of an agent
    • Drexler thinks one can get comprehensive, superintelligent AI services without AI-agent risks.
    • Given that, you can apply the superintelligent services to (AI agent) safety problems.

AGI implementation technologies would directly enable non-AGI alternatives:

  • Instead of opaque, self-improving AGI agents, you could implement open, comprehensive AI services.
  • Self-improving, general-purpose AI systems would have the ability to contribute to AI development before it becomes generally superhuman (including modeling the world, devising plans for takeover, etc.)
  • Technologies that could implement opaque, self-improving AI could be applied for other purposes, e.g. open systems.
    • There is no compelling reason to package and seal AI development processes in an opaque box.
  • Unexpected and problematic behavior is quite possible, but differs from classic AGI risks.
    • He thinks this is more tractable than the classical “control problem”.

Broadly-competent systems work by coordinating narrower competencies:

  • Both in humans and AI systems, broader capacities are built on narrow capacities
  • The black-box abstraction of AI discards this knowledge (but depending on what one wants to do, this can be useful).

Competitive pressures provide little incentive to transfer strategic choices to AI systems:

  • Generally, AI decision speed and quality will tend to favor AI control of decisions.
  • High-level strategic decisions are high-stakes and less urgent than many other decisions, and humans can evaluate them. 
    • This means that humans can exploit AI competence without ceding control, by having the AI systems suggest excellent options.
    • Drexler claims that this would not compromise competitiveness
  • Senior human decision makers will likely choose to retain their authority. Ceding control of high-level decisions would mean increasing risks for declining benefits.
  • I am fairly sceptical about these claims.

Rational-agent models place intelligence in an implicitly anthropomorphic frame:

  • Drexler claims that the usual rational-agent model is psychomorphic and anthropomorphic.
  • Thinking about AI as a mind has intuitive appeal, but is misguided.
  • Even rational-agent models originate as an idealization of human decision-making; they abstract away the content of human minds, but retain the role of minds in guiding decisions.
    • Even high-level intelligence need not be psychomorphic. 
  • He argues that even technical analysis of AI systems often incorporates biological assumptions.
  • Mind-like superintelligences may be an attractor, but it’s still important to model the entire space of potential AI systems.
  • Drexler argues that emerging AI technologies are radically unlike evolved intelligent systems:
Evolved systemsEngineered systems
Organisation of unitsDistinct organisms Systems of components 
Origin of new capabilitiesIncremental evolutionSystems engineering
Origin of instancesLocal reproductionDownload from a repository
Basis for learning tasksIndividual experienceAggregated training data
Transfer of knowledgeTeaching, imitationCopying code, data 
Necessary competenciesGeneral life skillsSpecific task performance
Success metricReproductive fitnessFitness for purpose
Self-modificationNecessaryOptional
Continuity of existenceNecessaryOptional
World-oriented agencyNecessaryOptional

Standard definitions of “superintelligence” conflate learning with competence:

  • Intelligence is the capacity to learn, which can be understood as distinct from competence. 
  • The two should be distinguished both in humans and AI system
    • A child is considered intelligent because of learning capacity, an expert is considered intelligent because of competence.
  • Learning and competence are separable in principle and practice.
  • Patterns of learning and competence differ radically in human beings and typical AI systems.
    • E.g. learning and competence is unified in humans, but separable in AI systems.
    • It’s a misconception to think that AI systems with superhuman learning skills will necessarily have superhuman capabilities.

Providing comprehensive AI services neither requires nor entails AGI:

  • Drexler thinks the practical incentives for developing AGI are surprisingly weak because you can have comprehensive AI services instead (and that’s much safer).
  • This means that AGI development is not inevitable (but still the default unless actively prevented).

Reinforcement learning systems ≠ reward-seeking agents:

  • In current practice, RL systems are typically distinct from the resulting agents
    • RL rewards support training, but aren’t motivation / action / sources of value.
    • RL systems and task-performing agents are not unitary “RL agents”; instead, the trained agents are products of RL systems.
  • Aggregation attenuates the link between agents, actions, and experience.
  • Similar to other posts, it’s important to think about it in terms of a “development-oriented approach”.

Broad world knowledge is compatible with strong (and safe) task focus:

  • Language translation systems show that it’s possible to have a system that has broad world knowledge, but is still restricted to a single task.
  • Current neural machine translation systems develop language-independent representations of meaning.
  • Robust task focus can support safety.

Ensuring non-collusion among superintelligent oracles:

  • He suggests using superintelligent problem-solving capabilities to solve AI safety.
  • A familiar objection is that such approaches are unsafe in and of themselves.
    • In particular, multiple systems would collude against humans.
  • But Drexler thinks there are strategies to preclude deceptive collusion.
  • Paul Christiano also shares this view.

Miscellaneous other points on AI safety:

  • Human oversight would be compatible with recursive AI technology improvement, and isn’t crowded out by competitive pressures.
  • AI technologies could themselves help with human oversight.
  • It would be possible to cure cancer without incurring AGI risk.
  • Strong optimization power may increase AI safety by constraining the structure and behaviour of a system. 
  • Boxing is misguided because capabilities are distributed. 
  • AI safety should be considered in the context of expected future knowledge.
  • Comprehensive AI services would mitigate, but not solve, the AGI control problem.

Leave a Reply

Your email address will not be published. Required fields are marked *