Every frontier AI laboratory follows a similar playbook: recruit the brightest technical minds, build sophisticated safety frameworks, and iterate towards alignment. DeepMind has its Frontier Safety Framework. Anthropic has Constitutional AI. OpenAI has its superalignment team. The logic is impeccable: the harder the technical problem, the more technical expertise you need.

This approach has produced extraordinary results. AlphaFold predicted 200 million protein structures. Large language models now reason, create, and contextualise in ways that seemed impossible five years ago. These achievements vindicate the model: elite technical talent, working within rigorous paradigms, solving precisely defined problems. It works because everyone thinks the same way.

This cognitive homogeneity isn’t always problematic. For early-stage technical challenges where the primary constraint is engineering capability, teams of specialists working within shared paradigms can achieve extraordinary results. The question is whether this approach scales to sociotechnical challenges where the core constraints are conceptual rather than computational.

But What If That’s the Problem?

The very cognitive homogeneity that makes AI labs successful at building systems may systematically obscure how they frame the problems those systems create. When an organisation recruits primarily for a particular kind of intelligence, certain questions may be harder to surface. Not because people lack curiosity or ethics, but because the questions fall outside the natural grooves of how technical minds approach problems.

This isn’t a call for more ethicists in the room. It’s an observation about what can happen when complex sociotechnical challenges get filtered through a single epistemological lens. The frame itself becomes invisible.

Elegant Solutions to the Wrong Problem

Consider recent proposals for AI-driven governance systems that I heard floated: let people vote on decisions, measure outcomes, and give those who consistently vote for popular outcomes more influence in future votes. The logic is clear. Success should inform future decisions. Optimise the system.

A philosopher hears something different: a mechanism that could concentrate influence amongst cognitive conformists. How do you measure ‘well-received’? Popular doesn’t necessarily mean beneficial. Measurable doesn’t always mean meaningful. The proposal assumes that good judgement reveals itself through aggregate approval, when actually it might reveal itself through protecting unpopular but crucial perspectives.

This isn’t a flaw in the thinking. It’s what can happen when you approach governance as an optimisation problem rather than a tension between competing goods. Technical brilliance can make elegant systems that encode problematic social dynamics, precisely because the elegance obscures what’s being optimised away.

Mistaking Technical Precision for Conceptual Clarity

AI safety frameworks talk extensively about ‘aligning AI with human values’. The phrasing suggests a solved philosophical problem: we know what human values are, we just need to encode them. But ‘human values’ aren’t a dataset waiting to be aligned. They’re contested, plural, and context-dependent.

When technologists say ‘alignment’, they often mean ‘this system does what users want it to do’. When philosophers hear ‘alignment’, they ask: aligned with which humans? Which version of their values-stated preferences or revealed preferences? Values they hold now or values they’d hold with more information? Values that make them happy or values that make them flourish?

These aren’t semantic quibbles. They’re foundational questions about what problem you’re solving. DeepMind’s frameworks address ‘socioaffective misalignment’, recognising that systems can follow instructions whilst failing to serve human wellbeing. That’s genuine progress. But ‘wellbeing’ isn’t a technical specification. It’s a philosophical concept with 2,500 years of written contested debate behind it.

The technical frame treats this as a measurement challenge. The philosophical frame recognises it as a question that may not be answerable through measurement alone.

Threshold Thinking vs Categorical Confusion

AI consciousness research increasingly focuses on capability thresholds: at what point does information integration or recurrent processing warrant moral consideration? The approach mirrors how labs think about capability more broadly. Define the metric, find the threshold, build the safeguard.

But consciousness may not be a technical threshold. The question ‘is this system conscious?’ sits downstream of deeper questions: what are the necessary and sufficient conditions for consciousness? Does consciousness require embodiment? Can there be consciousness without phenomenal experience? These aren’t empirical questions waiting for better measurement. They’re conceptual questions about what consciousness is.

Recent academic work argues we need ‘principles for responsible AI consciousness research’. The very framing reveals a potential issue: developing principles for researching something we haven’t conceptually clarified. It’s like building elaborate experimental protocols before agreeing on what you’re experimenting on.

Technical minds may see this as premature philosophical handwringing. Philosophers see it as building on foundations that haven’t been examined. Both perspectives have merit. And that’s precisely why you might need both in the room.

Optimising for the Measurable

Here’s the pattern underlying all of these: technical organisations naturally gravitate towards problems that can be specified, measured, and optimised. This isn’t laziness. It’s playing to their strengths. But it can create systematic blind spots.

You can measure user satisfaction. You may not be able to measure whether your system is making humans more or less capable of genuine agency.

You can measure engagement. You may not be able to measure whether your platform is affecting people’s capacity for sustained attention.

You can measure economic efficiency. You may not be able to measure whether you’re optimising society towards ends that humans would reflectively endorse.

The unmeasurable isn’t less real. It’s often more consequential. But it can become invisible within a purely technical frame.

What This Means for AI Development

The laboratories building transformative AI are not lacking in intelligence, ethics, or good intentions. They may be constrained by the epistemological assumptions embedded in their hiring, their culture, and their ways of framing problems. These constraints are usually invisible from inside the system.

This is why external philosophical collaboration can matter. Not as ethics theatre. Not as academic consultation. But as a genuine tool for stress-testing the frameworks themselves. When someone outside your paradigm examines your assumptions, they may see what you can’t: the questions you’re not asking because they don’t fit your problem-solving architecture.

The scientists and engineers building these systems are working within sophisticated frameworks. But frameworks always embed assumptions. They specify what counts as a problem, what counts as evidence, what counts as a solution. When those frameworks come from cognitively homogeneous teams, certain failure modes can become structurally invisible.

Consider what happens as these systems become more powerful. The distance between ‘technically aligned’ and ‘genuinely beneficial’ may widen. The gap between ‘optimised’ and ‘good’ can become consequential. These aren’t bugs you can patch. They’re paradigm-level questions about what you’re building towards.

The Structure of the Problem

AI development faces a peculiar challenge: the stakes rise faster than our ability to clarify what we’re optimising for. This isn’t a call to slow down. It’s an observation that speed without frame-checking may compound risk.

Technical excellence solves technical problems. But choosing which technical problems to solve requires stepping outside the technical frame entirely. It requires asking whether you’re building systems that serve human flourishing, or whether you’re building systems optimised for goals that happen to be measurable.

Beyond the Frame

The most sophisticated organisations recognise this. They know their blind spots exist. But blind spots, by definition, are hard to see from inside. This is why external perspectives aren’t peripheral to the work. They may be structural requirements for not missing what your own paradigm obscures.

What laboratories are building will shape human civilisation. That transformation demands not just the best technical thinking humanity can offer. It may demand the best kinds of thinking, including the philosophical depth to question whether we’re solving the right problems before engineering brilliant solutions to the wrong ones.

Laboratories willing to seek cognitive diversity, to let different kinds of minds stress-test their frameworks, may find they gain more than rigour. They may gain the ability to see what their own success has made invisible: the assumptions that seemed too obvious to question, right up until they weren’t.


Further reading:

↳ The formal taxonomy of what AI labs cannot see is developed in The Dimensions of Not Knowing.

↳ How homogeneity embeds itself in technical infrastructure is examined in Your Data Architecture Isn’t Technical.

↳ How cognitive homogeneity compounds governance risk at the board level is the subject of The Governance Cascade.


Garden notes