In just two years, demand for AI fluency in job postings has grown nearly sevenfold, faster than any other skill category, signaling that AI is moving from experimentation into everyday work.1 Models are more capable, tools are more accessible and organizational conviction is rising. But this acceleration surfaces a core operational problem: many of the assumptions we inherited about “human oversight” were designed for an earlier phase of AI, when systems assisted discrete tasks, not when they coordinated workflows, or acted on behalf of people. As AI’s role expands, the question is not whether humans should remain involved — that much is obvious — rather what effective human oversight actually looks like, and why it remains a decisive constraint on whether advanced AI systems can be trusted, governed and safely scaled in the real world.

Humans imbue contextual meaning into AI systems

In context-rich domains like finance, most AI failures are not blatantly wrong answers. They are plausible answers to a poorly specified question. As Brian Christian argues in The Alignment Problem, many breakdowns stem from specification and intent failures rather than model error: systems optimize exactly what they are told, not what organizations mean.2 The result is semantic mismatch rooted in organizational context, not hallucination.

Oversight, in this sense, is not solely about checking outputs, it is about ensuring AI systems reflect how the firm actually operates.

For example, terms that appear straightforward in finance like “performance” can refer to time-weighted return, money-weighted return, net-of-fees performance, benchmark-relative performance, or a firm-specific adjusted metric, depending on context. AI can recognize the word, but not your firm’s definition of it. Within Addepar, humans establish and encode their specific organization’s context through custom attributes, views and groupings. Oversight, in this sense, is not solely about checking outputs, it is about ensuring AI systems reflect how the firm actually operates. Even in 2025, AI does not understand your business until humans teach it how to understand it.


Brian Cantwell Smith’s distinction between reckoning and judgment explains why this issue persists even as models seemingly improve.3 Modern AI systems excel at reckoning — computation, optimization, classification and prediction. Judgment, by contrast, is context-aware and norm-laden. It involves understanding what is at stake and acting appropriately in underdefined situations. In context-rich domains such as wealth management, this distinction is decisive. Investment decisions, client communications, compliance interpretation and exception handling all require human judgment that cannot be automated.

Trust remains a major constraint on scale

Accenture’s 2025 technology research shows that 77% of executives say AI’s value can only be unlocked when systems are built on a foundation of trust.4 They define trust in terms of accuracy, consistency, predictability and traceability. The latter two properties are inherently human oversight requirements: no matter how capable and accurate the systems are, humans still need to be able to explain why something happened and whether it should have. 


Evidence from financial decision-making underscores the same point. A field experiment on human-AI collaboration in investment contexts shows that people trust advice more when it is presented as coming from a human, even when the content is AI-generated.5 This is not irrational bias, instead a reflection of industry expectations. In highly regulated industries like finance, regulatory burden, reputational risk, and fiduciary responsibility mean that human audit and review are not optional layers. Trust and governance – not model capability – are the true bottlenecks at scale.

Humans are the accountability layer

Philosopher John Haugeland famously observes that it is not AI systems that are unintelligent, it’s that they are unaccountable, or in other words, AI systems “do not give a damn.” They have no stake in outcomes, no exposure to consequences, and no moral, legal or social responsibility if they are wrong. Similarly, Smith argues that judgment — unlike reckoning — requires accountability, responsibility for outcomes, awareness that decisions matter and some form of situated consequence. This distinction maps directly to enterprise AI reality. Agents can draft investment summaries, propose reallocations and surface risks, but they cannot bear fiduciary responsibility or answer to regulators or clients.

The continued importance of human accountability helps explain where AI is actually delivering value today. McKinsey’s State of AI research shows that enterprise-wide EBIT impact remains limited, but organizations consistently report gains in innovation, customer satisfaction and competitive differentiation.6 These are not domains where AI replaces humans outright. They are domains where automated outputs intersect most deeply with human judgment — shaping decisions, informing creativity and improving client experience. In other words, AI delivers value where humans remain accountable for outcomes.

In 2025, the question is no longer whether AI can think. It is whether it can operate responsibly inside complex human systems. Humans remain in the loop not as a failsafe, but as designers, governors and accountable decision-makers.

References:

  1. Agents, robots and us: Skill partnerships in the age of AI. McKinsey & Company, 2025.

  2. The alignment problem: Machine learning and human values. Brian Christian, W.W. Norton & Company, 2021.

  3. The promise of artificial intelligence: Reckoning and judgment, Brian Cantwell Smith, MIT Press, 2019.

  4. AI: A declaration of autonomy — Is trust the limit of AI’s limitless possibilities? Technology Vision 2025, Accenture, January 2025.

  5. My advisor, her AI and me: Evidence from a field experiment on human-AI collaboration and investment decisions, Management Science, 2025. 

  6. The state of AI: How organizations are rewiring to capture value, McKinsey & Company, 2025.