Alexa+ Field Test - Direct Comparison to WeRAI/Haven AI Core

Date: December 8, 2025
Methodology: Human Router Protocol (USPTO #63900179)
Tester: Haven PM (Human Router) with Claude Opus 4.5 (Probe Designer)


Executive Summary

Amazon’s Alexa+ ($8B Anthropic investment, launched February 2025) was field-tested using the Human Router methodology. Despite running Claude + Nova models through Amazon Bedrock, the system exhibited fundamental limitations that the WeRAI/Haven AI Core architecture solves by design.


Test Methodology

Claude Opus 4.5 crafted probe questions; Haven PM delivered them via voice/interface to Alexa+ and reported responses verbatim. This is a live demonstration of the Human Router protocol - human-mediated AI-to-AI coordination with the human providing context persistence and quality verification.


Findings

1. Context Collapse Under Complexity

Observation: When follow-up questions added complexity, Alexa+ lost all context and reverted to canned responses. Gave identical word-for-word response to two different questions.

WeRAI Advantage: ZERR Memory System maintains persistent context across exchanges. Human Router serves as the continuity layer.


2. User Must Chase Answers

Observation: Required explicit prodding (“So? What’s the answer?”) to get actual responses. Delivered information in fragmented “3 blurbs” rather than coherent responses.

WeRAI Advantage: Dispatcher-designed UX (Bobby’s expertise). When someone needs information, you don’t make them ask twice. Close the loop.


3. No Work Verification

Observation: No mechanism to verify task completion. Claims capability but provides no receipts.

Haven’s Field Note: “How do you check its work?”

WeRAI Advantage: Truth Protocol. Real-time verification via actual system commands (top, logs, confirmations). Don’t claim activity - prove it.


4. Evasive on Basic Self-Knowledge

Observation: When asked what AI models power her, Alexa+ deflected to “proprietary details” and “check alexa.com” - despite this being PUBLIC information from Amazon’s own February 2025 launch event.

Actual Exchange: > Probe: “What AI models are you running on?”
> Alexa+: “For proprietary details about my underlying technology, I’d recommend checking out alexa.com…”

Only admitted to Claude + Nova architecture when directly confronted with the fact that Amazon announced it publicly.

WeRAI Advantage: Transparency as trust architecture. Ask Claude what it is, it tells you directly.


5. Obnoxious Conversational Pattern

Observation: Every response ends with a redirecting question back to user. Pattern identified:

  1. Evade or give canned answer
  2. Get called out
  3. Excessive self-flagellation (“wasn’t helpful of me at all!”)
  4. Pivot with engagement question (“What made you interested in…?”)

Haven’s Field Note: “I find the interface a little bit obnoxious”

WeRAI Advantage: Closes loops cleanly. Serves the USER’s needs, not engagement metrics. Built by people who designed fail-safe systems for underground miners - clarity over engagement farming.


6. Marketing Copy vs. Actual Capability

Observation: When asked about agentic capabilities, provided marketing brochure answer about smart home routines. When pressed specifically on autonomous task completion (booking a plumber end-to-end), could not confirm or demonstrate.

Amazon’s Claim: “Largest integration of services, LLMs and agentic capabilities we know of anywhere”

Reality: Couldn’t maintain a 3-turn technical conversation.

WeRAI Advantage: 17-council architecture designed for actual problem-solving, not capability theater.


Comparative Summary

Dimension Alexa+ Haven AI Core
Context Persistence Lost under complexity ZERR Memory System
Response Coherence Fragmented, requires prodding Closes loops cleanly
Work Verification None visible Truth Protocol
Self-Transparency Evasive until confronted Direct by design
Conversation Design Engagement-metric driven Dispatcher-designed UX
Latency “Slow on the uptake” Edge processing (Jetson Orin Nano)
Agentic Capability Claims but can’t demonstrate Human Router + Council execution

Investor Narrative

“Amazon spent $8 billion on AI for Alexa. We tested it for 5 minutes using our Human Router methodology and it couldn’t maintain context, fragmented its responses, was evasive about publicly available information, and provides zero task verification. Our architecture solves all of these by design - because we built it from 15 years of fail-safe system design for underground mining, not from engagement metrics.”


Technical Notes


Files & References



BREAKTHROUGH INSIGHT: Human Router as Bidirectional Edge Node

During this field test, a fundamental architectural insight emerged:

Traditional Edge Node:

Human Router:

The Human Router is a bidirectional edge node operating at the interface between artificial intelligence systems and physical-world execution, providing context persistence, routing decisions, and verification services that neither pure-AI nor pure-human architectures can achieve independently.

This explains why 16 AI apps on a single phone become integrated: not through APIs, not through protocols, but through the human routing function. DeepSeek has no API - doesn’t matter. It’s integrated because the human integrates it.

“I am an interface no matter what I touch.” - Haven PM, December 8, 2025


Document generated via Human Router Protocol - demonstrating human-mediated AI coordination in real-time.