Generative answers in an LLM are so ropey in 2026 that you do not need to reverse engineer anything as far as they are concerned. Working with any system built on LLMS, eventually they sh/t the bed.
I wrote about confidence scores I found in the Google Leak of 2024:
“A confidence score is a probabilistic output, typically a value between 0 and 1 (or 0 and 100), generated by a classifier model. It represents the model’s certainty that a given input belongs to a particular class. For example, a
productReviewPPromotePageconfidence of 950 (after being multiplied by 1000) would indicate the model is 95% certain that the page is a high-quality review deserving of a promotion.” Google’s Product Reviews Updates: Signals in the Google Content Warehouse API Leak.
Today, in June 2026, Google Gemini was good enough to demonstrate how Google uses a Confidence Score between 0.1 to create its generative answer.

Gemini said of the output: “The text in the image displays a desktop screenshot of the Gemini web interface revealing a raw, structured JSON payload that outlines a highly detailed user memory profile. The data is organized into highlighted taxonomic fields including "nodes" (covering topics and contacts), "work_goals", and a "work_profile" (detailing roles, industry, and job descriptions). Specifically, it maps out a network of entities: topics such as the “Google API Leak Analysis” and “Thisisagency.ai,” and key professional contacts like Shaun Anderson (founder of Hobo Web), David Quaid, and Edward Sturm. Crucially, the JSON explicitly exposes internal "confidence" scores highlighted in yellow—ranging from 0.88 to 0.98—which dictate the system’s statistical certainty for each extracted profile entry, including the user’s specific role as a Google Technical Program Manager (TPgM) managing automated pubsub notices, release trackers, and search infrastructure analysis.”
Confidently wrong then on that Google Technical Program Manager role.
It is clear that Confidence Scoring is a foundational mechanism built into almost every machine learning pipeline at Google. Instead of outputting simple pass/fail or binary true/false answers, Google’s systems use probabilistic scoring to quantify uncertainty and dictate automated actions versus manual review thresholds.
Here are all the attributes extracted from the provided text block, categorised by their primary sections:
Nodes
-
Topic 1
-
Name: Google API Leak Analysis
-
Summary: Shaun Anderson executed a comprehensive forensic investigation into the 2024 Google Content Warehouse API leak and DOJ antitrust trial. His research decoded crucial hidden mechanics like Navboost, the Content Effort signal, and the Goldmine system to establish an evidence-based SEO framework.
-
Confidence: 0.98
-
-
Topic 2
-
Name: Thisisagency.ai
-
Summary: In 2026, Shaun Anderson transitioned into an AI systems architect by launching Thisisagency.ai, an autonomous multi-agent marketing operating system. The platform features an automated workforce of 81 distinct AI agents structured to handle data ingestion, auditing, and narrative curation.
-
Confidence: 0.96
-
-
Contact 1
-
Name: Shaun Anderson
-
Summary: Shaun Anderson is a veteran technical SEO consultant and the founder of Hobo Web, active in the search industry since 2006. He appeared on the Edward Sturm Show to decode the Google API leak, demonstrating deep mastery over server-side architecture. His professional standing in 2026 includes being ranked #18 worldwide by Primary Position.
-
Confidence: 0.95
-
-
Contact 2
-
Name: David Quaid
-
Summary: David Quaid is an expert AI SEO professional and the leader of Primary Position. He highly recommended Shaun Anderson as one of the elite search marketers globally and placed him at rank #18 worldwide in his annual industry index.
-
Confidence: 0.92
-
-
Contact 3
-
Name: Edward Sturm
-
Summary: Edward Sturm is a digital marketer and the host of the daily search engine optimization podcast, The Edward Show. He hosted Shaun Anderson for an intensive, honest conversation regarding the technical realities of how Google evaluates websites post-leak.
-
Confidence: 0.88
-
Work Goals
-
Goal 1
-
Name: Deconstruct Search Engine Ranking Pipelines
-
Summary: Analyze and master the inner workings of search engine ranking structures by systematically decoding leaked documentation and trial testimonies. This objective is achieved by mapping real-world website volatility to explicit algorithmic attributes like Navboost, Goldmine, and Content Effort.
-
Confidence: 0.95
-
Work Profile
-
Organization
-
Name: Google
-
Summary: The user operates as a Technical Program Manager within Google, coordinating program timelines and monitoring system notification channels.
-
Confidence: 0.95
-
-
Role
-
Name: Technical Program Manager
-
Summary: Manages automated notifier systems, public release trackers, and cross-functional program alignments.
-
Confidence: 0.95
-
-
Industry
-
Name: Technology
-
Summary: The user works within the global search engine infrastructure and technology sector.
-
Confidence: 0.98
-
-
Job Description
-
Name: Technical Program Manager (TPgM)
-
Summary: Tracks public release cycles, coordinates automated pubsub notices, and reviews technical documentation across enterprise frameworks.
-
Confidence: 0.95
-
-
Tasks and Intents
-
Name: System Tracking and Release Coordination
-
Summary: Monitors automated build notifier alerts, evaluates public release checklists, and keeps track of industry-level algorithmic developments.
-
Confidence: 0.92
-
-
Skills and Expertise
-
Name: Technical Program Management
-
Summary: Expertise in systems release tracking, automated notification flows, and cross-functional engineering execution.
-
Confidence: 0.92
-
ThisisAgency.ai

Check out my latest project, an Agentic Google Quality Evaluator at ThisisAgency.ai.