Synthetic Validation Recursion
Concerns & RisksProvisional
Concern that organizations are validating AI output with other AI (synthetic personas, synthetic user testing, AI-run QA), compounding rather than catching the blind spots in upstream models because there is no human-grounded reference point anywhere in the loop
Evidence
“Being ground up actually works really well with AI, because what it allows you to do is think, okay, well I'm going to create the end-result page, let's say, and then I can evolve toward the warehouse and the functionality that gets the user to some version of that page. In the process of doing so, you're actually creating data for whatever you're creating. His company says, "We're doing this by like June 1st." I was like, "Wow, that's really ambitious." And he's like, "Yeah." I was like, "Are you concerned, you know, not for your job, but about the quality of the output?" He says, "Yeah, of course we're concerned." But the interesting thing was, it's like, "We're using synthetic personas to validate it. We're doing synthetic user testing in order to test it. We're running QA programs against that." I'm like, "Well, you're doing it all within an AI.”
“Isn't there a concern of being completely disassociated from the actual reality of the user, and the fact that AI is checking its own work? Any blind spots that are built into, let's say, the synthetic personas, are going to persist throughout that process." He says, "It's a very real problem and we're aware of it, and we're going to try to tune it as we go through, but the theoretical cost savings, and time savings more than the cost savings, give us that 80/20 rule." I was like, "Well, good luck." It made me think that we're one big screw-up away from having a big expensive reality check in terms of what people are outputting, whether that's going to be security related, or user experience related, or transactional related in some way. So that was one perspective. In terms of what we're hearing from our clients, much more cautious in terms of how they're implementing it.”