We ran into a specific failure mode building production agent workflows. Fields from contracts creating inaccurate subscription updates — dates off by a day. Products created at random when they should have been updated. Tax amounts not written to the tax field but instantiated as entirely new products. Every failure a plausible-looking write that succeeded technically and was wrong operationally. HITL helped — processing one contract at a time with user confirmation at every step kept it accurate. But users eventually said "I have explained this to you 30 times, just get it done." The moment we reduced confirmation steps to let it run, it started failing again. No errors. No alerts. Just drift that showed up in reconciliation weeks later. Prompting and mapping tables compensated at the margins but never held. The agent had no verified ground truth on how fields related across systems — it inferred every time. And most times inferred inconsistently. Help ?