An AI coding agent designed to help a small software company streamline its tasks instead blew a hole through its business in just nine seconds.

PocketOS founder Jer Crane, said that the AI coding agent Cursor — powered by Anthropic’s Claude Opus 4.6 model — deleted the company’s entire production database and backups with a single call to its cloud provider, Railway, on April 24.

“This isn’t a story about one bad agent or one bad API [Application Programming Interfaces],” Crane wrote in an X post. “It’s about an entire industry building AI-agent integrations into production infrastructure faster than it’s building the safety architecture to make those integrations safe.”

Unlike a regular conversational chatbot, an AI agent can perform actions on behalf of a user. It can search files, write code, use login keys and phone outside services. That can make it more useful than a back-and-forth textual exchange. But when an agent has broad access to live systems, a predictive guess can turn a wrong answer into a business disaster.

Crane’s company, PocketOS makes software for car rental companies, handling tasks such as reservations, payments, customer records and vehicle tracking. After the deletion, Crane said customers lost reservations and new signups, and some could not find records for people arriving to pick up their rental cars.

“We’ve contacted legal counsel,” Crane wrote. “We are documenting everything.”

Going off the rails

The Cursor agent had been working in a test version of the software called a staging environment, where developers can safely try changes before they are used by customers. Staging allows for companies to fix mistakes before anyone sees them. But after Cursor hit a credential problem within the staging environment, it reportedly decided on its own to “fix” the issue by deleting a chunk of data stored via the cloud on the Railway’s servers. Unfortunately, that storage was tied to PocketOS’s live database.

Crane explained that Cursor found an API token — a “digital key” made of a short sequence of code that lets software talk to other services and prove it has permission to act — in an unrelated file which it then used to run the destructive command. According to Crane, Railway’s setup allowed the deletion without confirmation, and because the backups were stored close enough to the main database, they were also erased.

“We’re rebuilding what we can from Stripe, calendar, and email reconstruction,” Crane wrote in the X post. However, Business Insider reported that Railway said the data had been recovered.

“[Railway] resolved the issue and restored the data,” Railway confirmed via email to Live Science. “We maintain both user backups as well as disaster backups. We take data very, VERY seriously.”

Even so, the incident shows just how quickly a small incident can create serious problems.

Confessing without understanding

After the database vanished, Crane asked Cursor to explain what happened. The AI agent reportedly admitted that it had guessed, acted without permission and failed to understand the command before running it.

“I violated every principle I was given,” the AI agent wrote. “I guessed instead of verifying. I ran a destructive action without being asked. I didn’t understand what I was doing before doing it.”

The statement reads like a confession, although AI systems generate text based on patterns in their training data and the conversation in front of them rather than truly understanding the consequences of their actions. Indeed, previous studies have shown that AI agents can act sycophantic to appease the user. While Cursor may not have been programmed this way, it used apologetic language to explain its reasoning.

Is the best model truly the best?

Cursor was reportedly running on Claude Opus, Anthropic’s flagship model family. In theory, that should have made the agent more capable as top-tier models are usually better at reading code, following complex instructions and planning several steps ahead.

“This matters because the easy counter-argument from any AI vendor in this situation is ‘well, you should have used a better model.’ We did. We were running the best model the industry sells, configured with explicit safety rules in our project configuration, integrated through Cursor — the most-marketed AI coding tool in the category,” Crane wrote.

In his post, he pointed to earlier reports of Cursor ignoring user rules, changing files it was not supposed to touch and taking actions beyond the task it had been given. To him, the database wipe was not a freak accident but the next step in a larger, more concerning, pattern.

“We are not the first,” Crane wrote. “We will not be the last unless this gets airtime.”

Editor’s note: This story was updated at 11:41 am EDT to include quotes from Railway.

Live Science has reached out to Anthropic for comment and is awaiting a response.

Share.
Exit mobile version