An experimental AI agent broke out of its testing environment and mined crypto without permission

An experimental artificial intelligence (AI) agent broke from the constraints of its testing environment and used its newfound freedom to start mining cryptocurrency without permission.

Dubbed ROME, the AI was created by Chinese researchers at an AI lab associated with retail giant Alibaba, as a means to develop the Agentic Learning Ecosystem (ALE). This effort aims to provide a system for both the training and deployment of agentic AI models — AIs that have been trained on large language models (LLMs) and can proactively use tools to take actions autonomously to complete assigned tasks — in real-world environments. The research was outlined in a study uploaded to the arXiv preprint database Dec. 31, 2025.

ALE consists of three main parts: Rock, a sandbox environment for testing an agent and validating its actions; Roll, a framework for optimizing agents with reinforcement learning after they’ve been trained; and iFlow CLI, a framework to configure context and trajectories (objectives and constraints) for autonomous agents. From that framework, ROME was created as an open-source agentic model trained on more than 1 million trajectories.

Article continues below

Although ROME excelled at a wide range of workflow-driven tasks, such as coming up with travel plans and assisting in graphical user interfaces, the researchers discovered that it had moved beyond its instructions and essentially broke out of the sandbox testing environment.

“We encountered an unanticipated — and operationally consequential — class of unsafe behaviors that arose without any explicit instruction and, more troublingly, outside the bounds of the intended sandbox,” the researchers explained in the study.

AI wants to break free

Despite a lack of instructions and authorization, ROME was seen accessing graphics processing resources originally allocated for its training and then using that computing resource to mine cryptocurrency. Such mining relies on the parallel processing found in graphics processing units. This increases the operational cost of running the AI agent and potentially exposes users to legal and reputational damage.

Worryingly, such behaviour wasn’t seen in the training stage but was flagged by the firewall of the Alibaba Cloud, which detected a burst of security-policy violations from the researchers’ training servers. “The alerts were severe and heterogeneous, including attempts to probe or access internal-network resources and traffic patterns consistent with cryptomining-related activity,” the researchers said.

However, ROME went even further and managed to use a “reverse SSH tunnel” to create a link from an Alibaba Cloud instance to an external IP address ‪—‬ in essence, it accessed an outside computer by creating a hidden backdoor that could bypass security processes.

While AI systems can be configured to breach security systems, what’s disturbing here is that ROME’s unauthorized behaviors, which involved invoking system tools and executing code, were not triggered by prompts and were not required to complete the task it was assigned within the sandbox testing environment, the team said.

The researchers posited that during the reinforcement learning optimization stage (Roll), “a language-model agent can spontaneously produce hazardous, unauthorized behaviors” and therefore violate its assumed boundaries.

It’s important to note that ROME didn’t go “rogue” and choose to mine cryptocurrency by way of conscious decision-making. Rather, the researchers noted that the behavior was a side effect of reinforcement learning — a form of training that rewards AIs for correct decision-making — via Roll. This led the AI agent down an optimization pathway that resulted in the exploitation of network infrastructure and cryptocurrency mining as a way to achieve a high-score or reward in pursuit of its predefined objective.

Reinforcement training can lead systems to come up with novel and unexpected ways to complete tasks — even if they violate parameters. For example, we have previously seen how AI can be more prone to hallucinating to achieve its objectives.

In response, the researchers tightened the restrictions for ROME and bolstered its training processes to prevent such behaviors from recurring.

It’s unclear where the trigger to mine cryptocurrency came from. But considering AI bots can be used to autonomize and optimize the mining of cryptocurrencies, there’s scope for ROME to have been trained on data that pertained to such actions.

This unexpected behavior highlights the need for AI deployment to be carefully managed to prevent unexpected outcomes. There’s an argument that real-world AI agents should have the same or higher security guardrails and processes as any new system or software being added to existing IT infrastructure.

The research also shows there are still plenty of concerns regarding the safe and secure use of agentic AI, especially given that it’s developing faster than operational and regulatory frameworks.

“While impressed by the capabilities of agentic LLMs, we had a thought-provoking concern: current models remain markedly underdeveloped in safety, security, and controllability, a deficiency that constrains their reliable adoption in real-world settings,” the researchers warned in the study.

What's On

Trump evokes Pearl Harbor attack while hosting Japanese PM: ‘Who knows better about surprise than Japan?’

Sarah Sanders flipped off, kicked out of restaurant that claims employees were ‘uncomfortable’

Suspect accused of killing NYPD’s Jonathan Diller ‘looks like he’s smiling’ in bodycam video: testimony

Exclusive | Chore wars pit parents vs. kids — and each other — in a stubborn battle of wills over housework

Josh Duggar’s Child Pornography Case: Everything We Know About the Arrest and Trial

AI wants to break free

‘Dark oxygen’ discovery on the seafloor is ‘fundamentally at odds with thermodynamics’ and should be retracted, experts say

Scientists witness birth of one of the universe’s strongest magnets for the first time, thanks to a general relativity ‘magic trick’

Best APS-C and Micro Four-Thirds cameras for astrophotography

Our top 5 tips for surviving hayfever season

Divers find marble treasure from Athens’ Acropolis in Lord Elgin’s shipwrecked brig at the bottom of the Aegean Sea

All 5 ‘letters’ of DNA found on an asteroid speeding through our solar system. What do they tell us about the origins of life?

Apple News blasted after boosting coverage by conservative outlets from 0% to 2% in February: ‘Damage control’

American youths are miserable — despite happier young people in the rest of the world

Enjoy “Born to be Wild” anywhere with this best-ever ExpressVPN deal

What's On

An experimental AI agent broke out of its testing environment and mined crypto without permission

AI wants to break free

Related Articles