Before we jump in, let me introduce myself. My name is Joseph Thacker, aka rez0. I’m a full-time bug hunter and former principal AI engineer. Over the years, I’ve helped hundreds of companies unearth vulnerabilities that could have cost them millions. I’ve submitted over 1,000 vulnerabilities through crowdsourced platforms like Bugcrowd. Furthermore, I love building and breaking systems.
Alright, let’s talk AI safety and security. One of the first executive actions of the Trump Administration was to repeal the previous administration’s guidance on AI safety and security testing. What does this mean? Specifically, what does it mean for you and other members of the public?
What were the previous AI security regulations?
The previous guidance included:
- AI safety and security—Developers of powerful AI systems were required to share safety test results with the government. This was to ensure that AI systems were safe and trustworthy before they hit the public sphere.
- Privacy protection—With AI’s ability to exploit personal data, there were calls for bipartisan data privacy legislation.
- Equity and civil rights—The administration aimed to combat algorithmic discrimination and ensure fairness in areas like justice and healthcare.
- Consumer protection—Efforts were made to advance AI in healthcare and education while protecting consumers from potential AI harms.
- Worker support—Principles were developed to mitigate AI’s impact on jobs and workplace equity.
- Innovation and competition—The United States aimed to maintain its lead in AI innovation by supporting startups and research.
- Global leadership—Collaborations with international partners were emphasized to ensure safe and responsible AI deployment worldwide.
With the Trump Administration in office, these regulations have been rescinded. The new administration seems to be taking a different approach, which, at first glance, might seem like a step backward. However, whether we see progress depends on what comes next. The Trump Administration has announced Project Stargate (awesome name, by the way), which is an investment of up to $500 billion in private-sector AI infrastructure. This is meant to foster more AI innovation. That said, a case can be made that large AI companies are more focused on “winning the market” than making sure that what they create is well-secured and safe.
Concerns from a security perspective
From a hacker and cybersecurity lens, this shift raises some concerns. While it’s true that AI safety and security risks can easily be overblown, ignoring them isn’t the answer either. The Biden Administration’s guidance on AI safety had its flaws, but I thought it was a decent start. Without these guidelines, the question is: What risks does the new administration see as priorities?
Personally, AI safety risks, such as a superintelligence enslaving us all, doesn’t concern me in the short term as much as AI security does. The potential for AI to be leveraged in attacks, like indirect prompt injection, through vulnerabilities is a real issue that needs attention. Developers need practical advice and recommendations for addressing these security challenges. In the long term, AI safety is worth prioritizing because AI systems will one day be even more influential than they are today.
Specific vulnerabilities
Below are the specific vulnerability types in new AI applications that I have been exploiting in the wild.
Prompt injections
- Indirect prompt injection can occur through browsing or consuming other incoming untrusted data such as when processing emails or documents. For example, I have worked on some interesting AI email-processing systems where the AI summarizing an email could be tricked into maliciously summarizing emails or taking bad actions.
- Multi-modal injection uses images with embedded prompts. Another bug hunter and I found this vulnerability on behalf of a major company the other day. With multi-modal injection, a user can upload a seemingly benign image, and the AI would respond with a malicious link.
- Chain-of-thought attacks leverage reasoning paths to steer AI systems into behaviors that are adversarial to a victim user or system. Such attacks are just “light jailbreaks,” which many have seen before, but it’s always fun to have the AI “roleplay” in a way that lets it break its guidelines.
AI = New paths to traditional vulnerabilities
- Cross-site request forgery (CSRF) in chatbots and agents leads to an attacker triggering impactful actions for victims. I found a bug where an AI assistant could be tricked into taking malicious actions through a CSRF vulnerability due to a lack of user confirmation.
- XSS in AI systems convinces an LLM to output XSS payloads. Some AI agents can be tricked into writing XSS payloads via CSRF or just in an app into fields where they will be rendered later for other users.
- Cross-user or cross-org data access can occur due to over-permissioned AI agents. When an AI agent is making requests on a user’s behalf, it must share the same authorization as the user calling it. Otherwise, there can be severe vulnerabilities.
While the rollback of these regulations may seem alarming, I think we can wait and see what comes next. The major point I want you to leave with is that AI will have a significant impact on our lives, and we need people working on both AI safety and security. The new administration should ideally release something similar to Biden’s order, or at least provide more practical guidance on security issues. What changes or recommendations would I like to see from the new administration? I’m glad you asked.
My government output wish list
- A list of suggested guardrail software to be used in AI systems (or at least a list of what’s out there): Let’s get some practical tools into the hands of engineers. Specifically, I mean actual libraries and plugins that actively block known attack patterns, not just a PDF document with some bullet points.
- Easily deployable templates for AI-cloud infrastructure like what Apple has done for its AI-cloud infrastructure: Think one-click deployments with all of the security configurations already set up. This means we won’t have to start from zero with every new AI app or model; security should be baked in from the start.
- A list of poor and proper design patterns for AI systems: We need a clear, no-fluff list that shows what not to do and what to do to build secure AI architectures properly, such that even someone new to AI security can quickly grasp it.
- Guides for pen testing and auditing AI systems: Pen testing and auditing AI systems is a good bit different from traditional vulnerability scanning and pen tests; we need specific approaches for testing AI applications.
- Investment in education for government officials and employees on AI security and safety: Let’s get decision makers some working knowledge, not just buzzwords—for smart decisions to occur, decision makers and various stakeholders need to have a solid grasp of AI security and safety.
- How to do automated security and safety testing at scale: With how fast it can be to build apps with AI, we need to automate testing—we definitely can’t do testing manually when AI models are also changing quickly. Dev teams need to think about security from day one and bake in automated testing.
As we wrap things up, I just want to express that navigating the ever-changing landscape of AI safety and security has never been more important. With regulatory changes and new vulnerabilities emerging, it’s obvious to me that AI is a double-edged sword—we are going to have to wield it wisely. While the rollback of Biden’s regulations might offend some, it also opens the door for (potentially) more practical recommendations and protection.
My hope is that people will work on robust safety guardrails, create smarter design patterns, and devote both time and money to educating the captain steering the ship. AI is moving fast, and our job is to ensure security efforts can keep pace. Please stay engaged and proactive. Don’t lose vigilance.