Uncover Urgent AI Risks that Threaten Your Business
Updated: Jun 11
Alec Crawford Founder & CEO of Artificial Intelligence Risk, Inc.
Artificial Intelligence and LLMs are so new that there are two completely different sets of risks: the risks that come from adopting AI and the risks that come from not adopting AI. Not using AI, or trying too late, will eventually lead to obsolescence and business failure for many companies. Using AI requires a significant risk management effort.
In this article, we suggest using a modified version of NASA’s “box” approach that looks at critical-high-medium-low risks for business that include not just the severity and likelihood, but also a new dimension, time. Most businesses adopting AI do not have the luxury of worrying about risks a decade or more down the road (i.e. super smart AI that takes over the world.) We believe companies should focus primarily on today’s AI risks that can also have a big impact on the company.
Identify Today’s Critical AI Risks First…
In our framework for GRCC for LLMs, we classify LLM-related risks into a four-tier system: critical, high, medium, and low. This stratification recognizes not only the severity of impact but also the temporal proximity of these risks. We discern between present challenges, such as data privacy breaches, imminent threats like regulatory non-compliance in the wake of evolving laws, and eventual risks tied to future capabilities or use cases, or something like AGI. Note that while we may talk about risks at different stages (e.g. building a training data set versus live customer interaction with a production model), for the purposes of risk tiering and timing, we place them without regard to when you need to analyze and mitigate them during the overall development and deployment process.
Trustworthy AI Is: 1. Ethical and Fair: AI systems must adhere to ethical standards and make unbiased decisions, respecting human rights and dignity. 2. Transparent and Explainable: Users should be able to understand AI decision-making processes and the reasoning behind outcomes. 3. Accountable and Responsible: Clear accountability for AI decisions must be established, with mechanisms to address any negative impacts. 4. Private and Secure: AI must protect personal data, ensuring privacy and secure data management practices. 5. Safe and Beneficial: AI should reliably operate with safety protocols in place and contribute positively to society and the environment. |
Identifying and Mitigating Risks to Create Trustworthy AI
We divide risk into several categories with the overall goal of creating a trustworthy AI system. By trustworthy, we mean transparent, explainable, accurate, ethical, and one that can correct mistakes going forward (see the box). You need to get the following pieces right to risk manage AI: training data, filtering user inputs and model outputs, and overall model trustworthiness.
Here are some examples and you can dive deeper in our white paper on the topic by request:
Training data: Risks of large data sets, data poisoning, data bias, discrimination, confidential data, PII, copyrighted or restricted data
User inputs: Confidentiality, cybersecurity, manipulation, data access governance security, controversial topics, blocked/illegal topics
Model outputs: Confidential information, toxic language, evidence of hacking, lack of continuity, lack of complete answer, hallucination, controversial topics, biased answer, unethical answer, discrimination, malicious application potential (e.g. hacking information)
Model trustworthiness: Transparency, explainability, upholding ethical principles, learning from mistakes over time.
User Inputs: The Gatekeeping Dilemma
While most user prompts are legitimate, one must be vigilant about protecting confidential data and personally identifiable information offered up or requested by the user. In addition, as hacking and manipulation of LLMs grows, it will be critical to defend against these malicious actions. As these are new, unique risks, this demonstrates the need for a new category of software: AI platform governance, risk, compliance, and cybersecurity (AI GRCC).
As part of gatekeeping, one controls user access to use cases, models, data, and documents and filters the user prompt for hacking attempts, toxic language, and worse.
Ensuring Confidentiality: LLMs must respect the confidentiality of user inputs, which may contain sensitive information. AI GRCC software should be able to block or allow specific types of personally identifiable information (PII) based on the user and use case. It should also be able to detect information confidential to the user and block or allow it based on the use case. Of course, the default is to block information.
Cybersecurity, Hacking, and Jailbreaking: As gatekeepers of information, LLMs are targets for hacking and jailbreaking attempts. Once penetrated, LLMs or “Copilots” can identify critical information, such as client information or emails to the CEO, in moments. Robust cybersecurity measures, including intrusion detection systems and filters to detect jailbreaking, prompt injections, do anything now (DAN) style attacks, etc. are needed to fend off such risks.
Preventing Manipulation: The malleability of AI in the face of user inputs is a potential vulnerability. To prevent manipulation, LLMs must incorporate checks to disallow inputs that aim to coax incorrect, toxic, biased, discriminatory, or unethical outputs.
Governing Data Access: Effective data governance ensures that users and LLMs only access permissible data sources, safeguarding against unauthorized dissemination of restricted content.
Tackling Controversial and Blocked Topics: LLMs may be prompted to discuss controversial or illegal topics. A finely tuned filtering system must be in place to detect and block inappropriate content proactively before it gets to the LLM, ensuring compliance with societal norms and laws. This goes beyond keyword searches and requires third-party AI detection tools as part of AI GRCC.
Exhibit: AI Requires New Governance, Risk Management, Compliance, and Cybersecurity
Model Outputs: Ensuring Integrity and Ethical Standards
As much as we must be concerned about user prompts, we must also closely examine model completions. Many third-party models include a “policy layer” that blocks what the model provider considers problem completions. This protection is a small part of what a corporate AI platform requires. In addition, any retraining of a model or introduction of your own or third-party documents and data requires enhanced screening of the output for problems, such as unauthorized release of confidential information. Failure to detect and stop inappropriate model outputs could result in a public relations disaster, up to and including a major data breach if a nefarious actor gains access to your AI system.
Guarding Confidential Information: Permissions for accessing confidential information should be in place for model outputs and inputs. Sophisticated data loss prevention tools and content filters help prevent the model from inadvertently disclosing sensitive information.
Contending with Toxic Language and Unethical Outputs: LLMs may generate toxic or unethical content on their own. A system for continuous monitoring and post-processing filtering are critical to sanitize outputs, aligning them with ethical standards and promoting positive user experiences. Software should allow blocking of controversial topics or other inappropriate material, such as pornography.
Addressing Hallucinations and Continuity Errors: LLMs are prone to “hallucinate” misinformation, provide incorrect answers or provide outputs lacking continuity. Providing relevant data to the model through research augmented generation (RAG) is the first step in reducing hallucinations. Employing screeners such as context validation can help, and corrective feedback mechanisms can significantly reduce such occurrences.
Mitigating the Risks of Malicious Applications: The potential for LLMs to be used in crafting sophisticated phishing attacks or other malicious or illegal activities is alarming. Rigorous output vetting procedures alongside ethical testing datasets can help curb such abuses.
Exhibit: Filter AI Model Outputs with Your Customized GRCC Platform to Avoid Problems
Conclusions
We have shown just a handful of examples of risks from using AI. There are obviously many different risks based on exactly what you use AI for, including some idiosyncratic risks based on your industry and business. Identifying the critical AI risks for today and focusing on those will help avoid process failure and headline or PR risk in the near future.
Copyright © 2024 by Artificial Intelligence Risk, Inc. All rights reserved
This paper may not be copied or redistributed without the express written permission of the authors.
Kommentare