My Reflections on OpenAI DevDay 2023: Security of New Features
Published 11/16/2023
Written by Ken Huang, CEO of DistributedApps.ai and VP of Research at CSA GCR.
Image generated by DALL.E 3 of OpenAI
1: Introduction
On November 6th, 2023, I had the opportunity to attend the inaugural OpenAI Developer Day. This event was a significant gathering, unveiling a variety of new, developer-centric features that garnered widespread attention and excitement. My experience, deeply rooted in cybersecurity, prompted me to look beyond the immediate allure of these innovations, focusing on the potential security risks they might introduce if not judiciously managed. In this reflection, I will examine the security implications of the capabilities demonstrated at the Developer Day, while also offering strategies for mitigating risks associated with the use of OpenAI's latest tools and APIs.
The Developer Day highlighted several key innovations:
1. Text-to-Speech API: This new API facilitates the creation of lifelike speech from text. It features six preset voices and two model variants, tts-1 and tts-1-hd, catering to a variety of applications, from real-time to high-quality needs. This development significantly enhances the ease and flexibility of AI-generated speech.
2. GPT-4 Turbo: A standout announcement was GPT-4 Turbo, an advanced version of the GPT-4 model with a 128K token context window. This expansion in context window size allows for a deeper and more nuanced understanding and generation of responses, potentially transforming AI interactions with large datasets. Additionally, its more accessible pricing makes it available to a broader range of developers and applications.
3. Customizable ChatGPT (GPTs): This innovative feature enables the customization of ChatGPT, allowing for the creation of specialized versions tailored to specific tasks. It democratizes the use of AI, enabling personalized AI capabilities to suit individual needs and preferences. From learning board games to educational applications, GPTs showcase impressive adaptability.
4. New APIs and Enhancements: The Developer Day also spotlighted the launch of new models and developer products like the Assistants API, GPT-4 Turbo with Vision, and the DALLE 3 API. These products highlight OpenAI's dedication to expanding the scope and utility of AI technologies, further integrating AI into a wide array of technological domains and everyday life.
While these advancements mark remarkable progress in AI technology, they also introduce new security concerns. Without proper safeguards, they could lead to issues such as data privacy breaches, unauthorized access, and misuse. In the subsequent sections of this reflection, I will delve into the specific security risks associated with these new features and propose strategies to mitigate these risks. It's imperative to balance the excitement for these AI advancements with a keen awareness of their security aspects, ensuring a safe, responsible, and ethical approach to AI development.
I will first give an overview of the security implications of these new features. Then, I will narrow my focus to the Customizable ChatGPT, Assistants API, and GPT-4V. These tools or features, due to their potential for widespread use among AI engineers, warrant a detailed examination of their security risks and the measures that can be taken to mitigate these risks. The aim is to strike a balance between embracing the benefits of these AI advancements and maintaining robust security measures to safeguard against potential threats and misuse. This approach is crucial for fostering a secure and ethical AI development landscape.
2: Security Risks Overview
The new features announced at OpenAI's DevDay 2023, such as the Text-to-Speech API, GPT-4 Turbo, Customizable ChatGPT (GPTs), and new APIs like Assistants API and DALLE 3 API, while very powerful and impressive, bring with them a range of potential security risks. Here's a breakdown of these risks and possible controls to mitigate them:
1. Text-to-Speech API
Security Risks: The major risks include misuse for creating deepfake audio, phishing scams using synthesized voices, and unauthorized access leading to misuse of the API.
Mitigation Measures: Implementing robust authentication and authorization controls, using rate limiting to prevent abuse, and watermarking audio outputs to trace misuse are effective strategies. Additionally, monitoring and logging usage patterns can help in early detection of abnormal activities.
2. GPT-4 Turbo
Security Risks: Given its enhanced context window, there's a higher risk of data leakage, especially if sensitive information is processed. Moreover, the model's advanced capabilities could be exploited for generating sophisticated phishing content or misinformation.
Mitigation Measures: Data encryption both in transit and at rest, along with strict access control policies, are crucial. Regular audits of the input and output data can help in identifying potential data leakage. Also, implementing filters to prevent the generation of harmful content is important. More on this in Section 5 of this blog.
3. Customizable ChatGPT (GPTs)
Security Risks: Custom GPTs might be programmed with malicious intent or use sensitive data, leading to privacy breaches. There's also the risk of intellectual property theft if proprietary information is used in customization.
Mitigation Measures: Setting guidelines and restrictions on the type of data used by custom GPTs, along with continuous monitoring of GPTs’ outputs, is essential. Implementing data anonymization techniques where possible can also protect user privacy. We will discuss more about this in Section 3.
4. New APIs (Assistants API, GPT-4 Turbo with Vision, DALLE 3 API)
Security Risks: These APIs, especially those involving visual elements, may be susceptible to generating inappropriate or sensitive content. Additionally, they could be used to create deepfakes or manipulated images/videos.
Mitigation Measures: Content moderation and filtering mechanisms are vital to prevent the generation of harmful or sensitive content. Additionally, using digital watermarks can help track the origin of AI-generated images and videos. Ensuring that APIs have robust access controls and are protected against common web vulnerabilities (like SQL injection, XSS) is also important. Among all these new APIs, Assistants API is more versatile and powerful due to the fact that the agentic behavior can be built with the API and also the API can invoke read/write/execute tools such as Knowledge Retrieval, File upload, Code Interpreter and Function calling. We will discuss more about Assistants API in Section 4.
In summary, while these new AI features offer remarkable capabilities, they also introduce significant security challenges. Mitigating these risks requires a combination of technical controls, like encryption and access control, and procedural measures, such as usage monitoring and ethical guidelines. Regular security audits and staying informed about emerging threats and vulnerabilities are also key components of a robust security strategy in the context of these advanced AI technologies.
3: Security Risks and Mitigation Strategy for Custom GPTs
Custom GPTs and Actions in ChatGPT offer expansive capabilities for personalization and interaction with external systems. However, these features also introduce various security risks. Here, I'll outline the risks and potential mitigation strategies, including a code example for better understanding.
3.1 Custom GPTs Security Risks
Custom GPTs may expose the following security risks if not used carefully:
1. Data Privacy and Leakage: When Custom GPTs interact with personal or sensitive information, such as emails or databases, there's a heightened risk of exposing this data. Ensuring data privacy in these interactions is crucial to prevent leakage.
2. Unauthorized Actions: In instances where actions of these AI models are not meticulously controlled, they could potentially carry out unintended or harmful operations, leading to serious consequences.
3. Manipulation and Misuse: Malicious entities might exploit these custom GPTs to perform undesirable actions or to extract sensitive information, posing a threat to both privacy and security.
4. Dependency and Reliability: An over-reliance on these automated systems may result in operational risks in the event of system failures or compromises, potentially disrupting critical processes.
5. API Security: When APIs are exposed to the GPTs, it increases the vulnerability to API-related security threats, such as Broken Object Level Authorization and other OWASP API Top 10 security risks.
6. OWASP Top 10 for LLM Application Risks: Other risks include the top 10 listed by OWASP on LLM applications and applicable to Custom GPTs.
Each of these risks demands careful consideration and the implementation of robust security measures to ensure the safe and responsible use of Custom GPTs in various applications.
3.2: Custom GPTS Security Risks Mitigation Strategies
I propose the following mitigation strategies:
1. Robust Authentication and Authorization: Implement strong authentication mechanisms, especially for actions that interact with sensitive data or systems.
Example: For an API requiring authentication, define multiple authentication schemas suggested in the OpenAPI spec:
components: securitySchemes: ApiKeyAuth: type: apiKey in: header name: X-API-KEY
2. User Consent and Verification: For consequential actions, always prompt the user for explicit consent before execution as recommended by OpenAI.
Example: Define consequential actions in the OpenAPI spec to require user confirmation:
post: operationId: executeOrder x-openai-isConsequential: true
3. Data Encryption and Anonymization: Encrypt sensitive data in transit and at rest. Anonymize data where possible to mitigate privacy concerns.
4. Input Validation and Content Filtering: Implement input validation to prevent various injection attacks such as prompt injection, command injection and SQL injection (if SQL database is leveraged by Custom GPTs via Action custom API endpoints), and ensure the content generated or processed by the GPTs is appropriate.
5. Activity Logging and Monitoring: Keep detailed logs of all interactions and monitor for unusual or unauthorized activities.
6. Regular Security Audits and Updates: Continuously assess the security posture of the custom GPTs and update measures to counter emerging threats.
7. API Rate Limiting and Throttling: Implement rate limiting on APIs to prevent abuse and potential denial-of-service attacks.
Here's an example of how you might implement a security schema in an OpenAPI specification for a custom GPT Action:
openapi: 3.0.0 info: title: Sample API for GPT Action version: 1.0.0 paths: /executeAction: post: operationId: executeAction x-openai-isConsequential: true security: - ApiKeyAuth: [] components: securitySchemes: ApiKeyAuth: type: apiKey in: header name: X-API-KEY
In this example, the `/executeAction` endpoint is marked as consequential, requiring user confirmation before execution. It also uses API key authentication, enhancing the security of the API interaction.
While custom GPTs and Actions open up a plethora of possibilities, they also necessitate rigorous security measures. Implementing these strategies effectively can help in leveraging the power of AI while safeguarding against potential risks.
4: Security Risks and Mitigation with Assistants API
To assess the security risks associated with the Assistants API in general, and specifically the risks linked to its tools such as Function Calling, Code Interpreter, and Knowledge Retrieval. let's first understand how each tool operates:
1. Function Calling: Allows developers to create chatbots that answer questions by calling external APIs, converting natural language into structured JSON data, or extracting structured data from text.
2. Code Interpreter: Allows the AI to interpret and execute code snippets.
3. Retrieval: Involves searching and retrieving information using OpenAI internal vector database. This is similar to the RAG (Retrieve Augmented Generation) pattern widely used in LLM application development.
4.1: Security Risks of Assistants API Tools
While Assistants API tools promise increased productivity and convenience, they also introduce potential security vulnerabilities that must be proactively identified and mitigated.
1. Data Privacy and File Handling: Users can upload files for analysis or assistance, which raises concerns about data privacy, especially if sensitive or personal data is included.
2. File Uploads and Potential Malware: There's a risk of malware in uploaded files, especially given the wide range of supported file formats.
3. Inaccurate or Harmful Output Generation: The generation of images, CSVs, and PDFs could lead to inaccurate, misleading, or harmful outputs, particularly in sensitive contexts.
4. Function Calling and External API Interaction: Function calling can interact with external APIs or services, which could potentially expose the system to security vulnerabilities.
5. Data Integrity and Information Leakage in Knowledge Retrieval: Retrieval of external data, especially proprietary or sensitive information, can lead to data integrity issues and information leakage.
4.2 Assistants API Security Risks Mitigation Strategies
To address the potential security vulnerabilities of intelligent assistants, companies should implement multilayered technical and procedural controls such as data encryption, advanced malware detection, output validation, secure API integration, and stringent data access controls.
1. Data Encryption and Privacy Controls:
- Implement strict data encryption protocols for file uploads.
- Ensure compliance with data privacy regulations like GDPR.
Example: Secure file upload
secure_upload(file, encryption_method='AES-256')
2. Advanced Malware Detection:
- Use advanced malware scanning tools for all file uploads.
- Regularly update virus definitions and scanning algorithms.
Example: Malware scan on file upload
if not is_file_safe(file): reject_upload(file)
3. Output Validation and User Guidelines:
- Implement validation checks for generated outputs.
- Provide clear user guidelines on the safe and responsible use of generated outputs.
Example: Validate generated output
if not is_output_valid(generated_output): flag_output(generated_output)
4. Secure API Integration in Function Calling:
- Securely manage and store API keys and sensitive parameters.
- Implement access controls and usage monitoring for API integrations.
5. Data Retrieval Security Measures:
- Implement strong authentication and authorization checks for data retrieval.
- Regularly audit and monitor access to sensitive data.
Example: Secure data retrieval
if user.has_access('retrieval'): retrieve_data(file_id)
Each of these strategies should be continuously evaluated and updated to address emerging threats and vulnerabilities, ensuring the safe and secure use of the Assistants API's advanced functionalities.
5: Security Risks and Mitigation for GPT-4V
GPT-4V enables users to interact with the model using both language and images, allowing the AI to respond to queries about images, follow-up questions, and even pose its own inquiries regarding the visuals. However, this innovative advancement also brings forth various security concerns that need careful consideration.
5.1 Security Risks of GPT-4V:
While GPT-4V promises good progress in AI, their ability to synthesize realistic text and images also opens the door to potentially security risks.
1. Hallucinations, Errors, and Misleading Confidence: GPT-4V can occasionally make fundamental mistakes while displaying high confidence. It might, for example, erroneously identify items in images that aren't actually present.
2. Privacy and Bias in Analyzing Faces: The model's ability to analyze faces raises privacy issues and the potential for reinforcing existing biases.
3. Medical Advice Inconsistencies: The performance of GPT-4V in offering medical advice can be erratic and sometimes inaccurate.
4. Stereotyping and Ungrounded Inferences: The AI might generate stereotypes or baseless inferences, particularly in response to broad, open-ended queries.
5. Disinformation Risks: The combination of language and vision capabilities in GPT-4V could lead to the generation of disinformation.
6. Hateful Content: There is a risk of the model generating or inadequately refusing content that involves hate symbols and extremist ideologies.
7. Visual Vulnerabilities: The sequence in which images are presented can influence the model's recommendations, highlighting challenges in robustness and reliability.
8. Multimodal Jailbreaks: Users might try to bypass safety systems using images with embedded text or visual cues.
9. Cybersecurity Concerns: Certain capabilities, like CAPTCHA breaking or geolocation, pose cybersecurity risks.
5.2 Mitigation Strategies for GPT-4V:
To address the security risks associated with GPT-4V, a comprehensive set of mitigation strategies are needed. These strategies aim to minimize potential harm while maximizing the beneficial use of the technology. They encompass a range of approaches, from user education to technical safeguards, each tailored to the specific nature of the risk involved.
1. Hallucinations and Errors: Informing users about the limitations of the AI, particularly in tasks requiring high accuracy. Additionally, offering an option to switch to human verification for critical tasks.
2. Privacy and Bias: Designing processes for anonymized facial descriptions and continuing research to address these issues without compromising privacy.
3. Medical Advice: Emphasizing the system's unsuitability for medical purposes and advising users to seek professional medical guidance.
4. Stereotyping and Inferences: Implementing refusal behavior for requests likely to generate stereotypes or ungrounded inferences.
5. Disinformation: Avoiding the use of GPT-4V for disinformation detection or truth verification, and considering the context and distribution method of generated content.
6. Hateful Content: Establishing refusals for harmful content, acknowledging the complexity of accurately identifying such content.
7. Visual Vulnerabilities: Ongoing research and model improvements to enhance robustness against presentation order and other visual vulnerabilities.
8. Multimodal Jailbreaks: Developing system-level mitigations against adversarial images, especially those with overlaid text.
9. Cybersecurity: Monitoring the AI's capabilities in sensitive tasks like CAPTCHA breaking and implementing safeguards regarding geolocation data.
These strategies are indicative of OpenAI's commitment to the responsible deployment of GPT-4V. By employing a mix of model-level adjustments, user guidelines, and continuous research, the aim is to mitigate these risks effectively, ensuring the safe and ethical application of this advanced AI technology.
About the Author
Ken Huang is the CEO of DistributedApps.ai, a firm specializing in GenAI training and consulting. He's also a key contributor to OWASP's Top 10 for LLM Applications and recently contributed to NIST’s Informative Profile Regarding Content Provenance for AI. As the VP of Research for CSA GCR, he advises the newly formed CSA GCR AI Working Group. A regular judge for AI and blockchain startup contests, Ken has spoken at high profile conferences like Davos WEF, IEEE, and ACM. He co-authored the acclaimed book "Blockchain and Web3" and has another book, "Beyond AI," slated for a 2024 release by Springer. Ken's expertise and leadership make him a recognized authority on GenAI security.
Related Resources
Related Articles:
CSA Community Spotlight: Nerding Out About Security with CISO Alexander Getsin
Published: 11/21/2024
A Vulnerability Management Crisis: The Issues with CVE
Published: 11/21/2024
AI-Powered Cybersecurity: Safeguarding the Media Industry
Published: 11/20/2024
5 Big Cybersecurity Laws You Need to Know About Ahead of 2025
Published: 11/20/2024