The document sheds light on OpenAI’s efforts to mitigate potential risks associated with its latest multimodal AI model.
Prior to launch, OpenAI employed a standard practice of utilising external red teamers, security experts tasked with identifying vulnerabilities in a system. These experts explored potential risks associated with GPT-4o, such as unauthorised voice cloning, generation of inappropriate content, and copyright infringement.
OpenAI has published the GPT-4o System Card, a detailed research document outlining the safety protocols and risk evaluations conducted prior to the model’s public release in May. The document sheds light on OpenAI’s efforts to mitigate potential risks associated with its latest multimodal AI model.
“This system card includes preparedness evaluations created by an internal team, alongside external testers listed on OpenAI’s website as Model Evaluation and Threat Research (METR) and Apollo Research, both of which build evaluations for AI systems,” explained OpenAI spokesperson Lindsay McCallum Rémy.
This release follows similar system card publications for previous models like GPT-4, GPT-4 with vision, and DALL-E 3, demonstrating OpenAI’s commitment to transparency and external collaboration in evaluating its AI systems.
Based on OpenAI’s internal framework, the researchers categorised GPT-4o as having a “medium” risk level. This overall risk assessment was derived from the highest individual risk rating across four key categories: cybersecurity, biological threats, persuasion, and model autonomy. All categories were deemed low risk except for persuasion, where certain GPT-4o generated text samples exhibited greater persuasive potential compared to human-written counterparts.
The release of a highly capable multimodal model like GPT-4o in close proximity to the US presidential election raises concerns about the potential for misinformation and malicious exploitation. OpenAI’s system card aims to address these concerns by highlighting the company’s proactive efforts to mitigate such risks through real-world scenario testing.
The timing of this release is particularly significant, as OpenAI faces ongoing criticism regarding its safety practices. Concerns have been raised by both internal employees and external stakeholders, including a recent open letter from Senator Elizabeth Warren and Representative Lori Trahan demanding greater accountability and transparency in OpenAI’s safety review processes.
Despite OpenAI’s efforts, calls for greater transparency and external oversight persist. The focus extends beyond training data to encompass the entirety of the safety testing process. In California, legislation is underway to regulate large language models, including holding companies accountable for potential harm caused by their AI systems.