We use a multi-layered security system to restrict DALL·E 3’s means to provide doubtlessly dangerous photographs containing violent, grownup, or hateful content material. Security checks are carried out on the consumer’s prompts and the ensuing photographs earlier than they’re exhibited to the consumer. We additionally labored with early customers and a devoted purple staff to determine and handle gaps in security system protection that emerged on account of new mannequin options. For instance, the suggestions helped us determine edge circumstances of graphic content material era, reminiscent of sexual photographs, and stress check the mannequin’s means to provide convincing and deceptive photographs.
As a part of the work completed to organize for the rollout of DALL・E 3, we now have restricted the potential of the mannequin to generate content material within the fashion of residing artists or photographs of well-known folks, and have centered on demographics throughout generated photographs. We additionally took steps to enhance our presentation. For extra info on the work completed to organize DALL·E 3 for widespread deployment, see DALL·E 3 System Card.
Consumer suggestions helps us constantly enhance. ChatGPT customers can share suggestions with the analysis staff by utilizing the flag icon to inform them of unsafe output or output that doesn’t precisely mirror the prompts they gave ChatGPT. Listening to our numerous and broad consumer group and understanding the true world is important to creating and deploying AI responsibly and is core to our mission.
We’re researching and evaluating an early model of the provenance classifier, a brand new inner device that helps determine whether or not a picture was generated by DALL·E 3. Preliminary inner analysis confirmed higher than 99% accuracy in figuring out whether or not a picture was generated by DALL·E 3. If the picture has not been modified, the picture was generated by DALL·E. >95% of the time, even when the picture undergoes widespread forms of modifications reminiscent of cropping, resizing, JPEG compression, or when textual content or cropping from the precise picture is superimposed on a small portion of the generated picture. Accuracy is maintained. Regardless of such robust leads to inner checks, the classifier solely signifies that the picture was seemingly generated by DALL·E and can’t but make a closing conclusion. This provenance classifier could possibly be a part of quite a lot of applied sciences that assist folks perceive whether or not audio or visible content material is generated by AI. This can be a problem that requires collaboration throughout the AI worth chain, together with the platforms that ship content material to customers. We hope to be taught rather a lot about how this device works and the place it may be most helpful, and refine our method over time.

