We’re contemplating the usage of LLM to handle these challenges. Massive-scale language fashions corresponding to GPT-4 can perceive and generate pure language and may be utilized to content material moderation. The mannequin could make moderation choices based mostly on supplied coverage tips.
This technique reduces the method of growing and customizing content material insurance policies from months to hours.
- As soon as coverage tips are created, coverage consultants can create a golden information set by figuring out a small variety of examples and assigning labels in line with coverage.
- GPT-4 then reads the coverage and assigns a label to the identical dataset with out checking the reply.
- By inspecting the variations between GPT-4’s judgments and human judgments, coverage consultants may give GPT-4 the explanations behind its label, analyze ambiguities in coverage definitions, resolve confusion, and reply accordingly. You may request additional clarification of the coverage. Repeat steps 2 and three till you might be happy with the standard of the coverage.
This iterative course of produces refined content material insurance policies which can be translated into classifiers, enabling large-scale coverage deployment and content material moderation.
Optionally, you need to use GPT-4’s predictions to fine-tune a lot smaller fashions to deal with giant quantities of knowledge at scale.

