OpenAI released teen safety prompt policies for developers of gpt-oss-safeguard
OpenAI released a set of teen safety prompt policies for developers using gpt-oss-safeguard. The tools filter age-specific risks: discussions of self-harm…
AI-processed from OpenAI Blog; edited by Hamidun News
OpenAI has published a set of ready-made safety policies based on prompts, designed to protect teenagers in applications built on the gpt-oss-safeguard model. The tool allows developers to add age-based moderation without the need to write complex filters from scratch. Teenagers are one of the most active audiences for AI applications.
Chat-bots, tutors, game companions, learning tools—all of these are used by children and young people aged 13 to 17. At the same time, most language models are configured by default for adult users: they do not distinguish between the question of a thirty-year-old specialist and the question of a fourteen-year-old schoolboy. The gap between the technical capabilities of the model and its actual audience has long been a problem for developers of mass-market products.
Regulators around the world are paying attention to this. In the European Union, the AI Act requires special protection of vulnerable user groups, including minors. In the United States, debates continue on rules for AI use by schoolchildren.
Companies developing products for a broad audience are increasingly faced with the requirement to prove that their system cannot harm a child—and this requirement is becoming not just ethical, but legal.
gpt-oss-safeguard is an open guard model from OpenAI, designed to check incoming and outgoing messages in chat systems. It analyzes content for policy violations and can block or flag problematic requests before they reach the main model or the user. The new policies for teenagers are implemented as prompts—text instructions that the developer passes to the model along with the request. This allows, without changing the code base, to connect an additional filtering layer specific to the age group. The policies cover risks relevant specifically to teenagers: topics of self-harm, cyberbullying, provocative sexual content, and situations where AI could unknowingly act as an authoritative adult and exert excessive influence on not-yet-formed thinking.
The developer includes the policy in the system prompt of their application. The guard model checks every interaction—both incoming requests from the user and system responses—against criteria adapted for the age group. If content does not pass the filter, the system can reject the response, rephrase it, or pass the situation for manual moderation. The key advantage of this approach is flexibility: the developer does not get a black box with rigid rules, but works with customizable policies. This is fundamentally important because the safety context for teenagers is very different—an educational platform for schoolchildren, a game chat companion, and a mental health app for youth require different moderation approaches.
OpenAI is making these tools available to the public, and this is part of a broader strategy by the company. By publishing ready-made moderation solutions, OpenAI lowers the entry barrier for small teams that lack the resources to develop their own security systems. At the same time, this forms industry standards: if enough developers adopt these policies, a de facto norm of teen protection in AI applications will emerge—and a way to demonstrate to regulators and the public a responsible approach to development.
Questions of age-based safety are moving from the category of ethical discussions to the category of concrete tools. Developers building products for youth can now rely on ready-made solutions from a leading market player—instead of inventing their own filters or ignoring the problem. For the industry, this is progress in the right direction.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.