Anthropic

Anthropic is hiring: Policy Design Manager, User Well-being in San Francisco

Anthropic, San Francisco, California, United States

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About The Role: As a Safeguards Policy Design Manager, you will be responsible for developing usage policies, clarifying enforcement guidelines, and advising on safety interventions for our products and services. Your core focus will be on mitigating potential risks related to user well-being, including concerns regarding mental health, sycophancy, delusions, and emotional attachment.

Important context for this role: In this position, you may be exposed to and engage with explicit content spanning a range of topics, including those of a sexual, violent, or psychologically disturbing nature.

Responsibilities

Serve as an internal subject matter expert, leveraging deep expertise in mental health and well-being to draft new policies that help govern the responsible use of our models for emerging capabilities and use cases
Design evaluation frameworks for testing model performance in areas of expertise
Conduct regular reviews and testing of existing policies to identify and address gaps and ambiguities
Review flagged content to drive enforcement and policy improvements
Update our usage policies based on feedback collected from external experts, our enforcement team, and edge cases that you will review
Work with safeguards product teams to identify and mitigate concerns, and collaborate on designing appropriate interventions
Educate and align internal stakeholders around our policies and our approach to safety in your focus area(s)
Keep up to date with new and existing AI policy norms and standards, and use these to inform our decision-making on policy areas

Qualifications

Experience as a researcher, subject matter expert, clinician, or trust & safety professional working in one or more of the following focus areas: psychology, mental health, developmental science, or human-AI interaction
Advanced degree in clinical psychology, counseling psychology, psychiatry, social work, or a related field is preferred
Experience drafting or updating product and / or user policies, with the ability to effectively bridge technical and policy discussions

What We Offer

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
A lovely office space in which to collaborate with colleagues

We encourage you to apply even if you do not believe you meet every single qualification.

#J-18808-Ljbffr