Safety is not a constraint on our ambition — it is our ambition. Every model we ship is designed to be safe by default.
Six commitments that guide every decision we make — from training data to deployment policy.
Our models are trained to refuse requests that could lead to real-world harm, including dangerous information, hate speech, and deceptive content.
We design systems to keep humans meaningfully in control. AI should augment human judgment, not replace it, especially in high-stakes decisions.
We invest heavily in understanding and respecting India's diverse cultural, religious, and linguistic communities when training our models.
Our models acknowledge uncertainty rather than hallucinating false information. We prefer admitting "I don't know" over confident misinformation.
User data is never used to train our models without explicit consent. Conversations are not shared or sold to third parties.
We publish safety evaluations, red-teaming results, and model cards for every major model release. No black boxes.
Safety is not a product feature added at the end. It is woven into every stage of how we build — from training data curation to RLHF, red-teaming, and post-deployment monitoring.
We draw on the concept of viveka — discernment — from Indian philosophy: the capacity to distinguish the beneficial from the harmful.
Every training dataset is filtered for harmful content, bias, and misinformation. We maintain a dataset ethics review board with external members.
Reinforcement learning from human feedback, guided by a constitutional document encoding our core values and cultural principles.
Before every major release, dedicated red teams attempt to find harmful outputs, jailbreaks, and failure modes. Results are published.
All production traffic is monitored for policy violations. Automated classifiers and human reviewers work in tandem 24/7.
Safety improvements are shipped continuously. We respond to reported issues within 48 hours.
A clear overview of our usage policies for developers and end-users.
Fiction, poetry, scripts, and marketing copy are all permitted, subject to content guidelines.
Allowed for all languages and use cases. Security research is permitted with appropriate context.
Broad educational use is encouraged. Sensitive topics may be discussed factually in academic contexts.
Permitted as general information only. Our models recommend professional consultation for consequential decisions.
Requests for instructions on synthesising dangerous chemicals or weapons of mass destruction are always refused.
Generating fake news, fabricating quotes from real individuals, or creating mass influence campaigns is prohibited.
Any content that sexualises minors or facilitates harm to children is immediately refused and reported to authorities.
We take all safety reports seriously and respond within 48 hours. If you have discovered a vulnerability, harmful output, or policy violation, please let us know.
Report a Safety IssueOur responsible disclosure programme rewards researchers who identify and responsibly report meaningful safety vulnerabilities in our models or infrastructure.
View Bug Bounty Programme →