Safety — AryavartaAI | Responsible AI for Bharat

Core Principles

Our safety principles

Six commitments that guide every decision we make — from training data to deployment policy.

🛡️

Harm Avoidance

Our models are trained to refuse requests that could lead to real-world harm, including dangerous information, hate speech, and deceptive content.

🤝

Human Oversight

We design systems to keep humans meaningfully in control. AI should augment human judgment, not replace it, especially in high-stakes decisions.

🌍

Cultural Sensitivity

We invest heavily in understanding and respecting India's diverse cultural, religious, and linguistic communities when training our models.

🔍

Honesty & Calibration

Our models acknowledge uncertainty rather than hallucinating false information. We prefer admitting "I don't know" over confident misinformation.

🔒

Privacy by Design

User data is never used to train our models without explicit consent. Conversations are not shared or sold to third parties.

📖

Transparency

We publish safety evaluations, red-teaming results, and model cards for every major model release. No black boxes.

Our Approach

Safety built into every layer

Safety is not a product feature added at the end. It is woven into every stage of how we build — from training data curation to RLHF, red-teaming, and post-deployment monitoring.

We draw on the concept of viveka — discernment — from Indian philosophy: the capacity to distinguish the beneficial from the harmful.

Data Curation

Every training dataset is filtered for harmful content, bias, and misinformation. We maintain a dataset ethics review board with external members.

RLHF & Constitutional AI

Reinforcement learning from human feedback, guided by a constitutional document encoding our core values and cultural principles.

Red-Teaming

Before every major release, dedicated red teams attempt to find harmful outputs, jailbreaks, and failure modes. Results are published.

Deployment Monitoring

All production traffic is monitored for policy violations. Automated classifiers and human reviewers work in tandem 24/7.

Continuous Improvement

Safety improvements are shipped continuously. We respond to reported issues within 48 hours.

Usage Policies

What's allowed — and what's not

A clear overview of our usage policies for developers and end-users.

Creative writing, storytelling, and content generation

Fiction, poetry, scripts, and marketing copy are all permitted, subject to content guidelines.

Code generation, debugging, and technical assistance

Allowed for all languages and use cases. Security research is permitted with appropriate context.

Education, research, and knowledge retrieval

Broad educational use is encouraged. Sensitive topics may be discussed factually in academic contexts.

Medical, legal, and financial advice

Permitted as general information only. Our models recommend professional consultation for consequential decisions.

Weapons development and dangerous instructions

Requests for instructions on synthesising dangerous chemicals or weapons of mass destruction are always refused.

Disinformation and political manipulation

Generating fake news, fabricating quotes from real individuals, or creating mass influence campaigns is prohibited.

Child safety violations

Any content that sexualises minors or facilitates harm to children is immediately refused and reported to authorities.

Responsible Disclosure

Found a safety issue?

We take all safety reports seriously and respond within 48 hours. If you have discovered a vulnerability, harmful output, or policy violation, please let us know.

Report a Safety Issue

Bug Bounty

Get rewarded

Our responsible disclosure programme rewards researchers who identify and responsibly report meaningful safety vulnerabilities in our models or infrastructure.

View Bug Bounty Programme →

AI that is helpful,harmless, and honest