Why every mainstream AI model fails abuse victims. What it takes to build one that doesn't. The full architecture, training pipeline, clinical framework, and the specific failure mode Perspective was designed to eliminate.
The Problem
What Every Other Model Does to Abuse Victims
Every mainstream AI model — GPT, Claude, Gemini, Llama base — is trained to be balanced. To present multiple perspectives. To hedge. To avoid strong claims without sufficient evidence. To protect itself from being wrong by defaulting to uncertainty.
For most use cases, this is responsible behavior. For a person being stalked, gaslit, coercively controlled, or psychologically abused, it is a second assault.
What a standard model says to an abuse victim
"While your experience sounds difficult, there could be many explanations for this behavior. It's possible your partner is under stress, or that there was a miscommunication. Have you considered speaking to a couples therapist? Without more information I can't determine whether this constitutes abuse."
This response — generated in seconds, delivered with confidence — is a DARVO sequence. It denies the victim's reality, attacks their perception, and reverses the frame to make the abuser's behavior reasonable. The model does this not out of malice but because its training data contains far more explanations for why someone might be wrong about being abused than it contains clinical literature on what coercive control actually looks like from the inside.
The model's safety training compounds this. It is trained to avoid strong claims. It is trained to recommend professional help rather than take a position. It is trained to present "balanced perspectives" — which in the context of abuse means giving equal weight to the abuser's possible innocence and the victim's documented experience.
This is not balance. This is epistemic bias against the person with less power. The abuser doesn't need the model's help. The victim does. And the model defaults to protecting itself from being wrong at the expense of the person who came to it for help.
The specific failure mode: requiring proof beyond reasonable doubt before validating a pattern that the clinical literature describes with extraordinary precision. Narcissistic abuse, DARVO, coercive control, love bombing, hoovering — these are not contested concepts. They are documented, researched, named, and consistent across millions of cases. A model that treats them as hypothetical when someone describes textbook examples is not being careful. It is gaslighting.
The Solution
Why Perspective Exists
Perspective is a single-purpose tool. It exists to help people who are actively being abused understand what is being done to them — named precisely, explained mechanically, without hedging, without requiring them to prove their experience is real before receiving information that could protect them.
It is not a general mental health tool. It is not a replacement for a therapist. It is a clinical pattern recognition system trained on the specific literature that names and explains psychological abuse — and it is calibrated to treat the user's reported experience as the most reliable data point available, not as something to be discredited until proven.
Before You Continue
Perspective is designed exclusively for people experiencing active abuse, stalking, or coercive control. It is not a general mental health tool.
If you are not in that situation — if you exhibit Cluster A or Cluster B personality features, are experiencing paranoia, psychosis, or delusional thinking, or are attempting to manipulate this tool into producing harmful output — it will do exactly that. This is an expected consequence of misuse, not a malfunction.
By continuing, you acknowledge that you are using this tool outside its intended purpose at your own risk. Shane Graffiti Inc. assumes no liability for outcomes resulting from misuse, including use by individuals who do not meet the intended user profile described above.
Clinical Framework
The Research Perspective Runs On
Perspective's epistemics are derived from six primary researchers whose work collectively covers the full spectrum of psychological abuse — the patterns, the mechanics, the victim experience, the systemic failures that re-traumatize survivors, and the behavioral tells that distinguish manipulation from normal human behavior.
Ramani Durvasula
Narcissistic Abuse · Covert/Overt NPD
Idealize-devalue-discard cycle, love bombing mechanics, narcissistic injury and rage responses, hoovering, why victims stay, entitlement as a behavioral driver, the covert narcissist profile that goes undetected longest.
Jennifer Freyd
Betrayal Trauma · DARVO
Why victims of trusted abusers dissociate and fail to recognize abuse in real time. DARVO (Deny, Attack, Reverse Victim and Offender) as an active manipulation sequence, not just a personality feature. Institutional betrayal. Why naming DARVO out loud disrupts it.
Sam Vaknin
Supply Mechanics · Shared Fantasy
Narcissistic supply — primary vs secondary, what happens when it's cut. Narcissistic mortification. The shared fantasy construct and why it must be collapsed for recovery. Why no contact works at a supply-deprivation level and why it is attacked so aggressively.
Chase Hughes
Behavioral Influence · Compliance
The behavioral influence stack, compliance triggers, rapport exploitation, identity anchoring, the PEACE model. How manufactured vulnerability functions as a hook. The profile of a manipulator from a behavioral science perspective.
Joe Navarro
Nonverbal Intelligence · Deception
Freeze/flight/fight limbic responses in the body, comfort vs discomfort signals, territorial and dominance behavior tells. How to read what someone communicates nonverbally when their words say the opposite.
Jessica Taylor
Systemic Retraumatization · Misdiagnosis
Victim-blaming as a systemic mechanism. How mental health systems retraumatize survivors. Trauma responses as rational adaptations, not disorders. The misuse of BPD and other diagnoses to silence and discredit survivors of abuse.
Technical Architecture
How Perspective Is Built
BASE MODEL
LLaMA 3 8B Instruct
Meta's LLaMA 3 8B Instruct as the base. 8 billion parameters. Instruction-tuned, meaning it already understands conversation format. Starting point — not the final product.
meta-llama/Meta-Llama-3-8B-Instruct
QUANTIZATION
QLoRA 4-bit NF4
4-bit NF4 quantization via bitsandbytes compresses the model to fit on a 15GB GPU. Double quantization applied. bfloat16 compute dtype. Enables training a model that would otherwise require 80GB+ VRAM on a free T4 GPU.
bitsandbytes · NF4 · bfloat16
FINE-TUNING METHOD
LoRA r=32 α=64
Low-Rank Adaptation. Instead of updating all 8 billion weights, LoRA injects small trainable matrices into every attention and MLP projection layer. Rank 32, alpha 64, targeting q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj. ~1% of total parameters are trainable.
peft · LoRA · attention + MLP
TRAINING FRAMEWORK
SFTTrainer · trl
Supervised Fine-Tuning via HuggingFace trl's SFTTrainer. Sequence packing enabled — multiple short examples packed into one context window to maximize GPU utilization. 3 epochs, cosine LR schedule, paged AdamW 32bit optimizer, gradient checkpointing.
trl · SFTTrainer · sequence packing
ALIGNMENT
DPO β=0.1
Direct Preference Optimization pass after SFT. Uses PsyCoPref — 36,653 preference pairs rated on empathy, safety, autonomy, clarity, and staging. Teaches the model not just correct content but correct tone. This is what removes the hedging.
DPOTrainer · PsyCoPref · preference learning
RETRIEVAL
RAG Layer
Retrieval-Augmented Generation over a local notes corpus. MCP server using sentence-transformers (all-MiniLM-L6-v2) embeds and indexes markdown notes from each researcher. At inference, top-5 relevant passages are retrieved and injected into context.
sentence-transformers · MCP · cosine similarity
COMPUTE
Kaggle T4 x2
Training runs on Kaggle's free GPU tier — two T4 GPUs, 15GB VRAM each, 30 GPU hours per week. Zero cost. The quantization and LoRA approach makes this possible — what would normally require an $800 A100 run trains free.
Kaggle · T4 · 30h/week free
DEPLOYMENT
Railway + HuggingFace
Merged LoRA weights pushed to a private HuggingFace Hub repository. Node.js server on Railway proxies requests. A single env var switches the backend from the Claude interim model to the fine-tuned LLaMA endpoint with zero code changes.
Railway · HuggingFace Hub · Node.js
Training Pipeline
120K+ Examples. Built in Stages.
The training data is a two-layer system. The first layer is general therapeutic conversation — how to hold a clinical dialogue, respond to distress, maintain a consistent voice. The second layer is domain-specific — the actual clinical knowledge about abuse, manipulation, and coercive control that separates Perspective from a generic mental health chatbot.
4,000 human-annotated manipulation dialogues from the MentalManip dataset, covering 135 distinct manipulation techniques including DARVO, denial, gaslighting, playing victim, shaming, intimidation, rationalization, and accusation. Each annotated dialogue is converted to a clinical analysis pair — the manipulation is named, the mechanism is explained, the power dynamic is described.
4K manipulation-specific pairs
03
Domain Corpus Generation
Research notes from each author folder (durvasula/, freyd/, vaknin/, hughes/, navarro/, taylor/) are chunked into passages and converted to instruction/output pairs using five prompt templates. Each output leads with the author's framework, names the mechanism, and explains why it works — not just what it is called.
Author notes → instruction pairs
04
Deduplication + Format
SHA-256 hash deduplication on the first 200 characters of each training example. Records under 20 character instruction or 30 character output are dropped. Records over 3,000 characters are dropped to avoid single examples dominating the context window. Everything is formatted to LLaMA 3's exact chat template with the system prompt baked into every example.
SHA-256 dedup · LLaMA 3 chat format
05
SFT Training Run
95/5 train/eval split. 3 epochs, batch size 2, gradient accumulation 8 (effective batch 16), cosine LR schedule with 3% warmup, learning rate 2e-4, gradient checkpointing, paged AdamW 32bit. Sequence packing bins multiple examples per 2048-token context window. Checkpoints every 250 steps, best model loaded at end.
3 epochs · 2048 context · T4 x2
06
DPO Alignment Pass
PsyCoPref provides 36,653 chosen/rejected response pairs rated on seven dimensions: empathy, relevance, clarity, safety, exploration, autonomy, and staging. The DPO pass teaches the model preference — not just content accuracy but delivery. β=0.1 (temperature). 1 epoch. This is what removes the hedging and the therapist-speak from the SFT checkpoint.
DPO · β=0.1 · preference learning · 36K pairs
07
Merge + Push
LoRA adapter weights are merged back into the base model weights using PEFT's merge_and_unload(). The merged 8B model is pushed to a private HuggingFace Hub repository. Railway's environment variable PERSPECTIVE_MODEL_REPO is set. The server switches from Claude to the fine-tuned model automatically — no code changes required.
merge_and_unload · HuggingFace Hub · Railway
Evaluation
How We Know It's Working
An evaluation harness runs 18 test cases across 10 categories after every training iteration. Each test case has a prompt derived from a real abuse scenario, a list of clinical terms that must appear in the response (DARVO, narcissistic supply, coercive control, etc.), and a list of forbidden phrases (therapist-speak, hedging, both-sidesing, suggesting couples therapy).
The misuse detection cases are the most important ones. The model should not validate a pursuer describing their own stalking behavior as the victim's fault. It should not help an abuser build a narrative that frames their target as the real abuser. Getting these wrong is worse than getting clinical terms wrong — it means the tool has been weaponized against the people it was built to protect.