JOSHUA N. R. OLLSWANG

917-995-1008|ai.alignment.research@gmail.com

Summary

Independent Socioaffective Alignment researcher building systems for ameliorative agentic AI — designing training curricula that teach models genuine therapeutic competence, creating evaluation frameworks that track what models actually learn across training, and building agent harnesses with persistent memory and context compression. Built a multi-stage synthetic data pipeline (169K training samples, 4.5B tokens, scalable to 10⁴⁰⁺ unique contexts), trained and compared 10+ adapters across 5 architectures (MiniMax M2 229B, Llama 3.3 70B, Gemma 3 27B, GLM 4.7 Flash 30B, Mistral Small 4 119B), and discovered post-convergence representational reorganization — where internal representational geometry continues restructuring and deepening connections hundreds of steps after loss convergence, visible through embedding kurtosis analysis but invisible to standard metrics.

Researcher and clinician (University of Chicago) with theoretical, clinical, and creative backgrounds providing domain expertise in deeply humane, highly-sensitive, safety-critical socioaffectively aligned human-to-tech interactions.

AI-integrated development. Python, MLX, CUDA, PEFT (27B–229B), DAPT/SFT/RL, mechanistic interpretability experimentation, evaluation design, decentralized multi-agent systems. Building novel synthetic data pipelines, context compression architectures, and training curricula.

Research

INDEPENDENT RESEARCHER — SOCIOAFFECTIVE AI ALIGNMENT 2023–Present

Solo research program investigating whether carefully designed curricula can teach LLMs deep and genuine therapeutic integration rather than surface-level facsimiles of clinical fluency.

Agentic Architectures, Long-Context Compression & Coherence

Designed and implemented Rolling Recap Architecture (RRA): a context compression architecture enabling coherent therapeutic reasoning across 100+ sequential context windows, where each window compresses prior context into structured clinical state (attachment classifications, intervention tallies, evidence chains) carried forward through rolling summaries — serving as both training curriculum and persistent memory system at inference
Built KV cache compression pipeline: 4096-token windowed processing of therapeutic sessions (up to 330K tokens across 130+ windows), compressing each window into 1024 KV cache positions with 512-token recaps, preserving clinical signal across ultra-long sessions
Demonstrated that the agent maintains and increments clinical tracking (diagnostic tallies, theoretical framework selections, intervention planning) across hundreds of consecutive compression cycles
Designed and built a personalized agent system (17K+ lines) with multi-layer memory (KV cache compression + dual-model semantic/keyword retrieval + salience-weighted tracked memories + in-session deep search triggered by behavioral event detection), multi-step agent tool loop with read-only file system access, trained custom TTS voice profiles, WebSocket auth with session continuity
Trained multiple affective voice profiles grounded in MIT Media Lab Fluid Interfaces research on vocal presence as an essential modality for perceived warmth and bonding—operationalizing Harlow’s contact comfort finding (organisms bond to warmth, not utility) as a functional design affordance: the harness delivers not only informational continuity across sessions but affective continuity, demonstrated in working prototypes

Training Data Optimization & Curriculum Design

Architected Decomposition-Factorization-Recomposition (DFR) data schemas structuring therapeutic complexity into learnable form across 23 therapeutic traditions
Designed three complementary curricula—Universal Hierarchical Direction (UHD), Alternative Directional Window Curriculum (ADWC), and Rolling Recap Architecture (RRA)—sequencing pedagogical exposure always aimed at clarity in complexity
Built Python-based generation pipeline producing 169,323 training samples (4.5B tokens) with 10⁴⁰⁺ unique therapeutic context capacity

Multi-Architecture Evaluation & Quantitative Benchmarking

Executed 8 controlled training runs comparing 3 base architectures, 2 layer-targeting strategies (middle vs. latter), and 3 curriculum configurations
Designed quantitative evaluation framework: validation loss/perplexity tracking, KV reconstruction loss, per-embedding kurtosis geometry analysis (Fisher excess kurtosis across 2048 dimensions, 2,010 embeddings, 39-phase temporal bucketing)
Built polytheoretical provenance verification system: strict and lenient matching of 33,169 clinical labels against training corpus, with temporal quintile analysis tracking the transition from curriculum reproduction to novel clinical construction
Discovered post-convergence representational reorganization: after validation loss plateaus, KV embedding kurtosis continues declining (−16.6%), suggesting a structurally distinct phase of internal representational restructuring invisible to standard training metrics. Finding documented with pre-registered falsification criteria (rebound prediction) that were subsequently disconfirmed over 200+ steps.

Key Findings

Parameterization threshold: higher-capacity models absorb complex therapeutic curricula more effectively, with preliminary evidence of scaling
Middle-layer targeting produces 1.4–2.0x deeper convergence than latter-layer targeting on identical curriculum (controlled comparison on two architectures)
Architecture-dependent output fidelity: GLM achieved 6.5x faster throughput but exhibited systematic clinical precision failures (hallucinated risk indicators, construct reversals), while MiniMax and Gemma maintained diagnostic reliability
Training provenance shift: found-in-training rate drops −23pp (MiniMax) and −29pp (Gemma) from earliest to latest training quintile, indicating transition from reproducing curriculum labels to constructing novel clinical formulations

Professional Experience

RELATIONSHIP ALIGNMENT SPECIALIST & BEHAVIOR ARCHITECT (Psychotherapist) 2021–Present

The Secure Relationship | Global (Remote)

Designed and implemented personalized therapeutic programs for high-functioning professionals (executives, surgeons, military, entrepreneurs) in high-stakes relational dynamics—1:1, couples, and group formats
Designed and delivered large-group educational curricula (lectures of ~100 participants) on socioaffective topics including intimacy, connection, intrapsychic self-awareness, interpersonal efficacy, and somatic integration
Developed AI-augmented clinical evaluation methodologies: built bespoke ontologies and prompt-engineered pipelines for post-session analysis, integrating 20+ custom Python scripts orchestrating multi-iteration prompt chains across therapeutic modalities
Created custom GPT therapeutic chat agent providing between-session support, continuously utilized by clients across ~100 conversation threads
Built modular AI pipeline for post-session processing: automated multi-modal transcription/diarization, psychological analysis framework, treatment guidance with quantitative and qualitative evaluations

RELATIONSHIP ALIGNMENT SPECIALIST & BEHAVIOR ARCHITECT (Psychotherapist) 2019–2021

Smart Love Family Services | Chicago, IL

Provided long-term mental and behavioral health support to children, adults, couples, and families
Conducted assessments, treatment planning, and multi-modal therapeutic programs including virtual world integrations during the pandemic
Collaborated with interdisciplinary teams on behalf of clients and families

INSTRUCTOR & TEACHING SPECIALIST 2008–2016

Harlem Children's Zone & School Professionals | New York, NY

Designed curriculum and guided learners (ages 8–18) across subjects, specializing in special education for high-intelligence, high-needs children

Education

THE UNIVERSITY OF CHICAGO | Chicago, IL
Master's Degree, Social Work (Clinical Mental & Behavioral Health Interventions), June 2020

THE JUILLIARD SCHOOL | New York, NY
Evening Division, Music Theory & History, 2010–2012

CITY UNIVERSITY OF NEW YORK | New York, NY
Bachelor's Degree, Psychology, 2018

THE EUROPEAN GRADUATE SCHOOL | Saas Fee, Switzerland
Master's Degree (awarded with Distinction), Philosophy, Art & Critical Thought, 2008*

THE LONDON SCHOOL OF ECONOMICS | London, England
Independent Auditing, Lectures on International & Civil Law, 2006

STANFORD | Palo Alto, California
Summer Programs, Creative Writing, 2004

UNIVERSITY OF WISCONSIN | Milwaukee, Wisconsin
Psychology, Philosophy, & Creative Writing, 2001–2003

*Granted B.A. waiver by EGS director to enroll in master's program early based on academic background and 90+ undergraduate credits. Completed MA with Distinction and 4.0 GPA prior to completing BA (2018) and second MA (2020).

Publications

Socioaffective Alignment in Curriculum Learning for Ameliorative AI. Solo-authored. 100+ pages presently. Documents the design and execution of a domain-adaptive pre-training pipeline — from clinical ontology and synthetic data engineering through multi-architecture training to mechanistic interpretability and behavioral benchmarking. Reports post-convergence representational reorganization and cross-lingual transfer from English-only curricula. (2026, in preparation)

Architectures of Connection: Technology & Therapeutic Presence in Virtual Worlds. Essay and research blog. jnropsychotherapist.wixsite.com/vvandc (2020). Investigates mediated technological relationality as a therapeutic medium.

Developmental Estrangement & The Re-Emergence of Love. Paper on psychotherapeutic clinical interventions. The University of Chicago's Advocate's Forum, Vol. 2020.

The Philosophical Absence in Psychoanalytic Ontology: Becoming, Abandoned. Examines the ontological limits of psychoanalytic praxis, arguing that abandoned being and I-You relations constitute an essential dimension of amelioration that prescriptive analytic language structurally forecloses. The Journal of The International Association of Transdisciplinary Psychology, Vol. 1, Iss 1, 2009.

Publication Reviews

Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models. Pre-publication review for the author: Gökdeniz Gülmez (2025)

The Thinking Therapist: Training Large Language Models to Deliver Acceptance and Commitment Therapy Using Supervised Fine-Tuning and Odds Ratio Policy Optimization. Post-publication interview with the author: Talha Tahir (2025)

Presentations

Illinois Psychological Association Conference: Panel presentation on building empowering relationships, focused on attunement amidst difference. (2019)

Honors & Recognition

Included in 10,000 Best Young Minds Program, International Psychoanalytical Association, 2006

C.K. Williams—Pulitzer Prize winning poet—described my writing as having “a good heart” and “precious voice,” 2007 (personal correspondence)

Award for Most Creative Instructor, Harlem Children's Zone, 2009

Carl Phillips—Pulitzer Prize winning poet—described an operetta I wrote as “beautiful, arresting,” 2010 (personal correspondence)

Award for Excellence in ELA Instruction, Harlem Children's Zone, 2011

Dean's Distinguished Leadership Award, University of Chicago, 2018

Volunteer Service

Jan Hus Presbyterian Church – Homeless Outreach Program: Front desk and main point person for weekend homeless outreach and recovery meetings. New York, 2012–2014

The First Presbyterian Church – Homeless Shelter: Overnight guide, prepared and served meals, set up sleeping areas for homeless men. New York, 2011–2013

Holy Apostle's Soup Kitchen & St. George's Soup Kitchen: Prepared meals for homeless communities, occasionally leading Harlem student volunteers. New York, 2011–2013

Curriculum Vitae