JOSHUA N. R. OLLSWANG

917-995-1008|ai.alignment.research@gmail.com

Summary

Independent alignment researcher building personalized AI systems that learn over time, adapt to individuals, and support human flourishing — designing RLHF and post-training pipelines (DAPT/SFT/RL across 5 architectures at up to 229B parameters), creating evaluation frameworks that track what models actually learn beyond surface metrics, and building agent harnesses with persistent memory, user modeling, and long-horizon context compression. Built a multi-stage synthetic data pipeline (169K training samples, 4.5B tokens, scalable to 10⁴⁰⁺ unique contexts) encoding human preference structures from 23 therapeutic traditions, and discovered post-convergence representational reorganization — where internal representational geometry continues restructuring hundreds of steps after loss convergence, visible through embedding kurtosis analysis but invisible to standard metrics.

Licensed clinician and researcher (University of Chicago) with deep domain expertise in human preference modeling, behavioral adaptation, and the relational dynamics that determine whether personalized AI systems build genuine trust or erode it.

Python, MLX, CUDA, PEFT/LoRA (27B–229B), RLHF/reward modeling, DAPT/SFT/RL pipelines, mechanistic interpretability, evaluation framework design, preference-learning systems, multi-agent architectures, long-context memory systems.

Research

INDEPENDENT RESEARCHER — SOCIOAFFECTIVE ALIGNMENT & MODEL BEHAVIOR 2023–Present

Solo research program investigating whether carefully designed training curricula, evaluation frameworks, and preference-learning pipelines can produce AI systems with genuine human alignment — not surface-level pattern matching but deep representational understanding of human needs, preferences, and well-being.

RLHF, Post-Training & Evaluation Frameworks

Designed and executed multi-stage post-training pipeline (DAPT → SFT → RL) across 10+ adapter runs on 5 architectures (MiniMax M2 229B, Llama 3.3 70B, Gemma 3 27B, GLM 4.7 Flash 30B, Mistral Small 4 119B), with controlled comparisons of layer-targeting strategies, curriculum configurations, and quantization precision
Built quantitative evaluation framework capturing what models actually learn: validation loss/perplexity tracking, KV reconstruction loss, per-embedding kurtosis geometry analysis (Fisher excess kurtosis across 2048 dimensions, 2,010 embeddings, 39-phase temporal bucketing) — revealing training dynamics invisible to standard metrics
Built preference-learning verification system: strict and lenient matching of 33,169 clinical labels against training corpus, with temporal quintile analysis tracking the transition from reproducing preference patterns to constructing novel preference-aligned formulations
Discovered post-convergence representational reorganization: after validation loss plateaus, embedding kurtosis continues declining (−16.6%), revealing a structurally distinct phase of internal representational restructuring — a long-horizon training dynamic invisible to standard metrics. Finding documented with pre-registered falsification criteria subsequently disconfirmed over 200+ steps
Designed RL reward modeling through Teaching by Negation: reinforcement learning from contrastive human preference signals, training models to distinguish aligned from misaligned behavioral patterns across nuanced relational contexts

Multimodal Post-Training & SME RLHF (In Progress)

Architecting multimodal post-training pipeline grounded in 450 longitudinal psychotherapy sessions (580+ hours, 2.47 TB) — extracting aligned audio, video, face, and text embeddings to train models that learn from richer interaction signals beyond text alone
Designed two-phase training: (1) self-supervised cross-modal contrastive learning (InfoNCE across 4 modalities + temporal forecasting via CPC) to build shared representational space, (2) supervised fine-tuning on gold-annotated segments with continuous Valence/Arousal/Dominance/Salience predictions per modality
Built face recognition and affect tracking pipeline: RetinaFace detection → SAM2 temporal tracking → ArcFace identity embeddings (512d) + HSEmotion affect embeddings (256d), with cross-session identity resolution and adversarial gradient reversal (λ=0.1) to prevent affect representations from collapsing into identity
Designed SME (Subject Matter Expert) RLHF: bus-conditioned LLM receives real-time multimodal affect events and generates candidate policy distributions over actions, scored by therapeutic reward model — clinician SMEs provide preference feedback on action alignment, appropriateness, and predicted outcomes
Audio emotional salience pipeline: WavLM-Large (1024d) + emotion2vec (768d) + 7-dimensional prosody features, capturing tone, rhythm, breath, tremor, and paralinguistic signals that carry preference and emotional state information invisible to text-only systems

Training Data & Preference-Learning Pipelines

Architected Decomposition-Factorization-Recomposition (DFR) data schemas encoding human preference structures into learnable form across 23 behavioral and relational traditions — each tradition representing a distinct lens on human needs, values, and well-being
Designed three complementary curricula — Universal Hierarchical Direction (UHD), Alternative Directional Window Curriculum (ADWC), and Rolling Recap Architecture (RRA) — sequencing preference-learning exposure to build progressively richer user models
Built Python-based generation pipeline producing 169,323 training samples (4.5B tokens) with 10⁴⁰⁺ unique context capacity, encoding rubrics and evaluation frameworks capturing long-term human value

Personalization, Long-Term Memory & User Modeling

Designed and implemented Rolling Recap Architecture (RRA): a context compression architecture enabling coherent personalized reasoning across 100+ sequential context windows, where each window compresses prior context into structured user state carried forward through rolling summaries — serving as both training curriculum and persistent memory system at inference
Built KV cache compression pipeline: 4096-token windowed processing of long-horizon interactions (up to 330K tokens across 130+ windows), compressing each window into 1024 KV cache positions with 512-token recaps, preserving preference signal across ultra-long sessions
Designed and built a personalized agent system (17K+ lines) with multi-layer memory (KV cache compression + dual-model semantic/keyword retrieval + salience-weighted tracked memories + in-session deep search triggered by behavioral event detection), multi-step agent tool loop with read-only file system access, trained custom TTS voice profiles, WebSocket auth with session continuity
Trained multiple affective voice profiles grounded in MIT Media Lab Fluid Interfaces research on vocal presence as an essential modality for perceived warmth and bonding — operationalizing Harlow’s contact comfort finding (humans bond to warmth, not utility) as a functional design affordance for personalized AI

Key Findings

Parameterization threshold: higher-capacity models absorb complex preference-alignment curricula more effectively, with preliminary evidence of scaling
Middle-layer targeting produces 1.4–2.0x deeper convergence than latter-layer targeting on identical curriculum (controlled comparison on two architectures), suggesting alignment is best embedded in representational composition layers
Architecture-dependent output fidelity: GLM achieved 6.5x faster throughput but exhibited systematic preference-alignment failures (hallucinated signals, construct reversals), while MiniMax and Gemma maintained reliability — demonstrating that speed and alignment quality can diverge
Preference provenance shift: found-in-training rate drops −23pp (MiniMax) and −29pp (Gemma) from earliest to latest training quintile, indicating transition from reproducing training labels to constructing novel preference-aligned formulations

Professional Experience

BEHAVIORAL ALIGNMENT SPECIALIST & PERSONALIZATION ARCHITECT (Psychotherapist) 2021–Present

The Secure Relationship | Global (Remote)

Designed and implemented personalized intervention programs for high-functioning professionals (executives, surgeons, military, entrepreneurs) — modeling individual user preferences, adapting strategies based on explicit and implicit feedback signals, and tracking long-horizon outcomes across 1:1, couples, and group formats
Designed and delivered large-group educational curricula (lectures of ~100 participants) on human relational dynamics including attunement, trust-building, intrapsychic self-awareness, and interpersonal efficacy
Developed AI-augmented evaluation methodologies: built bespoke preference ontologies and multi-iteration prompt-engineered pipelines for post-session analysis, integrating 20+ custom Python scripts orchestrating evaluation chains across behavioral modalities
Created custom GPT agent providing personalized between-session support, continuously utilized by users across ~100 conversation threads — early prototype of adaptive AI that learns from ongoing interaction
Built modular AI pipeline for post-session processing: automated multi-modal transcription/diarization, behavioral analysis framework, and personalized guidance with quantitative and qualitative evaluations

BEHAVIORAL ALIGNMENT SPECIALIST (Psychotherapist) 2019–2021

Smart Love Family Services | Chicago, IL

Provided long-term behavioral health support to children, adults, couples, and families — modeling individual preferences and adapting intervention strategies based on ongoing feedback
Conducted assessments, treatment planning, and multi-modal personalized programs including virtual world integrations during the pandemic
Collaborated with interdisciplinary teams on behalf of clients and families

INSTRUCTOR & TEACHING SPECIALIST 2008–2016

Harlem Children's Zone & School Professionals | New York, NY

Designed personalized curriculum and guided learners (ages 8–18) across subjects, specializing in adaptive instruction for high-intelligence, high-needs children

Education

THE UNIVERSITY OF CHICAGO | Chicago, IL
Master's Degree, Social Work (Clinical Mental & Behavioral Health Interventions), June 2020

THE JUILLIARD SCHOOL | New York, NY
Evening Division, Music Theory & History, 2010–2012

CITY UNIVERSITY OF NEW YORK | New York, NY
Bachelor's Degree, Psychology, 2018

THE EUROPEAN GRADUATE SCHOOL | Saas Fee, Switzerland
Master's Degree (awarded with Distinction), Philosophy, Art & Critical Thought, 2008*

THE LONDON SCHOOL OF ECONOMICS | London, England
Independent Auditing, Lectures on International & Civil Law, 2006

STANFORD | Palo Alto, California
Summer Programs, Creative Writing, 2004

UNIVERSITY OF WISCONSIN | Milwaukee, Wisconsin
Psychology, Philosophy, & Creative Writing, 2001–2003

*Granted B.A. waiver by EGS director to enroll in master's program early based on academic background and 90+ undergraduate credits. Completed MA with Distinction and 4.0 GPA prior to completing BA (2018) and second MA (2020).

Publications

Socioaffective Alignment in Curriculum Learning for Ameliorative AI. Solo-authored. 100+ pages presently. Documents the design and execution of a domain-adaptive pre-training pipeline — from clinical ontology and synthetic data engineering through multi-architecture training to mechanistic interpretability and behavioral benchmarking. Reports post-convergence representational reorganization and cross-lingual transfer from English-only curricula. (2026, in preparation)

Architectures of Connection: Technology & Therapeutic Presence in Virtual Worlds. Essay and research blog. jnropsychotherapist.wixsite.com/vvandc (2020). Investigates mediated technological relationality as a therapeutic medium.

Developmental Estrangement & The Re-Emergence of Love. Paper on psychotherapeutic clinical interventions. The University of Chicago's Advocate's Forum, Vol. 2020.

The Philosophical Absence in Psychoanalytic Ontology: Becoming, Abandoned. Examines the ontological limits of psychoanalytic praxis, arguing that abandoned being and I-You relations constitute an essential dimension of amelioration that prescriptive analytic language structurally forecloses. The Journal of The International Association of Transdisciplinary Psychology, Vol. 1, Iss 1, 2009.

Publication Reviews

Gabliteration: Adaptive Multi-Directional Neural Weight Modification for Selective Behavioral Alteration in Large Language Models. Pre-publication review for the author: Gökdeniz Gülmez (2025)

The Thinking Therapist: Training Large Language Models to Deliver Acceptance and Commitment Therapy Using Supervised Fine-Tuning and Odds Ratio Policy Optimization. Post-publication interview with the author: Talha Tahir (2025)

Presentations

Illinois Psychological Association Conference: Panel presentation on building empowering relationships, focused on attunement amidst difference. (2019)

Honors & Recognition

Included in 10,000 Best Young Minds Program, International Psychoanalytical Association, 2006

C.K. Williams—Pulitzer Prize winning poet—described my writing as having “a good heart” and “precious voice,” 2007 (personal correspondence)

Award for Most Creative Instructor, Harlem Children's Zone, 2009

Carl Phillips—Pulitzer Prize winning poet—described an operetta I wrote as “beautiful, arresting,” 2010 (personal correspondence)

Award for Excellence in ELA Instruction, Harlem Children's Zone, 2011

Dean's Distinguished Leadership Award, University of Chicago, 2018

Volunteer Service

Jan Hus Presbyterian Church – Homeless Outreach Program: Front desk and main point person for weekend homeless outreach and recovery meetings. New York, 2012–2014

The First Presbyterian Church – Homeless Shelter: Overnight guide, prepared and served meals, set up sleeping areas for homeless men. New York, 2011–2013

Holy Apostle's Soup Kitchen & St. George's Soup Kitchen: Prepared meals for homeless communities, occasionally leading Harlem student volunteers. New York, 2011–2013

Curriculum Vitae