Selective Disentanglement: Natural Language Fine-Tuning as an Architecture-Dependent Decoupler of Cross-Domain Entanglement

This paper has been superseded.

The content of this paper and Entanglement-Optimal Fine-Tuning (AI-28) have been consolidated into a single paper:

Entanglement Under Fine-Tuning: Architecture-Dependent Collapse, Scale Thresholds, and Entanglement-Optimal Adaptation
DOI: 10.5281/zenodo.19430945

Original Executive Summary

Structural entanglement — the geometric property that every informative direction in a transformer's activation space carries all concept dimensions simultaneously — was established as generic in prior work. This paper shows that natural language fine-tuning selectively destroys this property in Qwen-2.5-Coder-32B, driving entanglement intensity from 0.667 to 0.000 across 8 independent seeds while preserving domain classification accuracy (0.90–0.98). The collapse is a sharp phase transition occurring at training steps 2000–3500.

The same B3 protocol (code primary + NL secondary) applied to four other architectures at ~7B scale produces qualitatively different responses: CodeLlama increases EI (+27–39%), DeepSeek modestly decreases it (-14%), Mistral is unchanged (+0.4%), and Qwen-7B partially decreases (-32%). A scale ladder within the Qwen family reveals graded susceptibility with a phase boundary between 14B and 32B.

Key References

McEntire (2026) — Structural Entanglement (AI-26): establishes the geometric phenomenon across four architectures
McEntire (2026) — The Entanglement Theorem (AI-27): formal proof that entanglement is generic under a non-degeneracy condition
McEntire (2026) — Entanglement-Optimal Fine-Tuning (AI-28): discovers the architecture-specific collapse under B3
McEntire (2026) — Entangled Directions (AI-25): discovers the discrimination-activation dissociation

Original Executive Summary

Key References

Download Full Paper