Classical vs. Quantum Justifications¶
A rigorous, thesis-grade justification for every architectural decision β why each component is classical, why one is quantum, and what it would concretely take to make the classical parts quantum.
Decision Framework¶
Every component was assigned to classical or quantum execution based on three criteria:
| Criterion | Gate Question |
|---|---|
| Algorithmic fit | Does a known quantum algorithm provide a structural advantage here? |
| Hardware feasibility | Can this execute on a 5-qubit, depth-11 NISQ circuit within coherence time? |
| Training loop frequency | How many times per episode does this execute? Can NISQ latency absorb it? |
Components that fail any criterion default to classical. The quantum budget is spent on a single, high-value component (the policy network) rather than spread thin.
Component-by-Component Analysis¶
Feature Extraction β CLASSICAL β ¶
What: Converts raw trajectory coordinates into a 5D/8D state vector.
Why classical: 1. No quantum subroutine exists. Features require sequential geometric computation along an ordered trajectory β curvature estimation, running averages, MDL cost ratios. 2. Executes ~50,000 times per epoch. Even simulated quantum at 100ms/circuit would add 83 minutes. Classical: <1ms total. 3. Trajectory structure is inherently sequential. You cannot "look backward along a trajectory" in superposition.
What it would take to make this quantum: - qRAM to hold trajectory points in superposition β theoretically exists, not built - ~15 qubits just for data (100 points Γ 2D), before computation qubits - Zero algorithmic gain for conditional geometric queries
Policy Network (Q-Value Estimation) β QUANTUM π·¶
What: Maps 5D state β 2 Q-values.
Why quantum is the research choice: 1. Parameter efficiency β 20 params vs. ~450 classical (22.5Γ reduction). A 5-qubit circuit operates in a 2β΅ = 32-dimensional Hilbert space, providing empirically observed parameter efficiency. 2. Natural input mapping β 5 features β 5 qubits, one-to-one via angle encoding. No padding, no truncation. 3. NISQ-feasible β 5 qubits, depth ~11, 8 CNOTs total. Fits within Eagle (~100ΞΌs Tβ) and Heron (~200ΞΌs Tβ). 4. Empirical support for low-dimensional approximation β Low-dimensional continuous function approximation is the regime where VQCs have shown empirical advantages (Schuld et al., 2021).
Classical control experiments: - Control A: 5β4β2 MLP (30 params β parameter-matched) - Control B: 5β64β2 MLP (~450 params β architecture-matched) - Control C: 5β2 linear (12 params β linearity test) - Critical: All controls must use SPSA, not backprop.
Distance Computation (IED) β CLASSICAL β ¶
What: Integrated Euclidean Distance between sub-trajectories and centroids.
Why classical: 1. Incremental O(1) updates. No quantum analog exists for incrementally updating a running computation. 2. Markov-safe. RL requires reward that depends only on current state/action. Quantum swap test would require full re-encoding per step. 3. Geometric, not algebraic. IED measures "area between trajectories" β coordinate arithmetic, not inner products.
Quantum cost analysis: - 100-point trajectory Γ 10 clusters = 1,000 circuit evaluations per episode - At 100ms each: 100 seconds per episode vs. <1ms classical - 100,000Γ slowdown for identical output
Clustering (K-Means) β CLASSICAL β ¶
What: Groups sub-trajectory segments. Runs at episode-end only.
Why classical: 1. No quantum centroid update. K-means update = arithmetic mean. No quantum advantage for means. 2. Runs infrequently. Single call per epoch vs. ~50,000 VQ-DQN evaluations. <0.01% of runtime. 3. Debugging requires determinism. Cluster assignments must be inspectable and reproducible.
Reward Computation β CLASSICAL β ¶
What: R_t = Ξ±Β·ΞOD + Ξ²Β·sharpness β penalty
Pure floating-point arithmetic. Making this quantum: 10βΆΓ slowdown for identical output.
Boundary Sharpness β CLASSICAL β ¶
What: arccos(vβ Β· vβ / (|vβ| Γ |vβ|)) / Ο β approximately 10 FP operations.
Quantum version (swap test): ~100 circuit evaluations for less accuracy. 10,000Γ slower.
Thesis significance: Boundary sharpness is the key geometric signal. Its classical implementation preserves trajectory semantics in Euclidean space where they belong. The quantum component handles the decision (should I cut?), not the evidence (how sharp is this turn?).
Summary Table¶
| Component | Assignment | One-Line Justification |
|---|---|---|
| Feature extraction | Classical | Sequential geometry on ordered trajectory; no quantum speedup exists |
| Policy network | Quantum | 22Γ parameter reduction with empirically competitive performance; clean 5β5 qubit mapping; NISQ-feasible |
| Distance (IED) | Classical | Incremental O(1) impossible in quantum; re-encoding costs 100,000Γ |
| K-means clustering | Classical | No quantum centroid update; runs once per epoch |
| Reward computation | Classical | Single FP subtraction; quantum 10βΆΓ slower |
| Boundary sharpness | Classical | 10 FP ops vs. 100+ circuits for same result |
| Swap test (optional) | Quantum | Verification probe only; does not affect training loop |
The Argument in One Paragraph¶
Q-RLSTC uses quantum computation for exactly one component: the policy function approximator. This is the only component where quantum structure provides an empirically observed benefit (parameter efficiency), where input dimensions naturally map to qubits (5 features β 5 qubits), and where NISQ constraints are satisfied (depth 11, 8 CNOTs). Every other component is either inherently sequential (trajectories), requires incremental updates (IED), or is trivially cheap (reward subtraction). Making these quantum would introduce orders-of-magnitude overhead for zero algorithmic improvement.