Skip to content

API Reference


Agent Configuration — rl/vqdqn_agent.py

AgentConfig

Field Type Default Description
version str "A" "A" (5q standard), "B" (8q multi-observable), "D" (5q standard), "E" (5q + input scaling + anti-BP)
n_qubits int 5 Qubits in VQ-DQN (auto-set to 8 for Version B)
n_layers int 2 Variational layers
gamma float 0.90 Discount factor
epsilon_start float 1.0 Initial exploration rate
epsilon_min float 0.1 Minimum exploration rate
epsilon_decay float 0.99 Per-episode decay
shots int 512 Measurement shots
use_double_dqn bool True Enable Double DQN
target_update_freq int 10 Episodes between target sync
entanglement str "linear" "linear", "circular", "full", "none"
exploration_mode str "epsilon_greedy" "epsilon_greedy" or "boltzmann"
q_clip_range float 50.0 Symmetric Q-value clipping bound
optimistic_cut_bias float 0.0 Extra initial bias for CUT action
use_input_scaling bool False Learnable per-feature scale+shift
anti_barren_plateau bool False Near-zero circuit param init
use_soft_targets bool False Entropy-regularized targets (soft-DQN)
soft_alpha float 0.1 Entropy temperature for soft targets

ClassicalAgentConfigrl/spsa_classical_agent.py

Field Type Default Description
hidden_sizes List[int] [64] Hidden layer sizes (empty = linear)
feature_transform str "none" "none" (standard MLP) or "rbf" (Random Fourier Features)
rbf_dim int 10 Number of RBF random features
gamma float 0.90 Discount factor
epsilon_start float 1.0 Initial exploration rate
epsilon_min float 0.1 Minimum exploration rate
epsilon_decay float 0.99 Per-episode decay
use_double_dqn bool True Enable Double DQN
target_update_freq int 10 Episodes between target sync

AdamAgentConfigrl/adam_classical_agent.py

Field Type Default Description
hidden_sizes List[int] [64] Hidden layer sizes
gamma float 0.90 Discount factor
lr float 1e-3 Learning rate
beta1 float 0.9 Adam β₁
beta2 float 0.999 Adam β₂
max_grad_norm float 10.0 Gradient clipping norm

VQ-DQN Agent — rl/vqdqn_agent.py

VQDQNAgent

Method Signature Description
__init__ (config: AgentConfig, backend: AerSimulator, seed: int) Initialise circuit, params, target params
get_q_values (state, use_target=False) → ndarray[2] Run circuit, return Q-values
act (state, greedy=False) → int ε-greedy or Boltzmann action selection
update (states, actions, rewards, next_states, dones) → float Batched SPSA gradient step on Huber TD loss
compute_targets_batch (rewards, next_states, dones) → ndarray Compute TD targets for batch
update_target_network () Copy online params → target params
decay_epsilon () Decay ε and Boltzmann temp; periodic target sync
get_circuit_info () → CircuitInfo Circuit structure summary
save_checkpoint (path: str) Save agent state to .npz file
load_checkpoint (path: str) Load agent state from .npz file

Quantum Circuit — quantum/vqdqn_circuit.py

VQDQNCircuitBuilder

Method Signature Returns
__init__ (n_qubits, n_layers, use_data_reuploading, entanglement)
build_circuit (state, params, add_measurements) → QuantumCircuit Parameterised circuit
get_circuit_info (params) → CircuitInfo Circuit metrics

Module-Level Functions

Function Signature Description
angle_encode (features, scaling='arctan') → ndarray Encode features as rotation angles
build_vqdqn_circuit (state, params, n_qubits, n_layers, use_data_reuploading, add_measurements) → QuantumCircuit Convenience wrapper
evaluate_q_values (state, params, backend, shots, ...) → ndarray[2] Run circuit and return Q-values
q_values_batch (states, params, n_qubits, n_layers, ...) → ndarray[B,2] Fast batched Q-value computation (pure numpy)
compute_expectation_from_counts (counts, shots, qubit_idx, n_qubits) → float ⟨Zᵢ⟩ ∈ [-1, 1]
compute_parity_expectation (counts, shots, qubit_a, qubit_b, n_qubits) → float ⟨ZₐZ_b⟩ ∈ [-1, 1]

SPSA Optimizer — rl/spsa.py

SPSAConfig

Field Type Default Description
A int 20 Stability constant
a float 0.12 Initial learning rate scale
c float 0.08 Initial perturbation magnitude
alpha float 0.602 Learning rate decay exponent
gamma float 0.101 Perturbation decay exponent
max_iter int 100 Maximum iterations
seed int 42 Random seed
use_momentum bool True Enable momentum-SPSA
momentum float 0.9 Momentum coefficient (β)

SPSAOptimizer

Method Signature Description
__init__ (A, a, c, alpha, gamma, max_grad_norm, seed, use_momentum, momentum, n_perturbations, use_crn, crn_base_seed, param_scales) Initialise decay schedules
step (loss_fn, params) → (params, grad_norm) One SPSA update step
compute_gradient (loss_fn, params) → ndarray Estimate gradient (optionally momentum-averaged)
optimize (loss_fn, initial_params, max_iter, tolerance, callback) → (params, loss) Full optimization loop
reset () Reset iteration counter and momentum buffer

Replay Buffer — rl/replay_buffer.py

Experience

class Experience(NamedTuple):
    state: np.ndarray
    action: int
    reward: float
    next_state: np.ndarray
    done: bool

ReplayBuffer

Method Signature Description
__init__ (max_size: int = 5000, seed: int = 42) Initialise circular buffer
add (state, action, reward, next_state, done) Add experience
sample (batch_size: int) → list[Experience] Uniform random sample
sample_batch (batch_size: int) → tuple[ndarray×5] Sample as numpy arrays (states, actions, rewards, next_states, dones)
sample_batch_stratified (batch_size, min_cut_quota=0.3) → tuple[ndarray×5] Sample with minimum CUT action quota
is_ready (min_size: int) → bool len(buffer) >= min_size
clear () Clear all experiences
__len__ () → int Current buffer size

Trajectory Distance — clustering/trajdistance.py

Core IED Functions

Function Signature Description
traj2traj_ied (pts1: List[Point], pts2: List[Point]) → float Full IED between two trajectories
incremental_ied (traj1, traj2, k_dict, k, i, sp_i) → dict Incremental IED update (O(1) per step)
incremental_mindist (traj_pts, start, curr, k_dict, cluster_dict) → (dist, id) Nearest cluster via incremental IED
line2line_ied (p1s, p1e, p2s, p2e) → float Segment-pair distance
get_static_ied (points, x, y, t1, t2) → float Static point-to-trajectory IED
timed_traj (points, ts, te) → Optional[Trajectory] Time-windowed sub-trajectory extraction

MDL Cost

Function Signature Description
traj_mdl_comp (points, start_index, curr_index, mode) → float MDL cost ("simp" or "orign" mode)

Distance Classes

Class Method Description
FrechetDistance compute(traj_c, traj_q) → float Discrete Fréchet distance
DtwDistance compute(traj_c, traj_q) → float Dynamic Time Warping distance

Pickle Data Loader — clustering/pickle_loader.py

Function Signature Description
load_trajectories (path, limit=None) → List[Trajectory] Load pre-processed trajectories
load_raw_trajectories (path, limit=None) → list Load as raw RLSTCcode Traj objects
load_cluster_centers (path) → (Dict, float) Load cluster centers (Q-RLSTC format)
load_cluster_centers_raw (path) → (Dict, float) Load in MDP.py's native dict format
load_subtrajectories (path) → List[Trajectory] Load TRACLUS sub-trajectories
load_test_set (path) → List[Trajectory] Load held-out test/validation sets
list_available_datasets () → Dict[str, List[str]] List available pickle files in data dir

Preprocessing — data/preprocessing.py

Function Signature Description
simplify_trajectory (trajectory: Trajectory) → Trajectory Greedy MDL-based simplification
simplify_all (trajectories) → List[Trajectory] Simplify all trajectories
preprocess_tdrive (raw, max_len, min_len, simplify) → List[Trajectory] Full pipeline
filter_by_coordinates (trajs, lon_range, lat_range) → list Geographic bounding box filter
normalize_locations (trajs) → list Z-score normalize spatial coords
normalize_time (trajs) → list Z-score normalize timestamps
arrays_to_trajectories (data) → List[Trajectory] Convert [lon,lat,time] → Trajectory

Clustering — clustering/

ClassicalKMeansclassical_kmeans.py

Method Signature Description
__init__ (n_clusters, max_iter, convergence_threshold, seed) Initialize k-means
fit (data: ndarray) → KMeansResult Run k-means++
predict (data: ndarray) → ndarray Assign clusters

Metrics — metrics.py

Function Signature Description
overall_distance (data, centroids, labels) → float Root-mean-square distance
silhouette_score (data, labels) → float Cluster quality ∈ [-1, 1]
segmentation_f1 (predicted, true, tolerance) → (precision, recall, f1) Boundary detection F1 (returns tuple)
incremental_od_update (current_od, n_segments, new_segment_cost) → float Efficient reward-time OD update
od_improvement_reward (od_before, od_after, scale) → float Reward from OD improvement
weighted_valcr (per_segment_ods, per_segment_lengths, basesim, epsilon) → float Length-weighted ValCR
random_policy_advantage (agent_valcr, random_valcr) → float Δ_rand advantage metric

Random Frontier — clustering/random_frontier.py

Method Signature Description
__init__ (fold_basesim, epsilon) Initialize (call finalize() to build)
add_point (cut_pct, val_cr) Add raw observation
finalize (n_bins, smoothing) Bin raw points into frontier curve
interpolate (cut_pct) → float Get frontier ValCR at CUT budget
advantage (agent_valcr, agent_cut_pct) → float Budget-matched Δ_rand

Backends — quantum/backends.py

BackendFactory

Method Signature Description
get_ideal_backend () → AerSimulator Noiseless statevector backend
get_simple_noise_model (single_qubit_error, two_qubit_error, readout_error) → NoiseModel Depolarizing noise
get_thermal_noise_model (t1, t2, gate_time_1q, gate_time_2q, ...) → NoiseModel Thermal + depolarizing noise
get_ibm_eagle_noise_model () → NoiseModel IBM Eagle r3 approximation
get_ibm_heron_noise_model () → NoiseModel IBM Heron approximation
get_noisy_backend (noise_model) → AerSimulator Noisy simulator
get_noise_model_by_name (name) → Optional[NoiseModel] Named lookup: "ideal", "simple", "thermal", "eagle", "heron"

Module-Level Function

Function Signature Description
get_backend (mode, noise_model_name, device_name) → AerSimulator Backend factory: "ideal", "noisy_sim", "ibm_runtime"

Supporting Modules

Adaptive Shots — rl/adaptive_shots.py

Method Signature Description
get_shots (q_margin: float) → int Determine shot count from Q-value margin
get_stats () → dict Shot allocation statistics
reset () Clear history for new episode

DROP Action — rl/drop_action.py

Property/Method Signature Description
n_actions → int 3 when enabled, 2 when disabled
is_drop_allowed (consecutive_drops) → bool Check if DROP is allowed
get_drop_penalty (consecutive_drops) → float Escalating penalty
check_retention (n_total, n_dropped) → bool Validate retention constraint

Soft Targets — rl/soft_targets.py

Function Signature Description
soft_value (q_values, alpha=0.1) → ndarray Entropy-regularized soft value V(s) = α·log(Σ exp(Q/α))
soft_policy (q_values, alpha=0.1) → ndarray Boltzmann policy π(a

Statistics — utils/stats.py

Function Signature Description
bootstrap_ci (data, n_bootstrap, ci, seed) → (mean, ci_low, ci_high) Bootstrap confidence interval
paired_bootstrap_test (a, b, n_bootstrap, seed) → float Two-sided paired significance p-value

Experiments — experiments/

run_thesis_experiments.py

# Run all thesis experiments
python experiments/run_thesis_experiments.py

# Specific experiments
python experiments/run_thesis_experiments.py --experiments D1,E1 --amount 100 --epochs 3

run_cross_comparison.py

# Run 4-agent comparison on same data
python experiments/run_cross_comparison.py --amount 500 --run all