Questions

Section III: What differentiates the Artificial Neural Network used to calculate future return from the target network? It seems like the target network focuses on learning the immediate reward, whereas the original neural network calculates the expected return from the full trajectory.
Section IV. A: How do we measure that the third excited Bloch state provides the best trade-off between large momentum splitting and high-frequency components?

Time evolution

The Hamiltonian is described by $H = \frac{P ^ ^{2}}{2 m} - \frac{V _{0}}{2} cos (2 k_{L} \hat{X} + θ (x))$ .
The solution to this is given by Schrodinger Equation $i ℏ \frac{\partial}{\partial t} ψ = H ψ = \frac{P ^ ^{2}}{2 m} ψ - \frac{V _{0}}{2} cos (2 k_{L} \hat{X} + θ (t)) ψ$ .
Letting $\tilde{ψ} = F ψ$ be the Fourier Transform $\int_{R} ψ e^{- i p x} d p$ of $ψ$ , we convert the original Differential Equation into $i ℏ \frac{\partial}{\partial t} \tilde{ψ} (p) = \frac{p ^{2}}{2} ψ (p) - \frac{V _{0}}{4} (δ (θ (t) - 2 k_{L}) + δ (θ (t) + 2 k_{L})) ψ (p)$ .
Writing it as $i ℏ \frac{\partial}{\partial t} \tilde{ψ} (p) - \frac{p ^{2}}{2} ψ (p) = \frac{V _{0}}{4} (δ (θ (t) - 2 k_{L}) + δ (θ (t) + 2 k_{L}))$ , we use Integrating Factors to get the solution.

The effectiveness of the method is evaluated by the Fidelity of the obtained state vs the desired state.