We optimize the deployment of an aerial reconfigurable intelligent surface (ARIS) to assist the high altitude platform (HAP) downlink transmission when the direct link is blocked. Specifically, we maximize the received signal-to-noise ratio (SNR) of the ground users by jointly optimizing the trajectory and the phase-shift of the ARIS with the consideration of the unknown movement of HAP, which is caused by the changes in the stratospheric wind and air density. Due to the non-convex nature of the formulated optimization problem, we decouple the optimization problem and propose an alternative two-stage optimization. By proving that the movement of the HAP follows a finite state Markov stochastic process, we first learn the optimal ARIS trajectory via model-free reinforcement learning, and then adjust the optimal phase-shift of the ARIS, alternately. Next, the convergence of the proposed algorithm is analyzed based on the obtained upper bound of the accumulated reward. The numerical results show substantial performance gain of HAP communications with the optimized ARIS assistance.