This document proposes using Q-learning to optimize handover parameters in 4G networks. It discusses issues with handover like radio link failure from too early or late handover. The proposed approach uses Q-learning, where the environment model includes states defined by combinations of time-to-trigger and hysteresis values, and actions are increasing or decreasing those values. The reward function aims to minimize drops and ping-pongs. Simulation results show this Q-learning approach can optimize handover parameters. Future work could consider interference optimization along with handover.