The document describes an Echo State Fitted-Q Iteration (ESFQ) algorithm to learn control systems with delays. ESFQ is a batch reinforcement learning method that uses echo state networks for function approximation to estimate Q-values while preserving the Markov property by holding state histories. Experimental results on simulated benchmarks show ESFQ improves performance over standard tapped delay-line algorithms and that nonlinear readout layers help learn complex dynamics better than linear layers. The goal is to develop an effective and efficient reinforcement learning approach for learning delayed control systems without knowing their dynamics.