The document discusses an integrated algorithm that combines model-based and model-free updates for trajectory-centric reinforcement learning, enhancing both data efficiency and compatibility with unknown dynamics. It details a two-stage approach integrating the strengths of existing methods (PI2 and LQR-FLM) to optimize policy performance across various robotic tasks. Experimental results demonstrate significant performance improvements in simulated and real-world applications, indicating the effectiveness of the proposed approach.