The document discusses a method for preference learning to guide tree searches in partially observable Markov decision processes (POMDPs) for robotic object manipulation in unstructured environments. It introduces the preference-guided POMCPow (PGP) approach, which leverages ranking among historical actions to improve decision-making in high-dimensional spaces and addresses data efficiency issues through preference-based learning. Experimental results demonstrate that PGP outperforms traditional methods in success rate and optimality of trajectories in various simulated and real-world scenarios.