Darius Burschka discusses challenges in using deep learning for perception in robotics. Directly using raw sensor input like images as input to control commands can be computationally challenging due to high dimensionality. Instead, deep learning can be used to extract features from images that are then used for tasks like segmentation, labeling, and identification. However, these learned features may not provide clear metric information needed for robot control. Burschka suggests exploring alternatives for coupling perception and control that do not require explicit metric mappings.