Describir: Improving intelligent perception and decision optimization of pedestrian crossing scenarios in autonomous driving environments through large visual language models