Describir: Multi-HM: A Chinese Multimodal Dataset and Fusion Framework for Emotion Recognition in Human–Machine Dialogue Systems