Describir: Robust skeletal motion tracking using temporal and spatial synchronization of two video streams