Abstract: Handcrafted audio descriptors and learned deep representations each bring distinct strengths and inherent limitations to speech emotion recognition (SER). Traditional handcrafted features ...