allowing Spirit LM to perform cross-modal tasks like speech-to-text and text-to-speech, while maintaining the natural expressiveness of speech in its outputs. In line with Meta’s commitment to ...