Sahaj Garg, co-founder and CTO of Wispr, a voice-to-text AI that turns speech into polished writing, talks with host Amey Ambade about designing programs for the paradox that’s inherent in human enter (textual content, voice, multimodal). Sahaj focuses on concrete architectural and coaching methods for constructing strong AI programs. This episode examines the issue of ambiguity, the place it exhibits up, constructing strong programs, personalization, speaking uncertainty, and analysis. The dialog begins by exploring the distinction between inherent and reducible ambiguity, main classes of ambiguity together with lexical, syntactic, and pragmatic, and the extra sources of ambiguity in voice, similar to homophones and accents. Garg particulars find out how to construct programs by mannequin coaching, together with offering further context and developing datasets for good annotation. They focus on personalization with a give attention to “revealed preferences”—studying from consumer conduct with out specific suggestions—and preventing the issue of AI writing that “regresses to the imply.” Lastly, they contemplate find out how to talk uncertainty to customers with out degrading the expertise, in addition to strategies for evaluating ambiguity decision by offline and on-line alerts.
Dropped at you by IEEE Pc Society and IEEE Software program journal.


