2022年第46卷第10期17VoiceTechnologY语音技术文献引用格式:孟飞宇.基于角裕度损失的说话人识别研究[J].电声技术,2022,46(10):17-19.MENGFY.Researchonspeakerrecognitionbasedonangularmarginloss[J].AudioEngineering,2022,46(10):17-19.中图分类号:TN912.3文献标识码:ADOI:10.16311/j.audioe.2022.10.005基于角裕度损失的说话人识别研究孟飞宇(中国刑事警察学院,辽宁沈阳110854)摘要:针对传统身份认证矢量与概率线性判别分析结合的声纹识别模型步骤烦琐、泛化能力较弱等问题,基于自建的普通话唱红歌语音库,设计三个针对文本无关的闭集声纹识别模型,分别为Res-SD、Res-SA和Rep-SA模型。Res-SD模型采用传统的交叉熵损失函数完成训练,Rs-SA和Rep-SA模型采用可以在特征表达的角度空间中最大化分类界限的附加角裕度损失函数完成训练。实验结果验证了所提出的三个模型针对文本无关的闭集识别任务是有效的。在参数量和准确性方面,Rep-SA模型更适合在红歌数据库上学习到具有类别区分性的唱歌者特征。关键词:声纹识别;闭集;Res-SD;Res-SA;Rep-SAResearchonSpeakerRecognitionBasedonAngularMarginLossMENGFeiyu(CriminalInvestigationPoliceUniversityofChina,Shenyang110854,China)Abstract:Aimingattheproblemsoftediousstepsandweakgeneralizationabilityofvoiceprintrecognitionmodelcombiningtraditionalidentityauthenticationvectorandprobabilitylineardiscriminantanalysis,basedontheselfbuiltmandarinsingingredsongvoicedatabase,threetextindependentclosedsetvoiceprintrecognitionmodelsaredesigned,namelyResSD,ResSAandRepSAmodels.ResSDmodelusestraditionalcrossentropylossfunctiontocompletetraining,andRsSAandRepSAmodeluseadditionalanglemarginlossfunctionthatcanmaximizetheclassificationboundaryintheanglespaceoffeatureexpressiontocompletetraining.Theexperimentalresultsverifythatthethreemodelsproposedinthispaperareeffectivefortextindependentclosedsetrecognitiontasks.Intermsofparameterquantityandaccuracy,Rep-SAmodelismoresuitableforlearningthesingercharacteristicswithcategorydifferentiationontheredsongdatabase.Keywords:voiceprintrecognition;closedset;Res-SD;Res-SA;Rep-SA0引言声纹识别又叫作(自动)说话人识别,它是通过计算机或者其他识别技术等手段,...