基于申威1621的高精度点积算法实现与优化①徐方洁,王磊,王一卓,张亚光(中原工学院前沿信息技术研究院,郑州450007)通信作者:徐方洁,E-mail:xfj921084101@163.com摘要:点积函数是BLAS库中的一级基础函数,其被科学计算等领域广泛调用.由于浮点计算会引入舍入误差,现有BLAS库中双精度点积函数不足以满足某些应用领域的精度要求,因此需要高精度算法来实现更精确可靠的计算.在本文中,面向国产申威1621平台,在现有的BLAS库的基础上,新增高精度点积函数的实现接口,来满足应用的高精度需求.同时,对于高精度点积算法运用循环展开、访存优化、指令重排等优化策略,实现汇编级手工优化.实验结果显示,文中高精度点积算法的计算结果精度,近似达到了双精度点积的两倍,有效提升了原始算法精度.同时,在保证精度提升的基础上,文中优化后的高精度点积函数相比未优化前,平均性能加速比达到了1.61.关键词:申威1621;点积;高精度;BLAS库接口;性能优化引用格式:徐方洁,王磊,王一卓,张亚光.基于申威1621的高精度点积算法实现与优化.计算机系统应用,2023,32(2):400–405.http://www.c-s-a.org.cn/1003-3254/8932.htmlImplementationandOptimizationofHigh-precisionDotProductAlgorithmBasedonSW1621ProcessorXUFang-Jie,WANGLei,WANGYi-Zhuo,ZHANGYa-Guang(ResearchInstituteofFrontierInformationTechnology,ZhongyuanUniversityofTechnology,Zhengzhou450007,China)Abstract:Thedotproductfunctionisafirst-levelbasicfunctionintheBLASlibrary,whichiswidelycalledbyscientificcalculationsandotherfields.Asthefloating-pointcalculationintroducesroundingerrors,thedouble-precisiondotproductisunabletomeettheaccuracyrequirementsinsomeapplicationfields,andthushigh-precisionalgorithmsareneededtoachievemoreaccurateandreliablecalculations.Inthisstudy,onthebasisoftheexistingBLASlibrary,theinterfaceofthehigh-precisiondotproductfunctionisaddedtomeetthehigh-precisionrequirementsofapplicationsonthedomesticSW1621platform.Atthesametime,thehigh-precisiondotproductalgorithmusessuchoptimizationstrategiesasloopexpansion,visit-memoryoptimization,andinstructionrearrangementtorealizeassembly-levelmanualoptimization.Theexperimentalresultsindicatethatthehigh-precisiondotproductalgorithmhastheaccuracyapproximatelytwicethatofthedo...