• 结合经度加权的球形感知全景图像显著性预测算法

    A spherical perception-based panoramic image saliency prediction algorithm with longitude weighting

    • 全景显著性预测算法能够估计人类视觉注意力分布,排除图像冗余信息,在全景图像的压缩、传输、质量评估等领域具有重要意义。针对全景图像显著性预测任务中存在误检、不准确等问题,在编码器-解码器架构的基础上,提出一种结合经度显著性加权的球形感知全景图像显著性预测网络SPLWU-Net。首先,针对全景图像ERP投影的失真,通过设计基于空间注意力的球形编码器(Spherical Encoder based on Spatial Attention, SASE),对球形信息进行建模,在提取特征的过程消除失真的影响。然后,针对编码器-解码器架构多次池化导致的特征流失,提出自注意力机制特征选择模块(Self-Attention Feature Selection module, SAFS),抑制背景信息,并突出关键特征,形成编码器和解码器的良好互动。最后,SPLWU-Net结合经度显著性加权的方法(Longitude Saliency Weighting, LSW),以进一步提高模型的预测精度。实验结果证明,本文方法相比于主流算法有一定的提升,在Salient360!2017数据集上,CC和AUCj分别提升了0.9%和1.4%;在Saliency-in-VR数据集上,CC、KLD、NSS和AUCj分别提升了5.6%、8.6%、10.1%和1.2%。

       

      Abstract: The algorithm of panoramic saliency prediction has the ability to estimate human visual attention distribution, eliminate redundant information in panoramic images, and is significant in areas such as image compression, transmission, and quality assessment. To address the inherent geometric distortion in panoramic images, a spherical perception-based saliency prediction network called SPLWU-Net is proposed, which combines longitude saliency weighting. Firstly, a Spherical Encoder Based on Spatial Attention (SASE) is designed to model the spherical information and mitigate the impact of distortion during feature extraction. Next, a Self-Attention Feature Selection Module (SAFS) is introduced to address the feature loss caused by multiple pooling operations in the encoder-decoder architecture, suppressing background information and highlighting key features, facilitating effective interaction between the encoder and decoder. Finally, the SPLWU-Net model is enhanced with a post-processing method called Longitude Saliency Weighting (LSW) to further improve the prediction accuracy. Experimental results demonstrate the superior performance of the proposed method compared to mainstream algorithms. On the Salient360!2017 dataset, the CC and AUCj metrics are improved by 0.9% and 1.4% respectively, while on the Saliency-in-VR dataset, the CC, KLD, NSS, and AUCj metrics are improved by 5.6%, 8.6%, 10.1%, and 1.2% respectively. The panoramic saliency prediction algorithm presented in this paper exhibits excellent predictive performance, is applicable to various scenarios, and holds practical value.

       

    /

    返回文章
    返回