Abstract:
The algorithm of panoramic saliency prediction has the ability to estimate human visual attention distribution, eliminate redundant information in panoramic images, and is significant in areas such as image compression, transmission, and quality assessment. To address the inherent geometric distortion in panoramic images, a spherical perception-based saliency prediction network called SPLWU-Net is proposed, which combines longitude saliency weighting. Firstly, a Spherical Encoder Based on Spatial Attention (SASE) is designed to model the spherical information and mitigate the impact of distortion during feature extraction. Next, a Self-Attention Feature Selection Module (SAFS) is introduced to address the feature loss caused by multiple pooling operations in the encoder-decoder architecture, suppressing background information and highlighting key features, facilitating effective interaction between the encoder and decoder. Finally, the SPLWU-Net model is enhanced with a post-processing method called Longitude Saliency Weighting (LSW) to further improve the prediction accuracy. Experimental results demonstrate the superior performance of the proposed method compared to mainstream algorithms. On the Salient360!2017 dataset, the CC and AUC
j metrics are improved by 0.9% and 1.4% respectively, while on the Saliency-in-VR dataset, the CC, KLD, NSS, and AUC
j metrics are improved by 5.6%, 8.6%, 10.1%, and 1.2% respectively. The panoramic saliency prediction algorithm presented in this paper exhibits excellent predictive performance, is applicable to various scenarios, and holds practical value.