TY - GEN
T1 - Entropy transformer networks: a learning approach via tangent bundle data manifold
AU - Shamsolmoali, Pourya
AU - Zareapoor, Masoumeh
PY - 2023/8/2
Y1 - 2023/8/2
N2 - This paper focuses on an accurate and fast interpolation approach for image transformation employed in the design of CNN architectures. Standard Spatial Transformer Networks (STNs) use bilinear or linear interpolation as their interpolation, with unrealistic assumptions about the underlying data distributions, which leads to poor performance under scale variations. Moreover, STNs do not preserve the norm of gradients in propagation due to their dependency on sparse neighboring pixels. To address this problem, a novel Entropy STN (ESTN) is proposed that interpolates on the data manifold distributions. In particular, random samples are generated for each pixel in association with the tangent space of the data manifold, and construct a linear approximation of their intensity values with an entropy regularizer to compute the transformer parameters. A simple yet effective technique is also proposed to normalize the non-zero values of the convolution operation, to fine-tune the layers for gradients' norm-regularization during training. Experiments on challenging benchmarks show that the proposed ESTN can improve predictive accuracy over a range of computer vision tasks, including image reconstruction, and classification, while reducing the computational cost.
AB - This paper focuses on an accurate and fast interpolation approach for image transformation employed in the design of CNN architectures. Standard Spatial Transformer Networks (STNs) use bilinear or linear interpolation as their interpolation, with unrealistic assumptions about the underlying data distributions, which leads to poor performance under scale variations. Moreover, STNs do not preserve the norm of gradients in propagation due to their dependency on sparse neighboring pixels. To address this problem, a novel Entropy STN (ESTN) is proposed that interpolates on the data manifold distributions. In particular, random samples are generated for each pixel in association with the tangent space of the data manifold, and construct a linear approximation of their intensity values with an entropy regularizer to compute the transformer parameters. A simple yet effective technique is also proposed to normalize the non-zero values of the convolution operation, to fine-tune the layers for gradients' norm-regularization during training. Experiments on challenging benchmarks show that the proposed ESTN can improve predictive accuracy over a range of computer vision tasks, including image reconstruction, and classification, while reducing the computational cost.
KW - Transformer
KW - data manifold
KW - image reconstruction
U2 - 10.1109/IJCNN54540.2023.10191125
DO - 10.1109/IJCNN54540.2023.10191125
M3 - Conference contribution
SN - 9781665488679
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN)
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 International Joint Conference on Neural Networks
Y2 - 18 June 2023 through 23 June 2023
ER -