Abstract
Object recognition is a challenging problem in high-level vision. Models that perform well for the outdoor domain, perform poorly in the indoor domain and the reverse is also true. This is due to the dramatic discrepancies of the global properties of each environment, for instance, backgrounds and lighting conditions. Here, we show that inferring the environment before or during the recognition process can dramatically enhance the recognition performance. We used a combination of deep and shallow models for object and scene recognition, respectively. Also, we used three novel topologies that can provide a trade-off between classification accuracy and decision sensitivity. We achieved a classification accuracy of 97.91%, outperforming the performance of a single GoogLeNet by 13%. In another experiment, we achieved an accuracy of 95% to categorise indoor and outdoor scenes by inference.
Original language | English |
---|---|
Publication status | Published - 25 Apr 2019 |