Image
volume 04 issue 05

DEEP LEARNING FOR 3D RECOGNITION

Abstract

Estimating the three-dimensional location of an object is one of the most important issues to be addressed in the field of computer vision. In situations where the end goal is to build automated solutions capable of detecting and recognizing objects from photographs, new models and algorithms that perform exceptionally well are needed. It is possible that estimating the 3D position of an item from a single 2D image is a difficult challenge because the single image lacks information that is critical to the task. The investigation focused on a particular task of computing the three-dimensional location of a soccer ball. Ball nets and temporal nets are two examples of deep learning models, and this thesis outlines a strategy that is able to tackle this problem and is based on these models. The former uses a deep convolutional neural network to extract meaningful features from images, while the latter uses temporal information to arrive at more accurate predictions. Both of these methods aim to improve computer vision. Compared to other existing computer vision algorithms, our approach achieves a lower mean absolute error across a variety of conditions and setups. A whole new data-driven pipeline has been developed to process the movies and extract three-dimensional information about an item. In the realm of computer vision, one of the most important things to discuss is the process of estimating the three-dimensional location of an object. In situations where the end goal is to build automated solutions capable of detecting and recognizing objects from photographs, new models and algorithms that perform exceptionally well are needed. It is possible that estimating 3D space is a difficult challenge because single 2D photographs provide only limited information that is important for the task.

Keywords
  • Deep Learning,
  • 3D,
  • 2D,
  • computer vision algorithms
References
  • ] M. T. Ahmed, E. E. Hemayed, and A. A. Farag. “Neuro calibration: A Neural Network That Can Tell Camera Calibration Parameters”. In: Proceedings of the Seventh IEEE International Conference on Computer Vision. Vol. 1. 1999, 463–468 vol.
  • ] A.Borovykh, S. Bohte, and C. W. Oosterlee. “Conditional Time Series Forecasting with Convolutional Neural Networks”. In: ArXiv e-prints (Mar. 2017). arXiv: 1703.04691 [stat.ML].
  • ] Z.Boukhers et al. “Object Detection and Depth Estimation for 3D Trajectory Extraction”. In: 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI). June 2015, pp.
  • ] Y.Cao, Z. Wu, and C. Shen. “Estimating Depth from Monocular Images as Classification Using Deep Fully Convolutional Residual Networks”. In: ArXiv e-prints (May 2016). arXiv: 1605. 02305 [cs.CV].
  • ] G.Cybenko. “Approximation by superpositions of a sigmoidal function”. In: Mathematics of Control, Signals and Systems 2.4 (Dec. 1989), pp. 303–314. ISSN: 1435-568X. DOI: 10.1007/BF02551274.
  • ] Simon Donné et al. “MATE: Machine Learning for Adaptive Calibration Template Detection”. eng. In: Sensors (Basel, Switzerland) 16.11 (Nov. 2016). ISSN: 1424-8220. DOI: 10.3390/s16111858.
  • ] Dagao Duan et al. “An Improved Hough Transform for Line Detection”. In: 2010 International Conference on Computer Application and System Modeling (ICCASM 2010). Vol. 2. Oct. 2010,
  • ] “Extracting 3D Information from Broadcast Soccer Video”. en. In: Image and Vision Computing 24.10 (Oct. 2006), pp. 1146–1162. ISSN: 0262-8856.
  • ] Dirk Farin et al. “Robust Camera Calibration for Sport Videos Using Court Models”. In: Proceedings of SPIE. Vol. 5307. Bellingham, WA: SPIE, 2004, pp. 80–91. ISBN: 978-0-8194-5210-8.
  • ] FIFA.com. Fédération Internationale de Football Association (FIFA) - FIFA.Com.
  • ] Ross B. Girshick. “Fast R-CNN”. In: CoRR abs/1504.08083 (2015). arXiv: 1504.08083.
  • ] R. Girshick et al. “Rich feature hierarchies for accurate object detection and semantic segmentation”. In: ArXiv e-prints (Nov. 2013). arXiv: 1311.2524 [cs.CV].
  • ] K. He et al. “Deep Residual Learning for Image Recognition”. In: ArXiv e-prints (Dec. 2015). arXiv: 1512.03385 [cs.CV].
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Manish Kumar Verma. (2021). DEEP LEARNING FOR 3D RECOGNITION. International Journal of Multidisciplinary Research and Studies, 4(05), 01–11. Retrieved from https://ijmras.com/index.php/ijmras/article/view/206

Download Citation

Downloads

Download data is not yet available.

Similar Articles

You may also start an advanced similarity search for this article.