Patch-based 3D reconstruction of deforming objects from monocular grey-scale videos
Mostafa, Abdelrahman (2020-06-23)
Mostafa, Abdelrahman
A. Mostafa
23.06.2020
© 2020 Abdelrahman Mostafa. Tämä Kohde on tekijänoikeuden ja/tai lähioikeuksien suojaama. Voit käyttää Kohdetta käyttöösi sovellettavan tekijänoikeutta ja lähioikeuksia koskevan lainsäädännön sallimilla tavoilla. Muunlaista käyttöä varten tarvitset oikeudenhaltijoiden luvan.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:oulu-202006242661
https://urn.fi/URN:NBN:fi:oulu-202006242661
Tiivistelmä
The ability to reconstruct the spatio-temporal depth map of a non-rigid object surface deforming over time has many applications in many different domains. However, it is a challenging problem in Computer Vision. The reconstruction is ambiguous and not unique as many structures can have the same projection in the camera sensor.
Given the recent advances and success of Deep Learning, it seems promising to use and train a Deep Convolutional Neural Network to recover the spatio-temporal depth map of deforming objects. However, training such networks requires a large-scale dataset. This problem can be tackled by artificially generating a dataset and using it in training the network.
In this thesis, a network architecture is proposed to estimate the spatio-temporal structure of the deforming object from small local patches of a video sequence. An algorithm is presented to combine the spatio-temporal structure of these small patches into a global reconstruction of the scene. We artificially generated a database and used it to train the network. The performance of our proposed solution was tested on both synthetic and real Kinect data. Our method outperformed other conventional non-rigid structure-from-motion methods.
Given the recent advances and success of Deep Learning, it seems promising to use and train a Deep Convolutional Neural Network to recover the spatio-temporal depth map of deforming objects. However, training such networks requires a large-scale dataset. This problem can be tackled by artificially generating a dataset and using it in training the network.
In this thesis, a network architecture is proposed to estimate the spatio-temporal structure of the deforming object from small local patches of a video sequence. An algorithm is presented to combine the spatio-temporal structure of these small patches into a global reconstruction of the scene. We artificially generated a database and used it to train the network. The performance of our proposed solution was tested on both synthetic and real Kinect data. Our method outperformed other conventional non-rigid structure-from-motion methods.
Kokoelmat
- Avoin saatavuus [31941]