National Repository of Grey Literature 4 records found  Search took 0.01 seconds. 
6D pose estimation of objects in images
Cífka, Martin ; Šivic, Josef (advisor) ; Šikudová, Elena (referee)
The 6D pose estimation is an important computer vision task with applications in robotics, e.g. for manipulation or grasping, but also in computer graphics and augmented reality. Given an image, the task is to estimate the 3D rotation and 3D translation of the known object with respect to the camera. The task is even more challenging in an uncontrolled environment, e.g. when we do not have proper camera calibration. In that case, the focal length also needs to be estimated with the 6D pose. In this work, we address the issues of methods that work in such uncontrolled environments. First, we focus on FocalPose, a state-of-the-art method for joint estimation of object 6D pose and camera focal length. We review the method and propose several improve- ments. These include (i) re-deriving and improving the 6D pose and focal length update rule, (ii) replacing the model retrieval method, and (iii) changing the distribution of 6D poses and focal lengths used for synthetic training data rendering. These changes lead to improved results compared to the state-of-the-art FocalPose method. Second, to avoid often costly retraining of models for 6D pose estaimation, it is ben- eficial to consider methods with the ability to generalize to novel objects that have not been seen during training. These methods require a 2D...
Autoregressive action-conditioned 3D human motion synthesis using latent discrete codes
Waltl, Jan ; Šivic, Josef (advisor) ; Mirbauer, Martin (referee)
V této práci jsme představili novou metodu pro syntézu 3D animace pohybu člověka podmíněné na pevné množině akcí definující pohyb, například "běhání" nebo "předklon". Inspirování úspěchy metod pro generování obrázků z textu na základě diskrétních latentních reprezentacích, jsme úspěšně vyzkoušeli použití těchto metod v kontextu generování pohybu, což je v kontrastu s dosavadními příst upy využívající spojité latentní proměnné. Ve srovnání s dosavadní nejlepší metodou ACTOR, naše metoda není limitována délkou generovaných sekvencí a dokáže plynule navázat na vstupní startovní sekvenci. Autoregressivní generování je omezeno délkou kon textu, což zajišťuje rozumnou rychlost generování. Dále, díky učení ve dvou fázích, budoucí modely se mohou snadno pře dučit na větších datasetech bez označení kategorií a dotrénovat se na konkrétním úkolu. Naši metodu jsme vyhodnotili na UESTC dataset, v metriká ch překování dosavadní metodu ACTOR a generuje animace srovnatelné s datasetem.
Learning to solve geometric construction problems from images
Macke, Jaroslav ; Šivic, Josef (advisor) ; Šikudová, Elena (referee)
Geometric constructions using ruler and compass are being solved for thousands of years. Humans are capable of solving these problems without explicit knowledge of the analytical models of geometric primitives present in the scene. On the other hand, most methods for solving these problems on a computer require an analytical model. In this thesis, we introduce a method for solving geometrical constructions with access only to the image of the given geometric construction. The method utilizes Mask R-CNN, a convolutional neural network for detection and segmentation of objects in images and videos. Outputs of the Mask R-CNN are masks and bounding boxes with class labels of detected objects in the input image. In this work, we employ and adapt the Mask R- CNN architecture to solve geometric construction problems from image input. We create a process for computing geometric construction steps from masks obtained from Mask R- CNN and describe how to train the Mask R-CNN model to solve geometric construction problems. However, solving geometric problems this way is challenging, as we have to deal with object detection and construction ambiguity. There is possibly an infinite number of ways to solve a geometric construction problem. Furthermore, the method should be able to solve problems not seen during the...
Learning to solve geometric construction problems from images
Macke, Jaroslav ; Šivic, Josef (advisor) ; Šikudová, Elena (referee)
Geometric constructions using ruler and compass are being solved for thousands of years. Humans are capable of solving these problems without explicit knowledge of the analytical models of geometric primitives present in the scene. On the other hand, most methods for solving these problems on a computer require an analytical model. In this thesis, we introduce a method for solving geometrical constructions with access only to the image of the given geometric construction. The method utilizes Mask R-CNN, a convolutional neural network for detection and segmentation of objects in images and videos. Outputs of the Mask R-CNN are masks and bounding boxes with class labels of detected objects in the input image. In this work, we employ and adapt the Mask R- CNN architecture to solve geometric construction problems from image input. We create a process for computing geometric construction steps from masks obtained from Mask R- CNN and describe how to train the Mask R-CNN model to solve geometric construction problems. However, solving geometric problems this way is challenging, as we have to deal with object detection and construction ambiguity. There is possibly an infinite number of ways to solve a geometric construction problem. Furthermore, the method should be able to solve problems not seen during the...

Interested in being notified about new results for this query?
Subscribe to the RSS feed.