Kim, V. G., Lipman, Y., and Funkhouser, T. (2011).
Blended intrinsic maps. ACM Transactions On
Graphics (TOG), 30(4):1–12.
Krull, A., Brachmann, E., Michel, F., Ying Yang, M.,
Gumhold, S., and Rother, C. (2015). Learning
analysis-by-synthesis for 6D pose estimation in RGB-
D images. In IEEE International Conference on Com-
puter Vision (ICCV), pages 954–962.
Kuehne, B., True, T., Commike, A., and Shreiner, D.
(2005). Performance OpenGL: Platform independent
techniques. In ACM SIGGRAPH 2005 Courses.
Kundu, A., Li, Y., and Rehg, J. M. (2018). 3D-RCNN:
Instance-level 3D object reconstruction via render-
and-compare. In IEEE Conference on Computer Vi-
sion and Pattern Recognition (CVPR), pages 3559–
3568.
Labbe, Y., Carpentier, J., Aubry, M., and Sivic, J. (2020).
CosyPose: Consistent multi-view multi-object 6D
pose estimation. In European Conference on Com-
puter Vision (ECCV).
Li, Y., Wang, G., Ji, X., Xiang, Y., and Fox, D. (2018).
DeepIM: Deep iterative matching for 6D pose esti-
mation. In European Conference on Computer Vision
(ECCV), pages 683–698.
Liu, S., Li, T., Chen, W., and Li, H. (2019). Soft Rasterizer:
A differentiable renderer for image-based 3D reason-
ing. In IEEE International Conference on Computer
Vision (ICCV), pages 7708–7717.
Loper, M. M. and Black, M. J. (2014). OpenDR: An ap-
proximate differentiable renderer. In European Con-
ference on Computer Vision (ECCV), pages 154–169.
Mantiuk, R., Kim, K. J., Rempel, A. G., and Heidrich, W.
(2011). HDR-VDP-2 : A calibrated visual metric
for visibility and quality predictions in all luminance
conditions. ACM Transactions on graphics (TOG),
30(4):1–14.
Merry, B. (2012). Performance tuning for tile-based archi-
tectures. OpenGL Insights, page 323.
Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S.,
and Geiger, A. (2019). Occupancy networks: Learn-
ing 3D reconstruction in function space. In IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR), pages 4460–4470.
Moreno, P., Williams, C. K., Nash, C., and Kohli, P. (2016).
Overcoming occlusion with inverse graphics. In Euro-
pean Conference on Computer Vision (ECCV), pages
170–185.
Myronenko, A. and Song, X. (2010). Point set registra-
tion: Coherent point drift. IEEE Transactions on
Pattern Analysis and Machine Intelligence (TPAMI),
32(12):2262–2275.
Pan, J., Han, X., Chen, W., Tang, J., and Jia, K. (2019).
Deep mesh reconstruction from single RGB images
via topology modification networks. In IEEE Interna-
tional Conference on Computer Vision (ICCV), pages
9964–9973.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J.,
Chanan, G., Killeen, T., Lin, Z., Gimelshein, N.,
Antiga, L., Desmaison, A., Kopf, A., Yang, E., De-
Vito, Z., Raison, M., Tejani, A., Chilamkurthy, S.,
Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019).
PyTorch: An imperative style, high-performance deep
learning library. In Advances in Neural Information
Processing Systems (NeurIPS), pages 8024–8035.
Pavlakos, G., Zhu, L., Zhou, X., and Daniilidis, K. (2018).
Learning to estimate 3D human pose and shape from a
single color image. In IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), pages 459–
468.
Peng, S., Liu, Y., Huang, Q., Zhou, X., and Bao, H. (2019).
PVNet: Pixel-wise voting network for 6DOF pose es-
timation. In IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), pages 4561–4570.
Periyasamy, A. S., Schwarz, M., and Behnke, S. (2019).
Refining 6D object pose predictions using abstract
render-and-compare. In IEEE-RAS 19th International
Conference on Humanoid Robots (Humanoids), pages
739–746.
Pharr, M., Jakob, W., and Humphreys, G. (2016). Phys-
ically based rendering: From theory to implementa-
tion. Morgan Kaufmann.
Ravi, N., Reizenstein, J., Novotny, D., Gordon, T., Lo, W.-
Y., Johnson, J., and Gkioxari, G. (2020). Accelerating
3D deep learning with PyTorch3D. In European Con-
ference on Computer Vision (ECCV).
Rodriguez, D., Cogswell, C., Koo, S., and Behnke, S.
(2018). Transferring grasping skills to novel instances
by latent space non-rigid registration. In IEEE In-
ternational Conference on Robotics and Automation
(ICRA), pages 1–8.
Rodriguez, D., Huber, F., and Behnke, S. (2020). Category-
level 3D non-rigid registration from single-view RGB
images. IEEE/RSJ International Conference on Intel-
ligent Robots and Systems (IROS).
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-
Net: Convolutional networks for biomedical image
segmentation. In International Conference on Medi-
cal image computing and computer-assisted interven-
tion (MICCAI), pages 234–241.
Schwarz, M. and Behnke, S. (2020). Stillleben: Realistic
scene synthesis for deep learning in robotics. IEEE
International Conference on Robotics and Automation
(ICRA).
Spitzer, J. (2003). OpenGL performance tuning. In NVIDIA
Corporation, GameDevelopers Conference.
Tappen, M. F., Freeman, W. T., and Adelson, E. H. (2005).
Recovering intrinsic images from a single image.
IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI), 27(9):1459–1472.
Wang, N., Zhang, Y., Li, Z., Fu, Y., Liu, W., and Jiang, Y.-
G. (2018). Pixel2Mesh: Generating 3D mesh models
from single RGB images. In European Conference on
Computer Vision (ECCV), pages 52–67.
Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, E. (2004).
Image quality assessment: From error visibility to
structural similarity. IEEE Transactions on Image
Processing, 13(4):600–612.
Wang, Z., Simoncelli, E. P., and Bovik, A. C. (2003). Mul-
tiscale structural similarity for image quality assess-
Iterative 3D Deformable Registration from Single-view RGB Images using Differentiable Rendering
115