using anisotropic total generalized variation. In Pro-
ceedings of the IEEE International Conference on
Computer Vision, pages 993–1000.
Foundry (2021). Keylight-foundry learn. website.
Fusiello, A. and Irsara, L. (2008). Quasi-euclidean uncal-
ibrated epipolar rectification. In 2008 19th Interna-
tional Conference on Pattern Recognition, pages 1–4.
IEEE.
FXGuide (2014). The art of deep compositing. website.
Gvili, R., Kaplan, A., Ofek, E., and Yahav, G. (2003). Depth
keying. In Woods, A. J., Merritt, J. O., Benton, S. A.,
and Bolas, M. T., editors, Stereoscopic Displays and
Virtual Reality Systems X, volume 5006, pages 564–
574. SPIE.
Heber, S., Ranftl, R., and Pock, T. (2013). Variational shape
from light field. In International Workshop on Energy
Minimization Methods in Computer Vision and Pat-
tern Recognition, pages 66–79. Springer.
Honegger, D., Sattler, T., and Pollefeys, M. (2017). Em-
bedded real-time multi-baseline stereo. In 2017 IEEE
International Conference on Robotics and Automation
(ICRA), pages 5245–5250. IEEE.
Hosni, A., Rhemann, C., Bleyer, M., Rother, C., and
Gelautz, M. (2012). Fast cost-volume filtering for
visual correspondence and beyond. IEEE Transac-
tions on Pattern Analysis and Machine Intelligence,
35(2):504–511.
Ihrke, I., Restrepo, J., and Mignard-Debise, L. (2016). Prin-
ciples of light field imaging: Briefly revisiting 25
years of research. IEEE Signal Processing Magazine,
33(5):59–69.
Jeon, H.-G., Park, J., Choe, G., Park, J., Bok, Y., Tai, Y.-W.,
and So Kweon, I. (2015). Accurate depth map estima-
tion from a lenslet light field camera. In Proceedings
of the IEEE conference on computer vision and pat-
tern recognition, pages 1547–1555.
Kanade, Y., Yoshida, A., Oda, K., Kano, H., and Tanaka, M.
(1996). A stereo machine for video-rate dense depth
mapping and its new applications. In Proceedings
CVPR IEEE Computer Society Conference on Com-
puter Vision and Pattern Recognition, pages 196–202.
Kang, Y.-S., Lee, C., and Ho, Y.-S. (2008). An efficient
rectification algorithm for multi-view images in par-
allel camera array. In 2008 3DTV Conference: The
True Vision-Capture, Transmission and Display of 3D
Video, pages 61–64. IEEE.
Klodt, M. and Vedaldi, A. (2018). Supervising the new with
the old: learning sfm from sfm. In Proceedings of the
European Conference on Computer Vision (ECCV),
pages 698–713.
Kopf, J., Matzen, K., Alsisan, S., Quigley, O., Ge, F.,
Chong, Y., Patterson, J., Frahm, J.-M., Wu, S., Yu,
M., et al. (2020). One shot 3d photography. ACM
Transactions on Graphics (TOG), 39(4):76–1.
Levin, A., Lischinski, D., and Weiss, Y. (2008). A closed-
form solution to natural image matting. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence,
30(2):228–242.
Li, W., Viola, F., Starck, J., Brostow, G. J., and Campbell,
N. D. (2016). Roto++ accelerating professional roto-
scoping using shape manifolds. ACM Transactions on
Graphics (TOG), 35(4):1–15.
Li, Y., Sun, J., and Shum, H.-Y. (2005). Video object cut
and paste. In ACM SIGGRAPH 2005 Papers, pages
595–600.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. International journal of computer
vision, 60(2):91–110.
Lu, T. and Li, S. (2012). Image matting with color
and depth information. In Proceedings of the
21st International Conference on Pattern Recognition
(ICPR2012), pages 3787–3790. IEEE.
Luo, X., Huang, J.-B., Szeliski, R., Matzen, K., and Kopf,
J. (2020). Consistent video depth estimation. ACM
Transactions on Graphics (TOG), 39(4):71–1.
Ma, F., Cavalheiro, G. V., and Karaman, S. (2019). Self-
supervised sparse-to-dense: Self-supervised depth
completion from lidar and monocular camera. In 2019
International Conference on Robotics and Automation
(ICRA), pages 3288–3295. IEEE.
Manakov, A., Restrepo, J., Klehm, O., Hegedus, R., Eise-
mann, E., Seidel, H.-P., and Ihrke, I. (2013). A recon-
figurable camera add-on for high dynamic range, mul-
tispectral, polarization, and light-field imaging. ACM
Transactions on Graphics, 32(4):47–1.
Ming, Y., Meng, X., Fan, C., and Yu, H. (2021). Deep
learning for monocular depth estimation: A review.
Neurocomputing.
Nair, R., Ruhl, K., Lenzen, F., Meister, S., Sch
¨
afer, H.,
Garbe, C. S., Eisemann, M., Magnor, M., and Kon-
dermann, D. (2013). A survey on time-of-flight stereo
fusion. In Time-of-Flight and Depth Imaging. Sen-
sors, Algorithms, and Applications, pages 105–127.
Springer.
Nozick, V. (2011). Multiple view image rectification. In
2011 1st International Symposium on Access Spaces
(ISAS), pages 277–282. IEEE.
Ota, M., Fukushima, N., Yendo, T., Tanimoto, M., and Fujii,
T. (2009). Rectification of pure translation 2d camera
array. In Proceedings of the Korean Society of broad-
cast engineers conference, pages 659–663. The Ko-
rean Institute of Broadcast and Media Engineers.
Park, J., Kim, H., Tai, Y.-W., Brown, M. S., and Kweon, I.
(2011). High quality depth map upsampling for 3d-tof
cameras. In 2011 International Conference on Com-
puter Vision, pages 1623–1630. IEEE.
Park, K., Kim, S., and Sohn, K. (2018). High-precision
depth estimation with the 3d lidar and stereo fusion.
In 2018 IEEE International Conference on Robotics
and Automation (ICRA), pages 2156–2163. IEEE.
Sara, U., Akter, M., and Uddin, M. S. (2019). Image qual-
ity assessment through fsim, ssim, mse and psnr—a
comparative study. Journal of Computer and Commu-
nications, 7(3):8–18.
Scharstein, D., Hirschm
¨
uller, H., Kitajima, Y., Krathwohl,
G., Ne
ˇ
si
´
c, N., Wang, X., and Westling, P. (2014).
High-resolution stereo datasets with subpixel-accurate
ground truth. In German conference on pattern recog-
nition, pages 31–42. Springer.
An Occlusion Aware Five-view Stereo System and Its Application in Video Post-production
853