Trifocal tensor

In computer vision, the trifocal tensor (also tritensor) is a 3×3×3 array of numbers (i.e., a tensor) that incorporates all projective geometric relationships among three views. It relates the coordinates of corresponding points or lines in three views, being independent of the scene structure and depending only on the relative motion (i.e., pose) among the three views and their intrinsic calibration parameters. Hence, the trifocal tensor can be considered as the generalization of the fundamental matrix in three views. It is noted that despite that the tensor is made up of 27 elements, only 18 of them are actually independent.

Correlation slices

The tensor can also be seen as a collection of three rank-two 3 x 3 matrices ${\mathbf T}_1, \; {\mathbf T}_2, \; {\mathbf T}_3$ known as its correlation slices. Assuming that the projection matrices of three views are ${\mathbf P}=[ {\mathbf I} \; | \; {\mathbf 0} ]$ , ${\mathbf P}^'=[ {\mathbf A} \; | \; {\mathbf a}_4 ]$ and ${\mathbf P^{''}}=[{\mathbf B} \; | \; {\mathbf b}_4 ]$ , the correlation slices of the corresponding tensor can be expressed in closed form as ${\mathbf T}_i={\mathbf a}_i {\mathbf b}_4^t - {\mathbf a}_4 {\mathbf b}_i^t, \; i=1 \ldots 3$ , where ${\mathbf a}_i, \; {\mathbf b}_i$ are respectively the i^th columns of the camera matrices. In practice, however, the tensor is estimated from point and line matches across the three views.

Trilinear constraints

One of the most important properties of the trifocal tensor is that it gives rise to linear relationships between lines and points in three images. More specifically, for triplets of corresponding points ${\mathbf x} \; \leftrightarrow \; {\mathbf x}^{'} \; \leftrightarrow \;{\mathbf x}^{''}$ and any corresponding lines ${\mathbf l} \; \leftrightarrow \; {\mathbf l}^{'} \; \leftrightarrow \;{\mathbf l}^{''}$ through them, the following trilinear constraints hold:

({\mathbf l}^{'t} \left[{\mathbf T}_1, \; {\mathbf T}_2, \; {\mathbf T}_3 \right] {\mathbf l}^{''}) [{\mathbf l}]_{\times} = {\mathbf 0}^t

{\mathbf l}^{'t} \left( \sum_i x_i {\mathbf T}_i \right) {\mathbf l}^{''} = 0

{\mathbf l}^{'t} \left( \sum_i x_i {\mathbf T}_i \right) [{\mathbf x}^{''}]_{\times} = {\mathbf 0}^t

[{\mathbf x}^']_{\times} \left( \sum_i x_i {\mathbf T}_i \right) {\mathbf l}^{''} = {\mathbf 0}

[{\mathbf x}^']_{\times} \left( \sum_i x_i {\mathbf T}_i \right) [{\mathbf x}^{''}]_{\times} = {\mathbf 0}_{3 \times 3}

where $[\cdot]_{\times}$ denotes the skew-symmetric cross product matrix.

Transfer

Given the trifocal tensor of three views and a pair of matched points in two views, it is possible to determine the location of the point in the third view without any further information. This is known as point transfer and a similar result holds for lines.

References

Richard Hartley and Andrew Zisserman (2003). Multiple View Geometry in computer vision. Cambridge University Press. ISBN 0-521-54051-8. Chapter on tensor is online
Richard I. Hartley (1997). "Lines and Points in Three Views and the Trifocal Tensor". International Journal of Computer Vision. 22 (2): 125–140. doi:10.1023/A:1007936012022.
Philip Torr and Andrew Zisserman (1997). "Robust Parameterization and Computation of the Trifocal Tensor". Image and Vision Computing. 15 (8): 591–607. doi:10.1016/S0262-8856(97)00010-3.

External links

Visualization of trifocal geometry (originally by Sylvain Bougnoux of INRIA Robotvis, requires Java)

This article is issued from Wikipedia - version of the 3/18/2013. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.