In our project we attempted to use a Convolutional Neural Network to output a matching cost that could then be aggregated and refined to compute pixel-wise disparities. To make training computationally efficient it was necessary to use a fully connected network. To make testing computationally efficient, it was necessary to transform the fully connected network into a convolutional network.
Once matching costs were computed, we made use of a context-aware aggregation scheme called Cross-Based Cost Aggregation. We then estimated disparities using a “winner takes all” minimization approach. We also made use of occlusion interpolation to refine the computed pixel-wise disparities. We found that the CNN based approach leads to disparity maps that are smoother than those obtained with a naive approach. Regions with low texture which are traditionally considered difficult to produce disparity values for are modelled rather well by this approach. Anaglyphs were generated from the depth maps to subjectively evaluate the results. The next step step for us is to rigorously compare our
performance against ground truth for our data-set. Owing to computing constraints we have not yet been able to run exhaustive evaluations, but we are in the process of obtaining these resources, and plan to produce quantitative against the data-set results in the coming days.