外文翻譯---日常戶外圖片地面陰影檢測_第1頁
已閱讀1頁,還剩19頁未讀 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

1、<p><b>  外文資料</b></p><p>  Detecting ground shadows in outdoor consumer photographs</p><p>  Jean-Francois Lalonde, Alexei A. Efros, and Srinivasa G. Narasimhanc</p><

2、p>  School of Computer Science, Carnegie Mellon University</p><p>  Project webpage: http://graphics.cs.cmu.edu/projects/shadows</p><p>  Abstract. Detecting shadows from images can significa

3、ntly improve the performance of several vision tasks such as object detection and tracking. Recent approaches have mainly used illumination invariants which can fail severely when the qualities of the images are not very

4、 good, as is the case for most consumer-grade photographs, like those on Google or Flickr. We present a practical algorithm to automatically detect shadows cast by objects onto the ground, from a single consumer photogra

5、ph. Our</p><p>  key hypothesis is that the types of materials constituting the ground in outdoor scenes is relatively limited, most commonly including asphalt, brick, stone, mud, grass, concrete, etc. As a

6、result, the appearances of shadows on the ground are not as widely varying as general shadows and thus, can be learned from a labelled set of images. Our detector consists of a three-tier process including (a) training a

7、 decision tree classifier on a set of shadow sensitive features computed around each image</p><p>  Introduction</p><p>  Shadows are everywhere! Yet, the human visual system is so adept at filt

8、ering them out, that we never give shadows a second thought; that is until we need to deal with them in our algorithms. Since the very beginning of computer vision, the presence of shadows has been responsible for wreaki

9、ng havoc on a wide variety of applications, including segmentation, object detection, scene analysis, stereo, tracking, etc. On the other hand, shadows play a crucial role in determining the type of illuminat</p>

10、<p>  is because the appearances and shapes of shadows outdoors depend on several hidden factors such as the color, direction and size of the illuminants (sun, sky, clouds), the geometry of the objects that are cast

11、ing the shadows and the shape and material properties of objects onto which the shadows are cast.</p><p>  Most works for detecting shadows from a single image are based on computing illumination invariants

12、that are physically-based and are functions of individual pixel values [10–14] or the values in a local image neighborhood [15].Unfortunately, reliable computations of these invariants require high quality images with w

13、ide dynamic range, high intensity resolution and where the camera radiometry and color transformations are accurately measured and compensated for. Even slight perturbations (imper</p><p>  Our goal is to bu

14、ild a reliable shadow detector for consumer photographs of outdoor scenes. While detecting all shadows is expected to remain hard, we explicitly focus on the shadows cast by objects onto the ground plane. Fortunately, th

15、e types of materials constituting the ground in typical outdoor scenes are (relatively) limited, most commonly including concrete, asphalt, grass, mud, stone, brick, etc. Given this observation, our key hypothesis is tha

16、t the appearances of shadows on the ground </p><p><b>  Overview</b></p><p>  Our approach consists of three stages depending on the information in the image</p><p>  us

17、ed. In the first stage, we will exploit local information around edges in the image. For this, we compute a set of shadow sensitive features that include the ratios of brightness and color filter responses at different s

18、cales and orientations on both sides of the edge. These features are then used with a trained decision tree classifier to detect whether an edge is a shadow or not. The idea is that while any single feature may not be us

19、eful for detecting all ground shadows, the classifier is p</p><p>  In the second stage, we enforce a grouping of the shadow edges using a Conditional Random Field (CRF) to create longer contours. This is si

20、milar in spirit to the classical constrained label propagation used in mid-level vision tasks [17]. This procedure connects likely shadow edges, discourages T-junctions which are highly unlikely on shadow boundaries, and

21、 removes isolated weak edges. But how do we detect the ground in an image? For this, in the third stage, we incorporate a global scene layout</p><p>  We demonstrate successful shadow detection on several im

22、ages of natural scenes that include beaches, meadows and forest trails, as well as urban scenes that include numerous pedestrians, vehicles, trees, roads and buildings, captured under a variety of illumination conditions

23、 (sunny, partly cloudy, overcast). Similarly to the approach of Zhu et al. [19], our method relies on learning the appearance of shadows based on image features, but does so by using full color information. We found that

24、 usi</p><p>  2 Learning local cues for shadow detection</p><p>  Our approach relies on a classifier which is trained to recognize ground shadow edges by using features computed over a local ne

25、ighborhood around the edge. We show that it is indeed possible to obtain good classification accuracy by relying on local cues, and that it can be used as a building block for subsequent steps. In this section, we descri

26、be how to build, train, and evaluate such a classifier.</p><p>  2.1From pixels to boundaries</p><p>  We first describe the underlying representation on which we compute features. Since workin

27、g with individual pixels is prone to noise and computationally expensive, we propose to instead reason about boundaries, or groups of pixels along an edge in the image. To obtain these boundaries, we first smooth the ima

28、ge with a bilateral filter [20], compute gradient magnitudes on the filtered image, and then apply the watershed segmentation algorithm on the gradient map. Fig. 1(b) shows a close-up exampl</p><p>  (a) Inp

29、ut image (b) Boundaries (c) Strong boundaries (d) Output</p><p>  Fig. 1. Processing stages for the local classifier. The input image (a) is over-segmented into thousands of regions to obtain boundar

30、ies (b). Weak boundaries are filtered out by a Canny edge detector (c), and the classifier is applied on the remainder. (d) shows the boundaries i for which P (yi = 1|x) > 0.5. Note the correct classification of occlu

31、sion contours around the person’s legs and the reflectance edges in the white square between the person’s feet.</p><p>  An undesirable consequence of the watershed segmentation is that it generates boundari

32、es in smooth regions of the image (Fig. 1(b)). To compensate for this, we retain only those boundaries which align with the strong edges in the image. For this, we use the canny edge detector at 4 scales to account for b

33、lurry shadow edges (σ 2 = {1, 2, 4, 8}), with a high threshold empirically set to t = 0.3. Under these conditions, we verified that the initial set of boundaries contain more than 97% of the tru</p><p>  2.2

34、Local shadow features</p><p>  We now describe the features computed over each boundary in the image. A useful feature to describe a shadow edge is the ratio of color intensities on both sides of the edge (

35、e.g. min divided by max) [21]. The intuition is that shadows should have a specific ratio that is more or less the same across an image, since it is primarily due to the differences in natural lighting inside and outside

36、 theshadow. Since it is hard to manually determine the best color space [22] or best scale to compute fea</p><p>  For a pixel along a boundary, we compute the intensity on one side of the edge (say, the lef

37、t) by evaluating a weighted sum of pixels on the left of the edge. But which pixels to choose? We could use the watershed segments, but they do not typically extend very far. Instead, we use an oriented gaussian derivati

38、ve filter of variance σ 2 , but keep only its values which are greater than zero. We align the filter with the boundary orientation such that its positive weights lie on the left of the bo</p><p>  We also e

39、mploy two features suggested in [19] which capture the texture and intensity distribution differences on both sides of a boundary. The first feature computes a histogram of textons at 4 different scales, and compares the

40、m using the χ2 -distance. The texton dictionary was computed on a non-overlapping set of images. The second feature computes the difference in skewness of pixel intensities, again at the same 4 scales. </p><p&

41、gt;  Finally, we concatenate the absolute value of the minimum filter response computed over the intensity channel to obtain the final, overcomplete, 48-dimensional feature vector at every pixel. Boundary feature vector

42、s are obtained by averaging the features of all pixels that belong to it.</p><p>  2.3 Classifier</p><p>  Having computed the feature vector xi at each strong boundary in the image, we can now

43、use them to train a classifier to learn the probability P (yi |xi ) that boundary i is due to a shadow (which we denote with label yi ). We estimate that distribution using a logistic regression version of Adaboost [24],

44、 with twenty 16-node decision trees as weak learners. This classification method provides good feature selection and outputs probabilities, and has been successfully used in a variety of other </p><p>  To t

45、rain the classifier, we selected 170 images from LabelMe [16], Flickr, and the dataset introduced in [19], with the only conditions being that the ground must be visible, and there must be shadows. The positive training

46、set contains manually labelled shadow boundaries, while the negative training set is populated with an equal amount of strong non-shadow boundaries on the ground (e.g. street markings) and occlusion boundaries.</p>

47、<p>  We obtain a per-boundary classification accuracy of 79.7% (chance is 50%, see Fig. 5 for a breakdown per class). See Fig. 1(d) for an example. This result support out hypothesis: while the appearance of shad

48、ows on any type of material in any condition might be impossible to learn, the space of shadow appearances on the ground in outdoor scenes may not be that large after all!</p><p>  3 Creating shadow contours

49、</p><p>  Despite encouraging results, our classifier is limited by its locality since it treats each boundary independently of the next. However, the color ratios of a shadow boundary should be consistent w

50、ith those of its neighbors, since the sources illuminating nearby scene points should also be similar. Thus, we can exploit higher order dependencies across local boundaries to create longer shadow contours as well as re

51、move isolated/spurious ones.</p><p><b>  (b)</b></p><p>  Fig. 2. Creating shadow contours by enforcing local consistency. Our CRF formulationmay help to (a) bridge the gap across X-

52、junctions where the local shadow classifier might be uncertain, and (b) remove spurious T-junctions which should not be caused by shadows.</p><p>  To model these dependencies, we construct a graph with indi

53、vidual boundaries as nodes (such as those in Fig. 1(b)) and drawing an edge across boundaries which meet at a junction point. We then define a CRF on that graph, which expresses the log-likelihood of a particular labelin

54、g y (i.e. assignment of shadow/non-shadow to each boundary) given observed data x as a sum of unary φi (yi ) and pairwise potentials ψi,j (yi , yj ): </p><p><b>  (1)</b></p><p>  wh

55、ere B is the set of boundaries, E the set of edges between them, and λ and β are model parameters. In particular, λ is a weight controlling the relative importance of the two terms. Zλ,β is the partition function that de

56、pends on the parameters λ and β, but not on the labeling y itself. Intuitively, we would like the unary potentials to penalize the assignment of the “shadow” label to boundaries which are not likely to be shadows accordi

57、ng to our local classifier. This can be modeled using </p><p>  φi (yi ) = ? log P (yi |xi ) (2)</p><p>  We would also like the pairwise potentials to penalize the assignm

58、ent of different labels to neighboring boundaries that have similar features, which can be written as </p><p>  , (3)</p><p>  where 1(·) is the indicator function, and β is a contra

59、st-normalization constant as suggested in [26]. In other words, we encourage neighboring shadows which have similar features and strong local probabilities to be labelled as shadows. </p><p>  The negative l

60、ikelihood in (1) can be efficiently minimized using graph cuts [27–29]. The free parameters were assigned the values of λ = 0.5 and β = 16 obtained by 2-fold cross-validation on a non-overlapping set of images.</p>

61、<p>  Applying the CRF on our test images results in an improvement of roughly 1% in total classification accuracy, for a combined score of 80.5% (see Fig. 5-(b)). But more importantly, in practice, the way the CR

62、F is setup encourages</p><p>  (a) Input (b) Local classifier (c) Shadow contours </p><p>  (d) Ground likelihood [18] (e) Combining (c) and (d)</p><p>  Fig

63、. 3. Incorporating scene layout for detecting cast shadows on the ground. Applying our shadow detector on a complex input image (a) yields false detections in the vertical structures because of complex effects like occlu

64、sion boundaries, self-shadowing, etc. (b)& (c). Recent work in scene layout extraction from single images [18] can be used to</p><p>  estimate the location of the ground pixels (d). We show how we can c

65、ombine scene layout information with our shadow contour classifier to automatically detect cast shadows on the ground (e). continuity, crossing through X-junctions, and discourages T-junctions as shown in Fig. 2. Since s

66、hadows are usually signaled by the presence of X-junctions and the absence of T-junctions [30], this reduces the number of false positives.</p><p>  4 Incorporating scene layout</p><p>  Until n

67、ow, we have been considering the problem of detecting cast shadow boundaries on the ground with a classifier trained on local features and a CRF formulation which defines pairwise constraints across neighboring boundarie

68、s. While both approaches provide good classification accuracy, we show in Fig. 3 that applying them on the entire image generates false positives in the vertical structures of the scene. Reflections, transparency, occlus

69、ion boundaries, selfshadowing, and complex geometry [</p><p>  The advent of recent approaches which estimate a qualitative layout of the scene from a single image (e.g. splitting an image into three main ge

70、ometric classes: the sky, vertical surfaces, and ground [18]) may provide explicit knowledge of where the ground is. Since such a scene layout estimator is specifically trained on general features of the scene and not th

71、e shadows, combining its out-put with our shadow detector should reduce the number of false positive (non-shadow) detections outside the</p><p>  4.1 Combining scene layout with local shadow cues</p>

72、<p>  To combine the scene layout probabilities with our local shadow classifier, we</p><p>  can marginalize the probability of shadows over the three geometric classes sky S, ground G, and vertical su

73、rfaces V:</p><p>  = (4)</p><p>  where ci is the geometric class label of boundary i, P (yi |ci , xi ) is given by our local shadow classifier, and P (ci |xi ) by the scene

74、 layout classifier (we use the geometric context algorithm [18]). Unfortunately, this approach does not actually improve classification results because while it gets rid of false positives in the vertical structures, it

75、also loses true positives on the ground along the way. This is due to the fact that shadow likelihoods get down-weighted by low-confidence gr</p><p>  4.2 Combining scene layout with shadow contours</p>

76、;<p>  Intuitively, we would like to penalize an assignment to the shadow class when the probability of being on the ground is low. When it is high, however, we should let the shadow classifier decide. We can enco

77、de this behavior simply by modifying the unary potentials φi (yi ) from (2) in our CRF formulation:</p><p><b>  (5)</b></p><p>  Here, λ = 0.5 and β = 16 was found by cross-validatio

78、n. They yield a good compromise between local evidence and smoothness constraints.This approach effectively combines local and mid-level shadow cues with high-level scene interpretation results, and yields an overall cla

79、ssification accu-racy of 84.8% on our test set (see Fig. 5) without adding to the complexity of training our model. Observe how the results are significantly improved in Fig. 3(e) as compared to the other scenarios in Fi

80、g. 3(b)</p><p>  References</p><p>  Lalonde, J.F., Efros, A.A., Narasimhan, S.G.: Illumination estimation from a single </p><p>  outdoor image. In: IEEE International Conference o

81、n Computer Vision. (2009)</p><p>  Sato, I., Sato, Y., Ikeuchi, K.: Illumination from shadows. IEEE Transactions on </p><p>  Pattern Analysis and Machine Intelligence 25 (2003)</p><p

82、>  3. Matsushita, Y., Nishino, K., Ikeuchi, K., Sakauchi, M.: Illumination normalization</p><p>  with time-dependent intrinsic images for video surveillance. IEEE Transactions on Pattern Analysis and Mac

83、hine Intelligence 26 (2004)</p><p>  4. Finlayson, G.D., Fredembach, C., Drew, M.S.: Detecting illumination in images. In: IEEE International Conference on Computer Vision. (2007)</p><p>  5. We

84、iss, Y.: Deriving intrinsic images from image sequences. In: IEEE International</p><p>  Conference on Computer Vision. (2001)</p><p>  6. Huerta, I., Holte, M., Moeslund, T., Gonz`lez, J.: Dete

85、ction and removal of chroamatic moving shadows in surveillance scenarios. In: IEEE International Conference on Computer Vision. (2009)</p><p>  7. Wu, T.P., Tang, C.K.: A bayesian approach for shadow extract

86、ion from a single image. In: IEEE International Conference on Computer Vision. (2005)</p><p>  8. Bousseau, A., Paris, S., Durand, F.: User-assisted intrinsic images. ACM Trans-actions on Graphics (SIGGRAPH

87、Asia 2009) 28 (2009)</p><p>  9. Shor, Y., Lischinski, D.: The shadow meets the mask: pyramid-based shadow removal. Computer Graphics Forum Journal (Eurographics 2008) 27 (2008)</p><p>  10. Fin

88、layson, G.D., Hordley, S.D., Drew, M.S.: Removing shadows from images. In: European Conference on Computer Vision. (2002)</p><p>  11. Finlayson, G.D., Drew, M.S., Lu, C.: Intrinsic images by entropy minimiz

89、ation.</p><p>  In: European Conference on Computer Vision. (2004)</p><p>  12. Finlayson, G.D., Drew, M.S., Lu, C.: Entropy minimization for shadow removal. International Journal of Computer Vi

90、sion 85 (2009)</p><p>  13. Maxwell, B.A., Friedhoff, R.M., Smith, C.A.: A bi-illuminant dichromatic reflection model for understanding images. In: IEEE Conference on Computer Vision and Pattern Recognition.

91、 (2008)</p><p>  14. Tian, J., Sun, J., Tang, Y.: Tricolor attenuation model for shadow detection. IEEE</p><p>  Transactions on Image Processing 18 (2009)</p><p>  15. Narasimhan,

92、S.G., Ramesh, V., Nayar, S.K.: A class of photometric invariants:</p><p>  Separating material from shape and illumination. In: IEEE International Conference on Computer Vision. (2005)</p><p>  

93、16. Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: a database</p><p>  and web-based tool for image annotation. International Journal of Computer Vision 77 (2008)</p><p>  17

94、. Freeman, W.T., Pasztor, E.C., Carmichael, O.T.: Learning low-level vision. International Journal of Computer Vision 40 (2000)</p><p>  18. Hoiem, D., Efros, A.A., Hebert, M.: Recovering surface layout from

95、 an image. International Journal of Computer Vision 75 (2007)</p><p>  19. Zhu, J., Samuel, K.G.G., Masood, S.Z., Tappen, M.F.: Learning to recognize shadows in monochromatic natural images. In: IEEE Confere

96、nce on Computer</p><p>  Vision and Pattern Recognition. (2010)</p><p>  20. Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Proceedings of the 6th International Con

97、ference on Computer Vision. (1998)</p><p>  21. Barnard, K., Finlayson, G.D.: Shadow identification using colour ratios. In: Proc.</p><p>  IS&T/SID 8th Color Imaging Conf. Color Science, Sy

98、stems and Applications.(2000)</p><p>  22. Khan, E.A., Reinhard, E.: Evaluation of color spaces for edge classification in</p><p>  outdoor scenes. In: IEEE International Conference on Image Pro

99、cessing. (2005)</p><p>  23. Chong, H.Y., Gortler, S.J., Zickler, T.: A perception-based color space for illumination-invariant image processing. ACM Transactions on Graphics (SIG-GRAPH 2008) (2008)</p>

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
  • 5. 眾賞文庫僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論