3D Shape and Indirect Appearance by Structured Light Transport

Abstract — 摘要

We present an imaging technique that manipulates direct and indirect light in scenes to enhance visual analysis. Our key observation is that direct light always obeys the epipolar geometry of a projector-camera pair, while indirect light overwhelmingly does not. By exploiting this principle, our method enables three capabilities: generating indirect-only video streams, improving structured-light shape recovery algorithms' robustness to indirect light, and enabling single-shot dynamic 3D shape capture that is robust to global illumination effects.

本文提出一種影像擷取技術，透過操控場景中的直接光與間接光來強化視覺分析。我們的關鍵觀察是：直接光始終遵循投影器與攝影機的對極幾何，而間接光則絕大多數不遵循此幾何約束。利用此原理，本方法可實現三項能力：產生僅含間接光的影片串流、提升結構光形狀恢復演算法對間接光的穩健性，以及實現對全域照明效應具穩健性的單次拍攝動態三維形狀擷取。

段落功能全文總覽——以單一核心觀察（對極幾何區分直接/間接光）串聯三項技術貢獻。

邏輯角色摘要同時承擔「物理洞見陳述」與「應用預告」的雙重功能：先指出幾何約束的不對稱性，再展示由此衍生的三項實用能力。

論證技巧 / 潛在漏洞「間接光絕大多數不遵循對極幾何」的措辭留有「絕大多數」的模糊空間——作者需在方法章節中量化此比例，並說明何種場景可能違反此假設。

1. Introduction — 緒論

Structured light scanning is one of the most widely used techniques for acquiring 3D shape. A projector casts known patterns onto a scene, and a camera observes the deformed patterns to establish projector-camera correspondences that yield depth through triangulation. Despite its maturity, structured light remains susceptible to global illumination effects — interreflections, subsurface scattering, and volumetric scattering — which corrupt the observed patterns and cause systematic errors in reconstructed geometry.

結構光掃描是取得三維形狀最廣泛使用的技術之一。投影器將已知圖案投射至場景中，攝影機觀察變形後的圖案以建立投影器與攝影機之間的對應關係，進而透過三角測量求得深度。儘管此技術已相當成熟，結構光仍然容易受到全域照明效應的影響——包括互反射、次表面散射與體積散射——這些效應會破壞觀測到的圖案，並在重建的幾何中造成系統性誤差。

段落功能建立研究場域——介紹結構光掃描的基本原理與已知缺陷。

邏輯角色論證鏈的起點：先肯定結構光的廣泛應用，再揭示「全域照明」這一核心痛點，為後續提出的解決方案建立需求。

論證技巧 / 潛在漏洞以三種具體的全域照明現象（互反射、次表面散射、體積散射）來具象化問題，使讀者直覺地理解挑戰的多元性。此策略有效地暗示需要一個統一的解決框架，而非逐一應對。

Previous approaches to handling indirect light in structured light systems fall into two categories: designing illumination patterns that are inherently robust to global light transport, or explicitly separating direct and indirect components using high-frequency patterns. The former sacrifices pattern coding efficiency, while the latter requires many additional images that preclude real-time or single-shot acquisition. Our approach is fundamentally different: we exploit the epipolar geometry inherent in any projector-camera system to achieve separation without sacrificing coding efficiency or requiring extra captures.

先前在結構光系統中處理間接光的方法可分為兩類：設計對全域光線傳輸本質上具穩健性的照明圖案，或使用高頻圖案來顯式分離直接與間接成分。前者犧牲了圖案編碼效率，後者則需要大量額外影像，排除了即時或單次拍攝的可能性。我們的方法有本質上的不同：利用任何投影器與攝影機系統固有的對極幾何來實現分離，既不犧牲編碼效率，也不需要額外擷取。

段落功能批判既有方法——指出兩類現有解決方案的根本限制。

邏輯角色典型的「二分法批判」：先將現有方法歸為兩類，再分別指出各自的致命缺點，最終以「本質上不同」的措辭引出自身方案。

論證技巧 / 潛在漏洞「本質上不同」的修辭暗示典範轉移，但對極幾何的利用在計算機視覺中並非全新概念。作者的真正創新在於將此幾何約束應用於光線分離的特定場景。

Direct-indirect separation was pioneered by Nayar et al., who showed that high-frequency illumination patterns can isolate direct light from global illumination. Their method requires a minimum of three images per separation, making it impractical for dynamic scenes. Subsequent works by Gupta et al. explored logical coding strategies that combine separation with structured light decoding, but still require multiple shots. In contrast, epipolar-based methods from stereo vision have long exploited geometric constraints for correspondence, yet their application to light transport separation in projector-camera systems has not been explored.

直接-間接光分離的開創性工作由 Nayar 等人完成，他們證明了高頻照明圖案能從全域照明中隔離直接光。然而，此方法每次分離至少需要三張影像，對動態場景而言不切實際。Gupta 等人後續探討了結合分離與結構光解碼的邏輯編碼策略，但仍需多次拍攝。相較之下，立體視覺中的對極方法長久以來利用幾何約束來建立對應關係，然而將其應用於投影器-攝影機系統中的光線傳輸分離，至今尚未被探索。

段落功能文獻回顧——梳理直接-間接光分離的學術譜系。

邏輯角色以時序性敘事（Nayar -> Gupta -> 本文）建立方法的演進脈絡，同時指出「對極幾何應用於光線分離」的學術空白。

論證技巧 / 潛在漏洞將立體視覺中成熟的對極方法移植到新場景是典型的「跨領域遷移」論證。但作者需解釋為何此遷移並非顯而易見——否則讀者可能質疑其新穎性。

3. Method — 方法

3.1 Epipolar Constraint for Light Separation

Consider a projector-camera pair with known calibration. When the projector illuminates a single column, direct light from a scene point will arrive at the camera along the corresponding epipolar line. This is a fundamental consequence of epipolar geometry: the projector column, the scene point, and the camera pixel must be coplanar. Indirect light, however, undergoes one or more bounces in the scene before reaching the camera, breaking this coplanarity constraint. Therefore, by analyzing the spatial distribution of received light relative to epipolar lines, we can classify each light contribution as direct or indirect.

考慮一組已校準的投影器-攝影機對。當投影器照亮單一行時，來自場景點的直接光將沿著對應的對極線抵達攝影機。這是對極幾何的基本推論：投影器行、場景點與攝影機像素必須共面。然而，間接光在抵達攝影機前經歷一次或多次場景反彈，打破了此共面約束。因此，透過分析接收到的光線相對於對極線的空間分布，我們可以將每個光線貢獻分類為直接或間接。

段落功能方法推導第一步——建立對極約束與光線分類的理論基礎。

邏輯角色全文論證的數學核心：從對極幾何的基本定義出發，推導出直接光滿足共面約束而間接光不滿足的關鍵分離準則。

論證技巧 / 潛在漏洞推導邏輯清晰且嚴謹，從幾何公理直接導出分離準則。但「間接光打破共面約束」的成立前提是反射次數至少為一——對於鏡面反射（specular interreflection），某些特殊幾何構型下間接光仍可能落在對極線上。

3.2 Epipolar Separation Algorithm — 光線分離演算法

Our separation algorithm operates in two steps. First, we project a set of line patterns aligned with the projector's epipolar planes. For each camera pixel, the direct component produces a sharp peak in the epipolar direction, while the indirect component forms a diffuse spread across non-epipolar directions. Second, we apply a 1D filter along each epipolar line to extract the peak (direct) and subtract it from the total to obtain the indirect component. The filter is designed to be robust to noise and calibration imperfections.

我們的分離演算法分兩步操作。首先，投射一組與投影器對極面對齊的線條圖案。對每個攝影機像素而言，直接成分在對極方向上產生一個尖銳的峰值，而間接成分則在非對極方向上形成漫射擴散。接著，沿著每條對極線施加一維濾波器來提取峰值（直接成分），並從總量中減去以獲得間接成分。此濾波器的設計具有對雜訊與校準不完美的穩健性。

段落功能演算法細節——將理論轉化為可執行的兩步驟演算法。

邏輯角色從 3.1 的幾何原理到具體實作的橋樑：將「對極約束」轉化為「沿對極線的一維濾波」操作。

論證技巧 / 潛在漏洞以「尖銳峰值 vs. 漫射擴散」的對比使分離直覺化。但濾波器的具體設計參數（帶寬、截止頻率）將直接影響分離品質，在材質光滑度高的場景中可能面臨挑戰。

3.3 Robust Shape Recovery — 穩健形狀恢復

With the direct-indirect separation in place, we integrate it into structured light shape recovery. Standard structured light algorithms decode binary or Gray code patterns to find projector-camera correspondences. When indirect light corrupts these patterns, decoding errors propagate to the reconstructed 3D geometry. Our approach first removes the indirect component from each captured frame, then applies standard decoding algorithms to the cleaned direct-only images. Furthermore, we demonstrate a single-shot variant that embeds the separation process within a single projected pattern using epipolar-aware design, enabling dynamic 3D capture of moving objects.

在直接-間接光分離就位後，我們將其整合到結構光形狀恢復中。標準結構光演算法解碼二元或格雷碼圖案以尋找投影器-攝影機對應關係。當間接光破壞這些圖案時，解碼誤差會傳播至重建的三維幾何。我們的方法首先從每個擷取的影格中移除間接成分，然後對清理後的純直接光影像施加標準解碼演算法。此外，我們展示了一種單次拍攝的變體，將分離過程嵌入使用對極感知設計的單一投射圖案中，實現動態移動物件的三維擷取。

段落功能應用整合——展示分離技術如何提升現有結構光的效能。

邏輯角色此段將理論貢獻（光線分離）轉化為實際應用價值（穩健的三維形狀恢復與動態擷取），完成從「為何重要」到「如何使用」的論證閉環。

論證技巧 / 潛在漏洞單次拍攝變體是最具工程影響力的貢獻，但其在速度與精度之間的權衡需在實驗中量化驗證。對極感知設計是否在所有場景幾何下都能有效運作也值得探討。

4. Experiments — 實驗

We evaluate our method on scenes exhibiting strong interreflections (concave metallic objects), subsurface scattering (translucent materials like wax and marble), and volumetric scattering (scenes with participating media). For shape recovery, we compare against standard Gray code decoding and the high-frequency separation method of Nayar et al. Our epipolar-based separation achieves comparable or superior 3D reconstruction accuracy while requiring significantly fewer input images. The single-shot variant successfully captures dynamic scenes at video frame rates with shape accuracy within 1-2 mm of the multi-shot baseline. Additionally, the indirect-only video streams reveal material properties invisible in standard imaging, such as subsurface scattering profiles.

我們在展現強烈互反射（凹面金屬物件）、次表面散射（蠟和大理石等半透明材質）以及體積散射（含參與介質的場景）的場景上評估本方法。在形狀恢復方面，我們與標準格雷碼解碼及 Nayar 等人的高頻分離方法進行比較。我們基於對極的分離方法在所需輸入影像遠少於對手的條件下，達到可比擬或更優的三維重建精度。單次拍攝變體成功地以影片幀率擷取動態場景，形狀精度在多次拍攝基準的 1-2 毫米以內。此外，純間接光影片串流揭示了標準影像中不可見的材質性質，例如次表面散射分布。

段落功能提供全面實證——在多種全域照明場景與基準方法下驗證效能。

邏輯角色實證支柱覆蓋三個維度：(1) 多種全域照明類型的廣度；(2) 與現有方法的精度比較；(3) 單次拍攝動態擷取的時間效率。

論證技巧 / 潛在漏洞「1-2 毫米」的精度數據為定量支撐加分。但實驗場景均為受控的實驗室環境——在大規模戶外場景或複雜遮擋條件下，方法的表現尚待驗證。

5. Conclusion — 結論

We have presented a novel approach to structured light 3D scanning that exploits epipolar geometry to separate direct from indirect light transport. This geometric insight enables robust shape recovery in the presence of interreflections, subsurface scattering, and volumetric scattering, while also providing access to indirect-only appearance information. Our single-shot variant demonstrates the practical applicability of the approach for dynamic 3D capture. Future work includes extending the method to multi-projector systems and exploring the rich information contained in the separated indirect light transport for material classification and relighting applications.

我們提出了一種結構光三維掃描的新方法，利用對極幾何來分離直接與間接光線傳輸。此幾何洞見使得在互反射、次表面散射與體積散射存在下仍能進行穩健的形狀恢復，同時也能存取純間接光的外觀資訊。我們的單次拍攝變體展示了此方法在動態三維擷取中的實用適用性。未來工作包括將方法擴展至多投影器系統，以及探索分離出的間接光線傳輸中所蘊含的豐富資訊，以用於材質分類與重新打光應用。

段落功能總結全文——重申核心貢獻並展望未來方向。

邏輯角色結論段呼應摘要結構：從幾何原理回到應用價值，形成完整的論證閉環。未來方向的提出（多投影器、材質分類）顯示方法具有擴展潛力。

論證技巧 / 潛在漏洞結論適度謙遜且前瞻性強。但未充分討論方法的局限性——例如校準精度對分離品質的影響、投影器解析度的限制等。作為 Honorable Mention 論文，對自身侷限的反思會使論證更加完整。

論證結構總覽

問題
全域照明效應破壞
結構光三維掃描

→

論點
對極幾何可區分
直接與間接光

→

證據
多場景驗證穩健性
單次拍攝達 1-2mm 精度

→

反駁
比現有方法需更少
影像且不犧牲精度

→

結論
對極約束是光線分離
的統一幾何框架

作者核心主張（一句話）

利用投影器-攝影機系統固有的對極幾何，可在不增加額外擷取的條件下，將場景中的直接光與間接光進行分離，從而實現穩健的三維形狀恢復與間接外觀分析。

論證最強處

幾何原理的普適性：對極約束是任何投影器-攝影機系統的內在性質，不依賴特定硬體或材質假設。此方法從根本的射影幾何出發，使其在理論上對所有類型的全域照明效應（互反射、次表面散射、體積散射）均具備穩健性，且單次拍攝變體展現了工程上的實用價值。

論證最弱處

特殊幾何構型下的退化情形：在某些對稱或鏡面幾何中，間接光的反射路徑可能恰好落在對極平面上，導致分離失敗。此外，實驗僅在受控的實驗室環境中進行，方法在大型場景、戶外條件或低品質校準下的表現尚未驗證。