BSP-Net: Generating Compact Meshes via Binary Space Partitioning

Abstract — 摘要

We present BSP-Net, a network that generates compact and structured 3D meshes via Binary Space Partitioning (BSP). The key idea is to learn to recursively partition space using a set of hyperplanes, decomposing a shape into a collection of convex components. BSP-Net produces meshes that are compact (low polygon count), watertight, and directly usable without post-processing, in contrast to implicit methods that require expensive iso-surface extraction (e.g., Marching Cubes). Our network achieves comparable or better reconstruction quality with significantly fewer polygons.

本文提出 BSP-Net，一個透過二元空間分割（BSP）生成緊湊且結構化三維網格的網路。核心構想是學習以一組超平面遞迴地分割空間，將形狀分解為凸面組件的集合。BSP-Net 產生的網格具有緊湊性（低多邊形數量）、水密性，且可直接使用而無需後處理，這與需要昂貴等值面提取（如 Marching Cubes）的隱式方法形成對比。我們的網路以顯著更少的多邊形數量達到了可比或更優的重建品質。

段落功能全文總覽——點明 BSP-Net 的核心機制（空間分割）與三大優勢（緊湊、水密、免後處理）。

邏輯角色摘要以「問題-方法-優勢-結果」的四段式結構推進，每句話都承擔明確的論證功能。與隱式方法的對比立即建立差異化定位。

論證技巧 / 潛在漏洞「緊湊且水密」是強有力的賣點——在實際應用（遊戲、3D列印）中，低多邊形與水密性是關鍵需求。但摘要未提及方法對複雜拓撲結構（如薄壁、空洞）的處理能力。

1. Introduction — 緒論

3D shape generation is a core task in computer graphics and vision. Current deep learning approaches predominantly use voxel grids, point clouds, or implicit functions as 3D representations. However, voxel grids are memory-intensive and resolution-limited, point clouds lack surface connectivity, and implicit methods require post-hoc iso-surface extraction that produces dense, unstructured meshes. The ideal output should be a compact polygon mesh directly usable in downstream applications.

三維形狀生成是電腦圖學與視覺的核心任務。當前深度學習方法主要使用體素網格、點雲或隱式函數作為三維表徵。然而，體素網格消耗大量記憶體且解析度有限，點雲缺乏表面連接性，而隱式方法則需要事後的等值面提取，產出稠密且無結構的網格。理想的輸出應該是一個可直接用於下游應用的緊湊多邊形網格。

段落功能建立研究場域——列舉現有三維表徵的不足，提出對「理想輸出」的需求。

邏輯角色經典的「缺口分析」：透過系統性地批判三種主流表徵（體素、點雲、隱式函數），為 BSP-Net 的「直接網格生成」定位創造必要性。

論證技巧 / 潛在漏洞三種方法各被指出不同維度的缺陷，暗示理想方案需同時克服所有問題。但未提及已有的直接網格生成方法（如 AtlasNet、Pixel2Mesh），使得問題陳述稍顯不完整。

We propose to leverage Binary Space Partitioning (BSP), a classical computational geometry technique that recursively divides space using hyperplanes into convex regions. BSP-Net learns to predict a set of hyperplanes and their grouping into convex components, which are then combined via Boolean union to form the final shape. This produces compact meshes with guaranteed watertightness by construction.

我們提議利用二元空間分割（BSP），這是一種經典的計算幾何技術，透過超平面遞迴地將空間劃分為凸面區域。BSP-Net 學習預測一組超平面及其分組方式，形成凸面組件，再透過布林聯集運算組合成最終形狀。這種構造方式天然保證了產出網格的緊湊性與水密性。

段落功能提出解決方案——引入 BSP 作為三維生成的新範式。

邏輯角色從「問題」到「方案」的轉折。將經典計算幾何技術（BSP）與深度學習結合，是本文最重要的方法論創新。「構造保證水密性」直接回應隱式方法的後處理問題。

論證技巧 / 潛在漏洞借用經典理論賦予深度學習方法理論保證（水密性），是極具說服力的策略。但 BSP 本質上以凸面元件組合形狀，對於曲面細節豐富的物體（如人體），可能需要大量平面才能近似。

Implicit shape representations such as Occupancy Networks and DeepSDF learn continuous signed distance or occupancy functions, enabling arbitrary resolution output. However, they require Marching Cubes for mesh extraction, producing meshes with hundreds of thousands of faces. Template deformation methods like Pixel2Mesh deform a sphere mesh to match the target shape, but are limited to genus-0 topologies. Primitive-based methods approximate shapes with geometric primitives, but produce non-watertight results due to primitive overlaps.

隱式形狀表徵（如 Occupancy Networks 和 DeepSDF）學習連續的符號距離或佔據函數，能夠以任意解析度輸出。然而，它們需要 Marching Cubes 進行網格提取，產生的網格動輒包含數十萬個面。模板變形方法（如 Pixel2Mesh）將球體網格變形以匹配目標形狀，但受限於虧格為零的拓撲結構。基元組合方法以幾何基元近似形狀，但因基元重疊而產生非水密的結果。

段落功能文獻回顧——系統性地分析三類形狀生成方法及其固有限制。

邏輯角色為 BSP-Net 建立「技術生態位」：隱式方法輸出過密、變形方法拓撲受限、基元方法非水密——BSP-Net 的設計恰好規避了這三類問題。

論證技巧 / 潛在漏洞每類方法各指出一個核心缺陷，論述精煉高效。但對 CvxNet 等同樣基於凸面分解的並行工作著墨較少，可能低估了同期競爭。

3. Method — 方法

BSP-Net consists of three stages: (1) a BSP-plane generator that predicts N hyperplane parameters from a shape encoding, (2) a convex selector that groups planes into K convex components via a binary selection matrix, and (3) a shape assembler that combines convex components through Boolean union. The network is trained with a reconstruction loss comparing the predicted occupancy with ground truth occupancy. The entire pipeline is differentiable, enabling end-to-end training.

BSP-Net 由三個階段組成：(1) BSP 平面生成器，從形狀編碼預測 N 個超平面參數；(2) 凸面選擇器，透過二元選擇矩陣將平面分組為 K 個凸面組件；(3) 形狀組裝器，透過布林聯集合併凸面組件。網路以預測佔據值與真實佔據值的重建損失進行訓練。整條管線皆為可微分的，支援端對端訓練。

段落功能方法架構總覽——以三階段流程描述 BSP-Net 的完整管線。

邏輯角色將複雜的網路架構分解為三個易理解的模組，每個模組有明確的輸入輸出。「端對端可微分」是深度學習方法的關鍵技術需求。

論證技巧 / 潛在漏洞三階段分解使讀者能逐步建構心理模型。但二元選擇矩陣本質上是離散的，如何使其可微分是一個技術挑戰——作者需要額外的鬆弛策略。

3.1 Convex Decomposition — 凸面分解

Each convex component is defined as the intersection of half-spaces determined by a subset of hyperplanes. The inside/outside classification of a point with respect to a convex is computed as the product of its classifications with respect to each constituent plane. To enable gradient flow, we replace the hard indicator functions with sigmoid approximations during training. The final shape is the union of all convex components, computed as 1 minus the product of (1 minus each convex's occupancy).

每個凸面組件被定義為由一組超平面所決定的半空間的交集。一個點相對於某凸面的內外分類，是透過計算該點相對於各組成平面分類的乘積得出。為了使梯度能夠流通，我們在訓練期間以sigmoid 近似取代硬性指標函數。最終形狀為所有凸面組件的聯集，計算方式為1 減去各凸面佔據值的（1 減佔據值）之乘積。

段落功能技術細節——闡述凸面組件的數學定義與可微分化策略。

邏輯角色此段解決了上一段留下的技術問題：如何在離散的布林運算（交集、聯集）中實現梯度傳播。sigmoid 鬆弛是標準的可微分化技巧。

論證技巧 / 潛在漏洞以乘積運算統一交集與聯集的計算，在數學上簡潔優雅。但 sigmoid 近似在推論時需回復為硬性判斷，訓練與推論之間的不一致可能導致邊界偽影。

4. Experiments — 實驗

We evaluate BSP-Net on ShapeNet across 13 categories for 3D shape autoencoding and single-image reconstruction. Compared to Occupancy Networks, BSP-Net achieves comparable IoU (Intersection over Union) while producing meshes with 50x to 100x fewer polygons. For example, on the chair category, BSP-Net generates meshes with ~600 polygons vs. ~100,000 from Marching Cubes. The output meshes are guaranteed watertight by construction and exhibit meaningful structural decomposition into semantic parts.

我們在 ShapeNet 的 13 個類別上評估 BSP-Net，涵蓋三維形狀自動編碼與單影像重建任務。相較於 Occupancy Networks，BSP-Net 達到可比的 IoU（交併比），同時產生的網格多邊形數量減少 50 至 100 倍。例如，在椅子類別上，BSP-Net 生成約 600 個多邊形的網格，而 Marching Cubes 則產出約 100,000 個。輸出網格在構造上保證水密性，且呈現出具語意意義的結構分解。

段落功能提供核心量化證據——在多類別上證明 BSP-Net 的效率與品質。

邏輯角色此段是論文的實證支柱。50-100 倍的多邊形減少是壓倒性的優勢，直接回應摘要中「緊湊」的核心宣稱。語意分解作為附帶發現，增加了方法的應用價值。

論證技巧 / 潛在漏洞以具體數字（600 vs. 100,000）強化論述的說服力。但 IoU 指標可能掩蓋表面細節的差異——低多邊形網格在整體形狀上正確，但局部細節必然有損失，此處未以視覺品質指標補充。

Ablation studies show that increasing the number of planes N from 32 to 256 improves IoU but with diminishing returns. The convex decomposition learned by BSP-Net often aligns with semantically meaningful parts — for instance, chair legs, seat, and back are separated into distinct convex groups without any part-level supervision. Compared to SQ (superquadrics) and CvxNet, BSP-Net produces more faithful reconstructions with tighter surface fitting.

消融研究顯示，將平面數量 N 從 32 增加至 256 能改善 IoU，但效益遞減。BSP-Net 所學習的凸面分解往往與語意上有意義的部件對齊——例如，椅子的腿部、座面與靠背被分離為不同的凸面組別，完全無需部件級別的監督。相較於超二次曲面（SQ）和 CvxNet，BSP-Net 產生更忠實的重建結果，具有更緊密的表面擬合。

段落功能消融與比較實驗——驗證超參數選擇並與同類方法對比。

邏輯角色消融實驗提供超參數選擇的依據，而語意分解的發現則為方法增添了「可解釋性」的附加價值。與 CvxNet 的比較尤為重要，因為兩者同屬凸面分解陣營。

論證技巧 / 潛在漏洞無監督語意分解是一個吸引人的「意外發現」，但作者未提供量化的語意對齊指標，僅以定性觀察佐證。「效益遞減」的觀察有助於實務選擇，但未解釋為何會遞減。

5. Conclusion — 結論

We have introduced BSP-Net, a novel approach that leverages Binary Space Partitioning for deep 3D shape generation. The method directly produces compact, watertight polygon meshes without requiring iso-surface extraction. By decomposing shapes into learned convex components via hyperplane arrangement, BSP-Net bridges the gap between classical computational geometry and modern deep learning, offering a principled and efficient solution for 3D shape generation.

本文提出 BSP-Net，一種利用二元空間分割進行深度三維形狀生成的新方法。該方法直接產生緊湊的水密多邊形網格，無需等值面提取。透過以超平面排列將形狀分解為學習式的凸面組件，BSP-Net 架構起經典計算幾何與現代深度學習之間的橋梁，為三維形狀生成提供了一個具原理性且高效的解決方案。

段落功能總結全文——重申 BSP-Net 的核心優勢與學術定位。

邏輯角色結論以「橋梁」隱喻概括全文貢獻：將經典理論（BSP）注入深度學習，賦予神經網路幾何學的結構保證。

論證技巧 / 潛在漏洞「經典與現代的橋梁」是有效的學術修辭，但結論未討論方法的局限性（如處理薄壁結構、高虧格拓撲的困難）及未來改進方向。

論證結構總覽

問題
現有三維生成方法
輸出網格過密或非水密

→

論點
BSP 凸面分解可
直接生成緊湊網格

→

證據
多邊形減少 50-100 倍
IoU 品質可比

→

反駁
凸面分解能學習
語意部件結構

→

結論
經典幾何與深度學習
的有效結合

作者核心主張（一句話）

透過將二元空間分割引入深度學習框架，可直接生成緊湊且水密的三維多邊形網格，無需昂貴的等值面提取後處理。

論證最強處

構造性品質保證：BSP 的數學性質天然保證輸出網格的水密性，這是其他深度三維生成方法無法提供的理論保證。50-100 倍的多邊形減少在實用性上具有壓倒性優勢，且無監督語意分解作為附帶發現展示了方法的結構學習能力。

論證最弱處

表達能力的上限：以凸面組件的聯集來表示形狀，本質上限制了對曲面細節的精確再現。對於具有複雜拓撲（多虧格）或精細表面紋理的物體，BSP 分解可能需要極大量的平面才能達到足夠精度。此外，訓練時 sigmoid 鬆弛與推論時硬性判斷的不一致性，可能導致邊界品質的退化。