Mesh-based Non-collision Constraints #771
Conversation
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot — PR #771
Mesh-based Non-collision Constraints
Summary
This PR adds sphere-to-SDF mesh collision support as an alternative to AABB overlap detection. The architecture is clean — CollisionMode.MESH integrates well into the existing NoCollisionLossStrategy dispatch, and the greedy sphere decomposition + Warp SDF kernel approach is sound. The test suite is thorough (542 lines!) with good coverage of dispatch routing, gradient flow, and integration.
Findings
| # | Severity | Finding |
|---|---|---|
| 1 | 🟡 Warning | Validator creates fresh WarpMeshManager per call — cache never reused |
| 2 | 🟡 Warning | Scale applied post-transform in extract_trimesh_from_usd may be incorrect for nested prims |
| 3 | 🔵 Suggestion | object_base.py abstract method has no explicit return None |
| 4 | 🔵 Suggestion | Sentinel warning pattern on function object is not thread-safe |
| 5 | 🔵 Suggestion | Consider documenting the rotated-anchor limitation more prominently |
See inline comments for details.
Update (5e86ed0a): Reviewed incremental changes since 655ac73.
Addressed Findings
- ✅ Finding #1 resolved —
_get_cpu_mesh_manager()now lazily creates and caches theWarpMeshManageron the instance, eliminating redundant allocations per validation call. Good fix. - ✅ Finding #2 resolved — Removed erroneous
.Ttranspose onComputeLocalToWorldTransforminusd_helpers.py. USD returns row-major matrices; the transpose was producing incorrect vertex transforms for nested prims.
Other Changes
- Validation logic refactored (
_validate_placement): Mesh mode now skips AABB validation entirely (elsebranch). Previously both checks ran in mesh mode — the AABB check was redundant and could produce false negatives for non-convex shapes. Clean improvement. - Test suite trimmed: Removed
test_sphere_count_respects_budget,test_cache_key_differs_for_different_meshes,test_dispatch_falls_back_when_obj_is_none, andtest_mesh_zero_loss_separated_cylinders. These removals look intentional (simplified scope / covered elsewhere), though removing cache-key differentiation test reduces regression coverage on the caching layer.
Remaining Observations
- Findings #3–#5 from original review remain unaddressed (low priority, suggestions only).
- The new
_get_cpu_mesh_manageruseshasattrcheck — works fine butOptionalattribute initialized in__init__would be more explicit.
Overall: Good incremental improvement. The two main warnings from the initial review are resolved. No new concerns.
Update (729d892c): Reviewed incremental changes since 5e86ed0a.
Changes in this push (2 files)
-
relation_loss_strategies.py— Addedparent_pos_resolved.expand(batch_size, -1)before the per-batch loop. This fixes a shape mismatch whenparent_pos_resolvedis not already batch-expanded (e.g., single parent broadcast to multiple children). Correct fix. -
warp_mesh_manager.py— Wrappedgetattr(obj, "scale", ...)intuple()for cache key computation. This prevents unhashable types (e.g., numpy arrays or torch tensors returned by.scale) from breaking the dict lookup. Necessary bugfix.
Assessment
Both changes are small, targeted bugfixes. No new concerns introduced. All previous suggestions (#3–#5) remain low-priority and unaddressed.
Greptile SummaryThis PR adds mesh-based non-collision constraints as an alternative to AABB, using greedy sphere decomposition and differentiable Warp SDF queries (
Confidence Score: 4/5The collision-detection logic is substantially correct — yaw transforms, broadphase, gradient flow, and AABB fallback all look sound. The greedy sphere decomposition contains a double-selection bug that can quietly waste sphere budget, but it does not crash and does not break the AABB safety net. The yaw-transform math in both the solver and the validator is consistently correct. The isaaclab_arena/relations/warp_mesh_manager.py — the greedy set-cover loop in Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[ObjectPlacer.place] --> B[RelationSolver.solve]
B --> C{collision_mode?}
C -->|MESH| D[_prepare_mesh_collision_cache]
D --> E[WarpMeshAndSphereCache]
E --> F[get_collision_mesh per object]
E --> G[greedy_sphere_decomposition]
E --> H[get_warp_mesh: build BVH]
D --> I[_finalize_mesh_cache → MeshPairCache]
C -->|MESH| J[_compute_no_overlap_loss_mesh]
J --> K[AABB broadphase per env]
K --> L[multi_mesh_sdf kernel]
L --> M[_MultiMeshSDFFunction.backward]
C -->|BBOX / fallback| N[_compute_no_overlap_loss_aabb]
B --> O[Solved positions]
O --> P[_validate_placement]
P --> Q{collision_mode?}
Q -->|MESH| R[_validate_no_overlap skip_mesh_pairs=True]
R --> S[_validate_no_overlap_mesh]
S --> T{has mesh?}
T -->|Yes| U[_spheres_penetrate_mesh: mesh_sdf on CPU]
T -->|No| V[_pair_aabb_overlaps fallback]
Q -->|BBOX| W[_validate_no_overlap AABB only]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[ObjectPlacer.place] --> B[RelationSolver.solve]
B --> C{collision_mode?}
C -->|MESH| D[_prepare_mesh_collision_cache]
D --> E[WarpMeshAndSphereCache]
E --> F[get_collision_mesh per object]
E --> G[greedy_sphere_decomposition]
E --> H[get_warp_mesh: build BVH]
D --> I[_finalize_mesh_cache → MeshPairCache]
C -->|MESH| J[_compute_no_overlap_loss_mesh]
J --> K[AABB broadphase per env]
K --> L[multi_mesh_sdf kernel]
L --> M[_MultiMeshSDFFunction.backward]
C -->|BBOX / fallback| N[_compute_no_overlap_loss_aabb]
B --> O[Solved positions]
O --> P[_validate_placement]
P --> Q{collision_mode?}
Q -->|MESH| R[_validate_no_overlap skip_mesh_pairs=True]
R --> S[_validate_no_overlap_mesh]
S --> T{has mesh?}
T -->|Yes| U[_spheres_penetrate_mesh: mesh_sdf on CPU]
T -->|No| V[_pair_aabb_overlaps fallback]
Q -->|BBOX| W[_validate_no_overlap AABB only]
Reviews (21): Last reviewed commit: "unify AABB and mesh" | Re-trigger Greptile |
729d892 to
db06239
Compare
ef73a02 to
7c46283
Compare
alexmillane
left a comment
There was a problem hiding this comment.
First partial review.
Looks good. I haven't got to the warp mesh based stuff.
| # Rotate child sphere centers by net_yaw = child_yaw - parent_yaw. | ||
| net_yaw = child_yaw - parent_yaw | ||
| if net_yaw != 0.0: | ||
| cos_n = math.cos(net_yaw) | ||
| sin_n = math.sin(net_yaw) | ||
| rx = centers_local[:, 0] * cos_n - centers_local[:, 1] * sin_n | ||
| ry = centers_local[:, 0] * sin_n + centers_local[:, 1] * cos_n | ||
| centers_local = torch.stack([rx, ry, centers_local[:, 2]], dim=-1) | ||
|
|
||
| batch_size = child_pos.shape[0] | ||
| parent_pos_resolved = parent_pos_resolved.expand(batch_size, -1) | ||
| total_loss = torch.zeros(batch_size, device=device, dtype=child_pos.dtype) | ||
|
|
||
| for b in range(batch_size): | ||
| offset = child_pos[b] - parent_pos_resolved[b] | ||
| # Rotate offset into the parent's local frame. | ||
| if parent_yaw != 0.0: | ||
| cos_p = math.cos(-parent_yaw) | ||
| sin_p = math.sin(-parent_yaw) | ||
| ox = offset[0] * cos_p - offset[1] * sin_p | ||
| oy = offset[0] * sin_p + offset[1] * cos_p | ||
| offset = torch.stack([ox, oy, offset[2]]) | ||
| centers_in_parent = centers_local + offset |
There was a problem hiding this comment.
Two suggestions:
- Part of this code represents transforming vectors in one frame to another. Under a transformation (that happens to only have a yaw component). I'm wondering if we can use more general tools for this. For example, can we use:
isaaclab.utils.mathfor examplequat_applyfor applying rotations, rather than recoding the code for rotations here. - It is useful if you use structured notation. For example
q_B_Arepresents the rotation from the frameAto the frameB.
I just wrote a good example of doing this type of this here
| n_candidates = max(num_spheres, n_candidates) | ||
| n_surface = max(n_candidates, n_surface) | ||
|
|
||
| rng = np.random.default_rng(seed) |
There was a problem hiding this comment.
This might destroy the seeding we do elsewhere.
I'm wondering if we should avoid setting the seed anywhere in our codebase, except for the one place we set it in Isaac Lab.
@xyao-nv @peterd-NV What do you think? Should we outlaw setting seeds in our code?
af4e742 to
bc78db6
Compare
alexmillane
left a comment
There was a problem hiding this comment.
Another partial review
e844ccc to
e307fcd
Compare
| parent_world_bbox=anchor_world_bbox, | ||
| child_obj=child, | ||
| parent_obj=anchor, | ||
| parent_pos=None, | ||
| ) | ||
| if debug: | ||
| print(f" [NoOverlap] {child.name} vs {anchor.name}: loss={loss.mean().item():.6f}") | ||
| total_loss = total_loss + loss | ||
|
|
||
| # Against other non-anchors (unique pairs, both directions) | ||
| for j in range(i + 1, len(non_anchor_objects)): |
There was a problem hiding this comment.
Null-dereference on
_mesh_manager when skip_mesh_pairs=True
_compute_no_overlap_loss_aabb(skip_mesh_pairs=True) is only reached when collision_mode == MESH and calls self._mesh_manager.get_collision_mesh(child) directly. After __deepcopy__ the custom __deepcopy__ explicitly sets _mesh_manager = None. If the copied solver ever enters _compute_total_loss without first calling solve() (which is the only method that calls _prepare_mesh_collision_cache), this raises AttributeError: 'NoneType' object has no attribute 'get_collision_mesh'. The method should assert self._mesh_manager is not None or add a short-circuit early return before dereferencing it when skip_mesh_pairs=True.
e307fcd to
2b8adcd
Compare
| source, | ||
| source_mesh, | ||
| source_pos, | ||
| target, | ||
| target_mesh, | ||
| target_pos, | ||
| mesh_manager, | ||
| tolerance, | ||
| orientations, | ||
| ) -> bool: | ||
| """True if source's spheres penetrate target's mesh.""" |
There was a problem hiding this comment.
AssertionError crash for roll/pitch anchor with no mesh in MESH mode
When MESH mode is active and an anchor object has (a) no collision mesh and (b) a non-pure-Z rotation (roll or pitch), _pair_aabb_overlaps is reached via the AABB fallback in _validate_no_overlap_mesh. The assert on qx/qy then propagates as an unhandled AssertionError, crashing the entire place() call instead of returning False or raising a descriptive ValueError.
The solver-side guard in _prepare_mesh_collision_cache only fires for anchor pairs that have a mesh; anchors without meshes silently bypass it, so this is the only gate. Replacing the assert with an explicit ValueError (or a return True for conservative rejection) would make the failure mode predictable.
| continue | ||
| if ( | ||
| skip_mesh_pairs | ||
| and self._mesh_manager.get_collision_mesh(child) is not None | ||
| and self._mesh_manager.get_collision_mesh(anchor) is not None | ||
| ): | ||
| continue | ||
| anchor_world_bbox = anchor.get_world_bounding_box().to(device) | ||
| loss = self._no_collision_strategy.compute_loss( | ||
| clearance_m=self.params.clearance_m, | ||
| child_pos=child_pos, | ||
| child_bbox=child_bbox, | ||
| parent_world_bbox=anchor_world_bbox, | ||
| child_obj=child, | ||
| parent_obj=anchor, | ||
| parent_pos=None, | ||
| ) | ||
| if debug: | ||
| print(f" [NoOverlap] {child.name} vs {anchor.name}: loss={loss.mean().item():.6f}") | ||
| total_loss = total_loss + loss | ||
|
|
||
| # Against other non-anchors (unique pairs, both directions) | ||
| for j in range(i + 1, len(non_anchor_objects)): | ||
| other = non_anchor_objects[j] | ||
| if (id(child), id(other)) in on_pairs: | ||
| continue | ||
| if ( | ||
| skip_mesh_pairs | ||
| and self._mesh_manager.get_collision_mesh(child) is not None | ||
| and self._mesh_manager.get_collision_mesh(other) is not None | ||
| ): | ||
| continue |
There was a problem hiding this comment.
_mesh_manager null-dereference when skip_mesh_pairs=True after __deepcopy__
_compute_no_overlap_loss_aabb accesses self._mesh_manager.get_collision_mesh(child) unconditionally when skip_mesh_pairs=True (lines 207 and 231). After __deepcopy__, the custom implementation explicitly sets _mesh_manager = None. Any path that calls _compute_total_loss on a copied solver without first invoking solve() will raise AttributeError: 'NoneType' object has no attribute 'get_collision_mesh'.
A short-circuit guard — if self._mesh_manager is None: skip_mesh_pairs = False or an early-return of zeros — would make post-copy solver state safe.
ca965ca to
923bc46
Compare
Signed-off-by: zhx06 <zihaox@nvidia.com>
Signed-off-by: zhx06 <zihaox@nvidia.com>
Signed-off-by: zhx06 <zihaox@nvidia.com>
923bc46 to
0e57d62
Compare
Summary
Add mesh-based non-collision constraints via sphere-to-SDF, unified with the existing AABB path
Detailed description
CollisionMode.MESHas an alternative to AABB for no-overlap constraints, using greedy sphere decomposition + differentiable Warp SDF queries against actual collision geometry.MeshPairCachedataclass andMeshPairEntryNamedTuple give the mesh path the same collect-then-batch structure as the AABB vectorized path.placer_paramsis the primary configuration source; CLI--collision_mode meshis a fallback.Core files
relations/warp_sdf_kernels.py— differentiable SDF queries on Warp meshesrelations/warp_mesh_manager.py— sphere decomposition and mesh cachingrelations/relation_solver.py— vectorized mesh collision loss during optimizationrelations/mesh_pair_cache.py— typed dataclass for precomputed per-pair collision datarelations/object_placer.py— mesh collision validation at placement time