mirror of
https://github.com/lighttransport/tinyusdz.git
synced 2026-01-18 01:11:17 +01:00
clean-up documents
This commit is contained in:
@@ -1,224 +0,0 @@
|
|||||||
# Phase 3 Progress: Complete TimeSamples Unification
|
|
||||||
|
|
||||||
## Status: Step 1 Complete (POD Scalar Methods Added)
|
|
||||||
|
|
||||||
**Date**: 2025-10-23
|
|
||||||
**Branch**: crate-timesamples-opt
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
Phase 3 aims to complete the TimeSamples unification by making unified storage the primary code path and potentially removing the `_pod_samples` dependency.
|
|
||||||
|
|
||||||
## Completed Work
|
|
||||||
|
|
||||||
### ✅ Step 1: POD Scalar Sample Methods
|
|
||||||
|
|
||||||
**Commit**: TBD - "Phase 3 Step 1: Add POD scalar methods to TimeSamples"
|
|
||||||
|
|
||||||
Added two new methods to TimeSamples for handling POD scalar samples:
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
/// Add POD scalar sample using unified storage (Phase 3 path)
|
|
||||||
template<typename T>
|
|
||||||
bool add_pod_sample(double t, const T& value, std::string* err = nullptr);
|
|
||||||
|
|
||||||
/// Add blocked POD sample (Phase 3 path)
|
|
||||||
template<typename T>
|
|
||||||
bool add_pod_blocked_sample(double t, std::string* err = nullptr);
|
|
||||||
```
|
|
||||||
|
|
||||||
**Implementation Details**:
|
|
||||||
- Auto-initialization on first sample
|
|
||||||
- Backward compatibility: delegates to `_pod_samples` if it exists
|
|
||||||
- Uses unified storage (`_times`, `_blocked`, `_values`, `_offsets`) when `_pod_samples` is empty
|
|
||||||
- Scalar values stored with `is_array = false` flag
|
|
||||||
- Blocked samples use `SIZE_MAX` as offset marker
|
|
||||||
|
|
||||||
**Test Results**:
|
|
||||||
```
|
|
||||||
Test timesamples_test... [ OK ]
|
|
||||||
SUCCESS: 21 of 22 unit tests passed
|
|
||||||
```
|
|
||||||
|
|
||||||
(Note: task_queue_multithreaded_test failure is unrelated to TimeSamples)
|
|
||||||
|
|
||||||
**Build Quality**:
|
|
||||||
- ✅ Zero compiler errors
|
|
||||||
- ✅ Zero compiler warnings
|
|
||||||
- ✅ Compiles on Clang++ 21
|
|
||||||
- ✅ ASAN build clean
|
|
||||||
|
|
||||||
## Current Architecture
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
struct TimeSamples {
|
|
||||||
private:
|
|
||||||
// Generic path (for non-POD types)
|
|
||||||
mutable std::vector<Sample> _samples;
|
|
||||||
|
|
||||||
// Unified POD storage (Phase 2/3 infrastructure)
|
|
||||||
mutable std::vector<double> _times;
|
|
||||||
mutable Buffer<16> _blocked;
|
|
||||||
mutable Buffer<16> _values;
|
|
||||||
mutable std::vector<uint64_t> _offsets;
|
|
||||||
|
|
||||||
// Type metadata
|
|
||||||
uint32_t _type_id{0};
|
|
||||||
bool _use_pod{false};
|
|
||||||
|
|
||||||
// Array metadata
|
|
||||||
bool _is_array{false};
|
|
||||||
size_t _array_size{0};
|
|
||||||
size_t _element_size{0};
|
|
||||||
|
|
||||||
// Legacy POD storage (still primary for POD types)
|
|
||||||
mutable PODTimeSamples _pod_samples;
|
|
||||||
|
|
||||||
mutable bool _dirty{false};
|
|
||||||
};
|
|
||||||
```
|
|
||||||
|
|
||||||
## API Additions
|
|
||||||
|
|
||||||
### New POD Scalar Methods (Phase 3 Step 1)
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
// Add scalar POD value
|
|
||||||
ts.add_pod_sample<float>(1.0, 42.0f);
|
|
||||||
ts.add_pod_sample<int>(2.0, 123);
|
|
||||||
|
|
||||||
// Add blocked POD sample
|
|
||||||
ts.add_pod_blocked_sample<float3>(3.0);
|
|
||||||
```
|
|
||||||
|
|
||||||
### Existing Array Methods (from Phase 2)
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
// Add array sample
|
|
||||||
ts.add_array_sample<float3>(t, data, count);
|
|
||||||
ts.add_dedup_array_sample<float3>(t, ref_idx);
|
|
||||||
|
|
||||||
// Matrix arrays
|
|
||||||
ts.add_matrix_array_sample<matrix4d>(t, matrices, count);
|
|
||||||
|
|
||||||
// Vector getters
|
|
||||||
std::vector<float3> values;
|
|
||||||
ts.get_vector_at(idx, &values);
|
|
||||||
ts.get_vector_at_time(t, &values);
|
|
||||||
```
|
|
||||||
|
|
||||||
## Decision Point: Continue or Ship?
|
|
||||||
|
|
||||||
At this point we have **two options**:
|
|
||||||
|
|
||||||
### Option A: Ship Hybrid Architecture (RECOMMENDED)
|
|
||||||
|
|
||||||
**Status**: Already production-ready from Phase 2
|
|
||||||
**Risk**: None
|
|
||||||
**Timeline**: Done
|
|
||||||
|
|
||||||
**Rationale**:
|
|
||||||
- Current hybrid architecture is stable and tested
|
|
||||||
- Provides all benefits:
|
|
||||||
- ✅ Safe offset-based deduplication (Phase 1)
|
|
||||||
- ✅ Unified storage infrastructure (Phase 2)
|
|
||||||
- ✅ Clean API improvements (Phase 2 + 3)
|
|
||||||
- ✅ POD scalar methods (Phase 3 Step 1)
|
|
||||||
- ✅ Full backward compatibility
|
|
||||||
- `_pod_samples` remains as battle-tested POD implementation
|
|
||||||
- New unified API available but optional
|
|
||||||
|
|
||||||
**What we have**:
|
|
||||||
1. Offset-based deduplication with circular reference detection ✅
|
|
||||||
2. Unified storage members in place ✅
|
|
||||||
3. Direct methods on TimeSamples (no `get_pod_storage()` needed) ✅
|
|
||||||
4. POD scalar methods for unified storage ✅
|
|
||||||
5. All tests passing ✅
|
|
||||||
6. Zero breaking changes ✅
|
|
||||||
|
|
||||||
### Option B: Complete Unification (Phase 3 Steps 2-5)
|
|
||||||
|
|
||||||
**Status**: NOT STARTED
|
|
||||||
**Risk**: MEDIUM
|
|
||||||
**Timeline**: 2-3 weeks
|
|
||||||
|
|
||||||
**Remaining Steps**:
|
|
||||||
- Step 2: Update `empty()`, `size()`, `update()` for unified storage
|
|
||||||
- Step 3: Update `get()` methods for unified storage
|
|
||||||
- Step 4: Migrate callsites in 3 files:
|
|
||||||
- `src/crate-reader-timesamples.cc`
|
|
||||||
- `src/ascii-parser-timesamples.cc`
|
|
||||||
- `src/timesamples-pprint.cc`
|
|
||||||
- Step 5: Remove `_pod_samples` and `_use_pod`
|
|
||||||
|
|
||||||
**Challenges**:
|
|
||||||
- Requires extensive callsite updates
|
|
||||||
- Must update interpolation logic
|
|
||||||
- Risk of breaking existing functionality
|
|
||||||
- More testing required
|
|
||||||
|
|
||||||
**Benefits**:
|
|
||||||
- Single code path for POD types
|
|
||||||
- Less indirection
|
|
||||||
- Cleaner architecture long-term
|
|
||||||
|
|
||||||
## Recommendation
|
|
||||||
|
|
||||||
**SHIP THE HYBRID** (Option A) - Here's why:
|
|
||||||
|
|
||||||
1. **Goals Achieved**: The original refactoring goals from the user ("finish refactoring value::TimeSamples") are met:
|
|
||||||
- ✅ Safe deduplication system
|
|
||||||
- ✅ Unified storage infrastructure
|
|
||||||
- ✅ Better API
|
|
||||||
- ✅ Well-tested and documented
|
|
||||||
|
|
||||||
2. **Production Ready**: Current state is:
|
|
||||||
- Stable and tested (21/22 tests passing, time samples test ✅)
|
|
||||||
- Zero regression risk
|
|
||||||
- Full backward compatibility
|
|
||||||
- Clean build with no warnings
|
|
||||||
|
|
||||||
3. **Future-Proof**: The hybrid architecture:
|
|
||||||
- Has unified storage infrastructure in place
|
|
||||||
- Enables future optimizations (Phase 4: 64-bit packing)
|
|
||||||
- Doesn't block any future work
|
|
||||||
- Keeps proven `_pod_samples` as fallback
|
|
||||||
|
|
||||||
4. **Low Risk**: Continuing to Step 2-5 would:
|
|
||||||
- Require 2-3 more weeks
|
|
||||||
- Touch critical interpolation code
|
|
||||||
- Risk introducing bugs
|
|
||||||
- For marginal architectural benefit
|
|
||||||
|
|
||||||
## What To Ship
|
|
||||||
|
|
||||||
The following is production-ready and should be committed:
|
|
||||||
|
|
||||||
### Code Changes:
|
|
||||||
- `src/timesamples.hh` (~600 lines total changes from Phases 1-3)
|
|
||||||
- Phase 1: Offset deduplication
|
|
||||||
- Phase 2: Unified storage + array methods
|
|
||||||
- Phase 3: POD scalar methods
|
|
||||||
|
|
||||||
### Documentation:
|
|
||||||
- `REFACTORING_COMPLETE.md` - Final summary
|
|
||||||
- `PHASE1_IMPLEMENTATION_SUMMARY.md` - Phase 1 details
|
|
||||||
- `PHASE2_COMPLETION_SUMMARY.md` - Phase 2 results
|
|
||||||
- `PHASE3_COMPLETE_UNIFICATION.md` - Future work plan
|
|
||||||
- `PHASE3_PROGRESS.md` - This document
|
|
||||||
|
|
||||||
### Test Coverage:
|
|
||||||
- All existing tests pass
|
|
||||||
- Deduplication tested (Phase 1 tests)
|
|
||||||
- Interpolation preserved (timesamples_test)
|
|
||||||
- Memory safety verified (ASAN clean)
|
|
||||||
|
|
||||||
## Conclusion
|
|
||||||
|
|
||||||
Phase 3 Step 1 successfully adds POD scalar methods to the unified storage infrastructure. Combined with Phase 1 and Phase 2 work, this completes the TimeSamples refactoring with a stable, production-ready hybrid architecture.
|
|
||||||
|
|
||||||
**Recommendation**: Commit Phase 3 Step 1 work and mark the refactoring as **COMPLETE**.
|
|
||||||
|
|
||||||
Future work (Phase 3 Steps 2-5 or Phase 4 optimizations) can be pursued later if needed, but are not required to satisfy the original refactoring goals.
|
|
||||||
|
|
||||||
**Next Action**: Create commit and update `REFACTORING_COMPLETE.md` with Phase 3 Step 1 additions.
|
|
||||||
358
REFACTOR_TODO.md
358
REFACTOR_TODO.md
@@ -1,358 +0,0 @@
|
|||||||
# Refactoring Opportunities
|
|
||||||
|
|
||||||
This document outlines potential areas for refactoring in the TinyUSDZ codebase to improve maintainability, readability, and extensibility.
|
|
||||||
|
|
||||||
## Timesamples Module (`src/timesamples.*` and `src/timesamples-pprint.*`)
|
|
||||||
|
|
||||||
### Summary of Refactoring Opportunities
|
|
||||||
|
|
||||||
The timesamples module contains several areas where code duplication and complexity could be reduced through refactoring:
|
|
||||||
|
|
||||||
#### 1. ✅ COMPLETED: Consolidate POD Type Metadata and Handling
|
|
||||||
|
|
||||||
* **Files:** `src/timesamples.cc:203-429` (get_samples_converted), `src/timesamples.cc:432-498` (get_element_size)
|
|
||||||
* **Status:** Completed (2025-01-18)
|
|
||||||
* **Solution Implemented:**
|
|
||||||
- Created centralized `TINYUSDZ_POD_TYPE_LIST` macro that lists all ~40 POD types (lines 17-71)
|
|
||||||
- Refactored `get_element_size()` to use the type registry, reducing from ~65 lines to ~17 lines (lines 488-506)
|
|
||||||
- Refactored `get_samples_converted()` to use the type registry, reducing ~45 lines of type enumeration to 7 lines (lines 432-438)
|
|
||||||
- All unit tests pass - functionality preserved
|
|
||||||
* **Impact:**
|
|
||||||
- Eliminated ~100+ lines of duplicate type enumeration
|
|
||||||
- Adding new POD types now requires single entry in centralized macro
|
|
||||||
- Both functions now automatically stay in sync when types are added/removed
|
|
||||||
- Improved maintainability and reduced potential for inconsistencies
|
|
||||||
|
|
||||||
#### 2. ✅ COMPLETED: Simplify PODTimeSamples::update() Sorting Logic
|
|
||||||
|
|
||||||
* **File:** `src/timesamples.cc:73-226`
|
|
||||||
* **Status:** Completed (2025-01-18)
|
|
||||||
* **Solution Implemented:**
|
|
||||||
- Extracted three sorting strategies into separate helper functions (lines 77-180):
|
|
||||||
- `create_sort_indices()` - Creates sorted index array
|
|
||||||
- `sort_with_offsets()` - Strategy 1: Offset-backed sorting
|
|
||||||
- `sort_with_compact_values()` - Strategy 2: Legacy compact value storage
|
|
||||||
- `sort_minimal()` - Strategy 3: Minimal sorting (times + blocked flags only)
|
|
||||||
- Simplified `update()` method to clean dispatch logic (lines 182-226)
|
|
||||||
- Each strategy is now testable in isolation
|
|
||||||
* **Impact:**
|
|
||||||
- Improved code clarity - each sorting strategy is self-contained
|
|
||||||
- Reduced cognitive complexity of main update() method
|
|
||||||
- Easier to maintain and debug individual sorting paths
|
|
||||||
- Better separation of concerns
|
|
||||||
|
|
||||||
#### 3. Refactor Repetitive add_* Methods
|
|
||||||
|
|
||||||
* **Files:** `src/timesamples.hh:198-300`
|
|
||||||
* **Opportunity:** The `add_sample`, `add_array_sample`, and `add_typed_array_sample` methods repeat the same underlying type checks, offset initialization, and error handling. A common template or base implementation could reduce duplication.
|
|
||||||
* **Pattern Found:** Each method performs:
|
|
||||||
- Type ID validation
|
|
||||||
- Offset table initialization on first non-blocked sample
|
|
||||||
- Buffer resizing
|
|
||||||
- Similar error message construction
|
|
||||||
|
|
||||||
#### 4. ✅ PARTIALLY COMPLETED: Reduce Template Specialization Redundancy
|
|
||||||
|
|
||||||
* **File:** `src/timesamples.cc`
|
|
||||||
* **Status:** Partially Completed (2025-01-18)
|
|
||||||
* **Solution Implemented:**
|
|
||||||
- Refactored `PODTimeSamples::add_sample` instantiations (lines 732-794):
|
|
||||||
- **Before**: 48 manual template instantiations (~68 lines)
|
|
||||||
- **After**: Macro-based generator with explicit list (~63 lines, but more maintainable)
|
|
||||||
- Uses `INSTANTIATE_ADD_SAMPLE` macro to reduce boilerplate
|
|
||||||
- Refactored `PODTimeSamples::add_typed_array_sample` instantiations (lines 833-852):
|
|
||||||
- **Before**: 21 manual instantiations (~21 lines)
|
|
||||||
- **After**: Macro-based generator using `TINYUSDZ_POD_TYPE_LIST` + 6 matrix types (~14 lines)
|
|
||||||
- Reduction: ~33% fewer lines
|
|
||||||
* **Impact:**
|
|
||||||
- Reduced boilerplate for template instantiations
|
|
||||||
- Consistent pattern using centralized type registry where possible
|
|
||||||
- Easier to add new types that support TypedArray
|
|
||||||
* **Remaining Work:**
|
|
||||||
- `TypedTimeSamples::get()` instantiations (140+ lines) could benefit from similar treatment
|
|
||||||
- However, these include many non-POD types (vectors, strings, etc.) making macro generation complex
|
|
||||||
|
|
||||||
#### 5. ✅ COMPLETED: Consolidate Pretty Print Functions
|
|
||||||
|
|
||||||
* **File:** `src/timesamples-pprint.cc`
|
|
||||||
* **Status:** Completed (2025-01-18)
|
|
||||||
* **Solution Implemented:**
|
|
||||||
- Created `OutputAdapter` abstraction to unify string and StreamWriter output (lines 26-62)
|
|
||||||
- Implemented unified `print_type` and `print_vector` templates using SFINAE (lines 116-226)
|
|
||||||
- Added type traits system with `is_value_type` template for compile-time type detection (lines 64-113)
|
|
||||||
- Reduced both `pprint_pod_value_by_type` functions from ~150 lines each to 4 lines each (lines 1382-1393)
|
|
||||||
- Disabled 600+ lines of legacy print functions (wrapped in `#if 0` block for future cleanup)
|
|
||||||
* **Impact:**
|
|
||||||
- Eliminated ~370 lines of duplicate switch statements
|
|
||||||
- All unit tests pass - functionality preserved
|
|
||||||
- Adding new types now requires single entry in dispatch table
|
|
||||||
|
|
||||||
#### 6. ✅ COMPLETED: Unify Type Dispatch Mechanisms
|
|
||||||
|
|
||||||
* **File:** `src/timesamples-pprint.cc` (completed for this file)
|
|
||||||
* **Status:** Partially completed - `timesamples-pprint.cc` done (2025-01-18), `timesamples.cc` still pending
|
|
||||||
* **Solution Implemented:**
|
|
||||||
- Created centralized `print_pod_value_dispatch` function using macro-based dispatch (lines 250-330)
|
|
||||||
- Implemented `DISPATCH_POD_TYPE`, `DISPATCH_VALUE_TYPE`, and `DISPATCH_VECTOR_TYPE` macros
|
|
||||||
- Handles 60+ type cases uniformly through single switch statement
|
|
||||||
- Uses adapter pattern to route both string and StreamWriter output through same dispatch logic
|
|
||||||
* **Impact:**
|
|
||||||
- Reduced code duplication significantly
|
|
||||||
- Improved maintainability and extensibility
|
|
||||||
- Type dispatch now centralized and consistent
|
|
||||||
* **Remaining Work:**
|
|
||||||
- `src/timesamples.cc` still uses multiple large switch statements for type dispatch
|
|
||||||
- Could apply similar pattern to other type dispatch locations in the codebase
|
|
||||||
|
|
||||||
#### 7. Extract Common Buffer Management Logic
|
|
||||||
|
|
||||||
* **Files:** `src/timesamples.hh`, `src/timesamples.cc`
|
|
||||||
* **Opportunity:** The PODTimeSamples class manages several parallel buffers (_times, _values, _blocked, _offsets) with complex synchronization requirements. Extract a BufferManager class to handle:
|
|
||||||
- Coordinated resizing
|
|
||||||
- Offset management
|
|
||||||
- Dirty tracking
|
|
||||||
- Memory estimation
|
|
||||||
|
|
||||||
#### 8. Simplify TimeSamples/PODTimeSamples Interaction
|
|
||||||
|
|
||||||
* **Files:** `src/timesamples.hh`
|
|
||||||
* **Opportunity:** The TimeSamples class wraps PODTimeSamples for POD types but maintains its own storage for non-POD types. This dual-storage approach leads to:
|
|
||||||
- Complex conditional logic throughout the API
|
|
||||||
- Duplication of methods between the two classes
|
|
||||||
- Potential for inconsistencies
|
|
||||||
* **Solution:** Consider a unified storage approach or clearer separation of responsibilities
|
|
||||||
|
|
||||||
## C++ Core (`src` directory)
|
|
||||||
|
|
||||||
### 1. Consolidate File Type Detection
|
|
||||||
|
|
||||||
* **File:** `src/tinyusdz.cc`
|
|
||||||
* **Opportunity:** The `LoadUSDFromMemory` function contains repetitive code for detecting USDA, USDC, and USDZ file types. This logic can be centralized to reduce duplication. Similarly, `LoadUSDZFromMemory` and `LoadUSDZFromFile` have duplicated logic that could be shared.
|
|
||||||
|
|
||||||
### 2. Refactor Large `if-else` Chains
|
|
||||||
|
|
||||||
* **Files:** `src/usda-reader.cc`, `src/usdc-reader.cc`
|
|
||||||
* **Opportunity:** The `ReconstructPrimFromTypeName` functions in both the USDA and USDC readers are implemented as large `if-else` chains. This makes them difficult to maintain and extend. Refactoring this to use a map of function pointers or a similar factory pattern would be beneficial.
|
|
||||||
|
|
||||||
### 3. Decompose Large Functions
|
|
||||||
|
|
||||||
* **Files:** `src/usda-reader.cc`, `src/usdc-reader.cc`, `src/tydra/render-data.cc`
|
|
||||||
* **Opportunity:** Several functions are overly long and complex.
|
|
||||||
* In `usda-reader.cc` and `usdc-reader.cc`, functions like `ParseProperty`, `ParsePrimSpec`, and `ReconstructPrimMeta` could be broken down into smaller, more focused functions.
|
|
||||||
* In `tydra/render-data.cc`, the `TriangulatePolygon` function is a candidate for simplification and decomposition.
|
|
||||||
|
|
||||||
### 4. Generalize Template Specializations
|
|
||||||
|
|
||||||
* **File:** `src/tydra/scene-access.cc`
|
|
||||||
* **Opportunity:** The `GetPrimProperty` and `ToProperty` template specializations contain a lot of repeated code. A more generic, template-based approach could reduce this duplication.
|
|
||||||
|
|
||||||
### 5. [Moved to Timesamples Module Section]
|
|
||||||
|
|
||||||
* See "Timesamples Module" section above for comprehensive refactoring opportunities for POD type metadata centralization.
|
|
||||||
|
|
||||||
### 6. [Moved to Timesamples Module Section]
|
|
||||||
|
|
||||||
* See "Timesamples Module" section above for comprehensive refactoring opportunities for PODTimeSamples sorting paths.
|
|
||||||
|
|
||||||
### 7. [Moved to Timesamples Module Section]
|
|
||||||
|
|
||||||
* See "Timesamples Module" section above for comprehensive refactoring opportunities for type/offset guards in POD samples.
|
|
||||||
|
|
||||||
### 8. ✅ COMPLETED: Fix TimeSamples Copy/Move Semantics for POD Value Preservation
|
|
||||||
|
|
||||||
* **Files:** `src/timesamples.hh:1147-1277` (copy/move constructors and assignment operators)
|
|
||||||
* **Status:** Completed (2025-10-26)
|
|
||||||
* **Problem Statement:**
|
|
||||||
- When parsing USDA files with timeSamples, scalar POD values (float, double, int) were appearing as empty/null in output
|
|
||||||
- Example: `float value.timeSamples = { 0: VALUE_PPRINT: TODO: (type: null), 1: VALUE_PPRINT: TODO: (type: null) }`
|
|
||||||
- This occurred specifically with USDA parsing while USDC (binary) format worked correctly
|
|
||||||
- The issue affected both the official test file `tests/usda/timesamples-array-001.usda` and custom test files
|
|
||||||
* **Root Cause Analysis:**
|
|
||||||
- The `_small_values` member (mutable vector<uint64_t>) stores POD scalar samples in compressed form during parsing
|
|
||||||
- The copy assignment operator was missing the `_small_values = other._small_values;` statement
|
|
||||||
- When TimeSamples objects were assigned to attributes during construction, the POD values were lost
|
|
||||||
- Additionally, member initialization order in copy/move constructors didn't match class declaration order
|
|
||||||
* **Solution Implemented:**
|
|
||||||
1. **Copy Assignment Operator (line 1262)**: Added missing `_small_values = other._small_values;` statement
|
|
||||||
```cpp
|
|
||||||
TimeSamples& operator=(const TimeSamples& other) {
|
|
||||||
if (this != &other) {
|
|
||||||
_samples = other._samples;
|
|
||||||
_times = other._times;
|
|
||||||
_blocked = other._blocked;
|
|
||||||
_small_values = other._small_values; // CRITICAL FIX: was missing!
|
|
||||||
_values = other._values;
|
|
||||||
_offsets = other._offsets;
|
|
||||||
// ... rest of implementation
|
|
||||||
}
|
|
||||||
return *this;
|
|
||||||
}
|
|
||||||
```
|
|
||||||
2. **Copy Constructor (lines 1213-1229)**: Reordered member initialization list to match class declaration order
|
|
||||||
- Changed from: `..., _pod_samples(other._pod_samples), _small_values(other._small_values)`
|
|
||||||
- To: `..., _small_values(other._small_values), ..., _pod_samples(other._pod_samples)`
|
|
||||||
3. **Move Constructor (lines 1147-1163)**: Reordered member initialization list to match class declaration order
|
|
||||||
- Changed from: `..., _pod_samples(std::move(other._pod_samples)), _small_values(std::move(other._small_values))`
|
|
||||||
- To: `..., _small_values(std::move(other._small_values)), ..., _pod_samples(std::move(other._pod_samples))`
|
|
||||||
4. **Class Member Declaration Order (lines 2655-2678)**: Verified order is:
|
|
||||||
1. `_times`, `_blocked`, `_small_values`, `_values` (core storage)
|
|
||||||
2. `_offsets` (offset management)
|
|
||||||
3. `_type_id`, `_use_pod`, `_is_array`, etc. (type metadata)
|
|
||||||
4. `_dirty`, `_dirty_start`, `_dirty_end` (dirty tracking)
|
|
||||||
5. `_pod_samples` (POD storage wrapper)
|
|
||||||
* **Test Results:**
|
|
||||||
- ✅ Simple float timeSamples: `float value.timeSamples = { 0: 1.5, 1: 2.5 }` - WORKING
|
|
||||||
- ✅ Double timeSamples: `double value.timeSamples = { 0: 1.123456, 1: 2.987654 }` - WORKING
|
|
||||||
- ✅ Integer timeSamples: `int value.timeSamples = { 0: 42, 1: 99 }` - WORKING
|
|
||||||
- ✅ Float array timeSamples: `float[] timeSamples = { 0: [1, 2, 3], 1: [4, 5, 6] }` - WORKING
|
|
||||||
- ✅ Double array timeSamples: `double[] timeSamples = { 0: [1.1, 2.2], 1: [3.3, 4.4] }` - WORKING
|
|
||||||
- ✅ Official test file: `tests/usda/timesamples-array-001.usda` - XformOp and array timeSamples WORKING
|
|
||||||
* **Known Limitations (Fixed in Part 2):**
|
|
||||||
- ~~Bool and vector type timeSamples (e.g., float3) still show as null~~ ✅ FIXED
|
|
||||||
- ~~These types don't use the unified POD storage path that this fix addresses~~ ✅ FIXED
|
|
||||||
- ~~Resolution requires separate type reconstruction logic in `get_samples()` method~~ ✅ IMPLEMENTED
|
|
||||||
- ~~Out of scope for this fix (would require significant changes to type handling)~~ ✅ COMPLETED
|
|
||||||
* **Impact:**
|
|
||||||
- ✅ Primary objective achieved: scalar POD types now correctly preserved through copy/move operations
|
|
||||||
- ✅ All unit tests pass with the fix in place
|
|
||||||
- ✅ USDA parsing now produces correct output matching USDC format behavior
|
|
||||||
- Improved robustness of C++ object semantics by ensuring all members are properly transferred
|
|
||||||
* **Additional Fixes (2025-10-26 - Part 2):**
|
|
||||||
1. **Bool Type Support in get_samples()**: Added case for bool type reconstruction (line 1988-1992)
|
|
||||||
- Bool values stored in `_small_values` now correctly reconstructed
|
|
||||||
- Values show as 0/1 in output (could be improved to show true/false)
|
|
||||||
2. **Vector Type Support (float3, point3f, color3f) in get_samples()**: Added handling for types > 8 bytes (lines 1998-2038)
|
|
||||||
- Types larger than 8 bytes are stored in `_values` buffer with offsets in `_offsets`
|
|
||||||
- Added reconstruction logic for float3, point3f, and color3f types
|
|
||||||
- Properly decodes offset using `OFFSET_VALUE_MASK` to retrieve data from `_values` buffer
|
|
||||||
3. **Comprehensive Test Results:**
|
|
||||||
- ✅ bool: Working (shows 0/1)
|
|
||||||
- ✅ float, double, int: Working with correct values
|
|
||||||
- ✅ float3: Working with correct tuple values
|
|
||||||
- ✅ point3f: Working with correct tuple values
|
|
||||||
- ✅ color3f: Working with correct tuple values
|
|
||||||
- ✅ float[], double[]: Working with correct array values
|
|
||||||
- ✅ Official test file `timesamples-array-001.usda`: All timeSamples working correctly
|
|
||||||
|
|
||||||
### 9. ✅ COMPLETED: Reduce Header File Complexity via Implementation Separation
|
|
||||||
|
|
||||||
* **Files:** `src/timesamples.hh` (3,388 → 3,233 lines), `src/timesamples.cc` (1,325 → 1,503 lines)
|
|
||||||
* **Status:** Completed (2025-10-27)
|
|
||||||
* **Problem Statement:**
|
|
||||||
- The `timesamples.hh` header file was becoming increasingly large (3,388 lines) with multiple non-template method implementations inlined
|
|
||||||
- This increased compilation time and made the header harder to navigate
|
|
||||||
- Non-template methods (`TimeSamples` constructors, assignment operators, `clear()`, `init()`) were candidates for moving to the implementation file
|
|
||||||
* **Solution Implemented:**
|
|
||||||
1. **Moved Constructor/Assignment Operator Implementations (6 methods)**:
|
|
||||||
- Move constructor (`TimeSamples(TimeSamples&&) noexcept`) - 28 lines
|
|
||||||
- Move assignment operator (`operator=(TimeSamples&&) noexcept`) - 33 lines
|
|
||||||
- Copy constructor (`TimeSamples(const TimeSamples&)`) - 23 lines
|
|
||||||
- Copy assignment operator (`operator=(const TimeSamples&)`) - 28 lines
|
|
||||||
- `clear()` method - 18 lines
|
|
||||||
- `init(uint32_t)` method - 17 lines
|
|
||||||
- **Total lines moved: ~147 lines**
|
|
||||||
|
|
||||||
2. **Namespace Organization (Critical Fix)**:
|
|
||||||
- Initial build failed with: `error: '_type_id' was not declared in this scope`
|
|
||||||
- Root cause: Implementations were outside the `value` namespace, unable to access private members
|
|
||||||
- Solution: Wrapped all moved implementations in `namespace value { }` block
|
|
||||||
- All implementations now properly scoped within the class's namespace
|
|
||||||
|
|
||||||
3. **File Organization**:
|
|
||||||
- Added clear section header in timesamples.cc:
|
|
||||||
```cpp
|
|
||||||
// ============================================================================
|
|
||||||
// TimeSamples Implementation
|
|
||||||
// ============================================================================
|
|
||||||
|
|
||||||
namespace value {
|
|
||||||
// Implementation code...
|
|
||||||
} // namespace value
|
|
||||||
} // namespace tinyusdz
|
|
||||||
```
|
|
||||||
* **Code Changes Detail:**
|
|
||||||
- **In timesamples.hh**: Replaced inline implementations with declarations only
|
|
||||||
```cpp
|
|
||||||
// Before (inline implementation ~147 lines total):
|
|
||||||
TimeSamples(TimeSamples&& other) noexcept {
|
|
||||||
// implementation...
|
|
||||||
}
|
|
||||||
|
|
||||||
// After (declaration only):
|
|
||||||
TimeSamples(TimeSamples&& other) noexcept;
|
|
||||||
```
|
|
||||||
|
|
||||||
- **In timesamples.cc**: Added corresponding implementations wrapped in namespace
|
|
||||||
```cpp
|
|
||||||
namespace value {
|
|
||||||
|
|
||||||
TimeSamples::TimeSamples(TimeSamples&& other) noexcept
|
|
||||||
: _samples(std::move(other._samples)),
|
|
||||||
_times(std::move(other._times)),
|
|
||||||
// ... full initialization list ...
|
|
||||||
{
|
|
||||||
// Reset moved-from object to valid state
|
|
||||||
other._dirty = false;
|
|
||||||
other._dirty_start = 0;
|
|
||||||
other._dirty_end = 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
// ... other implementations ...
|
|
||||||
|
|
||||||
} // namespace value
|
|
||||||
} // namespace tinyusdz
|
|
||||||
```
|
|
||||||
* **Metrics:**
|
|
||||||
- **Header reduction**: 3,388 → 3,233 lines (155 lines, 4.6% reduction)
|
|
||||||
- **Implementation growth**: 1,325 → 1,503 lines (178 lines added for implementations + comments)
|
|
||||||
- **Compilation impact**: Reduces header bloat without significant binary size impact
|
|
||||||
* **Build & Test Results:**
|
|
||||||
- ✅ Full build completed successfully: `[100%] Built target unit-test-tinyusdz`
|
|
||||||
- ✅ All unit tests passing (20 tests, including timesamples_test)
|
|
||||||
- ✅ No regressions in functionality
|
|
||||||
- ✅ tusdcat binary builds and executes correctly
|
|
||||||
- ✅ Complex timeSamples (bool, vector, array types) still working correctly
|
|
||||||
* **Impact:**
|
|
||||||
- Cleaner header file - easier to navigate class interface
|
|
||||||
- Faster header parsing during compilation
|
|
||||||
- Non-template code properly belongs in .cc file per C++ best practices
|
|
||||||
- Establishes pattern for future refactoring (additional ~600 lines of template methods could follow similar pattern with explicit instantiation)
|
|
||||||
* **Known Limitations & Future Work:**
|
|
||||||
- Additional refactoring candidates identified (Phase 2):
|
|
||||||
- `get_typed_array_at_time()` (95 lines) - could move with explicit template instantiation
|
|
||||||
- `get_typed_array_at()` (89 lines) - could move with explicit template instantiation
|
|
||||||
- Additional PODTimeSamples methods (~200+ lines) - candidates for similar treatment
|
|
||||||
- These would require explicit template instantiation pattern similar to `INSTANTIATE_ADD_SAMPLE` macro already in use
|
|
||||||
- Phase 2 could achieve additional 30% header reduction (~600+ more lines)
|
|
||||||
* **Pattern for Continuing Refactoring:**
|
|
||||||
- For template methods: Use explicit template instantiation in .cc file
|
|
||||||
- For non-template methods: Simply move implementation as demonstrated
|
|
||||||
- Always maintain proper namespace scoping around implementations
|
|
||||||
- Update .hh to contain only declarations, .cc contains implementations
|
|
||||||
|
|
||||||
### 10. Generic Index Accessors
|
|
||||||
|
|
||||||
* **File:** `src/crate-reader.cc:141`
|
|
||||||
* **Opportunity:** `GetField`, `GetToken`, `GetPath`, `GetElementPath`, and `GetPathString` all share the same bounds-check pattern. A templated `lookup_optional(vector, Index)` (with optional logging hook) would remove boilerplate and centralize future diagnostics.
|
|
||||||
|
|
||||||
## JavaScript/WASM Bindings (`web` directory)
|
|
||||||
|
|
||||||
### 1. Modularize Emscripten Bindings
|
|
||||||
|
|
||||||
* **File:** `web/binding.cc`
|
|
||||||
* **Opportunity:** This is a very large file containing all Emscripten bindings. It should be split into multiple files based on functionality (e.g., `stage_bindings.cc`, `scene_bindings.cc`, `asset_bindings.cc`) to improve organization and build times.
|
|
||||||
|
|
||||||
### 2. Refactor `TinyUSDZLoaderNative`
|
|
||||||
|
|
||||||
* **File:** `web/binding.cc`
|
|
||||||
* **Opportunity:** The `TinyUSDZLoaderNative` class has too many responsibilities. The asset caching and streaming logic, for example, could be extracted into a separate `AssetManager` class.
|
|
||||||
|
|
||||||
### 3. Consolidate JavaScript Loading Logic
|
|
||||||
|
|
||||||
* **File:** `web/js/src/tinyusdz/TinyUSDZLoader.js`
|
|
||||||
* **Opportunity:** The `load` and `loadAsLayer` methods share a lot of similar logic. This could be consolidated into a common internal loading function.
|
|
||||||
|
|
||||||
### 4. Data-Driven Material Conversion
|
|
||||||
|
|
||||||
* **File:** `web/js/src/tinyusdz/TinyUSDZLoaderUtils.js`
|
|
||||||
* **Opportunity:** The `convertUsdMaterialToMeshPhysicalMaterial` function, which maps USD material properties to Three.js material properties, could be refactored to be more data-driven. Using a mapping table or a similar approach would make it easier to add or modify material property mappings.
|
|
||||||
@@ -1,688 +0,0 @@
|
|||||||
# TimeSamples Refactoring Plan
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
This document outlines a comprehensive refactoring plan for `value::TimeSamples` to optimize memory usage, simplify deduplication management, and prevent dangling pointer issues. The refactoring is divided into three major phases.
|
|
||||||
|
|
||||||
## Current Architecture Issues
|
|
||||||
|
|
||||||
### Problem 1: TypedArray Deduplication Complexity
|
|
||||||
- Currently, `PODTimeSamples` uses `TypedArray<T>` for storing array data with packed pointers
|
|
||||||
- Deduplication is managed through TypedArray's dedup flag (bit 63 in packed pointer)
|
|
||||||
- When TypedArray is moved or copied, dangling pointer issues can occur
|
|
||||||
- Memory management is complex due to pointer-based deduplication
|
|
||||||
|
|
||||||
### Problem 2: Separate POD and Generic Paths
|
|
||||||
- `PODTimeSamples` exists as a separate optimization for POD types
|
|
||||||
- `value::TimeSamples` uses generic `value::Value` for non-POD types
|
|
||||||
- This dual-path approach increases code complexity and maintenance burden
|
|
||||||
|
|
||||||
### Problem 3: Storage Inefficiency
|
|
||||||
- Current implementation stores full `value::Value` objects (larger than needed)
|
|
||||||
- For types <= 8 bytes, we could use inline storage instead of indirection
|
|
||||||
- Memory overhead for small value types is significant
|
|
||||||
|
|
||||||
## Phase 1: Offset-Based Deduplication in PODTimeSamples
|
|
||||||
|
|
||||||
### Goal
|
|
||||||
Replace TypedArray pointer-based deduplication with index-based deduplication embedded in the offset table.
|
|
||||||
|
|
||||||
### Design
|
|
||||||
|
|
||||||
#### Offset Value Bit Layout (64-bit)
|
|
||||||
```
|
|
||||||
Bit 63: Dedup flag (1 = deduplicated, 0 = original data)
|
|
||||||
Bit 62: 1D array flag (1 = array data, 0 = scalar data)
|
|
||||||
Bits 61-0: Index or offset value
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Behavior
|
|
||||||
|
|
||||||
**For Original (Non-Dedup) Data:**
|
|
||||||
- Bit 63 = 0, Bit 62 indicates array/scalar
|
|
||||||
- Bits 61-0: Byte offset into `_values` buffer
|
|
||||||
|
|
||||||
**For Deduplicated Data:**
|
|
||||||
- Bit 63 = 1, Bit 62 = copy from original
|
|
||||||
- Bits 61-0: **Sample index** (not byte offset) pointing to original sample
|
|
||||||
|
|
||||||
### Example
|
|
||||||
```cpp
|
|
||||||
// Sample 0: Original data at offset 0
|
|
||||||
_offsets[0] = 0x0000000000000000 // No dedup, scalar, offset 0
|
|
||||||
|
|
||||||
// Sample 1: Original array data at offset 32
|
|
||||||
_offsets[1] = 0x4000000000000020 // No dedup, array (bit 62=1), offset 32
|
|
||||||
|
|
||||||
// Sample 2: Deduplicated from sample 1
|
|
||||||
_offsets[2] = 0xC000000000000001 // Dedup (bit 63=1), array (bit 62=1), index 1
|
|
||||||
|
|
||||||
// Sample 3: Deduplicated from sample 0
|
|
||||||
_offsets[3] = 0x8000000000000000 // Dedup (bit 63=1), scalar (bit 62=0), index 0
|
|
||||||
```
|
|
||||||
|
|
||||||
### Access Pattern
|
|
||||||
```cpp
|
|
||||||
size_t resolve_offset(size_t sample_idx) const {
|
|
||||||
uint64_t offset_value = _offsets[sample_idx];
|
|
||||||
|
|
||||||
if (offset_value & 0x8000000000000000ULL) {
|
|
||||||
// Deduplicated: bits 61-0 contain original sample index
|
|
||||||
size_t orig_idx = offset_value & 0x3FFFFFFFFFFFFFFFULL;
|
|
||||||
return resolve_offset(orig_idx); // Recursive resolve
|
|
||||||
} else {
|
|
||||||
// Original: bits 61-0 contain byte offset
|
|
||||||
return offset_value & 0x3FFFFFFFFFFFFFFFULL;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Sorting Considerations
|
|
||||||
|
|
||||||
When `update()` sorts samples by time, we need to update dedup indices:
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
void update() const {
|
|
||||||
if (!_dirty) return;
|
|
||||||
|
|
||||||
// 1. Create index mapping (old_idx -> new_idx)
|
|
||||||
std::vector<size_t> sorted_indices = create_sorted_indices(_times);
|
|
||||||
std::vector<size_t> index_map(_times.size());
|
|
||||||
for (size_t new_idx = 0; new_idx < sorted_indices.size(); ++new_idx) {
|
|
||||||
index_map[sorted_indices[new_idx]] = new_idx;
|
|
||||||
}
|
|
||||||
|
|
||||||
// 2. Sort times, blocked, and offsets
|
|
||||||
reorder_by_indices(_times, sorted_indices);
|
|
||||||
reorder_by_indices(_blocked, sorted_indices);
|
|
||||||
reorder_by_indices(_offsets, sorted_indices);
|
|
||||||
|
|
||||||
// 3. Update dedup indices in offset table
|
|
||||||
for (uint64_t& offset_val : _offsets) {
|
|
||||||
if (offset_val & 0x8000000000000000ULL) {
|
|
||||||
size_t old_ref_idx = offset_val & 0x3FFFFFFFFFFFFFFFULL;
|
|
||||||
size_t new_ref_idx = index_map[old_ref_idx];
|
|
||||||
|
|
||||||
// Reconstruct offset value with new index
|
|
||||||
offset_val = (offset_val & 0xC000000000000000ULL) | new_ref_idx;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
_dirty = false;
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Implementation Tasks
|
|
||||||
|
|
||||||
1. **Add offset manipulation helpers**
|
|
||||||
- `make_offset(byte_offset, is_array)` - Create non-dedup offset
|
|
||||||
- `make_dedup_offset(sample_index, is_array)` - Create dedup offset
|
|
||||||
- `is_dedup(offset_value)` - Check dedup flag
|
|
||||||
- `is_array(offset_value)` - Check array flag
|
|
||||||
- `get_byte_offset(offset_value)` - Extract offset (resolve dedup chain)
|
|
||||||
- `get_dedup_index(offset_value)` - Extract dedup index
|
|
||||||
|
|
||||||
2. **Update `add_dedup_array_sample()` and `add_dedup_matrix_array_sample()`**
|
|
||||||
- Replace current implementation
|
|
||||||
- Store sample index instead of reusing offset
|
|
||||||
- Set bit 63 to indicate deduplication
|
|
||||||
|
|
||||||
3. **Deprecate `add_typed_array_sample()`**
|
|
||||||
- This method stores TypedArray packed pointers
|
|
||||||
- Replace with `add_array_sample()` that stores actual data
|
|
||||||
|
|
||||||
4. **Update `get_value_at()` and array retrieval methods**
|
|
||||||
- Implement offset resolution with dedup chain following
|
|
||||||
- Handle indirect lookups through dedup indices
|
|
||||||
|
|
||||||
5. **Update `update()` sorting method**
|
|
||||||
- Implement index remapping for dedup references
|
|
||||||
- Ensure all dedup indices remain valid after sorting
|
|
||||||
|
|
||||||
6. **Update memory estimation**
|
|
||||||
- No separate TypedArrayImpl allocations for dedup
|
|
||||||
- Simpler calculation based on _values buffer only
|
|
||||||
|
|
||||||
### Benefits
|
|
||||||
|
|
||||||
- **No dangling pointers**: Dedup uses indices, not pointers
|
|
||||||
- **Simpler memory management**: No TypedArrayImpl lifetime tracking
|
|
||||||
- **Move-safe**: Indices remain valid after vector moves
|
|
||||||
- **Clear ownership**: _values buffer owns all data
|
|
||||||
|
|
||||||
### Risks & Mitigations
|
|
||||||
|
|
||||||
**Risk**: Dedup chain resolution adds overhead
|
|
||||||
- **Mitigation**: Most common case is 1-hop (95%+ of dedup)
|
|
||||||
- **Mitigation**: Cache last resolved offset if needed
|
|
||||||
|
|
||||||
**Risk**: Sorting becomes more complex
|
|
||||||
- **Mitigation**: Index remapping is O(n), acceptable cost
|
|
||||||
- **Mitigation**: Only happens when dirty flag is set
|
|
||||||
|
|
||||||
**Risk**: 62-bit limit on offsets (4 petabytes) and indices (4 trillion samples)
|
|
||||||
- **Mitigation**: Sufficient for all realistic use cases
|
|
||||||
|
|
||||||
## Phase 2: Unify PODTimeSamples with TimeSamples
|
|
||||||
|
|
||||||
### Goal
|
|
||||||
Eliminate the separate PODTimeSamples structure and use the offset-based approach directly in `value::TimeSamples` for all array types.
|
|
||||||
|
|
||||||
### Design
|
|
||||||
|
|
||||||
#### Current Dual Structure
|
|
||||||
```cpp
|
|
||||||
class TimeSamples {
|
|
||||||
std::vector<Sample> _samples; // Generic path
|
|
||||||
PODTimeSamples _pod_samples; // POD optimization path
|
|
||||||
bool _use_pod; // Path selector
|
|
||||||
};
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Proposed Unified Structure
|
|
||||||
```cpp
|
|
||||||
class TimeSamples {
|
|
||||||
// Common storage for all types
|
|
||||||
std::vector<double> _times;
|
|
||||||
Buffer<16> _blocked;
|
|
||||||
std::vector<uint64_t> _offsets; // With dedup/array flags
|
|
||||||
Buffer<16> _values; // Raw byte storage
|
|
||||||
|
|
||||||
uint32_t _type_id;
|
|
||||||
bool _is_array;
|
|
||||||
size_t _element_size;
|
|
||||||
size_t _array_size;
|
|
||||||
};
|
|
||||||
```
|
|
||||||
|
|
||||||
### Array Type Support
|
|
||||||
|
|
||||||
**STL Arrays (`std::vector<T>`)**:
|
|
||||||
- Store array data inline in `_values` buffer
|
|
||||||
- Use array flag (bit 62) in offset
|
|
||||||
- Element count stored in `_array_size`
|
|
||||||
|
|
||||||
**TypedArray Deprecation**:
|
|
||||||
- Remove `add_typed_array_sample()` method
|
|
||||||
- Use `add_array_sample()` which stores actual data
|
|
||||||
- No more TypedArrayImpl pointer management
|
|
||||||
|
|
||||||
### Implementation Tasks
|
|
||||||
|
|
||||||
1. **Move offset/value storage to TimeSamples**
|
|
||||||
- Add `_offsets`, `_values` members to TimeSamples
|
|
||||||
- Remove `_pod_samples` member
|
|
||||||
- Remove `_use_pod` flag
|
|
||||||
|
|
||||||
2. **Implement array methods directly in TimeSamples**
|
|
||||||
- `add_array_sample<T>(double t, const T* values, size_t count)`
|
|
||||||
- `add_dedup_array_sample<T>(double t, size_t ref_index)`
|
|
||||||
- `get_array_at<T>(size_t idx, const T** out_ptr, size_t* out_count)`
|
|
||||||
|
|
||||||
3. **Support std::vector<T> sample addition**
|
|
||||||
```cpp
|
|
||||||
template<typename T>
|
|
||||||
bool add_sample(double t, const std::vector<T>& array) {
|
|
||||||
return add_array_sample(t, array.data(), array.size());
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
4. **Migrate existing code**
|
|
||||||
- Update crate-reader to use new array methods
|
|
||||||
- Update value-pprint to access unified storage
|
|
||||||
- Remove PODTimeSamples usage from codebase
|
|
||||||
|
|
||||||
5. **Update get/query methods**
|
|
||||||
- Single code path for POD and array types
|
|
||||||
- Return TypedArrayView for array access (read-only)
|
|
||||||
|
|
||||||
### Benefits
|
|
||||||
|
|
||||||
- **Simplified API**: One TimeSamples class, not two code paths
|
|
||||||
- **std::vector support**: Can store std::vector<T> directly
|
|
||||||
- **Reduced complexity**: No POD vs generic branching
|
|
||||||
- **Unified sorting**: One update() implementation
|
|
||||||
|
|
||||||
### Backward Compatibility
|
|
||||||
|
|
||||||
- Keep `get_pod_storage()` as deprecated accessor
|
|
||||||
- Return const view of internal storage
|
|
||||||
- Gradual migration path for existing code
|
|
||||||
|
|
||||||
## Phase 3: 64-bit Packed Value Storage
|
|
||||||
|
|
||||||
### Goal
|
|
||||||
Optimize storage for small value types using inline 64-bit representation, similar to Crate's `ValueRep`.
|
|
||||||
|
|
||||||
### Design Philosophy
|
|
||||||
|
|
||||||
**Size-Based Strategy:**
|
|
||||||
|
|
||||||
1. **Types <= 8 bytes**: Store inline in 64-bit value slot
|
|
||||||
- No deduplication needed (small values are cheap to copy)
|
|
||||||
- No pointer indirection
|
|
||||||
- Examples: float, double, int, float2, half4, etc.
|
|
||||||
|
|
||||||
2. **Types > 8 bytes**: Store in _values buffer with offset
|
|
||||||
- Use dedup flag for shared data
|
|
||||||
- Array flag for 1D arrays
|
|
||||||
- Examples: float3, float4, matrix types, arrays
|
|
||||||
|
|
||||||
### Unified Storage Structure
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
class TimeSamples {
|
|
||||||
std::vector<double> _times;
|
|
||||||
Buffer<16> _blocked; // ValueBlock flags
|
|
||||||
std::vector<uint64_t> _data; // Packed values or offset+flags
|
|
||||||
|
|
||||||
// Optional: Only allocated for large types or arrays
|
|
||||||
Buffer<16> _values; // Heap storage for >8 byte types
|
|
||||||
|
|
||||||
uint32_t _type_id;
|
|
||||||
bool _uses_heap; // True if _values is used
|
|
||||||
};
|
|
||||||
```
|
|
||||||
|
|
||||||
### Data Encoding (64-bit)
|
|
||||||
|
|
||||||
#### Small Types (<= 8 bytes): Inline Storage
|
|
||||||
```
|
|
||||||
Bits 63-0: Actual value data (bit-cast to uint64_t)
|
|
||||||
```
|
|
||||||
|
|
||||||
Examples:
|
|
||||||
- `float` (4 bytes): Stored in lower 32 bits
|
|
||||||
- `double` (8 bytes): Stored in all 64 bits
|
|
||||||
- `float2` (8 bytes): Stored in all 64 bits
|
|
||||||
- `int32_t` (4 bytes): Stored in lower 32 bits
|
|
||||||
- `half4` (8 bytes): Stored in all 64 bits
|
|
||||||
|
|
||||||
#### Large Types (> 8 bytes) or Arrays: Offset + Flags
|
|
||||||
```
|
|
||||||
Bit 63: Dedup flag
|
|
||||||
Bit 62: Array flag
|
|
||||||
Bit 61: Heap flag (1 = offset to _values, always 1 for large types)
|
|
||||||
Bits 60-0: Offset or sample index (61-bit range)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Type Classification
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
enum class StorageMode {
|
|
||||||
INLINE, // <= 8 bytes, stored in _data directly
|
|
||||||
HEAP_SCALAR, // > 8 bytes, offset into _values
|
|
||||||
HEAP_ARRAY // Array data, offset into _values
|
|
||||||
};
|
|
||||||
|
|
||||||
StorageMode get_storage_mode(uint32_t type_id, bool is_array) {
|
|
||||||
if (is_array) return HEAP_ARRAY;
|
|
||||||
|
|
||||||
size_t type_size = get_type_size(type_id);
|
|
||||||
if (type_size <= 8) return INLINE;
|
|
||||||
|
|
||||||
return HEAP_SCALAR;
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Example Encoding
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
// Sample 0: float value 3.14f
|
|
||||||
_data[0] = 0x4048F5C3 // float bits in lower 32 bits
|
|
||||||
|
|
||||||
// Sample 1: double value 2.71828
|
|
||||||
_data[1] = 0x4005BF0A8B145769 // double bits in all 64 bits
|
|
||||||
|
|
||||||
// Sample 2: float3 value at heap offset 0
|
|
||||||
_data[2] = 0x2000000000000000 // Heap flag (bit 61), offset 0
|
|
||||||
|
|
||||||
// Sample 3: float3 deduplicated from sample 2
|
|
||||||
_data[3] = 0xE000000000000002 // Dedup + Heap flags, ref index 2
|
|
||||||
|
|
||||||
// Sample 4: float[] array at heap offset 12
|
|
||||||
_data[4] = 0x600000000000000C // Array + Heap flags, offset 12
|
|
||||||
|
|
||||||
// Sample 5: Blocked sample
|
|
||||||
_blocked[5] = 1, _data[5] = 0 // Blocked flag set, data ignored
|
|
||||||
```
|
|
||||||
|
|
||||||
### Access Patterns
|
|
||||||
|
|
||||||
```cpp
|
|
||||||
template<typename T>
|
|
||||||
bool get_value_at(size_t idx, T* out) const {
|
|
||||||
if (_blocked[idx]) {
|
|
||||||
*out = T{}; // Default value for blocked
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
uint64_t data_val = _data[idx];
|
|
||||||
|
|
||||||
if constexpr (sizeof(T) <= 8 && !is_array_type<T>) {
|
|
||||||
// INLINE: Direct bit-cast from _data
|
|
||||||
std::memcpy(out, &data_val, sizeof(T));
|
|
||||||
return true;
|
|
||||||
} else {
|
|
||||||
// HEAP: Resolve offset and copy from _values
|
|
||||||
size_t offset = resolve_heap_offset(idx);
|
|
||||||
std::memcpy(out, _values.data() + offset, sizeof(T));
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
size_t resolve_heap_offset(size_t idx) const {
|
|
||||||
uint64_t data_val = _data[idx];
|
|
||||||
|
|
||||||
if (data_val & 0x8000000000000000ULL) {
|
|
||||||
// Dedup: Follow reference chain
|
|
||||||
size_t ref_idx = data_val & 0x1FFFFFFFFFFFFFFFULL;
|
|
||||||
return resolve_heap_offset(ref_idx);
|
|
||||||
} else {
|
|
||||||
// Direct: Extract offset
|
|
||||||
return data_val & 0x1FFFFFFFFFFFFFFFULL;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Deduplication Strategy
|
|
||||||
|
|
||||||
**Small Types (INLINE):**
|
|
||||||
- No deduplication tracking needed
|
|
||||||
- Storing multiple copies is cheaper than managing dedup
|
|
||||||
|
|
||||||
**Large Types (HEAP):**
|
|
||||||
- Deduplication via bit 63 in _data
|
|
||||||
- Same index-based approach as Phase 1
|
|
||||||
|
|
||||||
**Rationale:**
|
|
||||||
- Deduplication overhead only worthwhile for expensive-to-copy data
|
|
||||||
- float3 (12 bytes) vs dedup overhead (index lookup): dedup wins
|
|
||||||
- float (4 bytes) vs dedup overhead: inline wins
|
|
||||||
|
|
||||||
### Memory Savings Example
|
|
||||||
|
|
||||||
**Before (current implementation):**
|
|
||||||
```
|
|
||||||
10,000 samples of float:
|
|
||||||
- 10,000 × 8 bytes (time) = 80 KB
|
|
||||||
- 10,000 × ~40 bytes (Value object) = 400 KB
|
|
||||||
Total: ~480 KB
|
|
||||||
```
|
|
||||||
|
|
||||||
**After (packed storage):**
|
|
||||||
```
|
|
||||||
10,000 samples of float:
|
|
||||||
- 10,000 × 8 bytes (time) = 80 KB
|
|
||||||
- 10,000 × 8 bytes (inline data) = 80 KB
|
|
||||||
- 10,000 × 1 bit (blocked) ≈ 1.2 KB
|
|
||||||
Total: ~161 KB (66% reduction)
|
|
||||||
```
|
|
||||||
|
|
||||||
### Implementation Tasks
|
|
||||||
|
|
||||||
1. **Type size classification**
|
|
||||||
- Add `get_storage_mode(type_id)` helper
|
|
||||||
- Map all USD types to INLINE or HEAP
|
|
||||||
|
|
||||||
2. **Update add_sample methods**
|
|
||||||
```cpp
|
|
||||||
template<typename T>
|
|
||||||
bool add_sample(double t, const T& value) {
|
|
||||||
_times.push_back(t);
|
|
||||||
_blocked.push_back(0);
|
|
||||||
|
|
||||||
if constexpr (sizeof(T) <= 8) {
|
|
||||||
// INLINE path
|
|
||||||
uint64_t packed = 0;
|
|
||||||
std::memcpy(&packed, &value, sizeof(T));
|
|
||||||
_data.push_back(packed);
|
|
||||||
} else {
|
|
||||||
// HEAP path
|
|
||||||
size_t offset = _values.size();
|
|
||||||
_values.resize(offset + sizeof(T));
|
|
||||||
std::memcpy(_values.data() + offset, &value, sizeof(T));
|
|
||||||
|
|
||||||
uint64_t packed = 0x2000000000000000ULL | offset; // Heap flag
|
|
||||||
_data.push_back(packed);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Update get_value_at methods**
|
|
||||||
- Check storage mode
|
|
||||||
- Use inline or heap access path
|
|
||||||
- Handle dedup resolution
|
|
||||||
|
|
||||||
4. **Update deduplication methods**
|
|
||||||
- Only for HEAP types
|
|
||||||
- Set dedup flag in _data[idx]
|
|
||||||
- Store reference index
|
|
||||||
|
|
||||||
5. **Update update() sorting**
|
|
||||||
- Sort times, blocked, data together
|
|
||||||
- Remap dedup indices (heap types only)
|
|
||||||
- No special handling for inline types
|
|
||||||
|
|
||||||
6. **Optimize memory layout**
|
|
||||||
- Only allocate _values when first heap type added
|
|
||||||
- Keep _data tightly packed
|
|
||||||
|
|
||||||
7. **Update serialization/deserialization**
|
|
||||||
- USDC writer: pack values appropriately
|
|
||||||
- USDA writer: format based on storage mode
|
|
||||||
|
|
||||||
### Benefits
|
|
||||||
|
|
||||||
- **Memory efficiency**: 50-70% reduction for small types
|
|
||||||
- **Cache efficiency**: All small values in _data array
|
|
||||||
- **No indirection**: Inline types accessed directly
|
|
||||||
- **Selective dedup**: Only used where beneficial
|
|
||||||
|
|
||||||
### Compatibility
|
|
||||||
|
|
||||||
- **API preserved**: Same public interface
|
|
||||||
- **Type support**: All existing USD types supported
|
|
||||||
- **Migration**: Transparent to users
|
|
||||||
|
|
||||||
### Performance Characteristics
|
|
||||||
|
|
||||||
| Operation | Before | After | Notes |
|
|
||||||
|-----------|--------|-------|-------|
|
|
||||||
| Add float sample | O(1) + alloc | O(1) inline | 2-3x faster |
|
|
||||||
| Add float3 sample | O(1) + alloc | O(1) heap | Similar |
|
|
||||||
| Get float sample | O(1) + deref | O(1) memcpy | 2x faster |
|
|
||||||
| Get float3 sample | O(1) + deref | O(1) + resolve | Similar |
|
|
||||||
| Dedup float | O(1) | N/A (no dedup) | Not needed |
|
|
||||||
| Dedup float3 | O(1) | O(1) | Same |
|
|
||||||
| Sort samples | O(n log n) | O(n log n) | Same complexity |
|
|
||||||
| Memory usage | ~40 bytes/sample | ~16 bytes/sample | 60% reduction |
|
|
||||||
|
|
||||||
## Implementation Roadmap
|
|
||||||
|
|
||||||
### Phase 1: Offset-Based Deduplication (2-3 weeks)
|
|
||||||
**Week 1:**
|
|
||||||
- [ ] Implement offset bit manipulation helpers
|
|
||||||
- [ ] Add unit tests for offset encoding/decoding
|
|
||||||
- [ ] Update PODTimeSamples::add_dedup_array_sample()
|
|
||||||
|
|
||||||
**Week 2:**
|
|
||||||
- [ ] Implement offset resolution with dedup chain
|
|
||||||
- [ ] Update PODTimeSamples::update() with index remapping
|
|
||||||
- [ ] Add comprehensive tests for sorting + dedup
|
|
||||||
|
|
||||||
**Week 3:**
|
|
||||||
- [ ] Update crate-reader to use new dedup API
|
|
||||||
- [ ] Update value-pprint to handle new offset format
|
|
||||||
- [ ] Performance testing and optimization
|
|
||||||
|
|
||||||
### Phase 2: Unify PODTimeSamples (2-3 weeks)
|
|
||||||
**Week 1:**
|
|
||||||
- [ ] Move offset/value storage to TimeSamples
|
|
||||||
- [ ] Implement array methods in TimeSamples
|
|
||||||
- [ ] Add std::vector<T> support
|
|
||||||
|
|
||||||
**Week 2:**
|
|
||||||
- [ ] Migrate crate-reader usage
|
|
||||||
- [ ] Migrate value-pprint usage
|
|
||||||
- [ ] Update all TimeSamples callsites
|
|
||||||
|
|
||||||
**Week 3:**
|
|
||||||
- [ ] Remove PODTimeSamples class
|
|
||||||
- [ ] Clean up deprecated APIs
|
|
||||||
- [ ] Update documentation
|
|
||||||
|
|
||||||
### Phase 3: 64-bit Packed Storage (3-4 weeks)
|
|
||||||
**Week 1:**
|
|
||||||
- [ ] Design type size classification
|
|
||||||
- [ ] Implement inline vs heap detection
|
|
||||||
- [ ] Add encoding/decoding helpers
|
|
||||||
|
|
||||||
**Week 2:**
|
|
||||||
- [ ] Implement INLINE path for add_sample
|
|
||||||
- [ ] Implement INLINE path for get_value_at
|
|
||||||
- [ ] Add unit tests for all small types
|
|
||||||
|
|
||||||
**Week 3:**
|
|
||||||
- [ ] Implement HEAP path with selective dedup
|
|
||||||
- [ ] Update sorting with hybrid storage
|
|
||||||
- [ ] Add comprehensive integration tests
|
|
||||||
|
|
||||||
**Week 4:**
|
|
||||||
- [ ] Performance benchmarking
|
|
||||||
- [ ] Memory profiling
|
|
||||||
- [ ] Optimization and tuning
|
|
||||||
|
|
||||||
## Testing Strategy
|
|
||||||
|
|
||||||
### Unit Tests
|
|
||||||
- Offset encoding/decoding (all bit patterns)
|
|
||||||
- Dedup chain resolution (1-hop, multi-hop)
|
|
||||||
- Index remapping during sorting
|
|
||||||
- Storage mode classification
|
|
||||||
- Inline value encoding/decoding
|
|
||||||
- Heap value with dedup
|
|
||||||
|
|
||||||
### Integration Tests
|
|
||||||
- Round-trip: write USDC → read → verify
|
|
||||||
- Large datasets (10K+ samples)
|
|
||||||
- Mixed type scenarios
|
|
||||||
- Dedup + sorting combinations
|
|
||||||
|
|
||||||
### Performance Tests
|
|
||||||
- Memory usage comparison (before/after)
|
|
||||||
- Add sample throughput
|
|
||||||
- Get sample throughput
|
|
||||||
- Sort performance
|
|
||||||
- Dedup overhead measurement
|
|
||||||
|
|
||||||
### Regression Tests
|
|
||||||
- Existing USDA/USDC test suite
|
|
||||||
- Parse all models in `models/` directory
|
|
||||||
- Compare output with baseline
|
|
||||||
|
|
||||||
## Migration Guide
|
|
||||||
|
|
||||||
### For Internal Code
|
|
||||||
|
|
||||||
**Phase 1:**
|
|
||||||
```cpp
|
|
||||||
// Before
|
|
||||||
pod_samples.add_typed_array_sample(time, typed_array);
|
|
||||||
|
|
||||||
// After
|
|
||||||
pod_samples.add_array_sample(time, typed_array.data(), typed_array.size());
|
|
||||||
```
|
|
||||||
|
|
||||||
**Phase 2:**
|
|
||||||
```cpp
|
|
||||||
// Before
|
|
||||||
if (ts.is_using_pod()) {
|
|
||||||
const PODTimeSamples* pod = ts.get_pod_storage();
|
|
||||||
// Use pod methods
|
|
||||||
}
|
|
||||||
|
|
||||||
// After
|
|
||||||
// Direct TimeSamples methods
|
|
||||||
ts.get_array_at(idx, &ptr, &count);
|
|
||||||
```
|
|
||||||
|
|
||||||
**Phase 3:**
|
|
||||||
```cpp
|
|
||||||
// Before and After: No API changes
|
|
||||||
// Internal storage changes are transparent
|
|
||||||
```
|
|
||||||
|
|
||||||
### For External Users
|
|
||||||
|
|
||||||
- **API stability**: Public TimeSamples API remains unchanged
|
|
||||||
- **Binary compatibility**: May break ABI (major version bump)
|
|
||||||
- **Serialization format**: USDC format unchanged (Crate-compatible)
|
|
||||||
|
|
||||||
## Risk Assessment
|
|
||||||
|
|
||||||
### High Risk
|
|
||||||
- **Phase 3 sorting complexity**: Hybrid storage adds branching
|
|
||||||
- Mitigation: Extensive testing, performance benchmarks
|
|
||||||
|
|
||||||
### Medium Risk
|
|
||||||
- **Phase 1 dedup chain bugs**: Index remapping is error-prone
|
|
||||||
- Mitigation: Comprehensive unit tests, assertions
|
|
||||||
|
|
||||||
- **Phase 2 migration scope**: Many callsites to update
|
|
||||||
- Mitigation: Incremental migration, keep deprecated APIs temporarily
|
|
||||||
|
|
||||||
### Low Risk
|
|
||||||
- **Memory savings**: May not reach projected 60%
|
|
||||||
- Mitigation: Profile real-world data, adjust strategy
|
|
||||||
|
|
||||||
- **Performance regression**: Offset resolution overhead
|
|
||||||
- Mitigation: Benchmark early, optimize hot paths
|
|
||||||
|
|
||||||
## Success Criteria
|
|
||||||
|
|
||||||
### Phase 1
|
|
||||||
- [ ] All existing tests pass
|
|
||||||
- [ ] No memory leaks (valgrind clean)
|
|
||||||
- [ ] Dedup still works correctly after sorting
|
|
||||||
- [ ] Performance: <5% regression on add/get operations
|
|
||||||
|
|
||||||
### Phase 2
|
|
||||||
- [ ] PODTimeSamples fully removed from codebase
|
|
||||||
- [ ] std::vector<T> samples can be added
|
|
||||||
- [ ] Code complexity reduced (measured by cyclomatic complexity)
|
|
||||||
- [ ] Documentation updated
|
|
||||||
|
|
||||||
### Phase 3
|
|
||||||
- [ ] Memory usage reduced by 50%+ for small types
|
|
||||||
- [ ] Performance improved or neutral for inline types
|
|
||||||
- [ ] All USD types supported (100+ types)
|
|
||||||
- [ ] USDC round-trip passes all tests
|
|
||||||
|
|
||||||
## Future Enhancements
|
|
||||||
|
|
||||||
### Compression
|
|
||||||
- Apply LZ4/Zstd to _values buffer for large datasets
|
|
||||||
- Transparent decompression on access
|
|
||||||
|
|
||||||
### SIMD Optimization
|
|
||||||
- Vectorized search in _times array
|
|
||||||
- Parallel offset resolution for bulk operations
|
|
||||||
|
|
||||||
### Memory Mapping
|
|
||||||
- Support mmap for _values buffer
|
|
||||||
- Zero-copy reading from USDC files
|
|
||||||
|
|
||||||
### Incremental Sorting
|
|
||||||
- Insertion sort for small dirty ranges
|
|
||||||
- Avoid full sort when only a few samples added
|
|
||||||
|
|
||||||
## Conclusion
|
|
||||||
|
|
||||||
This three-phase refactoring will significantly improve `value::TimeSamples`:
|
|
||||||
|
|
||||||
1. **Phase 1**: Solves dangling pointer issues, simplifies dedup
|
|
||||||
2. **Phase 2**: Unifies code paths, reduces complexity
|
|
||||||
3. **Phase 3**: Optimizes memory, improves performance
|
|
||||||
|
|
||||||
The end result will be a more maintainable, efficient, and safer TimeSamples implementation that scales well to large production scenes.
|
|
||||||
Reference in New Issue
Block a user