mirror of
https://github.com/lighttransport/tinyusdz.git
synced 2026-01-18 01:11:17 +01:00
This commit implements Phase 1 of header complexity reduction for timesamples.hh: **Changes:** - Moved 6 non-template methods from timesamples.hh to timesamples.cc: - Move constructor (TimeSamples&&) - Move assignment operator - Copy constructor - Copy assignment operator - clear() method - init() method - Total: 147 lines moved - All implementations properly wrapped in namespace value blocks - Header reduced from 3,388 to 3,233 lines (155 lines, 4.6% reduction) - Implementation file increased from 1,325 to 1,503 lines (+178 lines) **Benefits:** - Cleaner header file - easier to navigate class interface - Faster header parsing during compilation - Non-template code properly belongs in .cc per C++ best practices - Establishes pattern for Phase 2 (template methods with explicit instantiation) **Testing:** - All unit tests pass (20 tests including timesamples_test) - No regressions in functionality - tusdcat builds and executes correctly with all timeSamples types **Future Work Identified:** - Phase 2: Move large template methods (~600 lines) using explicit instantiation - Could achieve additional 30% header reduction 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
20 KiB
20 KiB
Refactoring Opportunities
This document outlines potential areas for refactoring in the TinyUSDZ codebase to improve maintainability, readability, and extensibility.
Timesamples Module (src/timesamples.* and src/timesamples-pprint.*)
Summary of Refactoring Opportunities
The timesamples module contains several areas where code duplication and complexity could be reduced through refactoring:
1. ✅ COMPLETED: Consolidate POD Type Metadata and Handling
- Files:
src/timesamples.cc:203-429(get_samples_converted),src/timesamples.cc:432-498(get_element_size) - Status: Completed (2025-01-18)
- Solution Implemented:
- Created centralized
TINYUSDZ_POD_TYPE_LISTmacro that lists all ~40 POD types (lines 17-71) - Refactored
get_element_size()to use the type registry, reducing from ~65 lines to ~17 lines (lines 488-506) - Refactored
get_samples_converted()to use the type registry, reducing ~45 lines of type enumeration to 7 lines (lines 432-438) - All unit tests pass - functionality preserved
- Created centralized
- Impact:
- Eliminated ~100+ lines of duplicate type enumeration
- Adding new POD types now requires single entry in centralized macro
- Both functions now automatically stay in sync when types are added/removed
- Improved maintainability and reduced potential for inconsistencies
2. ✅ COMPLETED: Simplify PODTimeSamples::update() Sorting Logic
- File:
src/timesamples.cc:73-226 - Status: Completed (2025-01-18)
- Solution Implemented:
- Extracted three sorting strategies into separate helper functions (lines 77-180):
create_sort_indices()- Creates sorted index arraysort_with_offsets()- Strategy 1: Offset-backed sortingsort_with_compact_values()- Strategy 2: Legacy compact value storagesort_minimal()- Strategy 3: Minimal sorting (times + blocked flags only)
- Simplified
update()method to clean dispatch logic (lines 182-226) - Each strategy is now testable in isolation
- Extracted three sorting strategies into separate helper functions (lines 77-180):
- Impact:
- Improved code clarity - each sorting strategy is self-contained
- Reduced cognitive complexity of main update() method
- Easier to maintain and debug individual sorting paths
- Better separation of concerns
3. Refactor Repetitive add_* Methods
- Files:
src/timesamples.hh:198-300 - Opportunity: The
add_sample,add_array_sample, andadd_typed_array_samplemethods repeat the same underlying type checks, offset initialization, and error handling. A common template or base implementation could reduce duplication. - Pattern Found: Each method performs:
- Type ID validation
- Offset table initialization on first non-blocked sample
- Buffer resizing
- Similar error message construction
4. ✅ PARTIALLY COMPLETED: Reduce Template Specialization Redundancy
- File:
src/timesamples.cc - Status: Partially Completed (2025-01-18)
- Solution Implemented:
- Refactored
PODTimeSamples::add_sampleinstantiations (lines 732-794):- Before: 48 manual template instantiations (~68 lines)
- After: Macro-based generator with explicit list (~63 lines, but more maintainable)
- Uses
INSTANTIATE_ADD_SAMPLEmacro to reduce boilerplate
- Refactored
PODTimeSamples::add_typed_array_sampleinstantiations (lines 833-852):- Before: 21 manual instantiations (~21 lines)
- After: Macro-based generator using
TINYUSDZ_POD_TYPE_LIST+ 6 matrix types (~14 lines) - Reduction: ~33% fewer lines
- Refactored
- Impact:
- Reduced boilerplate for template instantiations
- Consistent pattern using centralized type registry where possible
- Easier to add new types that support TypedArray
- Remaining Work:
TypedTimeSamples::get()instantiations (140+ lines) could benefit from similar treatment- However, these include many non-POD types (vectors, strings, etc.) making macro generation complex
5. ✅ COMPLETED: Consolidate Pretty Print Functions
- File:
src/timesamples-pprint.cc - Status: Completed (2025-01-18)
- Solution Implemented:
- Created
OutputAdapterabstraction to unify string and StreamWriter output (lines 26-62) - Implemented unified
print_typeandprint_vectortemplates using SFINAE (lines 116-226) - Added type traits system with
is_value_typetemplate for compile-time type detection (lines 64-113) - Reduced both
pprint_pod_value_by_typefunctions from ~150 lines each to 4 lines each (lines 1382-1393) - Disabled 600+ lines of legacy print functions (wrapped in
#if 0block for future cleanup)
- Created
- Impact:
- Eliminated ~370 lines of duplicate switch statements
- All unit tests pass - functionality preserved
- Adding new types now requires single entry in dispatch table
6. ✅ COMPLETED: Unify Type Dispatch Mechanisms
- File:
src/timesamples-pprint.cc(completed for this file) - Status: Partially completed -
timesamples-pprint.ccdone (2025-01-18),timesamples.ccstill pending - Solution Implemented:
- Created centralized
print_pod_value_dispatchfunction using macro-based dispatch (lines 250-330) - Implemented
DISPATCH_POD_TYPE,DISPATCH_VALUE_TYPE, andDISPATCH_VECTOR_TYPEmacros - Handles 60+ type cases uniformly through single switch statement
- Uses adapter pattern to route both string and StreamWriter output through same dispatch logic
- Created centralized
- Impact:
- Reduced code duplication significantly
- Improved maintainability and extensibility
- Type dispatch now centralized and consistent
- Remaining Work:
src/timesamples.ccstill uses multiple large switch statements for type dispatch- Could apply similar pattern to other type dispatch locations in the codebase
7. Extract Common Buffer Management Logic
- Files:
src/timesamples.hh,src/timesamples.cc - Opportunity: The PODTimeSamples class manages several parallel buffers (_times, _values, _blocked, _offsets) with complex synchronization requirements. Extract a BufferManager class to handle:
- Coordinated resizing
- Offset management
- Dirty tracking
- Memory estimation
8. Simplify TimeSamples/PODTimeSamples Interaction
- Files:
src/timesamples.hh - Opportunity: The TimeSamples class wraps PODTimeSamples for POD types but maintains its own storage for non-POD types. This dual-storage approach leads to:
- Complex conditional logic throughout the API
- Duplication of methods between the two classes
- Potential for inconsistencies
- Solution: Consider a unified storage approach or clearer separation of responsibilities
C++ Core (src directory)
1. Consolidate File Type Detection
- File:
src/tinyusdz.cc - Opportunity: The
LoadUSDFromMemoryfunction contains repetitive code for detecting USDA, USDC, and USDZ file types. This logic can be centralized to reduce duplication. Similarly,LoadUSDZFromMemoryandLoadUSDZFromFilehave duplicated logic that could be shared.
2. Refactor Large if-else Chains
- Files:
src/usda-reader.cc,src/usdc-reader.cc - Opportunity: The
ReconstructPrimFromTypeNamefunctions in both the USDA and USDC readers are implemented as largeif-elsechains. This makes them difficult to maintain and extend. Refactoring this to use a map of function pointers or a similar factory pattern would be beneficial.
3. Decompose Large Functions
- Files:
src/usda-reader.cc,src/usdc-reader.cc,src/tydra/render-data.cc - Opportunity: Several functions are overly long and complex.
- In
usda-reader.ccandusdc-reader.cc, functions likeParseProperty,ParsePrimSpec, andReconstructPrimMetacould be broken down into smaller, more focused functions. - In
tydra/render-data.cc, theTriangulatePolygonfunction is a candidate for simplification and decomposition.
- In
4. Generalize Template Specializations
- File:
src/tydra/scene-access.cc - Opportunity: The
GetPrimPropertyandToPropertytemplate specializations contain a lot of repeated code. A more generic, template-based approach could reduce this duplication.
5. [Moved to Timesamples Module Section]
- See "Timesamples Module" section above for comprehensive refactoring opportunities for POD type metadata centralization.
6. [Moved to Timesamples Module Section]
- See "Timesamples Module" section above for comprehensive refactoring opportunities for PODTimeSamples sorting paths.
7. [Moved to Timesamples Module Section]
- See "Timesamples Module" section above for comprehensive refactoring opportunities for type/offset guards in POD samples.
8. ✅ COMPLETED: Fix TimeSamples Copy/Move Semantics for POD Value Preservation
- Files:
src/timesamples.hh:1147-1277(copy/move constructors and assignment operators) - Status: Completed (2025-10-26)
- Problem Statement:
- When parsing USDA files with timeSamples, scalar POD values (float, double, int) were appearing as empty/null in output
- Example:
float value.timeSamples = { 0: VALUE_PPRINT: TODO: (type: null), 1: VALUE_PPRINT: TODO: (type: null) } - This occurred specifically with USDA parsing while USDC (binary) format worked correctly
- The issue affected both the official test file
tests/usda/timesamples-array-001.usdaand custom test files
- Root Cause Analysis:
- The
_small_valuesmember (mutable vector<uint64_t>) stores POD scalar samples in compressed form during parsing - The copy assignment operator was missing the
_small_values = other._small_values;statement - When TimeSamples objects were assigned to attributes during construction, the POD values were lost
- Additionally, member initialization order in copy/move constructors didn't match class declaration order
- The
- Solution Implemented:
- Copy Assignment Operator (line 1262): Added missing
_small_values = other._small_values;statementTimeSamples& operator=(const TimeSamples& other) { if (this != &other) { _samples = other._samples; _times = other._times; _blocked = other._blocked; _small_values = other._small_values; // CRITICAL FIX: was missing! _values = other._values; _offsets = other._offsets; // ... rest of implementation } return *this; } - Copy Constructor (lines 1213-1229): Reordered member initialization list to match class declaration order
- Changed from:
..., _pod_samples(other._pod_samples), _small_values(other._small_values) - To:
..., _small_values(other._small_values), ..., _pod_samples(other._pod_samples)
- Changed from:
- Move Constructor (lines 1147-1163): Reordered member initialization list to match class declaration order
- Changed from:
..., _pod_samples(std::move(other._pod_samples)), _small_values(std::move(other._small_values)) - To:
..., _small_values(std::move(other._small_values)), ..., _pod_samples(std::move(other._pod_samples))
- Changed from:
- Class Member Declaration Order (lines 2655-2678): Verified order is:
_times,_blocked,_small_values,_values(core storage)_offsets(offset management)_type_id,_use_pod,_is_array, etc. (type metadata)_dirty,_dirty_start,_dirty_end(dirty tracking)_pod_samples(POD storage wrapper)
- Copy Assignment Operator (line 1262): Added missing
- Test Results:
- ✅ Simple float timeSamples:
float value.timeSamples = { 0: 1.5, 1: 2.5 }- WORKING - ✅ Double timeSamples:
double value.timeSamples = { 0: 1.123456, 1: 2.987654 }- WORKING - ✅ Integer timeSamples:
int value.timeSamples = { 0: 42, 1: 99 }- WORKING - ✅ Float array timeSamples:
float[] timeSamples = { 0: [1, 2, 3], 1: [4, 5, 6] }- WORKING - ✅ Double array timeSamples:
double[] timeSamples = { 0: [1.1, 2.2], 1: [3.3, 4.4] }- WORKING - ✅ Official test file:
tests/usda/timesamples-array-001.usda- XformOp and array timeSamples WORKING
- ✅ Simple float timeSamples:
- Known Limitations (Fixed in Part 2):
Bool and vector type timeSamples (e.g., float3) still show as null✅ FIXEDThese types don't use the unified POD storage path that this fix addresses✅ FIXEDResolution requires separate type reconstruction logic in✅ IMPLEMENTEDget_samples()methodOut of scope for this fix (would require significant changes to type handling)✅ COMPLETED
- Impact:
- ✅ Primary objective achieved: scalar POD types now correctly preserved through copy/move operations
- ✅ All unit tests pass with the fix in place
- ✅ USDA parsing now produces correct output matching USDC format behavior
- Improved robustness of C++ object semantics by ensuring all members are properly transferred
- Additional Fixes (2025-10-26 - Part 2):
- Bool Type Support in get_samples(): Added case for bool type reconstruction (line 1988-1992)
- Bool values stored in
_small_valuesnow correctly reconstructed - Values show as 0/1 in output (could be improved to show true/false)
- Bool values stored in
- Vector Type Support (float3, point3f, color3f) in get_samples(): Added handling for types > 8 bytes (lines 1998-2038)
- Types larger than 8 bytes are stored in
_valuesbuffer with offsets in_offsets - Added reconstruction logic for float3, point3f, and color3f types
- Properly decodes offset using
OFFSET_VALUE_MASKto retrieve data from_valuesbuffer
- Types larger than 8 bytes are stored in
- Comprehensive Test Results:
- ✅ bool: Working (shows 0/1)
- ✅ float, double, int: Working with correct values
- ✅ float3: Working with correct tuple values
- ✅ point3f: Working with correct tuple values
- ✅ color3f: Working with correct tuple values
- ✅ float[], double[]: Working with correct array values
- ✅ Official test file
timesamples-array-001.usda: All timeSamples working correctly
- Bool Type Support in get_samples(): Added case for bool type reconstruction (line 1988-1992)
9. ✅ COMPLETED: Reduce Header File Complexity via Implementation Separation
- Files:
src/timesamples.hh(3,388 → 3,233 lines),src/timesamples.cc(1,325 → 1,503 lines) - Status: Completed (2025-10-27)
- Problem Statement:
- The
timesamples.hhheader file was becoming increasingly large (3,388 lines) with multiple non-template method implementations inlined - This increased compilation time and made the header harder to navigate
- Non-template methods (
TimeSamplesconstructors, assignment operators,clear(),init()) were candidates for moving to the implementation file
- The
- Solution Implemented:
-
Moved Constructor/Assignment Operator Implementations (6 methods):
- Move constructor (
TimeSamples(TimeSamples&&) noexcept) - 28 lines - Move assignment operator (
operator=(TimeSamples&&) noexcept) - 33 lines - Copy constructor (
TimeSamples(const TimeSamples&)) - 23 lines - Copy assignment operator (
operator=(const TimeSamples&)) - 28 lines clear()method - 18 linesinit(uint32_t)method - 17 lines- Total lines moved: ~147 lines
- Move constructor (
-
Namespace Organization (Critical Fix):
- Initial build failed with:
error: '_type_id' was not declared in this scope - Root cause: Implementations were outside the
valuenamespace, unable to access private members - Solution: Wrapped all moved implementations in
namespace value { }block - All implementations now properly scoped within the class's namespace
- Initial build failed with:
-
File Organization:
- Added clear section header in timesamples.cc:
// ============================================================================ // TimeSamples Implementation // ============================================================================ namespace value { // Implementation code... } // namespace value } // namespace tinyusdz
- Added clear section header in timesamples.cc:
-
- Code Changes Detail:
-
In timesamples.hh: Replaced inline implementations with declarations only
// Before (inline implementation ~147 lines total): TimeSamples(TimeSamples&& other) noexcept { // implementation... } // After (declaration only): TimeSamples(TimeSamples&& other) noexcept; -
In timesamples.cc: Added corresponding implementations wrapped in namespace
namespace value { TimeSamples::TimeSamples(TimeSamples&& other) noexcept : _samples(std::move(other._samples)), _times(std::move(other._times)), // ... full initialization list ... { // Reset moved-from object to valid state other._dirty = false; other._dirty_start = 0; other._dirty_end = 0; } // ... other implementations ... } // namespace value } // namespace tinyusdz
-
- Metrics:
- Header reduction: 3,388 → 3,233 lines (155 lines, 4.6% reduction)
- Implementation growth: 1,325 → 1,503 lines (178 lines added for implementations + comments)
- Compilation impact: Reduces header bloat without significant binary size impact
- Build & Test Results:
- ✅ Full build completed successfully:
[100%] Built target unit-test-tinyusdz - ✅ All unit tests passing (20 tests, including timesamples_test)
- ✅ No regressions in functionality
- ✅ tusdcat binary builds and executes correctly
- ✅ Complex timeSamples (bool, vector, array types) still working correctly
- ✅ Full build completed successfully:
- Impact:
- Cleaner header file - easier to navigate class interface
- Faster header parsing during compilation
- Non-template code properly belongs in .cc file per C++ best practices
- Establishes pattern for future refactoring (additional ~600 lines of template methods could follow similar pattern with explicit instantiation)
- Known Limitations & Future Work:
- Additional refactoring candidates identified (Phase 2):
get_typed_array_at_time()(95 lines) - could move with explicit template instantiationget_typed_array_at()(89 lines) - could move with explicit template instantiation- Additional PODTimeSamples methods (~200+ lines) - candidates for similar treatment
- These would require explicit template instantiation pattern similar to
INSTANTIATE_ADD_SAMPLEmacro already in use - Phase 2 could achieve additional 30% header reduction (~600+ more lines)
- Additional refactoring candidates identified (Phase 2):
- Pattern for Continuing Refactoring:
- For template methods: Use explicit template instantiation in .cc file
- For non-template methods: Simply move implementation as demonstrated
- Always maintain proper namespace scoping around implementations
- Update .hh to contain only declarations, .cc contains implementations
10. Generic Index Accessors
- File:
src/crate-reader.cc:141 - Opportunity:
GetField,GetToken,GetPath,GetElementPath, andGetPathStringall share the same bounds-check pattern. A templatedlookup_optional(vector, Index)(with optional logging hook) would remove boilerplate and centralize future diagnostics.
JavaScript/WASM Bindings (web directory)
1. Modularize Emscripten Bindings
- File:
web/binding.cc - Opportunity: This is a very large file containing all Emscripten bindings. It should be split into multiple files based on functionality (e.g.,
stage_bindings.cc,scene_bindings.cc,asset_bindings.cc) to improve organization and build times.
2. Refactor TinyUSDZLoaderNative
- File:
web/binding.cc - Opportunity: The
TinyUSDZLoaderNativeclass has too many responsibilities. The asset caching and streaming logic, for example, could be extracted into a separateAssetManagerclass.
3. Consolidate JavaScript Loading Logic
- File:
web/js/src/tinyusdz/TinyUSDZLoader.js - Opportunity: The
loadandloadAsLayermethods share a lot of similar logic. This could be consolidated into a common internal loading function.
4. Data-Driven Material Conversion
- File:
web/js/src/tinyusdz/TinyUSDZLoaderUtils.js - Opportunity: The
convertUsdMaterialToMeshPhysicalMaterialfunction, which maps USD material properties to Three.js material properties, could be refactored to be more data-driven. Using a mapping table or a similar approach would make it easier to add or modify material property mappings.