Strategy Comparison and Selection Guide
This guide helps you choose the right strategy for your data processing needs by comparing the characteristics, use cases, and performance implications of each strategy.
Quick Strategy Selector
`rst
+------------------+----------------+--------------+---------------+------------------+---------------+
| **Use Case** | **DropCreate** | **Truncate** | **Append** | **Upsert** | **Historize** |
+==================+================+==============+===============+==================+===============+
| Full reload | ✅ | ✅ | ❌ | ❌ | ❌ |
+------------------+----------------+--------------+---------------+------------------+---------------+
| Incremental load | ❌ | ❌ | ✅ | ✅ | ✅ |
+------------------+----------------+--------------+---------------+------------------+---------------+
| Change tracking | ❌ | ❌ | ❌ | ❌ | ✅ |
+------------------+----------------+--------------+---------------+------------------+---------------+
| Schema evolution | ❌ | ✅ | ✅ | ✅ | ✅ |
+------------------+----------------+--------------+---------------+------------------+---------------+
| Duplicate | N/A | N/A | None | Update | Version |
| handling | | | | | |
+------------------+----------------+--------------+---------------+------------------+---------------+
| Performance | Fast | Fast | Fast | Medium | Slow |
+------------------+----------------+--------------+---------------+------------------+---------------+
| Storage overhead | Low | Low | Low | Low | High |
+------------------+----------------+--------------+---------------+------------------+---------------+
`
.. raw:: html
</div>
Detailed Strategy Comparison
Data Handling Characteristics
- DropCreate
Target handling: Completely recreates target (drop + create)
Source processing: All source data treated as new
Existing data: Permanently lost
Schema changes: Target inherits source schema exactly
Metadata: Uniform across all records
- Truncate
Target handling: Clears target data, preserves structure
Source processing: All source data treated as new
Existing data: Removed but structure preserved
Schema changes: Applied before loading data
Metadata: Uniform across all records
- Append
Target handling: Preserves all existing data
Source processing: All source data added to target
Existing data: Completely preserved
Schema changes: Applied without affecting existing records
Metadata: Mixed hashes (existing vs new data)
- Upsert
Target handling: Updates existing, inserts new based on keys
Source processing: Key-based matching determines insert vs update
Existing data: Updated if keys match, otherwise preserved
Schema changes: Applied before key-based operations
Metadata: Mixed hashes (updated vs unchanged records)
- Historize
Target handling: Maintains full history of changes over time
Source processing: Creates new versions for changed records
Existing data: Preserved with validity timestamps
Schema changes: Applied while maintaining historical continuity
Metadata: Complex versioning with validity periods
Performance Characteristics
Strategy |
Write Speed |
Read Speed |
Storage |
Memory |
Complexity |
|---|---|---|---|---|---|
DropCreate |
⭐⭐⭐⭐⭐ |
⭐⭐⭐⭐⭐ |
⭐⭐⭐⭐⭐ |
⭐⭐⭐⭐ |
⭐ |
Truncate |
⭐⭐⭐⭐⭐ |
⭐⭐⭐⭐⭐ |
⭐⭐⭐⭐⭐ |
⭐⭐⭐⭐ |
⭐⭐ |
Append |
⭐⭐⭐⭐ |
⭐⭐⭐⭐ |
⭐⭐⭐⭐ |
⭐⭐⭐ |
⭐⭐ |
Upsert |
⭐⭐⭐ |
⭐⭐⭐ |
⭐⭐⭐⭐ |
⭐⭐ |
⭐⭐⭐⭐ |
Historize |
⭐⭐ |
⭐⭐ |
⭐⭐ |
⭐⭐ |
⭐⭐⭐⭐⭐ |
Use Case Decision Tree
Do you need to preserve existing data?
├── NO → Do you need to preserve target structure?
│ ├── NO → Use DropCreate (fastest, complete recreation)
│ └── YES → Use Truncate (fast, preserves schema)
└── YES → Do you have identifying keys for records?
├── NO → Use Append (simple addition, no deduplication)
└── YES → Do you need to track changes over time?
├── NO → Use Upsert (update existing, insert new)
└── YES → Use Historize (full change tracking)
Real-World Scenarios
- ETL Data Warehouse Loading
Daily fact table refresh: DropCreate or Truncate
Incremental dimension updates: Upsert
Event stream processing: Append
Slowly changing dimensions: Historize
- Data Lake Operations
Raw data ingestion: Append
Curated data updates: Upsert
Historical data preservation: Historize
Data quality corrections: DropCreate
- Analytics and Reporting
Dashboard data refresh: Truncate
Audit trail maintenance: Historize
Incremental aggregations: Append or Upsert
Data mart population: DropCreate
Strategy Selection Guidelines
Choose DropCreate when: - Target schema frequently changes - Data quality issues require clean rebuilds - Processing time is not critical - Storage space is limited - Simple, predictable operations are preferred
Choose Truncate when: - Target schema is stable but data changes completely - Fast reload performance is critical - Schema evolution support is needed - Existing target structure should be preserved
Choose Append when: - All data is valuable and should be retained - Source provides only new/incremental data - Duplicate detection is handled upstream - Simple growth patterns are acceptable
Choose Upsert when: - Source contains mix of new and updated records - Business keys can identify record relationships - Current state accuracy is more important than history - Moderate complexity is acceptable for accuracy
Choose Historize when: - Regulatory compliance requires change tracking - Business analysis needs historical trends - Audit capabilities are essential - Storage costs are less important than data completeness - Complex temporal queries will be performed
Common Anti-Patterns
Avoid these combinations:
Append for dimension tables: Use Upsert instead to handle changes
DropCreate for large fact tables: Use Truncate for better performance
Historize for high-frequency data: Consider Append with custom timestamps
Upsert without proper keys: Use Append to avoid incorrect matching
Truncate with frequent schema changes: Use DropCreate for flexibility
Migration Strategies
When changing between strategies:
From DropCreate/Truncate to Append/Upsert: Requires one-time historical data load
From Append to Upsert: Requires deduplication and key identification
From Upsert to Historize: Existing data becomes initial historical state
From any strategy to DropCreate: Always safe but loses historical data
Each migration path has specific considerations for data preservation and downtime requirements.