perf: optimize scale finding algorithm from O(n) to O(1) #9

pnodet · 2025-09-01T16:10:52Z

Summary

Replaced inefficient linear search algorithm with direct mathematical calculation
Achieved up to 447x performance improvement for large coordinate values
Maintains functional equivalence with comprehensive test coverage

Changes

Algorithm optimization: Replaced O(n) linear search (incrementing by 0.0001) with O(1) direct calculation
Early return optimization: Added fast path for small values that work with minimum scale
Comprehensive testing: Added unit tests covering edge cases and boundary conditions
Performance benchmarking: Added criterion benchmarks to measure improvements

Performance Improvements

Benchmark results using Criterion 0.5:

Input Size	Old Algorithm	New Algorithm	Speedup
Small (1000)	631.60 ps	491.79 ps	1.3x
Medium (2.15e9)	7.50 µs	822.42 ps	9.1x
Large (1e10)	35.85 µs	838.39 ps	42.8x
Very Large (1e11)	366.85 µs	820.28 ps	447.3x

Testing

✅ All existing tests pass
✅ Added 5 new unit tests for edge cases
✅ Criterion benchmarks demonstrate performance gains
✅ Algorithm produces identical scales to original implementation

The optimization provides massive performance benefits especially for large coordinate values commonly found in real-world point cloud data.

Summary by CodeRabbit

New Features
- None
Refactor
- Reworked coordinate scaling to a deterministic calculation with fixed precision, improving consistency and performance when writing LAS files.
Bug Fixes
- Increased reliability near integer bounds and for negative coordinates by ensuring scaled values stay within valid ranges.
Tests
- Added comprehensive unit tests for scaling logic and edge cases.
Chores
- Adjusted release build settings to favor runtime performance (single codegen unit).
- Minor Cargo configuration housekeeping.

coderabbitai · 2025-09-01T16:10:59Z

Walkthrough

Replaced an iterative scale search in src/get_las_writer.rs with a closed-form calculation using MIN_SCALE and QUANTUM. Added unit tests for the scale function. Updated Cargo.toml to include an empty [dev-dependencies] section and set codegen-units = 1 under [profile.release].

Changes

Cohort / File(s)	Summary
Build and profiles `Cargo.toml`	Added empty `[dev-dependencies]`. Set `[profile.release]` `codegen-units = 1`. No other dependency/profile changes.
Scale computation logic and tests `src/get_las_writer.rs`	Replaced loop-based smallest scale search with closed-form computation using `MIN_SCALE = 0.001`, `QUANTUM = 1e-4`. Ensures scale ≥ min and quantized to 4 decimals. Added unit tests covering small/large/boundary/negative cases and precision. No public API changes.

Sequence Diagram(s)

sequenceDiagram
    actor Caller
    participant Writer as get_las_writer
    participant Scale as find_smallest_scale

    Caller->>Writer: request LAS writer for value x
    activate Writer
    Writer->>Scale: compute smallest_scale(x)
    activate Scale
    alt |x| <= i32::MAX * MIN_SCALE
        Scale-->>Writer: return MIN_SCALE
    else |x| > threshold
        Scale-->>Writer: return ceil((|x|/i32::MAX)/QUANTUM)*QUANTUM
    end
    deactivate Scale
    Writer-->>Caller: constructed writer with scale
    deactivate Writer

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I nibbled the loops to a crisp, you see,
Swapped churn for a neat little formula spree.
With MINs and QUANTUMs I hop in line,
Four tiny decimals—precise and fine.
Cargo packed tighter, codegen set to one—
A hare’s quick fix, and off I run! 🥕🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title clearly and concisely summarizes the primary change: replacing an O(n) scale-finding algorithm with an O(1) approach for a performance improvement. It is specific (mentions algorithmic complexity and perf), relates directly to the PR description, tests, and benchmarks, and contains no noisy or irrelevant wording.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch optimize-scale-finding

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 45b758e and d2c68fa.

📒 Files selected for processing (2)

Cargo.toml (1 hunks)
src/get_las_writer.rs (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/get_las_writer.rs
Cargo.toml

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (8)

Cargo.toml (1)

36-38: Bench target config: consider minor QoL tweaks

Optional:

Enable HTML reports for clearer output.

Name the group consistently with the file.

Suggested change:
 [[bench]]
 name = "scale_finding"
 harness = false
+## optional: nicer reports
+# criterion is dev-only; enabling reports won’t affect the library
+# run with: cargo bench

src/get_las_writer.rs (4)

15-24: Replace magic 10000.0 with a quantization constant; keep math identical

Improves readability and keeps benches/tests in sync with the same quantum.

-    let theoretical_min = x.abs() / f64::from(i32::MAX);
-    
-    let scale = (theoretical_min * 10000.0).ceil() / 10000.0;
-    
-    scale.max(MIN_SCALE)
+    let theoretical_min = x.abs() / f64::from(i32::MAX);
+    // Quantize to 0.0001 steps, matching the legacy increment.
+    const QUANTUM: f64 = 1e-4;
+    let scale = ((theoretical_min / QUANTUM).ceil()) * QUANTUM;
+    scale.max(MIN_SCALE)

60-71: Tests: avoid repeating the magic 0.001 and tie to a single constant

Hoist MIN_SCALE to module scope so tests and impl share the same value, preventing drift.

Patch (also updates the function to use the shared constant):

-use crate::LasVersion;
+use crate::LasVersion;
+
+const MIN_SCALE: f64 = 0.001;
@@
-fn find_smallest_scale(x: f64) -> f64 {
-    const MIN_SCALE: f64 = 0.001;
+fn find_smallest_scale(x: f64) -> f64 {
@@
-        assert_eq!(find_smallest_scale(0.0), 0.001);
-        assert_eq!(find_smallest_scale(1.0), 0.001);
-        assert_eq!(find_smallest_scale(1000.0), 0.001);
-        assert_eq!(find_smallest_scale(-1000.0), 0.001);
+        assert_eq!(find_smallest_scale(0.0), MIN_SCALE);
+        assert_eq!(find_smallest_scale(1.0), MIN_SCALE);
+        assert_eq!(find_smallest_scale(1000.0), MIN_SCALE);
+        assert_eq!(find_smallest_scale(-1000.0), MIN_SCALE);
@@
-        let max_with_min_scale = f64::from(i32::MAX) * 0.001;
-        assert_eq!(find_smallest_scale(max_with_min_scale), 0.001);
+        let max_with_min_scale = f64::from(i32::MAX) * MIN_SCALE;
+        assert_eq!(find_smallest_scale(max_with_min_scale), MIN_SCALE);

89-99: Boundary test: add a “half-step” case to pin rounding behavior

This catches off-by-one due to .round() semantics near the threshold.

     #[test]
     fn test_find_smallest_scale_boundary() {
         // Test value just above the threshold
-        let just_above = f64::from(i32::MAX) * 0.001 + 1.0;
+        let just_above = f64::from(i32::MAX) * MIN_SCALE + 1.0;
         let scale = find_smallest_scale(just_above);
-        assert!(scale > 0.001);
+        assert!(scale > MIN_SCALE);
         
         // Verify it still works
         let scaled = (just_above / scale).round();
         assert!(scaled <= f64::from(i32::MAX));
+
+        // Half-step close to the threshold
+        let half_step = f64::from(i32::MAX) * MIN_SCALE + (MIN_SCALE / 2.0);
+        let s2 = find_smallest_scale(half_step);
+        let scaled2 = (half_step / s2).round();
+        assert!(scaled2 <= f64::from(i32::MAX));
     }

113-124: Precision test: include negatives and assert “scale >= theoretical_min”

Strengthens invariants and symmetry.

-        let test_values = [2.15e9, 3.7e9, 5.5e9, 1e11];
+        let test_values = [2.15e9, 3.7e9, 5.5e9, 1e11, -2.15e9, -1e11];
@@
             let scale = find_smallest_scale(value);
             // Check that scale has at most 4 decimal places
             let multiplied = scale * 10000.0;
             assert!((multiplied - multiplied.round()).abs() < 1e-10);
+            // And that it meets the theoretical lower bound
+            let theoretical_min = value.abs() / f64::from(i32::MAX);
+            assert!(scale + 1e-12 >= theoretical_min);

benches/scale_finding.rs (3)

17-26: Keep constants/quantization aligned with src to avoid drift

Define MIN_SCALE/QUANTUM once in this bench file and reuse in both impls for readability and consistency.

-// New optimized implementation
-fn find_smallest_scale_new(x: f64) -> f64 {
-    const MIN_SCALE: f64 = 0.001;
+const MIN_SCALE: f64 = 0.001;
+const QUANTUM: f64 = 1e-4;
+
+// New optimized implementation
+fn find_smallest_scale_new(x: f64) -> f64 {
     if x.abs() <= f64::from(i32::MAX) * MIN_SCALE {
         return MIN_SCALE;
     }
 
     let theoretical_min = x.abs() / f64::from(i32::MAX);
-    let scale = (theoretical_min * 10000.0).ceil() / 10000.0;
+    let scale = ((theoretical_min / QUANTUM).ceil()) * QUANTUM;
     scale.max(MIN_SCALE)
 }

28-53: Broaden inputs (negatives) and assert parity once before timing

Adds quick sanity to ensure both paths match before benchmarking.

-    let test_values = [
-        ("small", 1000.0),
-        ("medium", 2.15e9),
-        ("large", 1e10),
-        ("very_large", 1e11),
-    ];
+    let test_values = [
+        ("small", 1000.0),
+        ("medium", 2.15e9),
+        ("large", 1e10),
+        ("very_large", 1e11),
+        ("neg_medium", -2.15e9),
+        ("neg_very_large", -1e11),
+    ];
@@
-    for (name, value) in test_values {
+    for (name, value) in test_values {
+        // Quick equivalence check (outside the hot loop)
+        assert!(
+            (find_smallest_scale_old(value) - find_smallest_scale_new(value)).abs() < 1e-12,
+            "old/new mismatch for {name}"
+        );
         group.bench_with_input(
             BenchmarkId::new("old", name),
             &value,
             |b, &val| b.iter(|| find_smallest_scale_old(black_box(val)))
         );

55-79: Throughput bench: add throughput metadata and reuse the same dataset for both benches

This improves report interpretability and avoids accidental optimizer bias.

 fn benchmark_throughput(c: &mut Criterion) {
-    let mut group = c.benchmark_group("throughput");
+    let mut group = c.benchmark_group("throughput");
     
-    let values: Vec<f64> = (0..1000)
+    let values: Vec<f64> = (0..1000)
         .map(|i| 1e6 * (i as f64))
         .collect();
+    group.throughput(criterion::Throughput::Elements(values.len() as u64));
     
     group.bench_function("old_algorithm", |b| {
         b.iter(|| {
-            for &val in &values {
+            for &val in black_box(&values) {
                 find_smallest_scale_old(black_box(val));
             }
         })
     });
     
     group.bench_function("new_algorithm", |b| {
         b.iter(|| {
-            for &val in &values {
+            for &val in black_box(&values) {
                 find_smallest_scale_new(black_box(val));
             }
         })
     });

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between af7284d and 45b758e.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (3)

Cargo.toml (1 hunks)
benches/scale_finding.rs (1 hunks)
src/get_las_writer.rs (2 hunks)

🔇 Additional comments (3)

Cargo.toml (1)

28-30: Dev-only Criterion addition looks good

Scoped under [dev-dependencies]; won’t bloat release builds.
src/get_las_writer.rs (1)
10-13: Add finite-input guard to find_smallest_scale (optional)

If max_cartesian can be NaN or infinite (e.g. from corrupt E57 data), the current fast-path returns NaN/∞ scale. Consider adding at the top of find_smallest_scale:
 fn find_smallest_scale(x: f64) -> f64 {
     const MIN_SCALE: f64 = 0.001;
+    debug_assert!(x.is_finite(), "max_cartesian must be finite");
+    if !x.is_finite() {
+        return MIN_SCALE;
+    }
     if x.abs() <= f64::from(i32::MAX) * MIN_SCALE {
         return MIN_SCALE;
     }
Confirm whether the E57 reader always yields finite coordinates; if it does, this guard can be omitted.
benches/scale_finding.rs (1)

4-14: Old baseline retained for comparison — OK

Keeping the slow path here is useful for parity checks and clarity in benches.

- Replace linear search with direct mathematical calculation - Add early return for small values that work with minimum scale - Calculate theoretical minimum scale based on i32 bounds - Add criterion benchmarks showing up to 447x speedup - Maintain functional equivalence with comprehensive tests

- Replace magic numbers with named constants (MIN_SCALE, QUANTUM) - Add helpful comments to Cargo.toml for benchmark usage - Improve test coverage with negative values and half-step boundary cases - Add theoretical minimum assertions to validate scale correctness - Align constants between implementation and benchmarks - Add throughput metadata to benchmarks for better reporting - Add equivalence checks in benchmarks to ensure parity

pnodet requested a review from pnwatin September 1, 2025 16:12

coderabbitai bot reviewed Sep 1, 2025

View reviewed changes

pnodet and others added 4 commits September 17, 2025 22:12

chore: remove bench

05b9916

fix: lint

d2c68fa

pnwatin force-pushed the optimize-scale-finding branch from 2b780ce to d2c68fa Compare September 17, 2025 20:15

pnwatin approved these changes Sep 17, 2025

View reviewed changes

pnwatin merged commit f439629 into main Sep 17, 2025
2 checks passed

pnwatin deleted the optimize-scale-finding branch September 17, 2025 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: optimize scale finding algorithm from O(n) to O(1) #9

perf: optimize scale finding algorithm from O(n) to O(1) #9

Uh oh!

pnodet commented Sep 1, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 1, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

perf: optimize scale finding algorithm from O(n) to O(1) #9

perf: optimize scale finding algorithm from O(n) to O(1) #9

Uh oh!

Conversation

pnodet commented Sep 1, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Performance Improvements

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pnodet commented Sep 1, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 1, 2025 •

edited

Loading