[otap-df-quiver] Add Write-Ahead-Log (WAL) implementation to Quiver #1537

AaronRM · 2025-12-05T21:52:04Z

This pull request introduces an implementation of the Quiver write-ahead log (WAL). The most significant change from the initial spec includes a rewrite and clarification of the WAL file rotation and checkpointing mechanism. Documentation has been updated to reflect the new design.

…mentations

rust/otap-dataflow/deny.toml

rust/otap-dataflow/crates/quiver/src/wal/writer.rs

lalitb · 2025-12-06T21:01:35Z

rust/otap-dataflow/crates/quiver/src/wal/reader.rs

+        }
+
+        let entry_len = u32::from_le_bytes(len_buf) as usize;
+        self.buffer.resize(entry_len, 0);


Here, the reader is trusting the length got from the file, and doing the allocation. In case of the WAL file is corrupted/or malicious attack, the length is large enough ( say 0xFFFF..), the reader will try allocate that size. While the 4 bytes will limit the allocation to 4GB, all df_instance doing this allocation can result in OOM crash. Should we have some kind of max limit check (say WAL size won't be more than 64MB) ?

Yes, good point. The default rotation target size for a single file is 64MB. I added an upper bound of 256MB for the rotation target file size which is also used by the reader as an upper bound for an entry size.

Additionally, when we are reading the file, I added validation to ensure entry_len doesn't exceed the remaining file length to guard against this corruption/attack scenario.

rust/otap-dataflow/crates/quiver/src/engine.rs

rust/otap-dataflow/crates/quiver/src/wal/writer.rs

lalitb · 2025-12-06T21:27:54Z

rust/otap-dataflow/crates/quiver/src/wal/writer.rs

+            path,
+            segment_cfg_hash,
+            flush_policy,
+            max_wal_size: u64::MAX,


More of a question - what will happen if we have reached the limit of 8 wal rotated files of 64MB each, while this max_wal_size limit is still not reached?

There are two different size enforcement mechanisms:

rotation_target_bytes x max_rotated_files

max_wal_size

If we reach the limit for (1), then we return a WalAtCapacity error and (2) is not enforced. (1) is for controlling rotation behavior and limiting the amount of file descriptors used. (2) is for global size enforcement. (We could potentially simplify and eliminate (2), but will leave in for now.)

See tests: wal_writer_errors_when_rotated_file_cap_reached and wal_writer_preflight_rejects_when_rotated_file_cap_hit.

lalitb · 2025-12-06T21:43:37Z

rust/otap-dataflow/crates/quiver/src/wal/writer.rs

+            }
+        }
+        Ok(highest.map_or(0, |seq| seq.wrapping_add(1)))
+    }


detect_next_sequence() scans all entries in all WAL files on startup. For the default ( 64MB * 8 = 512MB ) capacity this is fine, but would it make sense to persist the last sequence in the checkpoint sidecar for faster recovery?

Better yet, detect_next_sequence() doesn't need to scan all the rotated WAL files. It should only be looking at the active file (or the single most recent rotated file if the active file doesn't exist). Measuring this locally and time is on the order of ~30ms with a 64MB file, so doesn't seem worth putting in the sidecar.

Measuring this locally and time is on the order of ~30ms with a 64MB file, so doesn't seem worth putting in the sidecar.

Agreed on 30ms for typical cases, though edge devices with slower storage could see higher times. For now, should be good to scan only the active file starting from the checkpoint offset (as we read sidecar before wal) rather than from the beginning. We can revisit persisting the sequence number if startup time becomes an issue in production.

rust/otap-dataflow/crates/quiver/src/engine.rs

rust/otap-dataflow/crates/quiver/src/wal/checkpoint_sidecar.rs

rust/otap-dataflow/crates/quiver/src/wal/mod.rs

…tive_file()/trim_partial_entries()

…izes against remaining file bytes before allocation

rust/otap-dataflow/crates/quiver/src/wal/writer.rs

…ious/corrupted length values when reading

rust/otap-dataflow/crates/quiver/src/wal/writer.rs

albertlockett

LGTM @AaronRM !

github-project-automation bot added this to OTel-Arrow Dec 5, 2025

github-actions bot added the rust Pull requests that update Rust code label Dec 5, 2025

AaronRM added 28 commits December 5, 2025 13:52

Remove stray README.md

3404624

Add initial WAL framing and writer

047f7a2

Refactor WAL contants to mod level; initial WAL reader

6f1f630

Initial engine + WAL integration

4018dbd

Add additional test coverage, cross-cutting WAL tests

86c9ea0

Address clippy errors

5157167

Add copyright headers

d20b541

Add missing copyright header line

b9a90a5

Add error injection plumbing to reader tests

ff109fa

Add header tests for additional line coverage

48c90da

Add tests for cases around header/config mismatch.

febe725

Add truncate_to to WalWriter; tests for recovery and error condition.

da7fc58

Add flush on drop and when bytes have exceeded max_unflushed_bytes.

2ebdd88

Add cross segment iteration tests

25825d5

Add tests for 64 bitmap slots, large RecordBundles

38ec8f2

Update ARCHITECTURE.md to clarify the CRC algorithm.

7347361

Add test + implementation for flush syncing data for durability

aeecbbb

Add truncate.offset implementation

948d465

Add tests and implementation for Prefix Reclamation (Hole Punching)

2ca8053

Add rotation_target_bytes; additional rotation & cap tests

fe9a7b0

Fix doctest to write to tempdir

7fbb0c0

Add support for safe offset boundaries

80c8c8b

Add additional comments to key methods in the WAL reader/writer imple…

91a28e6

…mentations

Add tests for crash/resume scenarios

699dddd

Fix warning about unused variable

9206451

Formatting

5e00da6

Address clippy errors, formatting

67c01ed

Reload rotated chunks on WalWriter restart

31f8065

AaronRM added 2 commits December 5, 2025 13:57

Remove non-ASCII chars

d693425

Add BSD-2-Clause to allowed licenses

d6096d2

AaronRM commented Dec 5, 2025

View reviewed changes

rust/otap-dataflow/deny.toml Show resolved Hide resolved

AaronRM marked this pull request as ready for review December 5, 2025 22:57

AaronRM requested a review from a team as a code owner December 5, 2025 22:57

lalitb reviewed Dec 6, 2025

View reviewed changes

rust/otap-dataflow/crates/quiver/src/wal/writer.rs Outdated Show resolved Hide resolved

lalitb reviewed Dec 6, 2025

View reviewed changes

rust/otap-dataflow/crates/quiver/src/engine.rs Show resolved Hide resolved

lalitb reviewed Dec 6, 2025

View reviewed changes

rust/otap-dataflow/crates/quiver/src/wal/writer.rs Outdated Show resolved Hide resolved

lalitb reviewed Dec 6, 2025

View reviewed changes

albertlockett reviewed Dec 8, 2025

View reviewed changes

rust/otap-dataflow/crates/quiver/src/engine.rs Outdated Show resolved Hide resolved

albertlockett reviewed Dec 8, 2025

View reviewed changes

rust/otap-dataflow/crates/quiver/src/wal/checkpoint_sidecar.rs Show resolved Hide resolved

Switch segment_cfg_hash to use a placeholder value

2c5ccf1

albertlockett reviewed Dec 8, 2025

View reviewed changes

rust/otap-dataflow/crates/quiver/src/wal/mod.rs Show resolved Hide resolved

AaronRM added 3 commits December 8, 2025 10:30

Truncate partial/corrupted entries on WalWriter::open; Remove trim_ac…

6715300

…tive_file()/trim_partial_entries()

Added doc comments for 'magic values'

4f36b50

Guard against corrupted/malicious length values by validating entry s…

de6e453

…izes against remaining file bytes before allocation

albertlockett reviewed Dec 8, 2025

View reviewed changes

rust/otap-dataflow/crates/quiver/src/wal/writer.rs Outdated Show resolved Hide resolved

Set upper bound on rotation target to 256MB, add validation for malic…

4024ae8

…ious/corrupted length values when reading

jmacd approved these changes Dec 8, 2025

View reviewed changes

albertlockett reviewed Dec 8, 2025

View reviewed changes

rust/otap-dataflow/crates/quiver/src/wal/writer.rs Show resolved Hide resolved

albertlockett approved these changes Dec 8, 2025

View reviewed changes

AaronRM added 2 commits December 8, 2025 11:48

Eliminate temporary Vec<EncodedSlot> during RecordBundle writing to WAL

30a744f

Formatting

9d38507

jmacd added this pull request to the merge queue Dec 8, 2025

Merged via the queue into open-telemetry:main with commit 8d48e9f Dec 8, 2025
33 checks passed

github-project-automation bot moved this to Done in OTel-Arrow Dec 8, 2025

AaronRM deleted the quiver-wal branch December 8, 2025 21:23

[otap-df-quiver] Add Write-Ahead-Log (WAL) implementation to Quiver #1537

[otap-df-quiver] Add Write-Ahead-Log (WAL) implementation to Quiver #1537

Conversation

AaronRM commented Dec 5, 2025

Uh oh!

Uh oh!

Uh oh!

lalitb Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AaronRM Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lalitb Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

AaronRM Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

lalitb Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AaronRM Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

lalitb Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

albertlockett left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lalitb Dec 6, 2025 •

edited

Loading

lalitb Dec 6, 2025 •

edited

Loading