Releases: jepsen-io/jepsen
0.3.9
A medium-sized release, this version makes several quality-of-life improvements. We (finally!) download log files only once, rather than twice at the end of a run; this should make multi-GB log downloads less painful. A new namespace, db.watchdog, supports restarting flaky DBs automatically. Generators are a little more sophisticated, and a little easier to compose. Several error messages are nicer. Enjoy!
Bugfixes
- Jepsen now downloads log files only once, rather than twice at the end of a run
 - nemesis.file: converts rational 
:probabilityvalues to floats, rather than letting corrupt-file silently mis-parse the rational. 
API Changes
New Features
- db.watchdog: automatically restarts DB nodes when they crash
 - cli/without-defaults-for: strips defaults from option specifications. This significantly streamlines using the same option spec for both 
testandtest-all. - control.util/tarball!: creates a tarball of a directory. Particularly helpful for downloading data dirs as a part of 
db/log-files. - generator/map: functions can now take 
(f op test context)for more control. - generator/shortest-any: like 
any, but ends as soon as any single generator does. Helpful when you have a fixed-length generator, and you want to mix in some other operations, but don't know for how long. 
Minor Changes
lein testnow runs integration tests by default. I'm not entirely sure about this one; will try it out and see.- When downloading log files, Jepsen tries to kill all DB nodes first. This reduces the chances of files shifting as we read them.
 - generator: many constructors now pass through 
nilgenerators intelligently, returningnilimmediately. This makes it easier to determine whether a generator will do anything before the test starts. - nemesis/node-start-stopper: re-use existing SSH connections rather than opening new ones
 - reconnect/with-conn: throws if no connection is available. This is a significantly clearer error message.
 - Improved error messages for ex-infos thrown in generators and from a test as a whole
 - nemesis.file: returns [:io-error ...] instead of throwing on IO errors. This cuts down on noise when trying to corrupt files the DB is (e.g.) deleting out from under us.
 - nemesis.combined: shorter printed representations for some generators
 - Jepsen's interrupt handler logging has more informative log lines
 - reconnect: fixed a docstring typo
 
Dependencies
- elle 0.2.4
 - jepsen.history 0.1.5
 - knossos 0.3.12
 - ring 1.14.1
 - test.chuck 0.2.15
 
0.3.8
The centerpiece of 0.3.8 is a new nemesis for corrupting files: jepsen.nemesis.file. This nemesis can be scoped to specific regions of a file, is aware of chunk structure (e.g. database pages) and can be used to snapshot, restore, shuffle, and introduce bitflips in chunks stochastically. Its faults can also be scoped to specific nodes, or striped helically around a cluster. A big thank you to Ellen Marie Dash (@duckinator), who reviewed and contributed C expertise on this project.
You'll also find a handful of new utility functions, including a fast Zipfian PRNG, and a variety of bugfixes and improvements for clock skew nemeses.
Bugfixes
- SystemD took over clock management from NTP on newer Debian installs, and has its own way to disable clock sync. We now disable both NTP and systemd's time daemon.
 nemesis/composeoffers better error messages when given an unknown:f
API Changes
nemesis.time's bump operations used to generate lots of very small clock (e.g. 10 ms) adjustments, and only a few large ones. This made it hard to find errors with large (e.g. 10+ second) clock skew. We now have a higher chance of picking a large offset.
New Features
nemesis.file: a new tool for injecting file corruption.generator/concurrency-limit, which limits the number of threads executing ops from a given generator to at mostn.jepsen.webwill try to shell out to the system'szipcommand when building zip files. This can be noticeably faster than the built-in Java zip.
Minor Changes
nemesis.time's C-compiling powers now live innemesis, where they're used bynemesis.file.util/minority: likemajority, computes just shy of half an integerutil/zipf: generates Zipfian-distributed integersutil/rand-distributioncan generate Zipfians as wellos.debiannow includesbuild-essential, rather than installing GCC on-demandchecker.perfcan now take configurable filenames for latency graphs, in case you want to spit out custom latency plots- Rate graphs now skip over empty values of 
[:f :type] - Knossos 0.3.11
 - Ring 1.13.0
 - Fipp 0.6.27
 - Small documentation fixups
 
v0.3.7
A very small bugfix release. Jepsen.history incorrectly threw IndexOutOfBoundsException when asked for an out-of-bounds index with a default value. Clojure's semantics are to return the default value, which we now match. This fixes specific kinds of destructuring bind on empty histories.
v0.3.6
This is a sizeable release. It includes a significant correctness bugfix for a rare condition that could make operations in the history print with the wrong data. It also adds a new namespace for composing databases, nemeses, and generators when working with systems where each node has a different role. Kafka-style tests gain new powers and are significantly faster. And we have the usual slew of small bugfixes, dependency bumps, and quality-of-life improvements. Happy testing!
Bugfixes
generator/fill-in-mapno longer generates Ops with duplicate fields in their record extmaps. This fixes a rare bug where operations which used extra fields could wind up with two different values for (e.g.)(:value op)vs(pprint op). It should also improve speed and size on disk.checker.perf/with-range: fix a bug causing plots with zero data points to convert the plot to a string. This was expensive if the plot is large, and caused very confusing error messages. We now provide a short string message instead.net/iptablesnow handles the new error message fromtc qdiscwhen callingnet/fast!
API Changes
control/execwill now throw a:nonzero-exiterror when an exit status code isnil. Yes, this is apparently a thing that's possible.generator.test/with-fixed-rand-nthhas been replaced bywith-fixed-rands, which controls rand, rand-int, and rand-nth.tests/kafka: failed and info operations are now assumed to roll back consumer positions, rather than advancing them.tests/kafka: emit subscribe/assign ops only 1:64 ops, rather than 1:8. Now tunable via(:sub-p test).
New Features
- A new namespace, 
jepsen.role, supports systems where different nodes run different software. db/map-testwraps a DB in another which alters the test map. Helpful for composing DBs together which expect different things from their test maps.generator/each-process: likeeach-thread, this facets an underlying generator into a distinct one for each process.tests/kafkachecks for transactions which read their own writes prior to commit.
Minor Changes
os/centosnow uses dpkg 1.19.8control.net/ip*now prefers v4 addressescontrol/on-nodesno longer spawns a future when given a single node--slightly more efficient.- SSHJ now falls back to other auth methods after an AgentProxyException
 generator/mapandf-mapnow returnnilwhen given anilgenerator, which simplifies some before-run checks.generator.test/default-testnow includes a pair of:nodes, for generators that use nodeschecker/check-safenow writes exceptions as data to the:errorfield of the results, rather than an unreadable string stacktracetests.kafkanow detects all duplicates even when given inconsistent offsets. It's nice to have both, it turns out.tests/kafkaincludes an:unseenkey in poll operations to help operators track how far behind we aretests/kafka: new tests for the checker & generatortests/kafka: duplicate errors now include specific offsetstests/kafka: inconsistent-offsets errors now emit sorted sets, for readabilitytests/kafkais roughly 8x faster now, thanks to a slew of performance improvementstests/kafkaalso ignores the new cycle-exists variants of G0, G1c, etc.- Jepsen's internal tests log less noise now
 - Clojure 1.12.0
 - tools.logging 1.3.0
 - tools.cli 1.1.230
 - unilog 0.7.32
 - elle 0.2.2
 - http-kit 2.8.0
 - ring 1.12.2
 - sshj 0.39.0
 - data.codec 0.2.0
 - data.fressian 1.1.0
 
Full Changelog: v0.3.5...v0.3.6
0.3.5
This is a relatively small release. It incorporates a new version of Elle which brings dramatic performance improvements, and has a few quality-of-life improvements.
Bugfixes
- os.debian/install and remove now acquire locks, which means you can do multiple debian-package-affecting operations concurrently against a single node.
 - Thread/sleep callsites now have explicit integer coercion, fixing a crashes in newer JVMs/Clojure.
 
API Changes
- control.util/grepkill! now matches the full pattern of the process, rather than just the first 15 characters. This is particularly helpful for killing, say, one out of several 
javaprocesses. - net/Net has been moved to net.proto/Net. All its functions are still available in jepsen.net too.
 
New Features
- checker.perf: nemesis specifications can now include 
:hidden? true, which prevents them from appearing on graphs. 
Minor Changes
- fs-cache/deploy-remote! returns the remote path it uploaded, making it easier to thread into other expressions.
 - control.util/install-archive! now has docs that explain the use of file:// URLs.
 - net/drop! now uses existing SSH connections, making it faster
 - net/drop! has a clearer docstring
 - Elle 0.2.1
 - SSHJ 0.38.0
 - Ring 1.11.0
 
0.3.4
This is a small bugfix & performance release. Just a little faster, a little more correct, a little easier to use. :-)
Bugfixes
- control.util/await-tcp-port no longer logs a truncated error message
 - tests.kafka no longer crashes when checking histories without any received messages
 - jepsen.independent's generators properly unlift the :value fields of the operations they pass through to underlying generators
 
Removals
- os.debian/install-jdk8! is gone now. The repos it relies on haven't worked in years.
 
Minor changes
- independent/checker uses a concurrent fold for breaking apart histories in fewer passes. This roughly doubles throughput in tests with lots of independent keys.
 - independent/checker now returns results in a sorted map, which is easier to read
 - generator.interpreter-test now tests matching open/close! invocations
 - lazyfs version 0.2.0
 - store.format logs more informative errors when serialization fails. You'll get a path to the specific element that couldn't be serialized, as well as its class. This makes serializing tests with new datatypes much less frustrating.
 
v0.3.3
This release updates Jepsen to run with Debian Bookworm. It also includes performance improvements aimed at testing large histories. Jepsen can run and check histories of up to a billion operations now.
Significant API Changes
- During test setup, 
(:generator test)is now wrapped in a newForgettablereference type. You can deref this if you want access to the generator for some reason, but be aware that retaining the head of the generator often causes linear memory consumption during the test. - After the test starts generating operations, attempting to deref 
(:generator test)will throw. - core/run-case! and generator.interpreter/run! now return tests, rather than just histories.
 
Performance Improvements
- Jepsen no longer retains the head of the generator. This dramatically improves memory consumption on long-running tests: running tests of a billion operations in a 512 MB heap is entirely reasonable.
 - Elle 0.1.7 comes with significant speed and memory improvements to the G1A, G1b, and internal checkers for list-append and rw-register.
 - Jepsen.history is much faster to execute folds on large (e.g. 100+ million op) histories. We converted a quadratic-time loop to linear.
 
Bugfixes
- control.net/ip filters out loopback interfaces. Bookworm started returning 127.x.x.x interfaces from 
getent ahostson some platforms. 
Minor Changes
- os.debian now refers to the new 
netcatpackage name. - nemesis.time allows ntpdate to fail during setup/teardown (which now happens on Bookworm).
 - nemesis.time no longer tries to use 
ntpdate -p, which is deprecated in Bookworm - tools.cli 1.0.219
 - elle 0.1.7
 - jepsen.history 0.1.1
 - http-kit 2.7.0
 
Full Changelog: v0.3.2...v0.3.3
0.3.2
This is a relatively small release with a few minor bugfixes and tweaks.
Bugfixes
control.util/wget!correctly throws exceptions when encountering unrecoverable failures.generator.contextContexts now correctly handleassoc.
New Features
client/timeoutwraps an existing client in a new one that times out all operations after some time.
Minor Changes
net/net-devusesip, rather than/sys/class/net, for identifying network interfaces- Ring 1.10.0
 - SSHJ 0.35.0
 
0.3.1
0.3.0
This release replaces many of Jepsen's internals with faster or more scalable data structures. It introduces significant new datatypes and adds new support libraries. Core generators are much faster, thanks to new Context and Op types. Running and analyzing tests can be 1-2 orders of magnitude faster: Jepsen can now run list-append tests at ~45,000 ops/sec and check them at ~30,000 ops/sec. Histories are streamed and loaded incrementally, which improves crash recovery, allows for histories larger than RAM, and speeds up REPL work. Histories in the hundreds of millions or even billions of operations are now tractable. Most checkers are parallelized and take advantage of sophisticated multi-query optimization for reductions over histories. A new dependency-aware executor allows checkers to run in parallel without starvation. New nemesis.combined packages support file truncation and bitflips, as well as network latency and packet loss.
As usual, most things should be API compatible, and we try to issue Obvious Warnings when they're not--but this is a big enough change that we're bumping the minor version from 0.2.7 to 0.3.0. Users integrating tightly with histories and generators should test their code carefully.
New Features
- A new library, jepsen.history, provides support for writing efficient checkers. It includes a transactional dependency-aware concurrent executor, concurrent and linear folds with multi-query optimization, and lazy datatypes for working with large histories.
 - Operations are now represented by an Op defrecord (jepsen.history.Op) instead of maps. This yields significant performance and speed improvements. Ops have mandatory :index and :time fields, both longs. See jepsen.history for more details.
 - Histories are incrementally streamed to the 
test.jepsenfile, and sealed in 16384-operation chunks. If a test crashes during the run or analysis phase, you can likely recover some of its history and re-analyze it. - Histories are now represented by subtypes of jepsen.history.History. These should be compatible with vectors, but stream their contents lazily from disk. Mapping between invocations and completions is now built in to histories, rather than being an external pair-index structure. Histories support efficient linear and concurrent folds with stream fusion and multi-query optimization, and directly support Tesser folds. Analyses may be 1-2 orders of magnitude faster, depending on hardware. See jepsen.history for details.
 - dom-top.core has a new 
reducermacro which roughly doubles performance for reductions with multiple accumulator variables. - Elle can catch new classes of anomalies, especially involving realtime and process-including anti-dependency cycles.
 lein run analyzenow pulls the test arguments out of the test; you don't have to pass them every time.- A new 
nemesis.combined/file-corruption-packageprovides support for bitflips and truncation of files. - A new 
nemesis.combined/packet-packageinduces network latency and packet loss. - A new 
tests.kafkanamespace supports tests for Kafka-style append-only ordered logs. util/rand-distributionsupports picking random numbers
Significant API Changes
- Operations are now jepsen.history.Ops, not maps. 
:indexand:timefields are now mandatory. - Histories are now subtypes of jepsen.history.History, not vectors. They should be mostly API compatible, and will transparently promote themselves to vectors on certain operations (for instance, conj).
 - Generator contexts are now jepsen.context.Contexts, rather than maps. Accessing their old fields will throw and warn you to use new polymorphic functions in jepsen.context.
 lein run analyzenow takes-t path-to-testor-t test-index, rather than the full arguments to recreate the test map.test.fressianfiles, deprecated in 0.2.x, are no longer generated. Usetest.jepseninstead.
Performance Improvements
- Accessing operations is much faster thanks to jepsen.history.Op
 - jepsen.generator is roughly an order of magnitude faster, especially for high (~thousands of threads) concurrency tests, thanks to the new generator.context.Context type.
 - Generators can now dynamically compile context-filtering operations to BitSet intersections, which speeds up 
reserve,on-threads,clients,nemesis, and other generators. - Reductions over histories (e.g. basically every checker) are 1-2 orders of magnitude faster, thanks to jepsen.history.
 - Elle is roughly an order of magnitude faster, thanks to jepsen.history and careful parallelization.
 - Assorted optimizations to generator/fill-in-op, soonest-op-mop, and reserve make them significantly faster.
 - Tests no longer need to wait for history writing at the end of the test, since it's streamed to disk.
 - Using functions as generators is now faster; we perform arity reflection only once rather than on every op.
 - store.fressian decodes lists as vectors directly, rather than post-processing them. This makes Fressian decoding significantly faster.
 
Minor Improvements
- Jepsen and Elle used knossos.history and knossos.op extensively. These have been almost entirely replaced with jepsen.history.
 - Most checkers have been rewritten to use jepsen.history; many reductions are now concurrent folds.
 - Knossos 0.3.9
 - Tools.cli 1.0.214
 - Unilog 0.7.31
 - Ring 1.9.6
 - SSHJ 0.34.0
 - Elle 0.1.6
 - Lazyfs c16518f6
 - Assorted type hints and compiler warnings resolved
 - Contexts are deterministic again, rather than stochastic. This may break tests that depended on specific nondeterministic orders.
 
Full Changelog: v0.2.7...v0.3.0