Skip to content

Commit d0031d8

Browse files
committed
add info about flags to readme
1 parent 4dda12a commit d0031d8

File tree

2 files changed

+67
-5
lines changed

2 files changed

+67
-5
lines changed

README.md

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
# zlib-rs: a safer zlib
77

8-
This repository contains a Rust implementation of the zlib file format that is compatible with the zlib API.
8+
This repository contains a Rust implementation of the zlib file format that is compatible with the zlib API.
99

1010
This repository contains two public crates:
1111

@@ -23,14 +23,43 @@ zlib-rs can be used in both Rust and C projects.
2323
By far the easiest way to use zlib-rs is through the [flate2](https://crates.io/crates/flate2) crate, by simply enabling the `zlib-rs` feature gate. This will enable the `zlib-rs`
2424
backend.
2525

26-
## C projects
26+
### C projects
2727

2828
zlib-rs can be built as a shared object file for usage by C programs that dynamically link to zlib. Please see the example in [libz-rs-sys-cdylib](https://github.com/trifectatechfoundation/zlib-rs/tree/main/libz-rs-sys-cdylib).
2929

30-
## Acknowledgment
30+
## Performance
3131

32-
This project is heavily based on the [zlib](https://github.com/madler/zlib) and
33-
[zlib-ng](https://github.com/zlib-ng/zlib-ng) projects.
32+
Performance is generally on-par with [zlib-ng].
33+
34+
### Compiler Flags
35+
36+
Compiler flags that can be used to improve performance.
37+
38+
#### `-Ctarget-cpu=...`
39+
40+
Providing more information about the SIMD capabilities of the target machine can improve performance. E.g.
41+
42+
```
43+
RUSTFLAGS="-Ctarget-cpu=native" cargo build --release ...
44+
```
45+
46+
The resulting binary statically assumes the SIMD capabilities of the current machine.
47+
48+
Note: binaries built with `-Ctarget-cpu` almost certainly crash on systems that don't have the specified CPU! Only use this flag if you control how the binary is deployed, and can guarantee that the CPU assumptions are never violated.
49+
50+
#### `-Cllvm-args=-enable-dfa-jump-thread`
51+
52+
For best performance with very small input sizes, compile with:
53+
54+
```
55+
RUSTFLAGS="-Cllvm-args=-enable-dfa-jump-thread" cargo build --release ...
56+
```
57+
58+
This flag gives around a 10% boost when the input arrives in chunks of 16 bytes, and a couple percent when input arrives in chunks of under 1024 bytes. Beyond that, the effect is not significant. Using this flag can lead to longer compile times, but otherwise has no adverse effects.
59+
60+
## Acknowledgments
61+
62+
This project is heavily based on the [zlib](https://github.com/madler/zlib) and [zlib-ng] projects.
3463

3564
## About
3665

@@ -39,3 +68,5 @@ zlib-rs is part of Trifecta Tech Foundation's [Data compression initiative](http
3968
## History
4069

4170
The initial development of zlib-rs was started and funded by the [Internet Security Research Group](https://www.abetterinternet.org/) as part of the [Prossimo project](https://www.memorysafety.org/).
71+
72+
[zlib-ng]: https://github.com/zlib-ng/zlib-ng

libz-rs-sys-cdylib/README.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,3 +112,34 @@ target
112112
├── libz_rs.so
113113
└── libz_rs-uninstalled.pc
114114
```
115+
116+
## Performance
117+
118+
Performance is generally on-par with [zlib-ng].
119+
120+
### Compiler Flags
121+
122+
Compiler flags that can be used to improve performance.
123+
124+
#### `-Ctarget-cpu=...`
125+
126+
Providing more information about the SIMD capabilities of the target machine can improve performance. E.g.
127+
128+
```
129+
RUSTFLAGS="-Ctarget-cpu=native" cargo build --release ...
130+
```
131+
132+
The resulting binary statically assumes the SIMD capabilities of the current machine.
133+
134+
Note: binaries built with `-Ctarget-cpu` almost certainly crash on systems that don't have the specified CPU! Only use this flag if you control how the binary is deployed, and can guarantee that the CPU assumptions are never violated.
135+
136+
#### `-Cllvm-args=-enable-dfa-jump-thread`
137+
138+
For best performance with very small input sizes, compile with:
139+
140+
```
141+
RUSTFLAGS="-Cllvm-args=-enable-dfa-jump-thread" cargo build --release ...
142+
```
143+
144+
This flag gives around a 10% boost when the input arrives in chunks of 16 bytes, and a couple percent when input arrives in chunks of under 1024 bytes. Beyond that, the effect is not significant. Using this flag can lead to longer compile times, but otherwise has no adverse effects.
145+

0 commit comments

Comments
 (0)