-
Notifications
You must be signed in to change notification settings - Fork 300
Allow a MAX_SIZE
constant trait and generation with derive
#784
Description
Motivation:
For allocating buffers in memory-lite environments and for automated security checks (like in #764), having a constant that supplies a reasonable upper-bound for encoded data is useful. Also, while decoding with an untrusted input, bounding the entire decoder
will not prevent "playing around" with sizes of the internal fields of the encoded data, which may lead to unexpected behavior. In addition, having a bound that also validates the Encoding
in alongside the Decoding
is expected to catch logical bugs earlier thus reducing the development time. Finally, it is often unclear what is the correct global limit for the with_limit
method, leading to redundant trial-and-error.
Intuition:
For most types, their maximum size is defined compile-time (like u32
is maximum 4 bytes or Enum
is the maximum of it's variant's maxima + discriminant size). However, collections with variable size are by-definition unblocked thus in order to provide the upped-bound mentioned above, the user must provide additional data, like the maximum length of the collection (and not its size).
API Example for Getting the Maximum Size Without Collections:
#[derive(bincode::MaxSize]
pub struct TestMaxSize {
b: u32,
c: u32,
}
#[test]
fn test_() {
assert_eq!(TestMaxSize::MAX_ENCODED_SIZE, 4+4);
}
Implementation Suggestion:
Like Postcard's max_size
feature, an implementation for that should likely include a trait called MaxSize
with a const MAX_ENCODED_SIZE: usize
property. Such trait could be implemented differently for every type. The derive
feature will allow automatic creation of this trait's implementation, similar to the already existing derive
macros.
Possible Solutions For Communicating the Maximum Length of a Collection:
- Create a
newtype
generic for its maximum length while re-implementingEncode
,Decode
andMaxSize
to also assert that the current length is less or equal to the maximum length. This solution is very simple but it is dirty because the access for this type must always be followed by.0
. - Add an attribute providing the maximum length if required. This is more complicated because it will require macro-generated length-checking during
encode
anddecode
, but this solution provides a more natural API because it encapsulates the "encoding related" configuration in the macro-attribute world instead of the typing system.
In order to avoid editing the internal Encoding and Decoding logic, derive_decode
and derive_decode
could utilize the combination of 1,2 described above with:
For every property of type T
holding the maxlen
attribute with value N
, it may convert it to bincode::MaxLenValidated<T, N>
and encode/decode the latter.
MaxLenValidated
will implement encode
for T: Len+Encode
and N: usize
simply by wrapping the T::encode
with length check (provided by Len
trait).
Decoding for T: Len+Decode
and N: usize
may be more complicated and require either re-implementing the Decode
trait for bincode::MaxLenValidated<T, N>
or unifying the current implementation of Decode
for Vec
, BTreeMap
, BTreeSet
because they are essentially the same "iterate len
times, for each decode
and push
" logic.
API example:
#[derive(bincode::MaxSize)]
pub struct Example {
a: u64,
b: (),
#[bincode(maxlen = 50)]
c: Vec<isize>
}
#[test]
fn test_() {
assert_eq!(Example::MAX_ENCODED_SIZE, 8+0+(8+50*8));
}