@@ -611,7 +611,99 @@ As part of its testing, the NetCDF build process creates a number of shared libr
611
611
If you need a filter from that set, you may be able to set * HDF5\_ PLUGIN\_ PATH*
612
612
to point to that directory or you may be able to copy the shared libraries out of that directory to your own location.
613
613
614
- ## Debugging {#filters_debug}
614
+ # Lossy One-Way Filters
615
+
616
+ As of NetCDF version 4.8.2, the netcdf-c library supports
617
+ bit-grooming filters.
618
+ ````
619
+ Bit-grooming is a lossy compression algorithm that removes the
620
+ bloat due to false-precision, those bits and bytes beyond the
621
+ meaningful precision of the data. Bit Grooming is statistically
622
+ unbiased, applies to all floating point numbers, and is easy to
623
+ use. Bit-Grooming reduces data storage requirements by
624
+ 25-80%. Unlike its best-known competitor Linear Packing, Bit
625
+ Grooming imposes no software overhead on users, and guarantees
626
+ its precision throughout the whole floating point range
627
+ [https://doi.org/10.5194/gmd-9-3199-2016].
628
+ ````
629
+ The generic term "quantize" is used to refer collectively to the various
630
+ precision-trimming algorithms. The key thing to note about quantization is that
631
+ it occurs at the point of writing of data only. Since its output is
632
+ legal data, it does not need to be "de-quantized" when the data is read.
633
+ Because of this, quantization is not part of the standard filter
634
+ mechanism and has a separate API.
635
+
636
+ The API for bit-groom is currently as follows.
637
+ ````
638
+ int nc_def_var_quantize(int ncid, int varid, int quantize_mode, int nsd);
639
+ int nc_inq_var_quantize(int ncid, int varid, int *quantize_modep, int *nsdp);
640
+ ````
641
+ The * quantize_mode* argument specifies the particular algorithm.
642
+ Currently, three are supported: NC_QUANTIZE_BITGROOM, NC_QUANTIZE_GRANULARBR,
643
+ and NC_QUANTIZE_BITROUND. In addition quantization can be disabled using
644
+ the value NC_NOQUANTIZE.
645
+
646
+ The input to ncgen or the output from ncdump supports special attributes
647
+ to indicate if quantization was applied to a given variable.
648
+ These attributes have the following form.
649
+ ````
650
+ _QuantizeBitGroomNumberOfSignificantDigits = <NSD>
651
+ or
652
+ _QuantizeGranularBitRoundNumberOfSignificantDigits = <NSD>
653
+ or
654
+ _QuantizeBitRoundNumberOfSignificantBits = <NSB>
655
+ ````
656
+ The value NSD is the number of significant (decimal) digits to keep.
657
+ The value NSB is the number of bits to keep in the fraction part of an
658
+ IEEE754 floating-point number. Note that NSB of QuantizeBitRound is the same as
659
+ "number of explicit mantissa bits" (https://doi.org/10.5194/gmd-9-3199-2016 ) and same as
660
+ the number of "keep-bits" (https://doi.org/10.5194/gmd-14-377-2021 ), but is not
661
+ one less than the number of significant bunary figures:
662
+ ` _QuantizeBitRoundNumberOfSignificantBits = 0 ` means one significant binary figure,
663
+ ` _QuantizeBitRoundNumberOfSignificantBits = 1 ` means two significant binary figures etc.
664
+
665
+ ## Distortions introduced by lossy filters
666
+
667
+ Any lossy filter introduces distortions to data.
668
+ The lossy filters implemented in netcdf-c introduce a distortoin
669
+ that can be quantified in terms of a _ relative_ error. The magnitude of
670
+ distortion introduced to every single value V is guaranteed to be within
671
+ a certain fraction of V, expressed as 0.5 * V * 2** {-NSB}:
672
+ i.e. it is 0.5V for NSB=0, 0.25V for NSB=1, 0.125V for NSB=2 etc.
673
+
674
+
675
+ Two other methods use different definitions of _ decimal precision_ , though both
676
+ are guaranteed to reproduce NSD decimals when printed.
677
+ The margin for a relative error introduced by the methods are summarised in the table
678
+
679
+ ```
680
+ NSD 1 2 3 4 5 6 7
681
+
682
+ BitGroom
683
+ Error Margin 3.1e-2 3.9e-3 4.9e-4 3.1e-5 3.8e-6 4.7e-7 -
684
+
685
+ GranularBitRound
686
+ Error Margin 1.4e-1 1.9e-2 2.2e-3 1.4e-4 1.8e-5 2.2e-6 -
687
+
688
+ ```
689
+
690
+
691
+ If one defines decimal precision as in BitGroom, i.e. the introduced relative
692
+ error must not exceed half of the unit at the decimal place NSD in the
693
+ worst-case scenario, the following values of NSB should be used for BitRound:
694
+
695
+ ```
696
+ NSD 1 2 3 4 5 6 7
697
+ NSB 3 6 9 13 16 19 23
698
+ ```
699
+
700
+ The resulting application of BitRound is as fast as BitGroom, and is free from
701
+ artifacts in multipoint statistics introduced by BitGroom
702
+ (see https://doi.org/10.5194/gmd-14-377-2021 ).
703
+
704
+
705
+ # Debugging {#filters_debug}
706
+
615
707
616
708
Depending on the debugger one uses, debugging plugins can be very difficult.
617
709
It may be necessary to use the old printf approach for debugging the filter itself.
0 commit comments