Skip to content

Commit 6b430c9

Browse files
authored
Merge pull request #2716 from DennisHeimbigner/filtervlen.dmh
Suppress filters on variables with non-fixed-size types.
2 parents ddc67c7 + fb422e6 commit 6b430c9

File tree

17 files changed

+583
-184
lines changed

17 files changed

+583
-184
lines changed

RELEASE_NOTES.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ This file contains a high-level description of this package's evolution. Release
77

88
## 4.9.3 - TBD
99

10+
* Suppress filters on variables with non-fixed-size types. See [Github #2716](https://github.com/Unidata/netcdf-c/pull/2716).
1011
* Provide a single option to disable all network access and testing. See [Github #2708](https://github.com/Unidata/netcdf-c/pull/2708).
1112
* Fix some problems with earthdata authorization and data access. See [Github #2709](https://github.com/Unidata/netcdf-c/pull/2709).
1213
* Fix a race condition in some ncdump tests. See [Github #2682](https://github.com/Unidata/netcdf-c/pull/2682).

docs/filters.md

Lines changed: 41 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,9 @@ The most common kind of filter is a compression-decompression
3333
filter, and that is the focus of this document.
3434
But non-compression filters – fletcher32, for example – also exist.
3535

36+
This document describes the support for HDF5 filters and also
37+
the newly added support for NCZarr filters.
38+
3639
The netCDF enhanced (aka netCDF-4) library inherits this
3740
capability since it depends on the HDF5 library. The HDF5
3841
library (1.8.11 and later) supports filters, and netCDF is based
@@ -45,30 +48,22 @@ multiple filters are defined on a variable, they are applied in
4548
first-defined order on writing and on the reverse order when
4649
reading.
4750

48-
This document describes the support for HDF5 filters and also
49-
the newly added support for NCZarr filters.
50-
51-
## A Warning on Backward Compatibility {#filters_compatibility}
52-
53-
The API defined in this document should accurately reflect the
54-
current state of filters in the netCDF-c library. Be aware that
55-
there was a short period in which the filter code was undergoing
56-
some revision and extension. Those extensions have largely been
57-
reverted. Unfortunately, some users may experience some
58-
compilation problems for previously working code because of
59-
these reversions. In that case, please revise your code to
60-
adhere to this document. Apologies are extended for any
61-
inconvenience.
62-
63-
A user may encounter an incompatibility if any of the following appears in user code.
64-
65-
* The function *\_nc\_inq\_var\_filter* was returning the error value NC\_ENOFILTER if a variable had no associated filters.
66-
It has been reverted to the previous case where it returns NC\_NOERR and the returned filter id was set to zero if the variable had no filters.
67-
* The function *nc\_inq\_var\_filterids* was renamed to *nc\_inq\_var\_filter\_ids*.
68-
* Some auxilliary functions for parsing textual filter specifications have been moved to the file *netcdf\_aux.h*. See [Appendix A](#filters_appendixa).
69-
* All of the "filterx" functions have been removed. This is unlikely to cause problems because they had limited visibility.
70-
71-
For additional information, see [Appendix B](#filters_appendixb).
51+
There is an important "caveat" with respect to filters
52+
and their application to variables.
53+
If the type of the variable is variable-sized, then attempts
54+
to define a filter on such a variable will not be allowed.
55+
In this case, the call to *nc\_def\_var\_filter* will succeed
56+
but the filter will be suppressed and a warning will be logged.
57+
Similarly, if an existing file is opened, and there is a
58+
variable-sized variable with a filter, then that variable will be
59+
suppressed and will be inaccessible through the netcdf-c API.
60+
61+
The concept of a variable-sized type is defined as follows:
62+
1. The type NC_STRING is variable-sized.
63+
2. Any user defined type of the class NC_VLEN is variable sized.
64+
3. If a compound type has any field that is (transitively) variable-sized,
65+
then that compound type is variable-sized.
66+
4. All other types are fixed-size.
7267

7368
## Enabling A HDF5 Compression Filter {#filters_enable}
7469

@@ -1150,6 +1145,28 @@ The important thing to note is that at run-time, there are several cases to cons
11501145
3. HDF5_PLUGIN_DIR is not defined at either run-time or build-time -- no action needed
11511146
4. HDF5_PLUGIN_DIR is not defined at run-time but was defined at build-time -- this will probably fail
11521147

1148+
## Appendix I. A Warning on Backward Compatibility {#filters_appendixi}
1149+
1150+
The API defined in this document should accurately reflect the
1151+
current state of filters in the netCDF-c library. Be aware that
1152+
there was a short period in which the filter code was undergoing
1153+
some revision and extension. Those extensions have largely been
1154+
reverted. Unfortunately, some users may experience some
1155+
compilation problems for previously working code because of
1156+
these reversions. In that case, please revise your code to
1157+
adhere to this document. Apologies are extended for any
1158+
inconvenience.
1159+
1160+
A user may encounter an incompatibility if any of the following appears in user code.
1161+
1162+
* The function *\_nc\_inq\_var\_filter* was returning the error value NC\_ENOFILTER if a variable had no associated filters.
1163+
It has been reverted to the previous case where it returns NC\_NOERR and the returned filter id was set to zero if the variable had no filters.
1164+
* The function *nc\_inq\_var\_filterids* was renamed to *nc\_inq\_var\_filter\_ids*.
1165+
* Some auxilliary functions for parsing textual filter specifications have been moved to the file *netcdf\_aux.h*. See [Appendix A](#filters_appendixa).
1166+
* All of the "filterx" functions have been removed. This is unlikely to cause problems because they had limited visibility.
1167+
1168+
For additional information, see [Appendix B](#filters_appendixb).
1169+
11531170
## Point of Contact {#filters_poc}
11541171

11551172
*Author*: Dennis Heimbigner<br>

include/nc4internal.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -451,6 +451,7 @@ extern int NC4_inq_type_fixed_size(int ncid, nc_type xtype, int* isfixedsizep);
451451
/* Manage the fixed/var sized'ness of a type */
452452
extern int NC4_recheck_varsize(NC_TYPE_INFO_T* parenttype, nc_type addedtype);
453453
extern int NC4_set_varsize(NC_TYPE_INFO_T* parenttype);
454+
extern int NC4_var_varsized(NC_VAR_INFO_T* var);
454455

455456
/* Close the file. */
456457
extern int nc4_close_netcdf4_file(NC_FILE_INFO_T *h5, int abort, NC_memio *memio);

libdispatch/dfilter.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
#include "netcdf_filter.h"
2020
#include "ncdispatch.h"
2121
#include "nc4internal.h"
22+
#include "nclog.h"
2223

2324
#ifdef USE_HDF5
2425
#include "hdf5internal.h"
@@ -134,7 +135,10 @@ nc_def_var_filter(int ncid, int varid, unsigned int id, size_t nparams, const un
134135
if((stat = nc_inq_vartype(ncid,varid,&xtype))) return stat;
135136
/* If the variable's type is not fixed-size, then signal error */
136137
if((stat = NC4_inq_type_fixed_size(ncid, xtype, &fixedsize))) return stat;
137-
if(!fixedsize) return NC_EFILTER;
138+
if(!fixedsize) {
139+
nclog(NCLOGWARN,"Filters cannot be applied to variable length data types.");
140+
return NC_NOERR; /* Deliberately suppress */
141+
}
138142
if((stat = ncp->dispatch->def_var_filter(ncid,varid,id,nparams,params))) goto done;
139143
done:
140144
return stat;

libhdf5/hdf5open.c

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1061,6 +1061,7 @@ static int get_filter_info(hid_t propid, NC_VAR_INFO_T *var)
10611061
int f;
10621062
int stat = NC_NOERR;
10631063
NC_HDF5_VAR_INFO_T *hdf5_var;
1064+
int varsized = 0;
10641065

10651066
assert(var);
10661067

@@ -1070,12 +1071,16 @@ static int get_filter_info(hid_t propid, NC_VAR_INFO_T *var)
10701071
if ((num_filters = H5Pget_nfilters(propid)) < 0)
10711072
{stat = NC_EHDFERR; goto done;}
10721073

1074+
/* If the type of the variable is variable length, and
1075+
it has filters defined, suppress the variable. */
1076+
varsized = NC4_var_varsized(var);
1077+
10731078
for (f = 0; f < num_filters; f++)
10741079
{
1075-
int flags = 0;
10761080
htri_t avail = -1;
1081+
unsigned flags = 0;
10771082
cd_nelems = 0;
1078-
if ((filter = H5Pget_filter2(propid, f, NULL, &cd_nelems, NULL, 0, NULL, NULL)) < 0)
1083+
if ((filter = H5Pget_filter2(propid, f, &flags, &cd_nelems, NULL, 0, NULL, NULL)) < 0)
10791084
{stat = NC_ENOFILTER; goto done;} /* Assume this means an unknown filter */
10801085
if((avail = H5Zfilter_avail(filter)) < 0)
10811086
{stat = NC_EHDFERR; goto done;} /* Something in HDF5 went wrong */
@@ -1084,6 +1089,11 @@ static int get_filter_info(hid_t propid, NC_VAR_INFO_T *var)
10841089
/* mark variable as unreadable */
10851090
hdf5_var->flags |= NC_HDF5_VAR_FILTER_MISSING;
10861091
}
1092+
/* If variable type is varsized and filter is mandatory then this variable is unreadable */
1093+
if(varsized && (flags & H5Z_FLAG_MANDATORY) != 0) {
1094+
/* mark variable as unreadable */
1095+
hdf5_var->flags |= NC_HDF5_VAR_FILTER_MISSING;
1096+
}
10871097
if((cd_values = calloc(sizeof(unsigned int),cd_nelems))==NULL)
10881098
{stat = NC_ENOMEM; goto done;}
10891099
if ((filter = H5Pget_filter2(propid, f, NULL, &cd_nelems, cd_values, 0, NULL, NULL)) < 0)

libnczarr/zclose.c

Lines changed: 46 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,49 @@ zclose_gatts(NC_GRP_INFO_T* grp)
135135
return stat;
136136
}
137137

138+
/**
139+
* @internal Close resources for single var
140+
*
141+
* @param var var to reclaim
142+
*
143+
* @return ::NC_NOERR No error.
144+
* @author Dennis Heimbigner
145+
*/
146+
int
147+
NCZ_zclose_var1(NC_VAR_INFO_T* var)
148+
{
149+
int stat = NC_NOERR;
150+
NCZ_VAR_INFO_T* zvar;
151+
NC_ATT_INFO_T* att;
152+
int a;
153+
154+
assert(var && var->format_var_info);
155+
zvar = var->format_var_info;;
156+
for(a = 0; a < ncindexsize(var->att); a++) {
157+
NCZ_ATT_INFO_T* zatt;
158+
att = (NC_ATT_INFO_T*)ncindexith(var->att, a);
159+
assert(att && att->format_att_info);
160+
zatt = att->format_att_info;
161+
nullfree(zatt);
162+
att->format_att_info = NULL; /* avoid memory errors */
163+
}
164+
#ifdef ENABLE_NCZARR_FILTERS
165+
/* Reclaim filters */
166+
if(var->filters != NULL) {
167+
(void)NCZ_filter_freelists(var);
168+
}
169+
var->filters = NULL;
170+
#endif
171+
/* Reclaim the type */
172+
if(var->type_info) (void)zclose_type(var->type_info);
173+
if(zvar->cache) NCZ_free_chunk_cache(zvar->cache);
174+
/* reclaim xarray */
175+
if(zvar->xarray) nclistfreeall(zvar->xarray);
176+
nullfree(zvar);
177+
var->format_var_info = NULL; /* avoid memory errors */
178+
return stat;
179+
}
180+
138181
/**
139182
* @internal Close resources for vars in a group.
140183
*
@@ -148,37 +191,14 @@ zclose_vars(NC_GRP_INFO_T* grp)
148191
{
149192
int stat = NC_NOERR;
150193
NC_VAR_INFO_T* var;
151-
NCZ_VAR_INFO_T* zvar;
152-
NC_ATT_INFO_T* att;
153-
int a, i;
194+
int i;
154195

155196
for(i = 0; i < ncindexsize(grp->vars); i++) {
156197
var = (NC_VAR_INFO_T*)ncindexith(grp->vars, i);
157198
assert(var && var->format_var_info);
158-
zvar = var->format_var_info;;
159-
for(a = 0; a < ncindexsize(var->att); a++) {
160-
NCZ_ATT_INFO_T* zatt;
161-
att = (NC_ATT_INFO_T*)ncindexith(var->att, a);
162-
assert(att && att->format_att_info);
163-
zatt = att->format_att_info;
164-
nullfree(zatt);
165-
att->format_att_info = NULL; /* avoid memory errors */
166-
}
167-
#ifdef ENABLE_NCZARR_FILTERS
168-
/* Reclaim filters */
169-
if(var->filters != NULL) {
170-
(void)NCZ_filter_freelists(var);
171-
}
172-
var->filters = NULL;
173-
#endif
174-
/* Reclaim the type */
175-
if(var->type_info) (void)zclose_type(var->type_info);
176-
if(zvar->cache) NCZ_free_chunk_cache(zvar->cache);
177-
/* reclaim xarray */
178-
if(zvar->xarray) nclistfreeall(zvar->xarray);
179-
nullfree(zvar);
180-
var->format_var_info = NULL; /* avoid memory errors */
199+
if((stat = NCZ_zclose_var1(var))) goto done;
181200
}
201+
done:
182202
return stat;
183203
}
184204

libnczarr/zinternal.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,7 @@ int ncz_closeorabort(NC_FILE_INFO_T*, void* params, int abort);
253253

254254
/* zclose.c */
255255
int ncz_close_ncz_file(NC_FILE_INFO_T* file, int abort);
256+
int NCZ_zclose_var1(NC_VAR_INFO_T* var);
256257

257258
/* zattr.c */
258259
int ncz_getattlist(NC_GRP_INFO_T *grp, int varid, NC_VAR_INFO_T **varp, NCindex **attlist);

libnczarr/zsync.c

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1451,6 +1451,8 @@ define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames)
14511451
NCjson* jfilter = NULL;
14521452
int chainindex;
14531453
#endif
1454+
int varsized;
1455+
int suppressvar = 0; /* 1 => make this variable invisible */
14541456

14551457
ZTRACE(3,"file=%s grp=%s |varnames|=%u",file->controller->path,grp->hdr.name,nclistlength(varnames));
14561458

@@ -1533,6 +1535,9 @@ define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames)
15331535
}
15341536
}
15351537

1538+
/* See if this variable is variable sized */
1539+
varsized = NC4_var_varsized(var);
1540+
15361541
if(!purezarr) {
15371542
/* Extract the _NCZARR_ARRAY values */
15381543
/* Do this first so we know about storage esp. scalar */
@@ -1719,7 +1724,7 @@ define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames)
17191724
/* compressor key */
17201725
/* From V2 Spec: A JSON object identifying the primary compression codec and providing
17211726
configuration parameters, or ``null`` if no compressor is to be used. */
1722-
{
1727+
if(!varsized) { /* Only process if variable is fixed-size */
17231728
#ifdef ENABLE_NCZARR_FILTERS
17241729
if(var->filters == NULL) var->filters = (void*)nclistnew();
17251730
if((stat = NCZ_filter_initialize())) goto done;
@@ -1730,6 +1735,9 @@ define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames)
17301735
}
17311736
#endif
17321737
}
1738+
/* Suppress variable if there are filters and var is not fixed-size */
1739+
if(varsized && nclistlength((NClist*)var->filters) > 0)
1740+
suppressvar = 1;
17331741

17341742
if((stat = computedimrefs(file, var, purezarr, xarray, rank, dimnames, shapes, var->dim)))
17351743
goto done;
@@ -1741,9 +1749,16 @@ define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames)
17411749
}
17421750

17431751
#ifdef ENABLE_NCZARR_FILTERS
1744-
/* At this point, we can finalize the filters */
1745-
if((stat = NCZ_filter_setup(var))) goto done;
1752+
if(!suppressvar) {
1753+
/* At this point, we can finalize the filters */
1754+
if((stat = NCZ_filter_setup(var))) goto done;
1755+
}
17461756
#endif
1757+
1758+
if(suppressvar) {
1759+
if((stat = NCZ_zclose_var1(var))) goto done;
1760+
}
1761+
17471762
/* Clean up from last cycle */
17481763
nclistfreeall(dimnames); dimnames = nclistnew();
17491764
nullfree(varpath); varpath = NULL;

libsrc4/nc4type.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -799,3 +799,21 @@ NC4_set_varsize(NC_TYPE_INFO_T* typ)
799799
done:
800800
return stat;
801801
}
802+
803+
/**
804+
* Test if a variable's type is fixed sized or not.
805+
* @param var - to test
806+
* @return 0 if fixed size, 1 otherwise.
807+
*/
808+
int
809+
NC4_var_varsized(NC_VAR_INFO_T* var)
810+
{
811+
NC_TYPE_INFO_T* vtype = NULL;
812+
813+
/* Check the variable type */
814+
vtype = var->type_info;
815+
if(vtype->hdr.id < NC_STRING) return 0;
816+
if(vtype->hdr.id == NC_STRING) return 1;
817+
return vtype->varsized;
818+
}
819+

nc_test4/CMakeLists.txt

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,15 +48,20 @@ IF(USE_HDF5 AND ENABLE_FILTER_TESTING)
4848
build_bin_test(tst_multifilter)
4949
build_bin_test(test_filter_order)
5050
build_bin_test(test_filter_repeat)
51-
build_bin_test(tst_filter_avail)
52-
build_bin_test(test_filter_vlen)
51+
build_bin_test(tst_filter_vlen)
5352
ADD_SH_TEST(nc_test4 tst_filter)
5453
ADD_SH_TEST(nc_test4 tst_specific_filters)
5554
ADD_SH_TEST(nc_test4 tst_bloscfail)
55+
ADD_SH_TEST(nc_test4 tst_filter_vlen)
5656
IF(FALSE)
5757
# This test is too dangerous to run in a parallel make environment.
5858
# It causes race conditions. So suppress and only test by hand.
5959
ADD_SH_TEST(nc_test4 tst_unknown)
60+
# The tst_filterinstall test can only be run after an install
61+
# occurred with --with-plugin-dir enabled. So there is no point
62+
#in running it via make check. It is kept here so it can be
63+
# manually invoked if desired
64+
ADD_SH_TEST(nc_test4 tst_filterinstall)
6065
ENDIF(FALSE)
6166
ENDIF(USE_HDF5 AND ENABLE_FILTER_TESTING)
6267

0 commit comments

Comments
 (0)