Skip to content

Commit a366a05

Browse files
authored
Merge pull request #31 from facelessuser/extmatch-fixes
- allow literal dots as extmatch patterns should not be treated like character sequences. - extended match patterns should be allowed to match '.' and '..' directories just like bash if the extended pattern starts with '.'. - extended pattern inverse should handle path separators correctly. - `?` or `[.]` should not trigger matching directories `.` and `..` Closes #30 #32
2 parents 4f7e9c8 + 77b0a56 commit a366a05

File tree

10 files changed

+107
-38
lines changed

10 files changed

+107
-38
lines changed

.coveragerc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
[run]
22
omit=
3-
backrefs/pep562.py
3+
wcmatch/pep562.py
44

55
[report]
66
omit=
7-
backrefs/pep562.py
7+
wcmatch/pep562.py

docs/src/markdown/changelog.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,12 @@
11
# Changelog
22

3+
## 2.2.1
4+
5+
- **FIX**: `EXTMATCH`/`EXTGLOB` should allow literal dots and should not treat dots like sequences do.
6+
- **FIX**: Fix `!(...)` extended match patterns in `glob` and `globmatch` so that they properly match `.` and `..` if their pattern starts with `.`.
7+
- **FIX**: Fix `!(...)` extended match patterns so that they handle path separators correctly.
8+
- **FIX**: Patterns such as `?` or `[.]` should not trigger matching directories `.` and `..` in `glob` and `globmatch`.
9+
310
## 2.2.0
411

512
- **NEW**: Officially support Python 3.8.

docs/src/markdown/fnmatch.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ Pattern | Meaning
2929

3030
- Slashes are generally treated as normal characters, but on windows they will be normalized: `/` will become `\\`. There is no need to explicitly use `\\` in patterns on Windows, but if you do, it will be handled. This applies to matching patterns and the file names the patterns are applied to.
3131
- If case sensitivity is applied on a Windows system, slashes will not be normalized and pattern and file names will be treated as a Linux/Unix path.
32-
- By default, `.` is *not* matched by `*`, `?`, `[]`, and extended patterns such as `*(...)`. See the [`DOTMATCH`](#fnmatchdotmatch) flag to match `.` at the start of a filename without a literal `.`.
32+
- By default, `.` is *not* matched by `*`, `?`, and `[]`. See the [`DOTMATCH`](#fnmatchdotmatch) flag to match `.` at the start of a filename without a literal `.`.
3333

3434
--8<-- "posix.txt"
3535

@@ -145,7 +145,7 @@ When `MINUSNEGATE` is used with [`NEGATE`](#fnmatchnegate), negate patterns are
145145

146146
#### `fnmatch.DOTMATCH, fnmatch.D` {: #fnmatchdotmatch}
147147

148-
By default, [`glob`](#fnmatchfnmatch) and related functions will not match file or directory names that start with dot `.` unless matched with a literal dot. `DOTMATCH` allows the meta characters (such as `*`) to match dots like any other character. Dots will not be matched in `[]`, `*`, `?`, or extended patterns like `+(...)`.
148+
By default, [`glob`](#fnmatchfnmatch) and related functions will not match file or directory names that start with dot `.` unless matched with a literal dot. `DOTMATCH` allows the meta characters (such as `*`) to match dots like any other character. Dots will not be matched in `[]`, `*`, or `?`.
149149

150150
#### `fnmatch.EXTMATCH, fnmatch.E` {: #fnmatchextmatch}
151151

docs/src/markdown/glob.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ Pattern | Meaning
3434
- Windows drives are recognized as either `C:\\` or `\\\\Server\\mount\\` (or `C:/` and `//Server/mount/`).
3535
- Meta characters have no effect when inside a UNC path: `\\\\Server?\\mount*\\`.
3636
- If case sensitivity is applied on a Windows system, slashes will not be normalized and pattern and paths will be treated as if on Linux/Unix. Also Windows drives are no longer handled special. One exception is when using the functions [`glob`](#globglob) or [`iglob`](#globiglob). Since `glob` and `iglob` work on the actual file system of the host, it *must* normalize slashes and handle drives to work properly on the system.
37-
- By default, file and directory names starting with `.` are only matched with literal `.`. The patterns `*`, `?`, `[]`, and extended patterns like `*(...)` will not match a leading `.`. To alter this behavior, you can use the [`DOTGLOB`](#globdotglob) flab, but even with `DOTGLOB`, `*` and `**` will not match a directory `.` or `..`. But a pattern like `.*` will match `.` and `..`.
37+
- By default, file and directory names starting with `.` are only matched with literal `.`. The patterns `*`, `**`, `?`, and `[]` will not match a leading `.`. To alter this behavior, you can use the [`DOTGLOB`](#globdotglob) flag, but even with `DOTGLOB` these special tokens will not match a special directory (`.` or `..`). But when a literal `.` is used, for instance in the pattern `.*`, the pattern will match `.` and `..`.
3838
- Relative paths and patterns are supported.
3939

4040
```pycon3
@@ -268,7 +268,7 @@ When `MINUSNEGATE` is used with [`NEGATE`](#globnegate), negate patterns are rec
268268

269269
#### `glob.DOTGLOB, glob.D` {: #globdotglob}
270270

271-
By default, [`glob`](#globglob) and [`globmatch`](#globglobmatch) will not match file or directory names that start with dot `.` unless matched with a literal dot. `DOTGLOB` allows the meta characters (such as `*`) to glob dots like any other character. Dots will not be matched in `[]`, `*`, `?`, or extended patterns like `+(...)`.
271+
By default, [`glob`](#globglob) and [`globmatch`](#globglobmatch) will not match file or directory names that start with dot `.` unless matched with a literal dot. `DOTGLOB` allows the meta characters (such as `*`) to glob dots like any other character. Dots will not be matched in `[]`, `*`, or `?`.
272272

273273
Alternatively `DOTMATCH` will also be accepted for consistency with the other provided libraries. Both flags are exactly the same and are provided as a convenience in case the user finds one more intuitive than the other since `DOTGLOB` is often the name used in Bash.
274274

tests/test_fnmatch.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -112,10 +112,10 @@ class TestFnMatch:
112112
['?abc', '.abc', False, fnmatch.D],
113113
['*abc', '.abc', False, fnmatch.D],
114114
['[.]abc', '.abc', False, fnmatch.D],
115-
['*(.)abc', '.abc', False, fnmatch.E | fnmatch.D],
116-
[r'*(\.)abc', '.abc', False, fnmatch.E | fnmatch.D],
115+
['*(.)abc', '.abc', True, fnmatch.E | fnmatch.D],
116+
[r'*(\.)abc', '.abc', True, fnmatch.E | fnmatch.D],
117117
['*(?)abc', '.abc', False, fnmatch.E | fnmatch.D],
118-
['*(?|.)abc', '.abc', False, fnmatch.E | fnmatch.D],
118+
['*(?|.)abc', '.abc', True, fnmatch.E | fnmatch.D],
119119
['*(?|*)abc', '.abc', False, fnmatch.E | fnmatch.D],
120120
['a.bc', 'a.bc', True, fnmatch.D],
121121
['a?bc', 'a.bc', True, fnmatch.D],

tests/test_glob.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -363,12 +363,16 @@ class Testglob(_TestGlob):
363363
[('aa?',), [('aaa',), ('aab',)]],
364364
[('aa[ab]',), [('aaa',), ('aab',)]],
365365
[('*q',), []],
366+
[('.',), [('.',)]],
367+
[('?',), [('a',)]],
368+
[('[.a]',), [('a',)]],
369+
[('*.',), []],
366370

367371
# Glob inverse
368372
[
369373
('a*', '**'),
370374
[
371-
('EF',), ('ZZZ',), ('',)
375+
('EF',), ('ZZZ',), ('',), ('a',), ('aaa', ), ('aab', )
372376
] if not can_symlink() else [
373377
('EF',), ('ZZZ',), ('',), ('sym1',), ('sym3',), ('sym2',),
374378
('a',), ('aaa', ), ('aab', ), ('sym3', 'efg'), ('sym3', 'efg', 'ha'), ('sym3', 'EF')
@@ -716,7 +720,7 @@ class Testglob(_TestGlob):
716720
[
717721
('a*', '**'),
718722
[
719-
('EF',), ('ZZZ',)
723+
('EF',), ('ZZZ',), ('a',), ('aaa', ), ('aab', )
720724
] if not can_symlink() else [
721725
('EF',), ('ZZZ',),
722726
('a',), ('aaa', ), ('aab', ), ('sym1',), ('sym3',), ('sym2',), ('sym3', 'efg'), ('sym3', 'efg', 'ha'),

tests/test_globmatch.py

Lines changed: 46 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -382,6 +382,46 @@ class TestGlobFilter:
382382
['!(test)', ['.abc', 'abc'], glob.D | glob.M],
383383
['.!(test)', ['.', '..', '.abc'], glob.M],
384384
['.!(test)', ['.', '..', '.abc'], glob.D | glob.M],
385+
['!(.)', ['..', '.abc', 'abc'], glob.M],
386+
[r'!(\.)', ['..', '.abc', 'abc'], glob.M],
387+
[r'!(\x2e)', ['..', '.abc', 'abc'], glob.M | glob.R],
388+
['@(!(.))', ['..', '.abc', 'abc'], glob.M],
389+
['!(@(.))', ['..', '.abc', 'abc'], glob.M],
390+
['+(!(.))', ['..', '.abc', 'abc'], glob.M],
391+
['!(+(.))', ['.abc', 'abc'], glob.M],
392+
['!(?)', ['abc'], glob.M],
393+
['!(*)', [], glob.M],
394+
['!([.])', ['abc'], glob.M],
395+
['!(.)', ['..', '.abc', 'abc'], glob.M | glob.D],
396+
[r'!(\.)', ['..', '.abc', 'abc'], glob.M | glob.D],
397+
[r'!(\x2e)', ['..', '.abc', 'abc'], glob.M | glob.R | glob.D],
398+
['@(!(.))', ['..', '.abc', 'abc'], glob.M | glob.D],
399+
['!(@(.))', ['..', '.abc', 'abc'], glob.M | glob.D],
400+
['+(!(.))', ['..', '.abc', 'abc'], glob.M | glob.D],
401+
['!(+(.))', ['.abc', 'abc'], glob.M | glob.D],
402+
['!(?)', ['.abc', 'abc'], glob.M | glob.D],
403+
['!(*)', [], glob.M | glob.D],
404+
['!([.])', ['.abc', 'abc'], glob.M | glob.D],
405+
406+
# More extended pattern dot related tests
407+
['*(.)', ['.', '..']],
408+
[r'*(\.)', ['.', '..']],
409+
['*([.])', []],
410+
['*(?)', ['abc']],
411+
['@(.?)', ['..']],
412+
['@(?.)', []],
413+
['*(.)', ['.', '..'], glob.D],
414+
[r'*(\.)', ['.', '..'], glob.D],
415+
['*([.])', [], glob.D],
416+
['*(?)', ['.abc', 'abc'], glob.D],
417+
['@(.?)', ['..'], glob.D],
418+
['@(?.)', [], glob.D],
419+
420+
GlobFiles(['folder/abc', 'directory/abc', 'dir/abc']),
421+
# Test that inverse works properly mid path.
422+
['!(folder)/*', ['directory/abc', 'dir/abc'], glob.M],
423+
['!(folder)dir/abc', ['dir/abc'], glob.M],
424+
['!(dir)/abc', ['directory/abc', 'folder/abc'], glob.M],
385425

386426
# Slash exclusion
387427
GlobFiles(
@@ -725,23 +765,26 @@ def test_glob_translate(self, mock__iscase_sensitive):
725765
if util.PY37:
726766
value = (
727767
[
728-
'^(?s:(?:(?!(?:/|^)\\.).)*?(?:^|$|/)+(?![/.])[\x00-\x7f]/+stuff/+(?=.)'
768+
'^(?s:(?:(?!(?:/|^)\\.).)*?(?:^|$|/)+'
769+
'(?!(?:\\.{1,2})(?:$|/))(?![/.])[\x00-\x7f]/+stuff/+(?=.)'
729770
'(?!(?:\\.{1,2})(?:$|/))(?:(?!\\.)[^/]*?)?[/]*?)$'
730771
],
731772
[]
732773
)
733774
elif util.PY36:
734775
value = (
735776
[
736-
'^(?s:(?:(?!(?:\\/|^)\\.).)*?(?:^|$|\\/)+(?![\\/.])[\x00-\x7f]\\/+stuff\\/+(?=.)'
777+
'^(?s:(?:(?!(?:\\/|^)\\.).)*?(?:^|$|\\/)+'
778+
'(?!(?:\\.{1,2})(?:$|\\/))(?![\\/.])[\x00-\x7f]\\/+stuff\\/+(?=.)'
737779
'(?!(?:\\.{1,2})(?:$|\\/))(?:(?!\\.)[^\\/]*?)?[\\/]*?)$'
738780
],
739781
[]
740782
)
741783
else:
742784
value = (
743785
[
744-
'(?s)^(?:(?:(?!(?:\\/|^)\\.).)*?(?:^|$|\\/)+(?![\\/.])[\x00-\x7f]\\/+stuff\\/+(?=.)'
786+
'(?s)^(?:(?:(?!(?:\\/|^)\\.).)*?(?:^|$|\\/)+'
787+
'(?!(?:\\.{1,2})(?:$|\\/))(?![\\/.])[\x00-\x7f]\\/+stuff\\/+(?=.)'
745788
'(?!(?:\\.{1,2})(?:$|\\/))(?:(?!\\.)[^\\/]*?)?[\\/]*?)$'
746789
],
747790
[]

tox.ini

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ deps=
1212
commands=
1313
{envbindir}/py.test --cov wcmatch --cov-append tests
1414
{envbindir}/coverage html -d {envtmpdir}/coverage
15+
{envbindir}/coverage report --show-missing
1516

1617
[testenv:lint]
1718
deps=

wcmatch/__meta__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,5 +186,5 @@ def parse_version(ver, pre=False):
186186
return Version(major, minor, micro, release, pre, post, dev)
187187

188188

189-
__version_info__ = Version(2, 2, 0, "final")
189+
__version_info__ = Version(2, 2, 1, "final")
190190
__version__ = __version_info__._get_canonical()

wcmatch/_wcparse.py

Lines changed: 37 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -102,6 +102,7 @@
102102
_ONE_OR_MORE = r'+'
103103
# End of pattern
104104
_EOP = r'$'
105+
_PATH_EOP = r'(?:$|%(sep)s)'
105106
# Divider between `globstar`. Can match start or end of pattern
106107
# in addition to slashes.
107108
_GLOBSTAR_DIV = r'(?:^|$|%s)+'
@@ -603,6 +604,8 @@ def __init__(self, pattern, flags=0):
603604
self.bslash_abort = False
604605
self.sep = '/'
605606
sep = {"sep": re.escape(self.sep)}
607+
self.path_eop = _PATH_EOP % sep
608+
self.no_dir = _NO_DIR % sep
606609
self.seq_path = _PATH_NO_SLASH % sep
607610
self.seq_path_dot = _PATH_NO_SLASH_DOT % sep
608611
self.path_star = _PATH_STAR % sep
@@ -643,11 +646,18 @@ def update_dir_state(self):
643646
elif not self.dir_start and self.after_start:
644647
self.reset_dir_track()
645648

649+
def _restrict_extended_slash(self):
650+
"""Restrict extended slash."""
651+
652+
return self.seq_path if self.pathname else ''
653+
646654
def _restrict_sequence(self):
647655
"""Restrict sequence."""
648656

649657
if self.pathname:
650658
value = self.seq_path_dot if self.after_start and not self.dot else self.seq_path
659+
if self.after_start:
660+
value = self.no_dir + value
651661
else:
652662
value = _NO_DOT if self.after_start and not self.dot else ""
653663
self.reset_dir_track()
@@ -794,7 +804,7 @@ def _sequence(self, i):
794804
else:
795805
result = [value]
796806

797-
if self.pathname or (self.after_start and not self.dot):
807+
if self.pathname or self.after_start:
798808
return self._restrict_sequence() + ''.join(result)
799809

800810
return ''.join(result)
@@ -814,23 +824,25 @@ def _references(self, i, sequence=False):
814824
value = self.get_path_sep() + _ONE_OR_MORE
815825
self.set_start_dir()
816826
else:
817-
value = self._restrict_sequence() + value
827+
value = self._restrict_extended_slash() + value
818828
elif c == '/':
819829
# \/
820830
if sequence and self.pathname:
821831
raise PathNameException
822832
if self.pathname:
823833
value = r'\\'
824834
if self.in_list:
825-
value = self._restrict_sequence() + value
835+
value = self._restrict_extended_slash() + value
826836
i.rewind(1)
827837
else:
828838
value = re.escape(c)
829-
elif self.in_list and not sequence and self.after_start and c == '.':
830-
value = _NO_DOT + re.escape(c)
831839
else:
832840
# \a, \b, \c, etc.
833841
value = re.escape(c)
842+
if c == '.' and self.after_start and self.in_list:
843+
self.allow_special_dir = True
844+
self.reset_dir_track()
845+
834846
return value
835847

836848
def _handle_star(self, i, current):
@@ -923,7 +935,7 @@ def _handle_star(self, i, current):
923935
else:
924936
current.append(value)
925937

926-
def clean_up_inverse(self, current, default=None):
938+
def clean_up_inverse(self, current):
927939
"""
928940
Clean up current.
929941
@@ -941,18 +953,16 @@ def clean_up_inverse(self, current, default=None):
941953
if not self.inv_ext:
942954
return
943955

944-
if default is None:
945-
default = ''
946-
947956
index = len(current) - 1
948957
while index >= 0:
949958
if isinstance(current[index], InvPlaceholder):
950959
content = current[index + 1:]
951-
current[index] = (''.join(content) if content else default) + (_EXCLA_GROUP_CLOSE % str(current[index]))
960+
content.append(_EOP if not self.pathname else self.path_eop)
961+
current[index] = (''.join(content)) + (_EXCLA_GROUP_CLOSE % str(current[index]))
952962
index -= 1
953963
self.inv_ext = 0
954964

955-
def parse_extend(self, c, i, current):
965+
def parse_extend(self, c, i, current, reset_dot=False):
956966
"""Parse extended pattern lists."""
957967

958968
# Save state
@@ -961,6 +971,8 @@ def parse_extend(self, c, i, current):
961971
temp_in_list = self.in_list
962972
temp_inv_ext = self.inv_ext
963973
self.in_list = True
974+
if reset_dot:
975+
self.allow_special_dir = False
964976

965977
# Start list parsing
966978
success = True
@@ -979,14 +991,15 @@ def parse_extend(self, c, i, current):
979991
pass
980992
elif c == '*':
981993
self._handle_star(i, extended)
982-
elif c == '.' and not self.dot and self.after_start:
983-
extended.append(_NO_DOT + re.escape(c))
994+
elif c == '.' and self.after_start:
995+
extended.append(re.escape(c))
996+
self.allow_special_dir = True
984997
self.reset_dir_track()
985998
elif c == '?':
986999
extended.append(self._restrict_sequence() + _QMARK)
9871000
elif c == '/':
9881001
if self.pathname:
989-
extended.append(self._restrict_sequence())
1002+
extended.append(self._restrict_extended_slash())
9901003
extended.append(re.escape(c))
9911004
elif c == "|":
9921005
self.clean_up_inverse(extended)
@@ -1026,17 +1039,18 @@ def parse_extend(self, c, i, current):
10261039
# If pattern is at the end, anchor the match to the end.
10271040
current.append(_EXCLA_GROUP % ''.join(extended))
10281041
if self.pathname:
1029-
if temp_after_start and not self.dot:
1042+
if not temp_after_start or self.allow_special_dir:
1043+
star = self.path_star
1044+
elif temp_after_start and not self.dot:
10301045
star = self.path_star_dot2
1031-
elif temp_after_start:
1032-
star = self.path_star_dot1
10331046
else:
1034-
star = self.path_star
1047+
star = self.path_star_dot1
10351048
else:
1036-
if temp_after_start and not self.dot:
1037-
star = _NO_DOT + _STAR
1038-
else:
1049+
if not temp_after_start or self.dot:
10391050
star = _STAR
1051+
else:
1052+
star = _NO_DOT + _STAR
1053+
10401054
if temp_after_start:
10411055
star = _NEED_CHAR + star
10421056
# Place holder for closing, but store the proper star
@@ -1108,7 +1122,7 @@ def root(self, pattern, current):
11081122
for c in i:
11091123

11101124
index = i.index
1111-
if self.extend and c in EXT_TYPES and self.parse_extend(c, i, current):
1125+
if self.extend and c in EXT_TYPES and self.parse_extend(c, i, current, True):
11121126
# Nothing to do
11131127
pass
11141128
elif c == '*':
@@ -1146,7 +1160,7 @@ def root(self, pattern, current):
11461160

11471161
self.update_dir_state()
11481162

1149-
self.clean_up_inverse(current, default=_EOP)
1163+
self.clean_up_inverse(current)
11501164
if self.pathname:
11511165
current.append(_PATH_TRAIL % self.get_path_sep())
11521166

0 commit comments

Comments
 (0)