Skip to content

Conversation

jayvdb
Copy link
Contributor

@jayvdb jayvdb commented Sep 25, 2020

This is an idea for allowing custom thinning of repeats. It currently replaces \w\w and similar with three preset values, but the implementation could be a generated set, and could be generically reducing the complete set defined by the regex to be a subset (i.e. sampling/lossy compression).

Only the test depends on https://github.com/jayvdb/sre-tools . I am happy to directly contribute a copy of the relevant code here if there is a desire to incorporate this type of functionality. Personally I am happy to call the simplification algorithm before invoking sre_yield, so I am not pushing for inclusion of that feature. However there are a lot of optimisations that can be done by only merging adjacent identical nodes whilst doing the expansion, because sre_yield is already creating its own internal tree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant