Skip to content

rereplace side effect of performance tweak #352

@chuckbecker

Description

@chuckbecker

In the docs for rereplace, you say:

An small point to note, is that the replacements are first searched, and then all replacements are made. This is done for performance and reliability reasons. Generally this will have no side effects, however there may be cases where it makes a difference. (Author’s note: If you have such a case, please post a note on the forums such that it can be added to the documentation, or corrected).

Well, I found a situation: When the callback function changes the text in addition to returning the replacement, that can change the position of future matches.

My use case is that I have a json file where the structure includes a bunch of "id" values. Those ids are hierarchical, so "6.1" should live inside "6". Currently the json file has the ids all out of order. I want to go through the file, and for each integer id (eg, "6") I want to re-name them according to the order they show up in the file, but I also want to rename all the child ids (eg, "6.1") so that they stay together with their parents.

So I want to go from something like this:

    {
      "type": "html",
      "id": 15,
      "formId": 73,
    },
    {
      "type": "text",
      "id": 6,
      "formId": 73,
      "inputs": [
        {
          "id": "6.2",
          "label": "Prefix",
          "name": "",
        },
        {
          "id": "6.3",
          "label": "First",
          "name": "",
        }
      ]
    },
    {
      "type": "text",
      "id": 14,
      "formId": 73
    }

to this:

    {
      "type": "html",
      "id": 1,
      "formId": 73,
    },
    {
      "type": "text",
      "id": 2,
      "formId": 73,
      "inputs": [
        {
          "id": "2.2",
          "label": "Prefix",
          "name": "",
        },
        {
          "id": "2.3",
          "label": "First",
          "name": "",
        }
      ]
    },
    {
      "type": "text",
      "id": 3,
      "formId": 73
    }

Here's the python code I'm using:

i = 1   
def change(match):
    global i
    editor.rereplace(r'"id": "' + match.group(2) + r'\.(\d+)"', r'"id": "' + str(i) + '.' + r'\1"') 
    i = i + 1
    return match.group(1) + str(i - 1) + ','

editor.rereplace('("id": )(\d+),', change)

Notice the callback function ( change() ) also performs another replace before it returns the replacement value, but it should only affect the file after the current match. Since the current implementation apparently gets the positions of all the matches before applying the replacements (I'm assuming that to be the case?), that causes the result to look something like this:

    {
      "type": "html",
      "id": 1,
      "formId": 73,
    },
    {
      "type": "text",
      "id": 2,
      "formId": 73,
      "inputs": [
        {
          "id": "2.2",
          "label": "Prefix",
          "name": "",
        },
        {
          "id": "2.3",
          "label": "First",
          "name": "",
        }
      ]
    },
    {
      "type": "text",
      "id": 14,
    3,"formId": 73
    }

Notice that in the last object, the "14" has not been replaced, and the "3" gets inserted in the wrong place.

I get that maybe I'm using the the callback function inappropriately, but since you asked for examples, I figured I'd pass this along.

Perhaps there could be an optional parameter that turns off the performance algorithm of rereplace ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions