Skip to content

Support migrating arbitrary time ranges of data #17249

@e-dard

Description

@e-dard

Right now the influxd migrate tool only supports migrating entire shards of 1.x data into a 2.x OSS instance.

The PR for the tool is here: #17016

We should support arbitrary time ranges of data, so that a user can decide how much data they want to migrate from 1.x to 2.x.

I have already done the work to add the flags to support this: --from and --to. Those flags will expect the user to provide RFC3339Nano formatted timestamps. Currently those flags will cause an error if used, so that needs to be rectified.

Secondly, the processTSMFile method will need to be implemented. Unlike the processTSMFileFast method the processTSMFile needs to look at the time range of blocks within each TSM file, and compare that to the from and to flag values.

If a block is completely overlapped by the desired time range then it can be completely migrated over. However if a block is only partially covered then it will need to be decoded, chopped up appropriately, and then encoded again before migration. If a block is not overlapped at all it should be excluded from the migration.

It is important that index entries for blocks excluded from the migration are omitted from the 2.x file, and that index entries for partially overlapped blocks are updated appropriately (e.g., updating the start and end times).

Definition of Done

  • from and to flags don't return an error;
  • processTSMFile implemented;
  • Test cases for processTSMFile (hint - consider accepting an interface that can write and seek, and then using buffers in tests to write into).
  • ensure from and to flag documentation is clear. Right now it doesn't mention the format the flag values need to be in for example.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions