Skip to content

Workflow format #35

@kba

Description

@kba

Current situation

We ship with the simplistic ocrd process tool for sequential workflows with minimal validation of inputs/outputs/parameters. For more complex workflows and in workspaces with many files, this approach does not scale:

  • No error handling, graceful or otherwise. A single failure of a single processor on a single image breaks the workflow and leaves inconsistent state behind.
  • no support for runtime dynamic behavior, apart from simple mappings based on XPath or similar
  • Inefficient, does not make full and/or smart use of available computing resources

So we need a proper workflow engine as a backend, that is being worked on in different contexts. However the implementation, we should specify a common syntax for OCR-D workflows.

How it should be

OCR-D users should be able to model even complex, dynamic workflows with an easy-to-understand and well-defined syntax. It should be easy to share workflows, validate them with OCR-D tooling for consistency.

Requirements list

https://pad.gwdg.de/AosGiphcQoKKIqoRYBqK-A

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions