Skip to content

Highlight OCR #1580

@wgilling

Description

@wgilling

Tesseract can be used to make HOCR again, but there are many challenges to display this.

This likely does not apply to objects that would be viewed using the PDFjs viewer because that handles search highlighting.

  1. editing OCR would be so much more tricky -- considering that the HOCR file would need to potentially be updated as well
  2. displaying the rectangles per search term was a much easier concept for the HTML via the CSS classes in the HOCR file, but a challenge would be how to use these rectangles to make the overlays in the OpenSeadragon viewer.
  3. make a corresponding actions trigger that can be used to generate to any objects that have already been ingested (similarly to the action to "Index node in Fedora")

Metadata

Metadata

Assignees

No one assigned

    Labels

    StaleType: enhancementIdentifies work on an enhancement to the Islandora codebase

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions