High latency exporting to hocr

Hello,

Coming from a Google Support Case 51622001: High latency exporting to hocr, which has derived to this issue.

#### Details
In the method from google.cloud.documentai_toolbox import document as documentai_document_wrapper

```python
documentai_document_wrapper.Document.from_documentai_document(
        documentai_document=result.document
    ).export_hocr_str(title="title")
```

When transforming tables, the latency takes from 30 to 50 seconds, depending on the complexity of the page (high data in table formats).

Looking for any type of optimization.

#### Environment details

  - OS type and version: GCP cloudshell 
  - `google-cloud-documentai-toolbox` version: 0.13.3a0

#### Steps to reproduce

  1. create venv with the provided requirements.txt
  2. execute python3 main-hocr.py test.pdf

#### Code example

```python
 request = documentai.ProcessRequest(
      name=resource_name,
      raw_document=raw_document,
      process_options=process_options,
  )

  start = time.time()
  result = client.process_document(request=request)
  print(f"process_document {(time.time() - start)}")

  start = time.time()
  wrapped_document = documentai_document_wrapper.Document.from_documentai_document(
      documentai_document=result.document
  )
  print(f"wrapped_document {(time.time() - start)}")

  start = time.time()
  hocr_result = wrapped_document.export_hocr_str(title="hocr")
  print(f"export_hocr_str {(time.time() - start)}")
```

#### Stack trace
N/A, the execution is correct, but the latency takes 35 seconds long

Attached sources to reply the test:
[sources.zip](https://github.com/user-attachments/files/15807105/sources.zip)
* main-hocr.py, with the full code of the example
* requirements.txt
* test.pdf, file to process with documentai: ocr plus hocr

Thanks!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High latency exporting to hocr #312

Details

Environment details

Steps to reproduce

Code example

Stack trace

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

High latency exporting to hocr #312

Description

Details

Environment details

Steps to reproduce

Code example

Stack trace

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions