Google Vision

Description

The GoogleVision node connects to the Google Cloud Vision API to perform image and document analysis tasks such as Optical Character Recognition (OCR) and Object Detection.
It supports both local file OCR and Google Cloud Storage (GCS)-based document and image operations.

This node allows workflows to automatically extract text from images or PDFs, detect objects within images, and feed these results into downstream nodes like AI analyzers, Excel writers, or document classifiers.

How It Works

Validates the action type (Image OCR, Document OCR, or Object Detection).
Retrieves an OAuth access token from a configured Google Cloud third-party credential.
Invokes the appropriate Google Vision API endpoint depending on the selected action.
Returns extracted text, structured detection data, or GCS output references.
Logs the operation and makes results available for other workflow steps.

Supported Actions

Action	Description
Image OCR (local file)	Extracts text from a local image (PNG, JPG, BMP, etc.) using Google Vision OCR.
Document OCR (GCS pdf/tif)	Processes multi-page PDF or TIFF documents stored in Google Cloud Storage and exports text to a GCS bucket.
Object Detection (GCS image)	Detects and labels objects within a GCS-hosted image, including confidence scores.

Input Fields

Field	Type	Description	Required
ThirdParty - Google Cloud	Third-Party Token	Reference to a valid Google Cloud OAuth credential stored in MinuteView.	✅
Action	Picklist	The analysis type to perform: `Image OCR (local file)`, `Document OCR (GCS pdf/tif)`, or `Object Detection (GCS image)`.	✅

When Action = "Image OCR (local file)"

Field	Type	Description	Required
Local Image Path	Text	Full path to the image file on the automation server.	✅

When Action = "Document OCR (GCS pdf/tif)"

Field	Type	Description	Required
GCS Source URI	Text	Source file URI (e.g. `gs://bucket/path/document.pdf`).	✅
GCS Destination Bucket	Text	Destination bucket where OCR JSON will be written.	✅
GCS Destination Prefix	Text	Destination prefix (folder) for OCR results.	✅
MIME Type Override	Text	(Optional) Custom MIME type for special formats.	❌

When Action = "Object Detection (GCS image)"

Field	Type	Description	Required
GCS Image URI	Text	URI of the image stored in GCS (e.g. `gs://bucket/images/photo.jpg`).	✅
Confidence Threshold	Number	Minimum detection confidence between `0.0` and `1.0`. Default = 0.0.	❌
Max Results	Number	Maximum number of detected objects to return. Default = all.	❌

Output Data

Output Variable	Type	Description
out	Object / String	The raw or structured output of the selected action.
taskMessage	String	Message describing the outcome.
statusReturn	String	`Completed` on success or `Fail` on error.

Example Outputs

🧾 Image OCR (local file)

json

{
  "out": "Valve No. 204-B\nPressure Rating: 25 bar\nLast Service: 2024-08-05",
  "taskMessage": "Image OCR (local file) completed successfully",
  "statusReturn": "Completed"
}

📑 Document OCR (GCS pdf/tif)

json

{
  "out": {
    "Source": "gs://engineering-docs/invoices/invoice123.pdf",
    "Output": "gs://engineering-docs/ocr-results/invoice123/",
    "Status": "QueuedOrCompleted",
    "Result": "Operation-123456789"
  },
  "taskMessage": "Document OCR request submitted successfully",
  "statusReturn": "Completed"
}

🧠 Object Detection (GCS image)

json

{
  "out": {
    "Image": "gs://project-assets/inspection/site_photo.jpg",
    "MinScore": 0.6,
    "MaxResults": 10,
    "Objects": [
      { "name": "Hardhat", "score": 0.92 },
      { "name": "Person", "score": 0.87 },
      { "name": "Excavator", "score": 0.85 }
    ]
  },
  "taskMessage": "Object Detection completed successfully",
  "statusReturn": "Completed"
}

Example Workflow Use Cases

Scenario	Description
🔎 Drawing OCR	Extract text and dimensions from scanned PDFs or TIFF drawings.
📄 Document Digitization	Read legacy engineering documents and export OCR text into databases.
🧰 Object Recognition	Automatically tag and classify images (e.g., identify equipment, safety gear, or site conditions).
🧾 Invoice OCR Pipeline	Read invoice PDFs from a GCS bucket, parse text via OCR, and load results into SharePoint or SQL.

Task Flow Summary

Step	Action
1	Validates the selected action type.
2	Retrieves Google Cloud third-party credentials.
3	Executes the appropriate Vision API function.
4	Processes and filters the response.
5	Returns structured results or GCS reference URIs.

Notes

The node supports both local file-based and Google Cloud Storage-based operations.
Returned data can be passed into subsequent workflow nodes for analysis, classification, or AI processing.
Document OCR operations typically write their JSON results to the GCS destination path.
Object detection output can be filtered using Confidence Threshold and Max Results.
The node relies on the Google Cloud Vision API under your provided credentials and permissions.

Error Handling

If the task fails, a clear error message will be logged and returned. Common error causes include:

Missing or invalid Google Cloud token
Incorrect GCS bucket URI or permissions
Unsupported file format
Missing required fields (e.g., file path or GCS prefix)
API or network error from Google Cloud services

Example Workflow Integration

mermaid

graph LR
    A[Get File From SharePoint] --> B[GoogleVision (Image OCR)]
    B --> C[Extract Keywords]
    C --> D[Add Metadata to Vault]

Category: AI & Google Cloud Task Name: GoogleVision

Tasks

ACC

Autodesk Vault

Azure

BlueBeam

Dynamics 365

Google Cloud

Mesh

Monday

Sharepoint

Completions

Document

General

MinuteView

ElevenLabs

Open AI

AutoCAD

Excel

Inventor

Microstation

Pdf

Word

Google Vision

Description

How It Works

Supported Actions

Input Fields

When Action = "Image OCR (local file)"

When Action = "Document OCR (GCS pdf/tif)"

When Action = "Object Detection (GCS image)"

Output Data

Example Outputs

🧾 Image OCR (local file)

📑 Document OCR (GCS pdf/tif)

🧠 Object Detection (GCS image)

Example Workflow Use Cases

Task Flow Summary

Notes

Error Handling

Example Workflow Integration

Tasks

ACC

Autodesk Vault

Azure

BlueBeam

Dynamics 365

Google Cloud

Mesh

Monday

Sharepoint

Completions

Document

General

MinuteView

ElevenLabs

Open AI

AutoCAD

Excel

Inventor

Microstation

Pdf

Word

Google Vision ​

Description ​

How It Works ​

Supported Actions ​

Input Fields ​

When Action = "Image OCR (local file)" ​

When Action = "Document OCR (GCS pdf/tif)" ​

When Action = "Object Detection (GCS image)" ​

Output Data ​

Example Outputs ​

🧾 Image OCR (local file) ​

📑 Document OCR (GCS pdf/tif) ​

🧠 Object Detection (GCS image) ​

Example Workflow Use Cases ​

Task Flow Summary ​

Notes ​

Error Handling ​

Example Workflow Integration ​

Google Vision

Description

How It Works

Supported Actions

Input Fields

When Action = "Image OCR (local file)"

When Action = "Document OCR (GCS pdf/tif)"

When Action = "Object Detection (GCS image)"

Output Data

Example Outputs

🧾 Image OCR (local file)

📑 Document OCR (GCS pdf/tif)

🧠 Object Detection (GCS image)

Example Workflow Use Cases

Task Flow Summary

Notes

Error Handling

Example Workflow Integration