Google Vision

Performs image and document analysis using the Google Cloud Vision API, supporting OCR on local image files, asynchronous document OCR from Google Cloud Storage, and object detection from GCS-hosted images.

Purpose

Use this task when a workflow needs to extract text from scanned images or multi-page documents, or to detect and label objects within images. The action picker selects the specific Vision capability to invoke. Local image OCR requires only a file path, while the GCS-based modes require source and destination URI inputs. All modes require a Google Cloud Platform third-party account to be configured. Output variables differ by action and are listed below.

Inputs

Field	Type	Required	Description
Action	Dropdown	Yes	The Vision operation to perform. Options: `Image OCR (local file)`, `Document OCR (GCS pdf/tif)`, `Object Detection (GCS image)`.
Local Image Path	Text	No	The absolute local path to the image file to process with OCR. Supported formats include PNG, JPG, and BMP.
GCS Source URI	Text	No	The GCS URI of the PDF or TIFF document to process, for example `gs://my-bucket/docs/file.pdf`.
GCS Destination Bucket	Text	No	The name of the GCS bucket where OCR result JSON will be written.
GCS Destination Prefix	Text	No	The folder prefix within the destination bucket for OCR output files.
MIME Type Override	Text	No	An optional MIME type to use when the file extension is ambiguous, for example `application/pdf` or `image/tiff`.
GCS Image URI	Text	No	The GCS URI of the image to analyse for object detection, for example `gs://my-bucket/images/photo.jpg`.
Confidence Threshold	Text	No	The minimum detection confidence score to include in results. Accepts a value between `0.0` and `1.0`. Defaults to `0.0`.
Max Results	Text	No	The maximum number of detected objects to return. Defaults to all results.

Visibility Rules

Local Image Path is only shown when Action is set to Image OCR (local file).

GCS Source URI is only shown when Action is set to Document OCR (GCS pdf/tif).

GCS Destination Bucket is only shown when Action is set to Document OCR (GCS pdf/tif).

GCS Destination Prefix is only shown when Action is set to Document OCR (GCS pdf/tif).

MIME Type Override is only shown when Action is set to Document OCR (GCS pdf/tif).

GCS Image URI is only shown when Action is set to Object Detection (GCS image).

Confidence Threshold is only shown when Action is set to Object Detection (GCS image).

Max Results is only shown when Action is set to Object Detection (GCS image).

Operations

Operation	Description
Image OCR (local file)	Reads a local image file, encodes it, and submits it to the Vision API to extract all text. Returns the extracted text as a string.
Document OCR (GCS pdf/tif)	Submits a multi-page PDF or TIFF stored in GCS to the Vision API for asynchronous document OCR. Google writes JSON result files to the specified destination bucket and prefix.
Object Detection (GCS image)	Analyses a GCS-hosted image for recognisable objects and returns a list of detected items with confidence scores, filtered by the optional threshold and result count limit.

Outputs

The output variables produced depend on the selected action.

For Image OCR (local file):

Name	Description
OCR Text	The full text extracted from the image by the Vision API.
Source Image	The local image path that was processed.
Action	The action that was performed.

For Document OCR (GCS pdf/tif):

Name	Description
Source URI	The GCS source URI of the document that was processed.
Output URI	The GCS URI prefix where OCR result JSON files were written.
Destination Bucket	The destination bucket name.
Destination Prefix	The destination prefix used for OCR output.
Result	The raw result or operation reference returned by the Vision API.
Action	The action that was performed.

For Object Detection (GCS image):

Name	Description
Image URI	The GCS URI of the image that was analysed.
Min Score	The confidence threshold that was applied to filter results.
Max Results	The maximum result count that was applied, or `Unlimited` if not set.
Detected Objects	A list of detected objects with their labels and confidence scores.
Object Count	The total number of objects returned after filtering.
Action	The action that was performed.

Tasks

ACC

Bentley ProjectWise

Autodesk Vault

Azure

BlueBeam

Business Central

Dynamics 365

Google Cloud

Mesh

Monday

Sharepoint

Completions

Document

General

MinuteView

ElevenLabs

Open AI

AutoCAD

Excel

Inventor

Microstation

Pdf

Word

Triggers

API

Configurator

Google Vision

Purpose

Inputs

Visibility Rules

Operations

Outputs

Tasks

ACC

Bentley ProjectWise

Autodesk Vault

Azure

BlueBeam

Business Central

Dynamics 365

Google Cloud

Mesh

Monday

Sharepoint

Completions

Document

General

MinuteView

ElevenLabs

Open AI

AutoCAD

Excel

Inventor

Microstation

Pdf

Word

Triggers

API

Configurator

Google Vision ​

Purpose ​

Inputs ​

Visibility Rules ​

Operations ​

Outputs ​

Google Vision

Purpose

Inputs

Visibility Rules

Operations

Outputs