Search K
Appearance
Appearance
Performs image and document analysis using the Google Cloud Vision API, supporting OCR on local image files, asynchronous document OCR from Google Cloud Storage, and object detection from GCS-hosted images.
Use this task when a workflow needs to extract text from scanned images or multi-page documents, or to detect and label objects within images. The action picker selects the specific Vision capability to invoke. Local image OCR requires only a file path, while the GCS-based modes require source and destination URI inputs. All modes require a Google Cloud Platform third-party account to be configured. Output variables differ by action and are listed below.
| Field | Type | Required | Description |
|---|---|---|---|
| Action | Dropdown | Yes | The Vision operation to perform. Options: Image OCR (local file), Document OCR (GCS pdf/tif), Object Detection (GCS image). |
| Local Image Path | Text | No | The absolute local path to the image file to process with OCR. Supported formats include PNG, JPG, and BMP. |
| GCS Source URI | Text | No | The GCS URI of the PDF or TIFF document to process, for example gs://my-bucket/docs/file.pdf. |
| GCS Destination Bucket | Text | No | The name of the GCS bucket where OCR result JSON will be written. |
| GCS Destination Prefix | Text | No | The folder prefix within the destination bucket for OCR output files. |
| MIME Type Override | Text | No | An optional MIME type to use when the file extension is ambiguous, for example application/pdf or image/tiff. |
| GCS Image URI | Text | No | The GCS URI of the image to analyse for object detection, for example gs://my-bucket/images/photo.jpg. |
| Confidence Threshold | Text | No | The minimum detection confidence score to include in results. Accepts a value between 0.0 and 1.0. Defaults to 0.0. |
| Max Results | Text | No | The maximum number of detected objects to return. Defaults to all results. |
Local Image Path is only shown when Action is set to Image OCR (local file).
GCS Source URI is only shown when Action is set to Document OCR (GCS pdf/tif).
GCS Destination Bucket is only shown when Action is set to Document OCR (GCS pdf/tif).
GCS Destination Prefix is only shown when Action is set to Document OCR (GCS pdf/tif).
MIME Type Override is only shown when Action is set to Document OCR (GCS pdf/tif).
GCS Image URI is only shown when Action is set to Object Detection (GCS image).
Confidence Threshold is only shown when Action is set to Object Detection (GCS image).
Max Results is only shown when Action is set to Object Detection (GCS image).
| Operation | Description |
|---|---|
| Image OCR (local file) | Reads a local image file, encodes it, and submits it to the Vision API to extract all text. Returns the extracted text as a string. |
| Document OCR (GCS pdf/tif) | Submits a multi-page PDF or TIFF stored in GCS to the Vision API for asynchronous document OCR. Google writes JSON result files to the specified destination bucket and prefix. |
| Object Detection (GCS image) | Analyses a GCS-hosted image for recognisable objects and returns a list of detected items with confidence scores, filtered by the optional threshold and result count limit. |
The output variables produced depend on the selected action.
For Image OCR (local file):
| Name | Description |
|---|---|
| OCR Text | The full text extracted from the image by the Vision API. |
| Source Image | The local image path that was processed. |
| Action | The action that was performed. |
For Document OCR (GCS pdf/tif):
| Name | Description |
|---|---|
| Source URI | The GCS source URI of the document that was processed. |
| Output URI | The GCS URI prefix where OCR result JSON files were written. |
| Destination Bucket | The destination bucket name. |
| Destination Prefix | The destination prefix used for OCR output. |
| Result | The raw result or operation reference returned by the Vision API. |
| Action | The action that was performed. |
For Object Detection (GCS image):
| Name | Description |
|---|---|
| Image URI | The GCS URI of the image that was analysed. |
| Min Score | The confidence threshold that was applied to filter results. |
| Max Results | The maximum result count that was applied, or Unlimited if not set. |
| Detected Objects | A list of detected objects with their labels and confidence scores. |
| Object Count | The total number of objects returned after filtering. |
| Action | The action that was performed. |