AlekSystem Workflow Detail

AI Analysis Workflow for Images & Extract Text with GPT-4o Vision and Telegram

Analyze Images & Extract Text with GPT-4o Vision and Telegram

Who’s it for Teams and makers who want a plug-and-play vision bot: users send a photo in Telegram, the bot returns a concise description plus OCR text.

Rank 51 Verified workflow

Workflow overview

Why this workflow matters

Supports knowledge capture and document intelligence use cases.

Who’s it for Teams and makers who want a plug-and-play vision bot: users send a photo in Telegram, the bot returns a concise description plus OCR text. No custom servers required—just AlekSystem, a Telegram bot, and an AIMLAPI key. What it does / How it works The workflow listens for new Telegram messages, fetches the highest-resolution photo, converts it to base64, normalizes the MIME type, and calls AIMLAPI (GPT-4o Vision) via the HTTP Request node using the OpenAI-compatible messages format with an image_url data URI. The model returns a short caption and extracted text. The answer is sent back to the same Telegram chat. Requirements AlekSystem instance (self-hosted or cloud) Telegram bot token (from @BotFather) AIMLAPI account and API key (OpenAI-compatible endpoint) How to set up Create a Telegram bot with @BotFather and copy the token. In AlekSystem, add Telegram credentials (no hardcoded tokens in nodes). Add AIMLAPI credentials with your API key (base URL: https://api.aimlapi.com/v1). Import the workflow JSON and connect credentials in the nodes. Execute the trigger and send a photo to your bot to test. How to customize the workflow Modify the vision prompt (e.g., add brand, language, or formatting rules). Switch models within AIMLAPI (any vision-capable model using the same messages schema). Add an IF branch for text-only messages (reply with guidance). Log usage to Google Sheets or a database (user id, file id, response). Add rate limits, user allowlists, or Markdown formatting in Telegram responses. Increase timeouts/retries in the HTTP Request node for long-running images.

Best fit

Categories

AI/MLCommunicationDocument Ops

Services

Telegram

Use cases

document intelligence