Workflow overview
Why this workflow matters
Relevant for managed services and support workflows. Supports knowledge capture and document intelligence use cases.
Make your unstructured large documents LLM ready markdown using LandingAI Document Parsing. Automatically watches a Google Drive folder, submits new documents to Landing.ai for parsing, caches processed files in - Supabase to avoid reprocessing, and reliably polls results with retry and timeout handling. Use Cases Automated document ingestion for RAG pipelines Invoice, contract, or report parsing AI-powered document analysis workflows Knowledge base ingestion from Google Drive Preventing duplicate document processing in ETL pipelines External services: Google Drive Landing.ai Supabase Credentials Required Required Google Drive OAuth2 Landing.ai API (HTTP Bearer Token) Supabase API How it works Once the pdf land in google drive location it trigger and it convert pdf (even more then 200 pages to LLM ready markdown). It also check in database if the parsing is already done or not, this help to avoid any unnecessary landingAI api call. Setup Instructions Step 1: Google Drive Create or select a folder in Google Drive Copy the folder ID Update the Google Drive Trigger node with this folder ID Step 2: Landing.ai Create a Landing.ai account Generate an API key Add it in AlekSystem as an HTTP Bearer Auth credential Update the organization-id header if required Step 3: Supabase Create a Supabase project Create a table named landing_parse_cache Add fields such as: file_id document_name mime_type file_size_bytes job_id job_status markdown uploaded_at workflow_run_id Connect Supabase credentials in AlekSystem Expected Input A document uploaded into the configured Google Drive folder (PDF, DOCX, or other supported formats) Expected Output Parsed markdown content stored in Supabase Metadata including: File ID File name MIME type File size Job ID Processing status Early exit if the document already exists in cache Error Handling & Edge Cases Cache check to prevent duplicate processing Retry-based polling for async job completion Timeout detection for stuck jobs Large file output URL handling Detailed logging for debugging and audits Customization Ideas Push parsed output to a vector database Trigger Slack or email notifications Store results in cloud storage (S3, GCS) Extend into a RAG or AI agent pipeline Categories Document Processing AI & LLM Knowledge Management Automation Difficulty Level Advanced Happy Automating - from Alok
Best fit
Categories
Services
Use cases
Need another direction?