Workflow overview
Why this workflow matters
Relevant for managed services and support workflows. Supports knowledge capture and document intelligence use cases.
KB Builder โ Historical Emails AlekSystem Workflow Template ๐ Description This workflow automates the process of building a structured Knowledge Base (KB) from your existing Gmail inbox by processing historical email threads, extracting customer-support conversation pairs, classifying them with AI, generating vector embeddings, and storing everything in a PostgreSQL database โ ready to power AI-assisted draft generation in downstream workflows. The workflow is triggered manually and fetches the last N emails from your connected Gmail account. Each email is parsed, filtered, and deduplicated before being processed. A thread-level fetch retrieves the full conversation context, separating customer messages from support replies. An AI classification step categorises each thread into a defined scenario type. Vector embeddings are generated for semantic similarity search. Qualified threads are inserted into three database tables โ kb_data for clean Q&A pairs, scenario_patterns for reusable handling logic, and corrections for diverse real-world examples โ with strict deduplication to keep the KB clean and non-redundant. โ๏ธ What This Workflow Does (Step-by-Step) ๐ฅ Manual Trigger โ Run on demand whenever you want to import a batch of historical emails into the KB. No scheduling required. ๐ฌ Fetch Emails from Gmail โ Pulls the last N emails (configurable, default 100) from your Gmail account using OAuth2. Returns full email metadata including thread IDs, sender, subject, labels, and body. ๐ Parse & Filter โ Extracts structured fields from each email: sender address, subject, body (cleaned of links and quoted history), date, and thread ID. Filters out emails from your own domain (outbound support replies) and auto-generated emails (no-reply, notifications, newsletters). Only genuine inbound customer emails proceed. ๐งต Fetch Full Thread โ For each qualified email, fetches the entire Gmail thread using the thread ID. Retrieves all messages in the conversation to identify the original customer message and the latest support reply. ๐๏ธ Assemble Thread Context โ Reconstructs the conversation by separating customer messages from support replies. Builds a clean conversationText block used as AI input. Flags threads with no support reply for conditional handling. ๐ค AI Classification โ Calls GPT-4o-mini (or Claude) with the assembled conversation. Returns a structured JSON output containing: scenario category, a concise Q&A pair, a handling pattern description, key entities, sentiment, resolution status, and a summary. ๐ข Generate Embeddings โ Calls the OpenAI Embeddings API to generate a 1536-dimension vector for both the KB entry and the correction record. Used for semantic similarity search in the downstream draft-generation workflow. ๐ Duplicate Detection & DB Insert โ Performs cosine similarity checks against existing records before inserting. KB entries and scenario patterns are blocked if a match exceeds 92% similarity. Corrections are inserted freely (deduped at 92%) since diverse examples improve AI draft quality. New records are written to three tables: kb_data, scenario_patterns, and corrections. ๐งฉ Prerequisites Gmail account** โ OAuth2 credentials connected in AlekSystem. The account must be the support inbox you want to import from. OpenAI API key** โ Used for GPT-4o-mini classification ($0.002โ0.005 per thread) and text-embedding-3-small for vector generation ($0.0001 per record). PostgreSQL database** โ With the pgvector extension enabled. Must have the three tables set up per the schema below. AlekSystem instance** โ Self-hosted or cloud. Requires the PostgreSQL and OpenAI nodes. ๐๏ธ Database Schema Table: kb_data | Field | Type | Notes | |---|---|---| | id | SERIAL PRIMARY KEY | Auto-increment | | thread_id | TEXT | Gmail thread ID | | subject | TEXT | Email subject | | category | TEXT | AI-assigned scenario type | | question | TEXT | Customer issue / question | | answer | TEXT | Support resolution | | entities | TEXT | Key entities extracted by AI | | sentiment | TEXT | Customer sentiment | | resolution_status | TEXT | Resolved / Unresolved | | embedding | vector(1536) | OpenAI embedding for similarity search | | source | TEXT | historical_import | | created_at | TIMESTAMP | Insert timestamp | Table: scenario_patterns | Field | Type | Notes | |---|---|---| | id | SERIAL PRIMARY KEY | Auto-increment | | category | TEXT | Scenario type | | pattern_description | TEXT | Handling logic summary | | example_subject | TEXT | Representative subject line | | embedding | vector(1536) | OpenAI embedding | | source | TEXT | historical_import | | created_at | TIMESTAMP | Insert timestamp | Table: corrections | Field | Type | Notes | |---|---|---| | id | SERIAL PRIMARY KEY | Auto-increment | | gmail_thread_id | TEXT | Gmail thread ID | | original_email_body | TEXT | Customer's original message | | human_sent_text | TEXT | Actual support reply sent | | ai_draft_text | TEXT | NULL for historical imports | | diff_summary | TEXT | Import note or live diff | | classification | TEXT | Scenario category | | embedding | vector(1536) | OpenAI embedding | | source | TEXT | historical_import | | created_at | TIMESTAMP | Insert timestamp | ๐ฐ Cost Estimate | Item | Estimated Cost | |---|---| | Gmail OAuth2 | Free | | GPT-4o-mini (100 threads classification) | ~$0.20โ0.50 | | text-embedding-3-small (100 records ร 2 embeddings) | ~$0.02 | | PostgreSQL + pgvector (self-hosted) | ~$5โ15/mo | | AlekSystem self-hosted (AWS t3.small) | ~$10โ15/mo | | Total per 100-email import run | ~$0.25โ0.75 | โ๏ธ Setup Instructions Gmail โ Connect your Gmail account in AlekSystem using OAuth2. Ensure the account is the support inbox. Grant read permissions for messages and threads. PostgreSQL โ Create a new database, enable the pgvector extension (CREATE EXTENSION IF NOT EXISTS vector), then create the three tables using the schema above. OpenAI โ Add your OpenAI API key as an AlekSystem credential. Used for both the chat completion node (classification) and the HTTP Request node (embeddings). Customise the AI prompt โ Open the AI Classification node and update the system prompt to reflect your business type, support tone, and the scenario categories relevant to your domain (e.g. Refund Request, Order Status, Technical Issue, Billing Query). Set your domain filter โ In the Parse & Filter node, update the YOUR_DOMAIN variable to your support team's email domain so outbound replies are correctly excluded from customer email processing. Set fetch limit โ In the Fetch Emails node, set the limit parameter to the number of historical emails you want to import per run. Start with 5โ10 to validate the pipeline before running at scale. Run manually โ Click Execute Workflow. Monitor the output of each node to verify parsing, classification, and DB inserts are working correctly. ๐ก Key Benefits โ Converts your inbox into a structured KB โ no manual tagging or categorisation required โ AI classification assigns scenario types and extracts Q&A pairs automatically โ Vector embeddings enable semantic similarity search in downstream draft workflows โ Smart deduplication keeps KB and scenario tables clean โ no near-duplicate entries โ Corrections table accumulates diverse examples โ improves AI draft quality over time โ Thread-aware โ reconstructs full conversations, not just individual emails โ Modular AI node โ swap GPT-4o-mini for Claude or any other model with minimal changes โ One-time historical import feeds directly into live draft-generation workflows ๐ฅ Perfect For Support teams wanting to automate email draft generation with AI trained on their own history Founders or small teams building a knowledge base from years of accumulated support emails Developers building AI-powered support automation on top of Gmail Anyone who wants their AI drafts to sound like their own team โ not generic templates ๐ Related Workflows This workflow is Step 1 in a two-workflow system: KB Builder โ Historical Emails* *(this workflow) โ Imports historical threads to seed the KB AI Draft Generator* *(coming soon) โ Monitors new inbound emails, retrieves semantically similar KB entries and corrections, and generates personalised reply drafts delivered to your inbox for one-click review and send
Best fit
Categories
Services
Use cases
Need another direction?