AI-Powered consensus-based answers using Claude, GPT, Grok and Gemini Generation

Workflow overview

Why this workflow matters

Useful for software delivery and engineering operations. Improves internal consulting operations and productivity.

The original LLM Council concept was introduced by Andrej Karpathy and published as an open-source repository demonstrating multi-model consensus and ranking. This workflow is my adaptation of that original idea, reimplemented and structured as a production-ready AlekSystem template. Original repository - https://github.com/karpathy/llm-council This AlekSystem template implements the LLM Council pattern: a single user question is processed in parallel by multiple large language models, independently evaluated by peer models, and then synthesized into one high-quality, consensus-driven final answer. It is designed for use cases where answer quality, balance, and reduced single-model bias are critical. 📌 Section 1: Trigger & Input ⚡ When Chat Message Received (Chat Trigger) Purpose: Receives a user’s message and initiates the entire workflow. How it works: A user sends a chat message The message is stored as the Original Question The same input is forwarded simultaneously to multiple LLM pipelines Why it matters: Provides a clean, unified entry point for all downstream multi-model logic. 📌 Section 2: Stage 1 — Parallel LLM Responses 🤖 Basic LLM Chains (x4) Models used: Anthropic Claude OpenAI GPT xAI Grok Google Gemini Purpose: Each model independently generates its own response to the same question. Key characteristics: Identical prompt structure for all models Independent reasoning paths No shared context between models Why it matters: Produces diverse perspectives, reasoning styles, and solution approaches. 📌 Section 3: Stage 2 — Response Anonymization 🧾 Set Nodes (Response A / B / C / D) Purpose: Stores model outputs in an anonymized format: Response A Response B Response C Response D Why it matters: Prevents evaluator models from knowing which LLM authored which response, reducing bias during evaluation. 📌 Section 4: Stage 3 — Peer Evaluation & Ranking 📊 Evaluation Chains (Claude / GPT / Grok / Gemini) Purpose: Each model acts as a reviewer and: Analyzes all four anonymized responses Describes strengths and weaknesses of each Produces a strict FINAL RANKING from best to worst Ranking format (strict): FINAL RANKING: Response B Response A Response D Response C Why it matters: Creates multiple independent quality assessments from different model perspectives. 📌 Section 5: Stage 4 — Ranking Aggregation 🧮 Code Node (JavaScript) Purpose: Aggregates all peer rankings by: Parsing ranking positions Calculating average position per response Counting evaluation occurrences Sorting responses by best average score Output includes: Aggregated rankings Best response label Best average score Why it matters: Transforms subjective rankings into a structured, quantitative consensus. 📌 Section 6: Stage 5 — Final Consensus Answer 🧠 Chairman LLM Chain Purpose: One model acts as the Council Chairman and: Reviews all original responses Considers peer rankings and aggregated scores Identifies consensus patterns and disagreements Produces a single, clear, high-quality final answer Why it matters: Delivers a refined response that reflects collective model intelligence rather than a simple average. 📊 Workflow Overview Stage Node / Logic Purpose 1 Chat Trigger Receive user question 2 LLM Chains Generate independent responses 3 Set Nodes Anonymize outputs 4 Evaluation Chains Peer review & ranking 5 Code Node Aggregate rankings 6 Chairman LLM Final synthesized answer 🎯 Key Benefits 🧠 Multi-model intelligence — avoids reliance on a single LLM ⚖️ Reduced bias — anonymized peer evaluation 📊 Quality-driven selection — ranking-based consensus 🔁 Modular architecture — easy to add or replace models 🌍 Language-flexible — input and output languages configurable 🧩 Production-ready logic — clear stages, deterministic ranking 🚀 Ideal Use Cases High-stakes decision support Complex technical or architectural questions Strategy and research synthesis AI assistants requiring higher trust and reliability Comparing and selecting the best LLM-generated answers

Best fit

Services

SlackTelegramGmailWhatsApp Business CloudBasic LLM ChainAnthropic Chat ModelOpenAI Chat ModelGoogle Gemini Chat Model

Use cases

support automationengineering workflow automationemail workflow automation

Need another direction?

Continue a new search Request this workflow

AI-Powered consensus-based answers using Claude, GPT, Grok and Gemini Generation

Why this workflow matters

Categories

Services

Use cases

Related AlekSystem workflow ideas

Automated AI Timesheets for Consulting Teams

Executive AI Briefing and Follow-Up Assistant

Automated Multi-Channel Customer Support with Gmail, Telegram, and GPT AI Solution