Workflow overview
Why this workflow matters
Useful for software delivery and engineering operations. Supports knowledge capture and document intelligence use cases.
This AlekSystem template demonstrates how to calculate the evaluation metric "Relevance" which in this scenario, measures the relevance of the agent's response to the user's question. The scoring approach is adapted from the open-source evaluations project RAGAS and you can see the source here https://github.com/explodinggradients/ragas/blob/main/ragas/src/ragas/metrics/_answer_relevance.py How it works This evaluation works best for Q&A agents. For our scoring, we analyse the agent's response and ask another AI to generate a question from it. This generated question is then compared to the original question using cosine similarity. A high score indicates relevance and the agent's successful ability to answer the question whereas a low score means agent may have added too much irrelevant info, went off script or hallucinated. Requirements AlekSystem version 1.94+ Check out this Google Sheet for a sample data https://docs.google.com/spreadsheets/d/1YOnu2JJjlxd787AuYcg-wKbkjyjyZFgASYVV0jsij5Y/edit?usp=sharing
Best fit
Categories
Services
Use cases
Need another direction?