#llm-evaluation

4 results found

Atla MCP Server

Initial version of an mcp server for agents to interact with atla's models

Prompt Quality Score

PQS is the fastest way to get better output from any AI model. Score any prompt before it hits the model. Get a grade (A-F), score out of 80, percentile, and dimension breakdown across 8 quality dimensions. Built on PEEM, RAGAS, MT-Bench, G-Eval, and ROUGE frameworks. Pre-flight, not post-hoc. The AI input quality problem is real. PQS solves it. MCP Tools: - score_prompt: Free. Grade and percentile for any prompt. No API key needed. - optimize_prompt: $0.025 USDC. Returns optimized prompt with full dimension breakdown. - compare_models: $1.25 USDC. Side-by-side scoring across multiple models. HTTP API (x402-native on Base): - /api/score/free: Free. Grade and percentile, no payment required. - /api/score: $0.025 USDC. Single score, pay-per-call. - /api/score/full: $0.125 USDC. Grade, percentile, dimension breakdown, and rewrite. - /api/score/batch: $0.25 USDC. Score multiple prompts in a single call. - /api/score/compare: $1.25 USDC. Multi-model side-by-side scoring. - /api/preflight: $0.05 USDC. Lightweight pre-flight quality check. Paste a prompt. Get an optimized version. Ship better work. Cheaper than one bad prompt.

Build with ShipAny.