LLM Vendor Lock-in Escape Test Kit

4.05

Derivation Chain

Step 1 Trump's directive to discontinue Anthropic usage

→

Step 2 Urgent vendor transition demand due to AI vendor political/regulatory risk

→

Step 3 Pre-diagnosis of compatibility, cost, and timeline risks during AI vendor transition

→

Step 4 Automated prompt migration testing based on transition diagnosis results

Problem

Even after diagnosing AI vendor transition risks, teams fail to detect vendor-specific response quality differences, token consumption variations, and edge case failures during actual prompt migration — leading to production incidents post-deployment. Manually A/B testing hundreds of prompts takes 2-4 weeks, and subjective comparison criteria delay decision-making.

Solution

A tool that takes uploaded prompt sets currently in use and simultaneously calls multiple LLM vendors to automatically compare response quality (accuracy, format compliance, consistency), token costs, and latency. Provides automated edge case generation, regression test automation, and per-vendor performance reports to support data-driven migration decisions.

Target: ML engineers at AI service startups with 3-20 employees operating 10+ prompts via LLM APIs

Revenue Model: 1,000 KRW (~$0.75) per test run (1 prompt × 1 vendor), monthly plan at 79,000 KRW (~$59)/month (includes 1,000 runs), overage at 500 KRW (~$0.37) per run

Ecosystem Role: Supplier

MVP Estimate: 2_weeks

NUMR-V Scores

N Novelty

5.0/5

U Urgency

5.0/5

M Market

4.0/5

R Realizability

3.0/5

V Validation

4.0/5

NUMR-V Scoring System

N Novelty	1-5	How uncommon the service is in market context.
U Urgency	1-5	How urgently users need this problem solved now.
M Market	1-5	Market size and growth potential from proxy indicators.
R Realizability	1-5	Buildability for a small team with realistic constraints.
V Validation	1-5	Validation signal quality from competition and demand data.

SaaS N=.15 U=.20 M=.15 R=.30 V=.20 Senior N=.25 U=.25 M=.05 R=.30 V=.15

Feasibility (69%)

Tech Complexity

29.3/40

Data Availability

19.4/25

MVP Timeline

20.0/20

API Bonus

0.0/15

Feasibility Breakdown

Tech Complexity	/ 40	Difficulty of core implementation stack.
Data Availability	/ 25	Practical availability and cost of required data.
MVP Timeline	/ 20	Expected time to ship a usable MVP.
API Bonus	/ 15	Bonus for viable public API leverage.

Market Validation (58/100)

Competition

8.0/20

Market Demand

6.2/20

Timing

16.0/20

Revenue Signals

9.0/15

Pick-Axe Fit

12.0/15

Solo Buildability

7.0/10

Validation Breakdown

Competition	/ 20	Signal quality from competitor landscape.
Market Demand	/ 20	Demand proxies from search and mention patterns.
Timing	/ 20	Fit with current shifts in tech, behavior, and regulation.
Revenue Signals	/ 15	Reference evidence for monetization viability.
Pick-Axe Fit	/ 15	How well the concept serves participants in a trend.
Solo Buildability	/ 10	Practicality for lean-team implementation.

Technical Requirements

Backend [medium] AI/ML [medium] Frontend [low]

Dashboard