B

AI Cancer Research Data Cleanser

2.75

Derivation Chain

Step 1 Government policy expanding pre-cancer early screening and AI-based cancer research
Step 2 Training data preparation support for AI cancer research teams
Step 3 Automated medical imaging data anonymization + labeling quality verification tool

Problem

University hospitals and research institutes conducting AI-based cancer research need to anonymize DICOM metadata, verify IRB compliance, and quality-check labeling before using medical imaging data for AI training. A single researcher takes 3–4 hours to process 100 images. With tens of thousands of images to process annually, manual errors risk patient information leaks — leading to research suspension and legal liability.

Solution

Upload DICOM files and the system auto-detects and removes patient identifiers (name, ID, date of birth, address, etc.), auto-checks IRB compliance, validates labeling consistency (overlap detection, missing-region detection), and generates anonymization audit logs.

Target: Radiology research teams at university hospitals and data teams at medical AI startups conducting AI cancer research (team size 3–15)
Revenue Model: Usage-based: ₩9,900 (~$7.50) per 100 DICOM files; Monthly Subscription: ₩490,000/mo (~$368) for unlimited processing (per research team)
Ecosystem Role: Infrastructure
MVP Estimate: 1_month

NUMR-V Scores

N Novelty
3.0/5
U Urgency
4.0/5
M Market
2.0/5
R Realizability
2.0/5
V Validation
3.0/5
NUMR-V Scoring System
N Novelty1-5How uncommon the service is in market context.
U Urgency1-5How urgently users need this problem solved now.
M Market1-5Market size and growth potential from proxy indicators.
R Realizability1-5Buildability for a small team with realistic constraints.
V Validation1-5Validation signal quality from competition and demand data.
SaaS N=.15 U=.20 M=.15 R=.30 V=.20 Senior N=.25 U=.25 M=.05 R=.30 V=.15

Feasibility (51%)

Tech Complexity
19.3/40
Data Availability
19.4/25
MVP Timeline
12.0/20
API Bonus
0.0/15
Feasibility Breakdown
Tech Complexity/ 40Difficulty of core implementation stack.
Data Availability/ 25Practical availability and cost of required data.
MVP Timeline/ 20Expected time to ship a usable MVP.
API Bonus/ 15Bonus for viable public API leverage.

Market Validation (51/100)

Competition
8.0/20
Market Demand
3.8/20
Timing
14.0/20
Revenue Signals
10.5/15
Pick-Axe Fit
12.0/15
Solo Buildability
3.0/10
Validation Breakdown
Competition/ 20Signal quality from competitor landscape.
Market Demand/ 20Demand proxies from search and mention patterns.
Timing/ 20Fit with current shifts in tech, behavior, and regulation.
Revenue Signals/ 15Reference evidence for monetization viability.
Pick-Axe Fit/ 15How well the concept serves participants in a trend.
Solo Buildability/ 10Practicality for lean-team implementation.

Technical Requirements

Backend [high] AI/ML [medium] Infrastructure [medium]
Dashboard