Alex Chen/Fraud Detection Review Queue
FinovaLead PM4 months6-person squad

Cut fraud rate 64% in 90 days with a redesigned review queue

Key outcomes
Fraud rate reduced from 3.2% to 1.1% of transaction volume
Analyst review time cut by 52% (from 8 min to 3.8 min per case)
False-positive rate dropped from 22% to 9%, reducing customer friction
01

The problem

Finova's fraud operations team was manually reviewing flagged transactions using a brittle internal tool built in 2019. As transaction volume grew 3× in 18 months, the backlog of unreviewed fraud alerts hit 48 hours — meaning real fraud was being caught too late, and legitimate transactions were being declined while customers waited. The tool surfaced alerts in chronological order with no risk scoring, no case history, and no bulk actions.

Every percentage point of fraud directly hit the P&L, and regulator SLAs required us to action high-risk alerts within 4 hours. We were failing that target 40% of the time.
02

Research & insights

Methods: 12 contextual interviews with fraud analysts, 3 weeks of shadow sessions, Mixpanel funnel analysis, 6 stakeholder interviews with Risk, Compliance, and Engineering leads

Analysts spent the first 2 minutes of every case hunting for context scattered across four tabs: transaction history, device fingerprint, account age, and prior flags. The tool showed alerts in FIFO order regardless of risk score, so high-risk fraud was often reviewed after low-risk noise. 78% of analysts said they had a mental "pattern" they checked every time — but the tool didn't support it. We also discovered that 60% of false positives came from a single category: first-purchase, high-value orders from new accounts in the EU — a segment that had tripled as we expanded.

03

Solution

We redesigned the review queue around the analyst's mental model rather than the database's record structure. The new queue surfaces cases ranked by a composite risk score (ML model + rule-based signals), consolidates the four context tabs into a single card with progressive disclosure, and adds one-click bulk actions for common patterns. We shipped in three phases: risk-ranked queue first (lowest eng effort, highest impact), then the consolidated case card, then bulk actions.

Key decisions & trade-offs

The biggest debate was whether to build a fully automated decision engine or improve the human review flow. I pushed for the human-in-the-loop approach for two reasons: (1) our ML model had only 78% precision at the time — not good enough to automate high-stakes declines, and (2) analysts' domain knowledge was genuinely catching edge cases the model missed. We agreed to improve tooling now and revisit automation in 6 months once precision improved. The second decision was the ranking algorithm: engineering wanted a pure ML score, but compliance required at least one hard rule (flag any transaction > $5k for human review). We built a hybrid: ML score with compliance overrides surfaced visually so analysts understood why a case was elevated.

04

Results

MetricBeforeAfterDeltaTime
Fraud rate (% of volume)3.2%1.1%−64%90 days
Avg review time per case8 min3.8 min−52%90 days
False-positive rate22%9%−59%90 days
SLA compliance (4-hr rule)60%97%+37pp90 days
05

Challenges & learnings

Challenges

The biggest challenge was the 6-week data migration from the legacy tool. Our fraud data was spread across three systems with inconsistent IDs. We hit a 3-week delay when we discovered that ~8% of historical cases had duplicate entries, which would have corrupted the ML training data. We fixed it by adding a deduplication step to the pipeline — but it pushed the launch by a sprint.

What I'd do differently

I underestimated how much analysts relied on muscle memory with the old tool. Even though the new interface was objectively better, we saw a 2-week productivity dip after launch as people adjusted. I'd budget a longer hands-on training period next time and consider a gradual rollout (opt-in for the first cohort) rather than a hard cutover. I'd also involve the Compliance team 2 sprints earlier — their sign-off on the ranking algorithm added a week at the end.

Skills demonstrated
0→1RiskDataOps designStakeholder management
Back to Alex's folioBuild your own folio