Zanwen Fu — ML Engineer & Founder

About

From NUS to Duke — shipping production systems at every stop.

I'm a Software Engineer and MS Computer Science (AI/ML) student at Duke University, focused on building production-grade autonomous systems — from multi-agent orchestration and LLM tooling to the distributed backends that run them reliably at scale.

I'm the sole founder and engineer of VYNN AI, an agentic financial analyst platform built end-to-end and deployed to ~500 pilot users.

Previously, I designed core components of AutoCodeRover, an autonomous code repair system acquired by Sonar, integrating agentic reasoning directly into JetBrains IDEs. In parallel, I've led research as sole first author on multi-agent LLM frameworks for medical text mining, achieving 98.2% sensitivity across 15 systematic reviews (~150K citations).

This summer, I’ll be joining Robinhood in Menlo Park as a Machine Learning Engineer on the Agentic AI team, continuing my focus on building autonomous systems that operate reliably at real-world scale.

What drives me

Systems that are reliable, observable, and production-ready — not just demo-ready. I care deeply about turning ideas into robust software that solves real problems and serves real users.

Current obsession

Agent harness design — the infrastructure layer that makes agents actually work in production. The agent itself is the easy part; the harness that makes it reliable, observable, and debuggable is what I want to build.

Git-based MemoryOrchestrationContext EngineeringEval PipelinesSelf-MonitoringFailure Recovery

Duke University

M.S. Computer Science (AI/ML)

2025 – 2027 · Graduate Teaching Assistant

Duke Scholar

View official scholar profile

National University of Singapore

B.Comp. in Computer Science (Honours)

2021 – 2025 · Distinction

Distinction in Software Engineering

View verified credential

Exchange Semester at HKU (Fall 2023)

Experience

Startups, big tech, research, and teaching.

Machine Learning Engineer (Agentic)

UPCOMING

May 2026 – Aug 2026

Robinhood·Agentic AI Team

May 2026 – Aug 2026

Building agent infrastructure and tooling — prototyping reasoning loops, composing toolchains, and supporting evaluation pipelines for Robinhood's next-generation AI financial products. Bringing production experience from building multi-agent systems at VYNN AI and AutoCodeRover.

Founder & Software Engineer (Agentic)

Jul 2025 – Dec 2025

VYNN AI

Jul 2025 – Dec 2025

Designed, built, and deployed a full-stack agentic financial analysis platform single-handedly — from LangGraph multi-agent backend and FastAPI orchestration layer to React dashboard and production infrastructure on Hetzner Cloud. Serves ~500 pilot users with institutional-quality equity research (DCF modeling, news intelligence, automated reports) in under 7 minutes end-to-end.

Graduate Teaching Assistant

CURRENT

Aug 2025 – Present

Duke University

Aug 2025 – Present

Architected and led CS 590 (Software Development Studio), where graduate students build AI debugging agents inspired by AutoCodeRover and deploy full-stack applications. Also mentored teams in CS 408 and CS 390 on software architecture, DevOps, and LLM-oriented programming — shipping production software for real clients.

Research Software Engineer

Aug 2024 – May 2025

AutoCodeRover (acquired by Sonar)

Aug 2024 – May 2025

Built the JetBrains IDE plugin end-to-end for autonomous code repair — GumTree-based 3-way AST merge, embedded SonarLint analysis, and real-time SSE streaming with per-step developer feedback. Enhanced the agentic repair backend with LLM-as-a-Judge self-improvement, lifting SWE-bench Verified to 51.6% (state-of-the-art among open-source agents). Core technology acquired by Sonar.

Software Engineer

Jul 2025 – Oct 2025

Binance·Web3 Wallet Team

Jul 2025 – Oct 2025

Built backend validation infrastructure for Binance's Boosters campaign — automated API regression suites in CI, load-tested services to ~500K concurrent transactions via JMeter, and instrumented monitoring to catch consistency failures before production. Worked directly with backend and Web3 Wallet engineers to root-cause and patch defects, cutting resolution time by 40%.

AI Researcher

Jan 2024 – Jul 2025

NUS Undergraduate Research

Jan 2024 – Jul 2025

Led research as first author on a multi-agent AI framework for medical evidence synthesis. Designed and built LUMINA, a four-agent LLM framework that automates citation screening for medical systematic reviews — achieving 98.2% sensitivity and 87.9% specificity across 15 SRMAs (~150K citations) with a 35× reduction in false negatives vs. prior state-of-the-art.

Earlier Experience

Full-Stack Software Engineer·ST Engineering

May 2023 – Aug 2023

Quantum Software Engineer·Centre for Quantum Technologies (Singapore)

May 2024 – Dec 2024

Web Developer·NUS Computing

Feb 2024 – Nov 2024

Selected Work

FOUNDER · SOLE ENGINEER~500 PILOT USERSFULL-STACK · PRODUCTION50,000+ LOC

VYNN AI

Automates institutional-quality equity research that traditionally takes analysts 6–12 hours into a single autonomous pipeline under 7 minutes. A LangGraph supervisor orchestrates five specialized agents — financial data collection, DCF modeling, news intelligence, report generation, and a 3-layer recommendation engine with deterministic validation — serving ~500 pilot users in production. Built end-to-end as sole engineer: 50,000+ lines of Python, React/TypeScript frontend with real-time WebSocket streaming, and Docker-based infrastructure on Hetzner Cloud.

System Architecture

< 7 min

Full equity analysis — data scraping, DCF modeling, news intel, and PDF report generation

72%

Latency reduction via parallel agent execution and result caching

Real-Time

Dual WebSocket streams for live prices and news with auto-reconnect and health checks

~500

Pilot users on production Hetzner Cloud infrastructure with zero-downtime deployments

🤖

AI Chat Engine

SSE streaming with log batching, multi-conversation management, downloadable XLSX + PDF reports

📊

Market Dashboard

Live prices, stock charts, news aggregation

💼

Portfolio Mgmt

Multi-portfolio, real-time P&L, holdings CRUD

📈

Analytics

6 interactive chart types (area, bar, pie, radar, scatter, treemap), one-click PNG export

📄

Daily Reports

Company, sector, and global market reports with batch generation and smart polling

🔐

Auth & Security

OAuth, passwordless login, HTTP-only cookies, cross-tab sync, user-scoped storage

LangGraphMCPFastAPIRedisMongoDBReactTypeScriptViteTailwind CSSshadcn/uiRechartsDockerNginxCaddySSEWebSocketOAuthPythonHetzner Cloud

VYNN AI Org Agent Backend Platform Frontend Demo & Samples Blog Post

SONAR · SOFTWARE ENGINEERACQUIRED BY SONARISSTA 2024 + arXiv

AutoCodeRover — IDE Plugin + Repair Agent

Brought autonomous code repair from research to a production developer tool. AutoCodeRover is a multi-agent system that resolves real GitHub issues end-to-end — reproducing bugs, searching codebases across 7 languages via tree-sitter, generating patches with iterative refinement, and self-correcting through an LLM-as-a-Judge reviewer. I built the JetBrains IDE plugin end-to-end in Kotlin: a conversational agent UI with real-time SSE streaming, GumTree-based three-way AST merge for conflict-free patch application, embedded SonarLint static analysis, and a feedback loop where developers can critique any reasoning step to trigger guided re-runs. On the backend, I designed the self-fix agent that diagnoses inapplicable patches and autonomously replays the pipeline from the most suspicious stage — lifting SWE-bench Verified to 51.6%. The core technology was acquired by Sonar. Sonar Foundation Agent, built on the AutoCodeRover core, has since reached 79.2% on SWE-bench Verified — #1 on the leaderboard (Feb 2026).

Repair Pipeline Architecture

51.6%

SWE-bench Verified (Jan 2025)

State-of-the-art across 2,294 real GitHub issues — highest among open-source agents

13.2%

Resolve Rate Improvement

Lifted SWE-bench Verified from 38.4% (Jun 2024) to 51.6% (Jan 2025) — via Self-Fix Agent with LLM-as-a-Judge and interactive feedback loops

3-Way

AST Merge (GumTree)

Conflict-free patch application when local code has diverged from agent's baseline

7

Languages Supported

Tree-sitter search across Python, Java, JS, TS, C/C++, Go, PHP

🐛

Autonomous Repair

Describe a bug → ACR localizes, patches, and validates autonomously

🔍

SonarLint Integration

Embedded static analysis for Java/Python with one-click ACR fixes

🌳

3-Way AST Merge

GumTree conflict resolution across baseline/modified/patched

💬

Interactive Feedback

Critique any agent reasoning step — triggers guided pipeline re-run

🔄

Self-Fix Agent

LLM-as-a-Judge diagnoses inapplicable patches and replays from failure point

🔨

Build/Test Capture

Auto-captures IDE build and test failures with one-click ACR submission

KotlinPythonJetBrains PSIGumTreetree-sitterSonarLintOkHttpSSEREST APIsJGitDockerLLM-as-a-JudgeClaude 3.5 SonnetGPT-4o

IDE Plugin Agent Backend SpecRover Paper (arXiv)ACR Paper (ISSTA '24)Blog Post

FIRST AUTHOR · AI RESEARCH

LUMINA — Multi-Agent Citation Screening for Systematic Reviews

Designed and built LUMINA, a four-agent framework that automates citation screening for medical systematic reviews and meta-analyses. A classifier agent triages citations, a detailed screening agent applies PICOS-guided Chain-of-Thought evaluation, a reviewer agent audits each decision via LLM-as-a-Judge, and an improvement agent self-corrects when disagreements arise — mirroring the human peer-review process. Evaluated on 15 SRMAs across ~150K citations from BMJ, JAMA, and Lancet journals: achieved 98.2% mean sensitivity (10 of 15 at perfect 100%) with 87.9% mean specificity. Outperforms published sensitivity baselines by Li et al. (37%) and Strachan (58%) by 35× fewer missed studies.

Multi-AgentChain-of-ThoughtPICOSLLM-as-a-JudgeGPT-4o-miniGPT-o3-miniPython

98.2%

Sensitivity

10 of 15 reviews at perfect 100%. vs. 37% (Li et al.) and 58% (Strachan) — 35× fewer missed studies

87.9%

Specificity

reduces manual screening by ~10× — reviewers examine only ~12% of citations instead of 100%

15

Systematic Reviews

~150K citations from BMJ, JAMA, Lancet — <$0.01 per article

Writing & Thoughts

Architecture, engineering, and lessons learned.

Mar 2026

Beyond the Harness: An Operating System for AI Agents

Why the industry's biggest problem isn't the model — it's the infrastructure. And why git, not markdown, should be the memory layer.

Mar 2026

From Research Agent to Acquired Product: What I Learned Building the AutoCodeRover IDE Plugin

How I turned an academic code repair agent into a developer tool, built a self-correcting feedback loop, and designed an AST-level patch merge system that Git can't do.

Mar 2026

Building VYNN AI: 50,000 Lines of Code, One Engineer, and Everything I Learned

How I built and scaled an agentic financial analysis pipeline from scratch, handling complex graph routing, deterministic validations, and real users.

Teaching

Graduate TA at Duke — 3 CS courses.

Teaching software engineering, DevOps, and agentic AI systems to undergraduate and graduate students at Duke.

CS 590Spring 2026

Software Development Studio

Graduate course covering production software engineering, Docker, CI/CD, API design, and server deployment — culminating in students building an AI debugging agent (inspired by AutoCodeRover) and a full-stack social media application.

Course site

CS 408Fall 2025

Delivering Software: From Concepts to Client

Mentoring teams on architecture, testing, DevOps, and full-stack development to deliver production-ready software for real-world clients.

Course site

CS 390Fall 2025

Modern Software Development

Leading weekly labs covering AI agents, LLM-oriented programming, Docker, APIs, and system design.

Course site

Software that thinks, ships,and scales.

About

From NUS to Duke — shipping production systems at every stop.

What drives me

Current obsession

Duke University

National University of Singapore

Experience

Startups, big tech, research, and teaching.

Machine Learning Engineer (Agentic)

Founder & Software Engineer (Agentic)

Graduate Teaching Assistant

Research Software Engineer

Software Engineer

AI Researcher

Earlier Experience

Selected Work

VYNN AI

AI Chat Engine

Market Dashboard

Portfolio Mgmt

Analytics

Daily Reports

Auth & Security

AutoCodeRover — IDE Plugin + Repair Agent

Autonomous Repair

SonarLint Integration

3-Way AST Merge

Interactive Feedback

Self-Fix Agent

Build/Test Capture

LUMINA — Multi-Agent Citation Screening for Systematic Reviews

Writing & Thoughts

Architecture, engineering, and lessons learned.

Beyond the Harness: An Operating System for AI Agents

From Research Agent to Acquired Product: What I Learned Building the AutoCodeRover IDE Plugin

Building VYNN AI: 50,000 Lines of Code, One Engineer, and Everything I Learned

Teaching

Graduate TA at Duke — 3 CS courses.

Software Development Studio

Delivering Software: From Concepts to Client

Modern Software Development

Software that thinks, ships,
and scales.