Aaron Bleiweiss - AI Portfolio

Chapter 01

Enabled AI across an enterprise

When Shopify's CEO mandated an AI-first operating model, I didn't wait for a playbook. I started small, proved value, and let adoption pull the work across the org. Each tool built on the last.

Click to expand

"While You Were Away" Agent

Proof of concept that gave the team confidence to tackle bigger problems.

+

Six-Week Executive Reviews

~80% reduction in manual reporting prep across all teams.

+

Program Briefs Agent

Timeline cut from 6-8 weeks to 3-4 weeks.

+

MCP-Powered Dashboards

Delivery time cut from 3-4 months to 1-2 weeks.

+

Performance Review Agent

30 hours per cycle down to 5-6 hours. Other leads adopted and customized it.

+

The through-line: Start small, prove value, let adoption spread. Treat AI agents as internal products with teams as customers. Measure quality, feed corrections back, improve over time.

~80%

Manual reporting reduced

~50%

Review processes streamlined

5-6 hrs

Down from 30 per cycle

Chapter 02

Built AI for clients

After Shopify, I went independent. First real engagement: a skincare brand drowning in repetitive customer questions and manual clinic management.

Client engagement / B2B skincare

Multi-Agent Customer Service System

Three connected AI systems: a customer chatbot grounded in website content, a clinician agent for licensed professionals, and an admin dashboard so non-technical staff manage both agents themselves.

The first version had staff using the OpenAI platform directly. It didn't stick. So I built the admin dashboard - brought the tool to the user instead of bringing the user to the tool. That principle has guided every build since.

~5 hrs

Saved weekly

98%

Cost reduction

<$25

Monthly cost

Next.jsTypeScriptFirebaseOpenAIVercelShopify APIWordPress

Chapter 03

Built my own products

Client work proved I could ship production AI for someone else. Next: building products I own, where I'm the user, the designer, and the operator.

Solo product / Music discovery SaaS

reDiscover

Your streaming history has thousands of songs you've played dozens of times but haven't heard in months or years. reDiscover finds them and brings them back.

Built by a musician who wanted to solve his own problem. Hybrid data ingest (one-time privacy export + daily API sync), AI-powered music chat, monthly concert digests for artists you actually listen to. Designed the visual identity and built a cohesive design system across the product.

Live. Paying users.

$2.99

Monthly price

Features shipped

Platforms integrated

Next.jsTypeScriptFirebaseClaude AISpotify APIApple Music APITicketmasterSetlist.fm

View App →

Solo product / AI-powered job search

Signal

Scrapes openings across ATS platforms (Greenhouse, Lever, Ashby, Workday, and more), scores them against configurable criteria, discovers hiring contacts, and drafts personalized outreach.

Two-agent outreach drafting (drafter + formatter). Dead role detection with auto-close. Auto-detects ATS platform from job URLs.

Key decision: Two-LLM cost split. Sonnet for high-stakes scoring and outreach, Haiku for high-volume classification. ~90% cost reduction without sacrificing quality where it matters.

Live. In beta with early users.

20+

Features shipped

10+

ATS platforms

~90%

Cost reduction via model split

Next.jsTypeScriptFirebaseClaude (Sonnet + Haiku)Apollo APIVercel

View App →

Chapter 04

Went deeper

I'd shipped production AI across enterprise, client, and personal contexts. But I knew there were gaps. At Shopify, I'd consumed MCP servers (Looker, Salesforce, GitHub, Figma) but never built one. I'd built individual agents but never orchestrated multiple agents with genuinely different jobs working as a pipeline.

So I mapped the skills I wanted to develop and applied each one to real work. Not tutorials. Not demos. Real systems running in production right now.

Skill: MCP Server as Producer

Foundation Server

The gap: I'd consumed MCP servers at enterprise scale. Never designed the tool interface, handled auth, or built the serve-and-capture patterns from the other side.

A custom MCP server that serves professional context to any AI session - Claude, Cursor, any MCP-compatible client. But the real innovation is the capture loop: every AI session generates knowledge, that knowledge gets proposed back to a central source of truth, and nothing changes without review.

It's a PR workflow for AI-generated knowledge. Propose, review, merge. Most people's AI tools start from scratch every session. This system compounds.

Every session makes the next one better.

Projects connected

Daily

Usage frequency

100%

Provider agnostic

TypeScriptVercelGitHub APIMCP ProtocolOAuth

Skill: Multi-Agent Orchestration

KB Quality Agent

The gap: I'd built individual agents. Never orchestrated multiple agents with genuinely different architectures working as a pipeline.

My skincare client's chatbot was built on a point-in-time website scrape. Every time they updated their site, the knowledge base drifted. A three-agent pipeline that monitors, evaluates, and drafts updates for human review:

Same propose/review/approve pattern as the Foundation Server. Same architecture, completely different domain. The pattern proved portable.

Fully automated weekly scans. Staff review through the admin dashboard they already use.

Pages monitored

Specialized agents

<$1

Monthly cost

Next.jsTypeScriptFirebaseOpenAI (gpt-4o-mini + gpt-4o)WordPress REST APIVercel Cron

Now it compounds

The Foundation Server isn't just another project. It's the layer underneath everything else. Lessons from one project surface when they're relevant in another. The KB Quality Agent proved the architecture is portable. One system. Every session makes it smarter. Every project feeds the next.

Principles

What I've learned

Start small, prove value, let adoption spread.

Don't mandate AI adoption. Build one tool, show the results, let demand pull it across the org.

Bring the tool to the user.

If people have to learn your platform, you've already lost. Build around how they already work.

Human-in-the-loop by design.

The propose/review/approve pattern isn't a safety net. It's an architecture choice that builds trust and catches things AI misses.

Model selection by task.

Strongest model where reasoning quality matters. Cheapest model where the task is mechanical. Know which is which.

Not all AI is expensive.

Cheap models can produce high-quality output. Test and try before defaulting to the most powerful option. The right model for the task is often smaller and faster than you'd expect.

Agents that improve over time.

Track edit rates. Feed corrections back. If output quality isn't trending up, the system isn't learning.

The system should compound.

If every session starts from scratch, you're leaving value on the table. Capture what you learn. Make it accessible. Let it build.

Aaron Bleiweiss / 2026

LinkedIn Email