Available for rolesSF · PST
Full-Stack AI Engineer · San Francisco

Shipping intelligent products —
from frontier LLMs
to pixel-perfect interfaces.

I'm Binaya Tripathi— architecting agentic AI systems, RAG pipelines and production LLM apps with Claude, GPT-5, LangGraph and MCP. Next.js on the front, FastAPI on the back, shipped on AWS Bedrock, Azure OpenAI & GCP Vertex AI.

Available12+ YRS
Now

Senior Full-Stack AI Engineer @ Builders Academy — mentoring 50+ devs on Claude, Cursor & MCP.

Current stack
Claude 4.7LangGraphMCPNext.js 16React 19BunHonopgvector
Portfolio · v2026.04San Francisco, CA · PST
Claude Opus 4.7GPT-5LangGraphMCPClaude Agent SDKNext.js 16React 19Vercel AI SDKCursorClaude CodepgvectorTurbopufferFastAPIHonoBunTursoAWS BedrockAzure OpenAILangSmithBraintrustThree.jsMotionClaude Opus 4.7GPT-5LangGraphMCPClaude Agent SDKNext.js 16React 19Vercel AI SDKCursorClaude CodepgvectorTurbopufferFastAPIHonoBunTursoAWS BedrockAzure OpenAILangSmithBraintrustThree.jsMotion
01 · About

AI-first.
Full-stack craft.

12+ years moving between research-grade ML, production web infra, and product thinking — turning frontier LLMs into shippable products people actually use.

I architect end-to-end LLM-powered web apps — combining React / Next.js / TypeScript fronts with Python (FastAPI) and Node.js backends. I've shipped agentic AI tools, RAG pipelines with pgvector & Weaviate, and production LLM services on AWS Bedrock, Azure OpenAI and GCP Vertex AI. I care about evals, observability, and cost — not just demos.

12+
Years of Engineering
60%
Manual Effort Cut via AI Agents
50%
Faster Delivery Across Teams
70%
Research Time Reduction
45%
LLM Cost Reduction
2M+
Monthly Sessions Scaled
Education
  • Master's Degree, Data Science
    Eastern University
    2021 — 2022
  • Bachelor of Science, Physics
    Winona State University
    2009 — 2013
Certifications
  • JavaScript for Web Designers
  • Frontend Web Development Certificate
02 · Selected work

Things I've shipped

LLM products, AI storytelling SaaS, and commerce at scale. Each shaped by real constraints and measurable outcomes.

AIGen2o
01 / 03

AIGen2o

AI + Blockchain Content Platform

An AI-generated content platform that combines multi-modal LLM generation with on-chain authenticity verification. Creators mint articles, audio, and visuals, each signed and provenance-tracked via Web3 primitives.

Next.jsClaudeLangChainWeb3pgvector
Multi-modal GenAI with blockchain-backed provenance.Case study
StoryWonderBook
02 / 03

StoryWonderBook

AI Storytelling for Kids

A magical story generation platform for families, powered by streaming LLMs, safety guardrails, and a credits-based billing model. 100+ stories generated, 30+ happy kids, and scaling.

Next.jsOpenAIStreamingStripeSupabase
100+ stories · 30+ active kids · 3-tier SaaS.Case study
Responsive Commerce
03 / 03

Responsive Commerce

Omni-device Shopping UI

A responsive e-commerce platform scaled to 2M+ monthly sessions and 200K+ SKUs. Cut checkout latency from 3s to 1s via Redis caching, NGINX load balancing, and a lean React + Node.js stack.

ReactReduxNode.jsPostgreSQLRedis
2M+ sessions · 200K+ SKUs · 1s checkout.Case study
— Principles

How I actually work

I'm wary of AI theater. Here are the four rules that have held up across 12 years and five engineering roles.

A / 01

Evals before demos

Every LLM feature ships with an eval harness — offline + online. If we can't measure it, we don't claim it.

A / 02

Cost is a product decision

Model choice, prompt caching, context discipline. I've cut real LLM bills by 45% without losing quality.

A / 03

Agents in the loop, humans in command

LangGraph state machines, MCP tools, and guardrails. Autonomous where it wins, human-gated where it matters.

A / 04

Ship the boring half too

AI features need auth, billing, observability, retries, and migrations. I build all of it — front to back.

03 · Journey

12 years shipping

From classical ML at scale to frontier agentic AI — condensed timeline of systems shipped and measurable impact.

  • 01

    Senior Full-Stack AI Engineer

    @ Builders AcademyCurrent
    04/2023 — PresentRemote
    • Directed AI agent workshops and technical onboarding for 50+ developers globally using Claude and Cursor.
    • Architected and shipped 5+ agentic AI tools using Claude, LangChain, LangGraph, and OpenAI GPT with multi-step workflow automation — cut manual team effort by 60%.
    • Built end-to-end LLM pipelines and agentic AI systems enabling production-grade autonomous behavior for startup clients.
    • Automated debugging, deployment, and task tracking via AI agents — cut delivery time by 50% across 10+ active teams.
    • Mentored developers on prompt engineering, Claude/LangChain agent design, MCP integrations, and AI-assisted full-stack architecture.
    ClaudeLangGraphMCPCursorNext.jsReactNode.jsPython
  • 02

    Full-Stack AI Engineer & Co-Founder

    @ Co-FounderGPT
    12/2022 — 11/2023San Francisco, US
    • Architected an LLM-powered co-founder matching platform with GPT-4 / GPT-3.5 / Claude 2 and LangChain — served 10K+ founders with streaming SSE responses.
    • Built end-to-end RAG with pgvector + Weaviate and OpenAI embeddings — indexed 100K+ startup documents, cut manual research by 70%.
    • Shipped a full-stack Next.js + FastAPI + PostgreSQL app with TailwindCSS, shadcn/ui, Zustand, and React Query.
    • Deployed production LLM services on AWS Lambda/ECS with Docker + GitHub Actions, instrumented with LangSmith observability — cut model cost by 45%.
    • Automated 80% of founder onboarding via document parsing, web research, and CRM sync using tool-use / function calling.
    GPT-4Claude 2LangChainpgvectorWeaviateFastAPINext.jsAWS
  • 03

    AI Full Stack Engineer

    @ NFT Studios
    08/2022 — 05/2023San Francisco, US
    • Engineered a production GenAI NFT discovery & valuation platform with GPT-3.5/4, LangChain, and Weaviate — processed 1M+ token metadata.
    • Built full-stack React + Next.js + Node.js marketplace with GraphQL, REST, PostgreSQL, Redis, and WebSockets on AWS (S3, Lambda, ECS).
    • Orchestrated LLM pipelines with LangChain + LlamaIndex blending GPT and Claude 1 for reasoning on on-chain events and investor reports.
    • Automated CI/CD with GitHub Actions, Docker, Kubernetes (EKS), and Terraform — launched 15+ features with zero production incidents over 9 months.
    • Applied prompt engineering, few-shot and chain-of-thought techniques with Hugging Face + PyTorch — raised valuation accuracy by 32%.
    GPT-4ClaudeLangChainLlamaIndexWeaviateReactGraphQLKubernetes
  • 04

    Senior Full Stack Engineer

    @ Bitfari
    03/2022 — 10/2022San Francisco, US
    • Developed a full-stack smart-city campaign platform in React, TypeScript, Node.js, and Python (FastAPI) — delivered real-time content to 500+ displays across 12 US markets.
    • Built an early AI creative-assistant with GPT-3 (davinci) for ad-copy, plus Hugging Face classification and scikit-learn predictive models — cut creative turnaround from 3 days to 4 hours.
    • Containerized microservices with Docker + Kubernetes on AWS with Terraform — moved deployment cadence from weekly to daily.
    • Added Datadog + Sentry observability — reduced mean-time-to-detect incidents by 55%.
    GPT-3Hugging FaceFastAPINode.jsAWSTerraformDocker
  • 05

    Full Stack Engineer

    @ WANAMAKER
    03/2013 — 08/2021San Francisco, US
    • Built and maintained a customer-facing e-commerce platform with React, Redux, TypeScript, Node.js, PostgreSQL — scaled to 2M+ monthly sessions and 200K+ SKUs across B2B/B2C.
    • Designed REST, GraphQL, and microservices on AWS with Docker, Jenkins, GitHub Actions; Redis caching + NGINX — cut checkout latency from 3s to 1s.
    • Built classical ML/NLP pipelines (scikit-learn, PyTorch, spaCy, Pandas) for product recs, search ranking, and demand forecasting across 200K+ SKUs.
    • Led code reviews, TDD (Jest, Playwright, Pytest), and mentored 6+ junior engineers — reduced production defects by 40%.
    ReactReduxNode.jsPostgreSQLscikit-learnPyTorchspaCyAWS
04 · Stack

The 2026 toolkit

Production-tested across agentic pipelines, SaaS platforms, and e-commerce at 2M+ sessions — not a sandbox list.

Frontier LLMs

Models
  • Claude Opus 4.7
  • Claude Sonnet 4.6
  • Claude Haiku 4.5
  • GPT-5 · o3 · o4-mini
  • Gemini 2.5 Pro
  • Llama 4
  • DeepSeek R1
  • Qwen 3
  • Mistral Large 2

Agentic AI

Agents
  • Claude Agent SDK
  • OpenAI Agents SDK
  • LangGraph
  • CrewAI
  • Pydantic AI
  • Mastra
  • Model Context Protocol (MCP)
  • FastMCP · MCP servers
  • Tool Use / Function Calling
  • Multi-agent orchestration
  • Computer-use agents

AI Coding Tooling

Craft
  • Claude Code
  • Cursor (Agent mode)
  • Windsurf
  • Zed AI
  • Cline · Continue · Aider
  • v0 · Lovable · Bolt
  • GitHub Copilot
  • Codegen / scaffolding

RAG & Vector

Retrieval
  • pgvector · Postgres FTS
  • Weaviate · Qdrant · Chroma
  • Turbopuffer · LanceDB
  • Upstash Vector
  • Hybrid search (BM25 + dense)
  • Contextual Retrieval
  • Rerankers (Cohere, Voyage)
  • Graph-RAG

LLMOps & Evals

Prod
  • LangSmith · LangFuse
  • Braintrust · Arize Phoenix
  • Helicone
  • Prompt caching
  • Structured outputs · JSON mode
  • Guardrails · PII redaction
  • Offline + online evals
  • LoRA · QLoRA · DPO fine-tuning

Frontend

UI
  • Next.js 16 (Turbopack, Cache Components)
  • React 19 (RSC, Compiler, Actions)
  • TanStack Start · Remix · Astro
  • TypeScript 5
  • Tailwind v4 · shadcn/ui · Radix
  • Motion (Framer) · GSAP · Lenis
  • Three.js · React Three Fiber · Rive
  • Zustand · TanStack Query
  • React Native (New Architecture)

Backend & APIs

Runtime
  • Node.js · Bun
  • Hono · Elysia · tRPC
  • Python FastAPI · Litestar
  • Vercel AI SDK
  • Nitro · NestJS
  • gRPC · GraphQL Yoga
  • Server-Sent Events · WebSockets
  • Streaming UI · Server Actions

Data & Infra

Platform
  • Postgres · Neon · Supabase · Turso
  • Redis · DragonflyDB
  • Prisma 6 · Drizzle
  • Cloudflare Workers · D1 · R2
  • AWS Bedrock · Lambda · ECS
  • Azure OpenAI · GCP Vertex AI
  • Modal · Railway · Fly.io
  • Docker · Kubernetes · Terraform
  • OpenTelemetry · Datadog · Sentry
05 · Contact

Let'sshipsomethingworthcaringabout.

Open to Senior / Staff Full-Stack AI Engineer roles, GenAI product teams, and consulting engagements on LLM apps, agents, and RAG infra. Ping me — I read every message.

San Francisco, CA
+1 (786) 471-8264
binayatripathi.dev@gmail.com