/work
/work · ai-content-pipeline

AI Content Pipeline

Scrape, LLM analysis, multimodal generation, video render

rolePipeline architecture, AI orchestrationyear2025-26typeAutomation
// CAROUSEL TRACK01ScrapeYouTube · HN · RSS02LLM AnalyzeClaude · topic + copy03Image GenNano Banana Pro04HTML RenderBrand tokens05ScreenshotPlaywright · 1080×1350Carousel · 10 slides~5–10 min vs. 3 hrs by hand// VIDEO TRACK · parallelWhispertranscribeSilence-Cutauto-editRemotioncaptions + motion

Pipeline that runs LLM, image, and video models end to end. Built to feed a brand's social calendar without a designer or editor in the loop.

// goal

The brief

Producing consistent social content (carousels, short videos) takes a lot of time. I wanted a pipeline that watches news sources, picks topics, builds slides, and post-produces video footage automatically.

// approach

What I built

Five stages: (1) scraping of YouTube transcripts, Hacker News, and RSS. (2) Claude analysis to pick topics and write the per-slide copy. (3) background image generation with Nano Banana Pro on the Gemini 3 Image API. (4) HTML slide rendering against brand tokens. (5) Playwright screenshot to a 1080x1350 PNG carousel. The video workflow runs in parallel: Whisper transcription, automatic silence cuts, and a Remotion motion-graphics layer.

// features

Inside the build

Scrape and select

Reads YouTube transcripts, Hacker News, and RSS feeds. A Claude step picks topics that fit the target audience and brand voice, writes the per-slide copy, and returns a structured slide deck spec.

Multimodal image generation

Nano Banana Pro on the Gemini 3 Image API generates background imagery for each slide. Image prompts are written by the Claude step using strict brand and color rules.

HTML slide and Playwright screenshot

Slides are rendered as HTML against the brand tokens, then captured to 1080x1350 PNG via Playwright. Same brand system as the website, no design tool in the loop.

Video workflow

Parallel Whisper transcription, automatic silence cuts, and a Remotion motion-graphics layer for captions and overlays. Takes raw footage in, post-produced short-form out.

// tech stack

Stack used

// AI

Claude APIgoogle-genai SDKWhisperNano Banana Pro

// Render

PlaywrightRemotionNext.jsHTML / CSS

// Sources

YouTube Transcript APIHacker News APIRSS

// Runtime

PythonNode.js

// outcome

Carousel production time dropped from roughly three hours per set to under fifteen minutes. The video pipeline turns raw footage into captioned short-form videos automatically.