Technical Overview
Paste or upload any HTML file. Uncluster parses it, understands its structure, and converts it into whatever you need — clean JSX, a full TypeScript project, an EJS server setup, or just extracted and separated assets. No configuration. No local tooling required.
Features
Every feature runs on the same server endpoint model — send HTML in, get a transformed result back.
Takes messy, minified, or inconsistently indented HTML and outputs it with clean, normalized indentation. Tags are properly nested, whitespace is trimmed, and the structure becomes easy to read at a glance.
→ returns formatted HTML string
Rewrites the HTML as a valid React JSX component. Attribute names are translated
(class→className, for→htmlFor, all event handlers),
inline styles are converted to JS objects, void elements are self-closed,
and structural wrapper tags are stripped out.
→ returns ready-to-paste .jsx / .tsx component
Scans the DOM for elements that repeat with the same structure and class names. Patterns that appear three or more times and match known UI identifiers (card, button, badge, modal, etc.) are surfaced as component suggestions, each with a generated prop list and starter JSX body.
→ returns JSON array of component suggestions
Pulls all inline <style> and <script> blocks
out of the HTML into separate files, then downloads any external CDN stylesheets
and scripts referenced via <link> and <script src>
tags. The HTML is rewritten to reference these local files, then everything is
bundled into a ZIP.
→ returns .zip with separated HTML, CSS, JS files
Scaffolds a complete, runnable Express + Vite + TypeScript project around your HTML.
Generates package.json, vite.config.js, tsconfig.json,
ESLint and Prettier configs, a server.js, and a structured src/
directory with your converted component inside it.
→ returns .zip, unzip and run npm install
Scaffolds an Express + EJS server-rendered project. Your HTML is split into EJS
partials — one for the header, one for the footer, and one per major content block.
Each partial is wired up to an Express route that calls res.render().
No client-side framework needed.
→ returns .zip ready for server-side rendering
The Pipeline
Every request runs through the same five-stage pipeline. The stages are the same regardless of which feature you use — only the final output stage changes.
The raw HTML string is fed into Go's html.Parse(). This produces a
linked node tree — not an array, not a flat list. Each node has a type (element,
text, comment), a tag name, a list of attributes, and pointers to its first child
and next sibling. Every subsequent stage reads this tree; nothing works directly
on the raw string.
<div class="card"> <h2>Title</h2> <p>Body text here</p> </div>
Element: div
attr: class="card"
└─ Element: h2
└─ Text: "Title"
└─ Element: p
└─ Text: "Body text here"
Every stage visits the tree the same way: go to FirstChild,
do the work, then move to NextSibling, recurse. This depth-first
walk visits every node in the document exactly once. The converter uses this
walk to build JSX. The analyzer uses it to count repeating patterns.
The formatter uses it to track indent depth.
Diagram — depth-first traversal order
graph LR
A["html.Parse()"] --> B["DOM root"]
B --> C["visit FirstChild"]
C --> D{"has children?"}
D -->|yes| E["recurse into children"]
D -->|no| F["move to NextSibling"]
E --> F
F --> G{"sibling exists?"}
G -->|yes| C
G -->|no| H["return up"]
style A fill:#1565c0,color:#ffffff,stroke:#0d47a1
style B fill:#1b5e20,color:#ffffff,stroke:#1b5e20
style E fill:#e65100,color:#ffffff,stroke:#bf360c
For JSX conversion, each element's attributes are run through a lookup table of
70+ entries. class becomes className.
for becomes htmlFor. Every onclick-style
handler becomes onClick. Inline style="color:red"
strings become style={{ color: 'red' }} objects.
Elements like <html>, <head>, and
<body> are skipped entirely — they don't belong in a React component.
<label for="email" class="field" onclick="go()" style="color:red"> Email </label>
<label htmlFor="email"
className="field"
onClick={go}
style={{ color: 'red' }}>
Email
</label>
For component analysis, each element visited during the walk is fingerprinted by
its tag name, CSS classes, and id — for example div.card#featured.
Elements that share the same fingerprint are grouped and counted. Any pattern that
appears three or more times and whose name matches a known UI keyword
(card, button, badge, modal, dialog, avatar, toast, alert...) is flagged as a
component candidate. Generic structural tags like div and
section are excluded even if they repeat.
All string output is built with strings.Builder — append-only,
no string concatenation, no intermediate copies. For JSX: imports at the top,
the component function body in the middle, any extracted script logic at the bottom.
For exports: the ZIP is assembled in memory using Go's archive/zip
and streamed directly back in the HTTP response.
Internals
The Export ZIP and project scaffold features need to separate your HTML from its styles and scripts. Here's what the extractor actually does.
Every <style> block in the document is lifted out and written to
its own numbered CSS file (style-0.css, style-1.css, ...).
Every <script> block without a src attribute is
lifted into a separate JS file. The original tags are then removed from the HTML.
Any <link rel="stylesheet"> or <script src="...">
pointing to an external URL is fetched over HTTP and saved locally. CDN hostnames
are replaced with short aliases in the filename, unsafe characters are stripped,
and names are truncated to stay filesystem-safe. The HTML's href and
src attributes are rewritten to point to the local file.
cdn.jsdelivr.net/npm/bootstrap@5/dist/css/bootstrap.min.css
becomes jsdelivr-bootstrap-min.css in the ZIP.
Project Generation
The scaffold exporters don't just rename files. They generate a complete project
that you can unzip, run npm install, and start developing in immediately.
Your HTML is converted to JSX and placed in src/App.tsx. The scaffolder
generates all the boilerplate around it using Go's text/template engine —
inserting your project name, dependency versions, and converted component at the right
places. The output is a ready-to-run SPA with hot module replacement via Vite and an
Express server for API routes.
package.json · vite.config.js ·
server.js · tsconfig.json · .eslintrc.json ·
.prettierrc · .gitignore · src/ with your component
The HTML is parsed a second time and split into named EJS partial files — one for the
header, one for the footer, and one for each major content section found in the body.
Each partial is placed in views/partials/ and included from a root
views/index.ejs. Express route handlers call res.render() on
each view. No client-side framework is involved — pages are rendered server-side on
every request.
Codebase
Each internal package owns exactly one concern. No package imports another — they communicate only through function arguments and return values.
API Reference
All endpoints accept JSON with an html field. Export endpoints return a binary
ZIP. The server includes CORS, request logging, and panic recovery middleware.
| Method | Path | Request | Response |
|---|---|---|---|
| POST | /api/format |
{ html: string } |
Formatted HTML string |
| POST | /api/convert |
{ html: string } |
JSX component string |
| POST | /api/analyze |
{ html: string } |
JSON array of component suggestions |
| POST | /api/export |
{ html: string } |
ZIP — separated HTML, CSS, JS |
| POST | /api/export-nodejs |
{ html: string } |
ZIP — full Express + Vite + TS project |
| POST | /api/export-nodejs-ejs |
{ html: string } |
ZIP — full Express + EJS project |
| GET | /api/health |
— | { status, service, version } |
cmd/uncluster-split binary runs the full
extraction pipeline locally without starting the HTTP server.
go run ./cmd/uncluster-split -input file.html -output ./out.
Pass -manifest true to also write a split-manifest.json
listing every output file produced.