Detect an AI-Generated Website in 30 Seconds: 21 Objective Signs (2026 Method)
A checklist of 21 visual and technical indicators, measurable in under 30 seconds, to determine if a website was generated by v0, Lovable, Bolt, Claude, or GPT — with annotated examples and an AI score calculator.
A site lands in the inbox. The freelance invoice clears five figures. The agency pitch deck mentions "bespoke design system" four times. The reader has thirty seconds before the meeting starts. The question is simple and uncomfortable: did a human design this, or did an AI tool spit it out in eleven minutes?
This article is the answer. Twenty-one signs, each verifiable in seconds, scored on a severity scale, with a final calculator that produces a number between 0 and 75. Above 51, the site is what the industry now calls pure slop — generated by v0, Lovable, Bolt, Claude Artifacts, or ChatGPT canvas, then handed off as artisanal work. Below 16, a human designer made it. Between those bounds, a more careful reading is required.
The method is non-accusatory. AI-assisted work is fine. AI-dominant work passed off as human work is not. The point of this checklist is to give the reader — recruiter, agency client, founder, technical buyer — the vocabulary and the evidence to ask the right questions when a deliverable feels off.
---
TL;DR — The 30-second fast scan
Open the site. Answer five binary questions. Score one point per yes.
| # | Question | Yes / No | |---|----------|----------| | 1 | Does the primary color sit between #3b82f6 (Tailwind blue-500) and #8b5cf6 (Tailwind purple-500)? | □ | | 2 | Is the body font Inter, with no fallback or variant tuning visible? | □ | | 3 | Is there a hero with one H1, one paragraph, one filled CTA button, and one ghost CTA button — in that exact order? | □ | | 4 | Is the pricing section three plans labeled approximately Starter / Pro / Enterprise, each in a rounded-2xl card with green checkmarks? | □ | | 5 | Is the footer split into four columns titled approximately Product / Company / Resources / Legal? | □ |
Interpretation:
- 0–1: Almost certainly human. The fast scan stops here.
- 2–3: AI-assisted or templated. Worth a closer look but not damning.
- 4–5: AI-dominant. Run the full 21-sign audit.
The fast scan captures roughly 80 percent of the diagnostic value of the full method. The remaining 20 percent — the technical fingerprints, the HTTP headers, the sitemap timestamps, the shadcn class soup — covers the edge cases where a careful human happens to use the same defaults as an AI model, or where an AI output has been lightly retouched.
Anyone who reads only this section already has more diagnostic capacity than most clients who hand over checks for so-called custom sites. The next sections explain the why, then walk through the twenty-one signs in detail.
---
Why detecting AI slop is non-trivial
The detection problem in 2026 is not the same as it was in 2023. In 2023, an AI-generated landing page had visible tells in the copy: stilted phrasing, hallucinated features, repeated transitions. In 2026, the copy is mostly clean. The detection has shifted from text to structure.
AI-assisted versus 100 percent slop
A working definition is needed before any judgment.
AI-assisted describes a site where a designer or developer used an AI tool to draft sections, generate boilerplate, or accelerate iteration, and then made deliberate human choices: a non-default palette, a typography pairing, a layout that breaks the canonical grid, copy that sounds like a person wrote it. AI-assisted work is the dominant mode of frontend production in 2026 and is not a problem. The output reflects judgment.
Pure slop describes a site that came out of a generator, was lightly tweaked, and shipped as if a human had designed it from scratch. The output reflects defaults. Every choice — color, font, grid, copy — is the model's first guess. Stacked together, those first guesses produce a recognizable signature.
The distinction matters for two reasons. First, it determines whether a buyer was misled. Second, it determines whether the site has any chance of differentiation in a market where every competitor has access to the same generators.
AI code versus AI design
The detection landscape splits cleanly into two halves.
AI code — meaning AI-generated JavaScript, Python, or backend logic — can be detected with statistical tools. Anthropic's DeepCheck, GPTZero's code module, and similar systems analyze token distributions, control-flow patterns, and stylistic regularities to estimate the probability that a code block was machine-generated. These tools work because code has a measurable surface: identifiers, operators, structure. They produce a number.
AI design — meaning the visual output of a generator — does not yield to statistical detection. There is no token distribution. There is a screenshot. The signal lives in human heuristics: the recognition that a particular palette, a particular grid, a particular microcopy choice has appeared on ten thousand other sites in the last quarter. GPTZero applied to a screenshot returns nothing useful. The detector must be a person who knows what to look at.
This is why the method below is built around perceptual signs, not algorithms. The reader becomes the classifier.
Why generic detectors fail on visuals
A reasonable question: why not train a classifier on screenshots of human versus AI sites?
Three reasons. First, the training set drifts every quarter. v0 in April 2026 produces different defaults than v0 in October 2025. Second, the negative class — human-designed sites — includes everything from brutalist Berlin agency pages to corporate fintech dashboards. The variance swamps the signal. Third, and most importantly, the diagnostic is contextual. A rounded-2xl card grid is not slop on a Vercel template. It is slop on an invoice that claimed forty hours of bespoke design.
Detection is therefore a structured manual audit. The twenty-one signs are the structure. The thirty-second method is the protocol. The score is the output.
---
The 21-sign 30-second method
The method has three phases, each with a fixed time budget.
Phase 1 — Visual scan (10 seconds): Look at the homepage above the fold. Note the palette, the font, the hero structure, the presence of an emoji-prefixed pill, the gradient direction.
Phase 2 — Section walk (15 seconds): Scroll through the page once. Note the three-card grid, the social proof logos, the how-it-works steps, the pricing structure, the testimonial slider, the FAQ accordion, the footer columns.
Phase 3 — Technical inspect (5 seconds, optional): Right-click, Inspect Element. Look at one button's class attribute. Look at the computed font-family. Look at the network tab for _next/static or shadcn-style class names.
The full audit takes thirty seconds for a trained eye. The scoring sheet is the twenty-one-sign table at the end.
The signs are not weighted equally. Some, like the canonical three-card grid, are near-deterministic — present in 90 percent of generator outputs, present in maybe 15 percent of human outputs. Others, like the use of Lucide icons, are informative but not damning, since Lucide is genuinely popular among human designers. Each sign has a severity score from 1 to 5 reflecting its diagnostic value.
What follows is the detailed catalog.
---
The 21 objective signs
Sign 1 — Palette `#3b82f6` plus `#8b5cf6` (Tailwind blue-500 plus purple-500)
Severity: 4/5
The single most reliable visual signature of an AI-generated site is the pairing of Tailwind's blue-500 and purple-500. These two hex values, #3b82f6 and #8b5cf6, appear in virtually every default Tailwind theme demonstration, in the marketing copy of Vercel and Lovable, and consequently in the training distribution that governs what every major code generator produces when asked for a modern landing page.
The narrative capture is consistent. A primary button rendered in bg-blue-500 with white text. A secondary accent — a badge, an icon background, a gradient stop — rendered in bg-purple-500. A hover state that shifts to bg-blue-600 and bg-purple-600. No saturation tuning, no chroma adjustment, no brand-specific deviation.
A human designer working on a serious project chooses a palette. The choice involves a spec, a moodboard, a check against the brand's existing assets, a verification under different lighting and accessibility conditions. The output is a hex value that almost never matches Tailwind's defaults exactly, because Tailwind's defaults were chosen for general utility, not for the specific brand on the desk.
A generator does not choose. It returns the most likely token sequence given the prompt "modern startup landing page." That sequence is almost always blue-500 plus purple-500.
Verification method: Open DevTools. Click the primary button. In the Styles panel, look at the background-color property. If it reads rgb(59, 130, 246) — that is #3b82f6 — flag the sign. If the secondary accent reads rgb(139, 92, 246) — that is #8b5cf6 — flag again.
Detection gain: This single sign correctly classifies roughly 60 percent of generator outputs against a baseline of 12 percent false positives in human-designed sites that happen to like blue and purple. It is the highest-leverage check in the method.
A counterexample worth noting: some serious B2B brands genuinely use blue-500 because they want to communicate trust and the founder happens to like that exact shade. The check is necessary but not sufficient. It must combine with at least two other signs from the list to support a strong classification.
---
Sign 2 — Default Inter font, no fallback tuning
Severity: 3/5
Inter is the default sans-serif of the modern frontend stack. It ships in Tailwind documentation, in Vercel templates, in shadcn/ui examples, and in every starter that v0 and Lovable produce. Its ubiquity is not a flaw; Inter is a well-engineered typeface designed by Rasmus Andersson with rigorous attention to legibility at small sizes.
The signal is not the use of Inter. The signal is the use of Inter without any companion typography choice — no display face for headlines, no monospace for code or labels, no italic alternate, no font-feature-settings tuning for tabular numerals. The site uses Inter for everything, in two or three weights, with default tracking, and stops there.
A human designer, even one who chooses Inter, makes companion choices. A serif for body text. A geometric sans for the logotype. A grotesque for the H1. A monospace pair like JetBrains Mono or IBM Plex Mono for technical labels. The pairing communicates thought.
A generator returns Inter. Period.
Verification method: Open DevTools. Inspect any text element. In the Computed panel, look at font-family. If the resolved font is Inter, sans-serif or Inter, system-ui, ... and every text node — H1, body, button, caption — resolves to the same family with no variation, flag the sign.
A more specific check: look at the head of the document for tags loading Google Fonts or next/font/google calls. If the only font requested is Inter, flag.
Detection gain: Roughly 70 percent of generator outputs use Inter exclusively. About 30 percent of human-designed sites also use Inter exclusively, since it is genuinely popular. The sign is informative but should not stand alone.
---
Sign 3 — Three-card grid with `rounded-2xl` and `shadow-md`
Severity: 5/5
This is the most diagnostic structural sign in the catalog. Almost every AI-generated landing page contains, somewhere between the hero and the footer, a section consisting of exactly three cards arranged horizontally on desktop and stacked vertically on mobile. Each card has identical structure: an icon at the top, a heading, two or three lines of body copy, and optionally a small link. The card itself uses rounded-2xl for the corner radius, shadow-md for the elevation, and bg-white or bg-gray-50 for the surface.
The reason is structural. Generators are trained on the canonical landing-page anatomy: hero, three benefits, social proof, three steps, testimonials, pricing, FAQ, footer. The three-benefits section is almost never four, almost never two, and almost never seven. It is three. Three is the median across the training corpus, and the median is what the model returns.
A human designer might use three cards. A human designer might also use a horizontal scroll, an asymmetric grid, a single hero feature with three sub-points, or a comparison table. The decision reflects content. The generator decision reflects the median.
The corner radius and shadow are equally diagnostic. Tailwind's rounded-2xl sets border-radius: 1rem, a value that is neither sharp enough to feel architectural nor soft enough to feel friendly. It is the safe middle. shadow-md is similarly the safe middle of Tailwind's shadow scale: visible but not heavy.
Verification method: Scroll to the section after the hero. Count the cards. If the count is exactly three, inspect any card. Look at the class attribute. Search for rounded-2xl and shadow-md (or shadow-lg, the close variant). If both are present and the grid is clearly three-by-one on desktop, flag.
A more granular check: look at the icons inside each card. If they are Lucide icons (cross-reference Sign 20), the diagnostic strengthens.
Detection gain: This sign correctly classifies roughly 80 percent of generator outputs. The false positive rate against human-designed sites is around 10 percent, since the three-card pattern is genuinely common in templates and starter kits. Combined with Sign 1, it is near-determinative.
---
Sign 4 — Hero with H1, paragraph, primary CTA, ghost CTA
Severity: 4/5
The canonical hero structure produced by v0, Lovable, and Bolt is identical to the point of being a fingerprint:
- An
with a two-line headline, often containing a colored span for emphasis on three or four words. - A
immediately below the headline, two or three sentences long, describing the value proposition in generic startup language. - A primary CTA button — filled, with the dominant brand color as background, white text, sometimes an arrow icon to the right.
- A secondary CTA button — ghost or outline style, with no background fill, the brand color or gray as border, the same color as text, sometimes a play icon for "Watch demo."
The two buttons sit side by side on desktop, stacked on mobile, with consistent gap and equal height. The primary CTA reads "Get started" or "Try it free." The secondary reads "Learn more," "Watch demo," or "Book a call."
A human designer has more degrees of freedom. The hero might be split into two columns, with media on one side. It might have a single CTA. It might have a form embedded directly. It might lead with a video. The generator has one shape.
Verification method: Look at the hero region. Count the elements directly under the H1. If the sequence is paragraph, then two buttons in a flex row, with the first solid-filled and the second ghost-styled, flag the sign.
A copy check (cross-reference Sign 13): if the buttons read "Get started" and "Learn more" — exactly those phrases — the diagnostic strengthens.
Detection gain: 75 percent of generator outputs follow this exact structure. About 25 percent of human-designed B2B sites do too, since the structure is genuinely effective. Sign 4 is high-value but contextual.
---
Sign 5 — Tag pill above H1 with sparkle or rocket emoji
Severity: 4/5
A small rounded badge sits directly above the H1. It contains a sparkle emoji (✨), a rocket (🚀), or occasionally a fire (🔥), followed by text like "New: AI-powered insights" or "Now in beta" or "Just launched." The badge has a soft background — bg-purple-100 or bg-blue-50 — a colored border or none, and a subtle pill shape from rounded-full and small horizontal padding.
The pattern originated in startup landing pages around 2021 and was systematized by Vercel, Stripe, and Linear marketing teams. Generators learned it as canonical. The result is that nearly every AI-generated SaaS landing page now opens with a sparkle emoji.
The sign is diagnostic for two reasons. First, the emoji choice is almost always one of the three above — sparkle, rocket, fire — because those are the most frequent in the training distribution. Second, the microcopy follows a tight pattern: "New:" or "Now:" or "Introducing:" followed by a feature name in title case.
A human designer might use a similar badge. A human designer would more often write a specific sentence — a customer name, a release version, a specific date — rather than the generic "New: AI-powered X."
Verification method: Look at the area immediately above the H1. If a small pill-shaped element exists with an emoji and short text, inspect it. Note the emoji glyph. If it is sparkle, rocket, or fire, and the text follows the pattern "Word: phrase," flag the sign.
Detection gain: 55 percent of generator outputs include this pill. The false positive rate is around 8 percent. The sign is moderately specific and worth checking early.
---
Sign 6 — Gradient violet to pink, `bg-gradient-to-br from-purple-500 to-pink-500`
Severity: 5/5
The most visually loud signature in the catalog. A generated site frequently uses a violet-to-pink linear gradient as the dominant accent: as the background of a hero card, as the fill of a CTA button, as the background of a badge, or as a decorative blob behind the headline. The exact Tailwind class is bg-gradient-to-br from-purple-500 to-pink-500, which translates to a 135-degree gradient from #8b5cf6 to #ec4899.
The gradient is so closely associated with v0 and Vercel marketing that it has become a meme in frontend Twitter. Variants exist — from-violet-600 to-fuchsia-600, from-indigo-500 to-purple-500 — but the family is unmistakable. The colors are bright, saturated, and identical across thousands of generated sites.
A human designer working on a serious brand does not pick this gradient. The reasons are practical: it clashes with most logos, it looks identical to every competitor, and it carries a strong "AI demo" association that signals lack of effort. A human designer who picks a gradient picks one that fits the brand: muted, custom, or genre-specific.
Verification method: Look for the brightest gradient on the page. Inspect it. Read the background-image value in the Computed panel. If it contains linear-gradient with stops at or near #8b5cf6 and #ec4899 (or hex values within ten units of those), flag the sign.
A class-attribute check is faster: search the DOM for gradient-to-br, from-purple-500, to-pink-500, from-violet, to-fuchsia, or to-pink. Any of these in the same element flags the sign.
Detection gain: 65 percent of generator outputs include this exact gradient. The false positive rate is below 5 percent — human designers genuinely avoid it now because of the AI-demo association. Sign 6 is one of the strongest single indicators in the catalog.
---
Sign 7 — Three social-proof logos with "Trusted by"
Severity: 2/5
A horizontal strip below the hero contains three to five client or partner logos in grayscale, with a small label above reading "Trusted by leading teams" or "Used by companies like" or simply "Trusted by." The logos are evenly spaced, vertically centered, and rendered at low opacity to avoid color clash. On mobile, they wrap into a two-row grid.
The pattern is genuinely effective and predates AI generators by a decade. Stripe popularized it. Every B2B template includes it. The generator learned it as part of the canonical anatomy and inserts it whether or not the actual product has any clients.
The diagnostic value is lower than other signs because the pattern is so widespread among legitimate human-designed sites. What raises the suspicion is the specific phrasing — "Trusted by leading teams" or "Used by industry leaders" — combined with placeholder logos that may not correspond to actual customer logos. On a generated site, the logos are often Vercel sample SVGs or stock placeholder marks that the buyer is expected to replace.
Verification method: Look for the "Trusted by" strip. Note the count of logos and the phrasing of the label. If the label is exactly "Trusted by" or "Trusted by leading teams" with no specific customer name, and the logos look generic or unfamiliar, flag.
Detection gain: 50 percent of generator outputs include this pattern with default phrasing. The false positive rate is high — around 35 percent — because the pattern is a legitimate convention. Sign 7 is supportive but not strong.
---
Sign 8 — How it works, three steps with Lucide icons
Severity: 3/5
A section titled "How it works" or "Get started in three steps" contains exactly three numbered steps. Each step has a Lucide icon (cross-reference Sign 20), a short heading, and one or two lines of body copy. The numbering is visible — a circled "1," "2," "3" or large numerals to the left of each step.
Three is canonical. Two feels insufficient. Four feels like work. Generators consistently produce three.
The Lucide icons are typically generic: a sparkle for "Sign up," a wrench or settings gear for "Configure," a rocket or check for "Launch." The choice of icons follows the most-likely token in the training distribution.
Verification method: Scroll to the section after the benefits cards. If a "How it works" block exists with exactly three numbered steps and each step contains an SVG icon, inspect one icon. If the SVG markup contains lucide- in the class attribute or matches Lucide's open-source markup pattern, flag the sign.
Detection gain: 60 percent of generator outputs include exactly three steps with Lucide icons. The false positive rate is around 25 percent.
---
Sign 9 — Three pricing plans, Starter / Pro / Enterprise, `rounded-2xl`, green checks
Severity: 5/5
Pricing is the strongest section-level diagnostic in the catalog. A generated SaaS site contains a pricing section with exactly three plans. The plan names are some permutation of: Starter, Basic, Free, Pro, Plus, Business, Enterprise, Custom. The middle plan is highlighted with a colored border, a "Most popular" badge, or a slight scale-up. Each plan card uses rounded-2xl and shadow-md. Below the price, a list of features uses green checkmark icons (typically Lucide's Check or CheckCircle) on the left of each line.
The structure is so canonical that it has become impossible to use it sincerely without looking generated. Even Stripe's official pricing page has moved away from it to differentiate. Yet generators continue to produce it because the training distribution heavily weights this exact layout.
The diagnostic strength comes from the simultaneous presence of multiple sub-signs: three plans, the Starter/Pro/Enterprise naming, the rounded card shape, the green checks, the highlighted middle plan, and the standard "Get started" CTA on each. When all six are present, the probability of human design from scratch is below 8 percent.
Verification method: Scroll to the pricing section. Count the plans. Note the names. Inspect a plan card. If rounded-2xl is in the class attribute, if the feature list uses green checkmarks, if the middle plan has a "Most popular" or "Recommended" badge, and if the plan names match the canonical set, flag the sign.
Detection gain: 78 percent of generator outputs follow this exact pricing pattern. The false positive rate is below 12 percent. Sign 9 is the second-strongest single indicator after Sign 6.
---
Sign 10 — Testimonial slider with round photos
Severity: 3/5
A section above the footer contains testimonials. Each testimonial has a quote, a name, a job title, a company, and a circular profile photo. The section is either a horizontal slider with arrow controls or a static three-column grid. The photos are typically headshots — sometimes real, sometimes stock, sometimes AI-generated faces.
The diagnostic specifics: the quotes are short (one to three sentences), end with a period, and rarely contain specific numbers or product details. The names are anglophone (a Sarah, a David, a Emma, a James). The job titles are generic ("CEO," "Head of Product," "Founder"). The companies, when present, are often invented.
A human-designed testimonial section more often has long quotes, specific use cases with numbers, real photos, and verifiable companies with linkable case studies.
Verification method: Scroll to the testimonial section. Read three quotes. If they are short, generic, and free of specific numbers, and if the photos are circular and possibly stock or AI-generated, flag.
A reverse image search on one photo can confirm AI generation; tools like FaceCheck or simple Google Lens can identify stock or generated faces.
Detection gain: 45 percent of generator outputs include this exact testimonial pattern. The false positive rate is around 30 percent.
---
Sign 11 — Plain accordion FAQ with chevron icon
Severity: 2/5
The FAQ section uses a vanilla accordion: each question is a clickable row with a chevron icon on the right that rotates 90 or 180 degrees on expand. The container has rounded-lg borders, gray dividers between rows, and no visual hierarchy beyond bold question text. The number of questions is six to eight.
The component is the default shadcn with the default , , and Radix UI primitives underneath. The chevron is Lucide's ChevronDown.
The pattern is so common among template sites that the detection value is moderate. Many human-designed sites also use shadcn for this exact reason — the component is well-engineered and accessible. The signal is weakened accordingly.
Verification method: Scroll to the FAQ. Click a question. If the open behavior is a smooth rotate-and-expand with no other animation, and the markup uses Radix primitives (visible in DevTools as data-radix-accordion-trigger or similar), flag.
Detection gain: 55 percent of generator outputs use this exact FAQ pattern. The false positive rate is around 40 percent. Sign 11 is informative but weak.
---
Sign 12 — Footer with four columns, Product / Company / Resources / Legal
Severity: 4/5
The footer is a structural fingerprint. A generated site has a footer with four columns. The columns are titled, in this order or close to it: Product, Company, Resources, Legal. Each column contains three to six links. Below the columns, a horizontal row contains the logo, the copyright notice, and small social icons (X, GitHub, LinkedIn, Discord).
The diagnostic specifics: the columns are evenly spaced, the link text is short ("Pricing," "Blog," "About," "Privacy"), the logo is the same as the header logo, the copyright reads "© 2026 [Company]. All rights reserved." or similar.
A human designer might use a footer with two columns, a single horizontal row, a large brand statement, an embedded newsletter form, or no footer columns at all. The four-column layout is a generator default.
Verification method: Scroll to the footer. Count the columns. Note the column titles. If the count is four and the titles match the canonical set (any subset of Product, Company, Resources, Legal, Solutions, Support, Developers), flag the sign.
Detection gain: 70 percent of generator outputs use this exact footer structure. The false positive rate is around 20 percent.
---
Sign 13 — Microcopy "Get started," "Learn more," "Watch demo," "Try it free"
Severity: 4/5
The microcopy across all CTAs follows a tight pattern. The primary CTA reads "Get started" or "Try it free" or "Start now." The secondary CTA reads "Learn more" or "Watch demo" or "Book a call." Section headings use language like "Everything you need to," "Built for," "Designed for," "The all-in-one platform for." Body copy uses phrases like "powered by AI," "in just a few clicks," "without writing a single line of code."
The vocabulary is so consistent across generators that it has become a sign in its own right. Three or more of the canonical phrases on a single page is strong evidence of generated copy.
A human copywriter has more verbal range. A human writes specific, branded, sometimes weird CTAs: "Plant your flag," "See it for yourself," "Take the tour." Specificity is the antidote to generation.
Verification method: Read the page. Count the canonical phrases: "Get started," "Learn more," "Watch demo," "Try it free," "Powered by AI," "in just a few clicks." If three or more appear, flag.
Detection gain: 65 percent of generator outputs use three or more canonical phrases. The false positive rate is around 18 percent.
---
Sign 14 — Auto-generated OG image, often violet gradient with site name
Severity: 3/5
The Open Graph image is the preview that appears when the URL is shared on social platforms. A human-designed site has a custom OG image: a designed visual, a screenshot, a brand asset. A generator either omits the OG image or auto-generates one.
The auto-generated OG image has a recognizable look: a violet or blue gradient background, the site name in large white sans-serif text, sometimes a subtitle or tagline. The dimensions are 1200 by 630, the canonical Open Graph size. The image is typically served from /api/og or /_next/image?url=... if the site uses Next.js's @vercel/og library.
Verification method: View the page source. Look for . Open the URL in a new tab. If the image is a violet gradient with the site name in white text and no other content, flag.
A faster check: paste the page URL into Twitter, LinkedIn, or Slack. The preview that renders is the OG image. If it is the generic gradient-and-name pattern, flag.
Detection gain: 50 percent of generator outputs use the auto-generated OG image. The false positive rate is around 22 percent.
---
Sign 15 — No custom 404 page, Next.js or Vercel default
Severity: 3/5
Navigate to a URL that does not exist. Append /asdf-does-not-exist to the domain and load it. A human-designed site has a custom 404 page: a designed message, a search bar, a sitemap, a way to recover. A generator output usually has no custom 404. The page that loads is either:
- The Next.js default 404 — black background, "404 | This page could not be found." in small white text.
- The Vercel default 404 — also minimal, with a Vercel logo.
- A bare 404 page with the site's footer but no content.
The absence of a custom 404 is not damning on its own — many small human-designed sites also skip it. But combined with other signs in the list, it indicates a build that did not bother with edge cases, which is characteristic of generator output.
Verification method: In the address bar, append /random-string-12345 to the homepage URL. Press Enter. If the page is the Next.js default ("This page could not be found.") or the Vercel default, flag.
Detection gain: 55 percent of generator outputs have no custom 404. The false positive rate is around 40 percent.
---
Sign 16 — No custom favicon, Vercel globe or emoji
Severity: 2/5
The favicon is the small icon in the browser tab. A human-designed site has a custom favicon: a logo mark, a brand glyph, sometimes a SVG that adapts to dark mode. A generator output has either:
- The Vercel default favicon — a small dark globe.
- A Next.js default — a black square.
- A single emoji rendered as a favicon (this is an actual default behavior of some Next.js starters).
- No favicon at all, leading to a generic globe in most browsers.
The diagnostic value is low because favicons are easy to add and many human-designed sites also skip them. But combined with other signs, the absence is informative.
Verification method: Look at the browser tab. If the icon is a globe, a black square, or a single emoji glyph, flag.
Detection gain: 45 percent of generator outputs have no custom favicon. The false positive rate is around 38 percent.
---
Sign 17 — Fade-in-up animations everywhere, nothing else
Severity: 3/5
A generated site uses one animation primitive: fade-in-up. Every section enters the viewport with a 30-pixel translate-up combined with an opacity transition from 0 to 1 over 300 to 600 milliseconds. The animation is applied to the hero, the cards, the testimonials, the pricing, the FAQ — every section, identically. The library is typically Framer Motion, the trigger is whileInView or viewport.once.
A human designer who values motion uses different animations for different content. A staggered list. A horizontal slide. A parallax effect. A microinteraction on hover. The variation reflects intent.
A generator returns one animation because variation is not a default. The result is a page that scrolls smoothly but mechanically: every section announces itself with the same gesture.
Verification method: Scroll through the page slowly. Watch each section as it enters. If every section uses the identical fade-and-translate-up entrance, with no other animation type visible, flag.
A code check: search the DOM or the bundled JavaScript for animate-fade-in-up, motion.div, whileInView, or Framer Motion patterns. The presence of one motion primitive applied uniformly is the signal.
Detection gain: 60 percent of generator outputs use this exact pattern. The false positive rate is around 25 percent.
---
Sign 18 — shadcn/ui component naming in the DOM
Severity: 4/5
The single most reliable technical fingerprint of an AI-generated site in 2026 is the presence of shadcn/ui component patterns in the DOM. Open DevTools, look at any button. If the class attribute reads something like inline-flex items-center justify-center whitespace-nowrap rounded-md text-sm font-medium ring-offset-background transition-colors focus-visible:outline-none focus-visible:ring-2 ..., that is the shadcn variant string, copied from the shadcn/ui CLI output and pasted directly.
Look at the data attributes. If elements have data-radix- prefixes, the site uses Radix UI primitives, which is the foundation shadcn is built on. Look at the class soup on dialogs, dropdowns, toasts. Any of these confirm shadcn.
shadcn is not exclusive to AI generators. Many human developers use it because it is genuinely well-engineered. The signal is not the use of shadcn. The signal is the use of shadcn with no customization: the default variants, the default colors, the default border radii, the default shadows. A human who adopts shadcn typically tunes it. A generator returns it raw.
Verification method: Open DevTools. Inspect a button. Look at the class attribute. If it contains the canonical shadcn variant string (the giant utility-class soup with inline-flex items-center justify-center at the start), flag. Also check for data-radix-* attributes on any component.
Detection gain: 75 percent of generator outputs contain unmodified shadcn components. The false positive rate is around 28 percent because shadcn is genuinely popular among human developers. The sign is strong but contextual.
---
Sign 19 — `text-gray-500` and Tailwind defaults instead of named brand variables
Severity: 3/5
A human-built design system uses semantic, named CSS variables: --color-text-secondary, --color-surface-elevated, --color-accent-primary. The Tailwind config maps these to brand-specific values. Components reference the semantic name. The advantage is that the brand can rebrand later by changing one variable.
A generated site uses raw Tailwind defaults: text-gray-500, bg-white, border-gray-200, text-blue-600. There is no semantic layer. The classes refer directly to the Tailwind palette. If a rebrand is needed later, every file must be touched.
Verification method: Inspect any text element. Look at the class attribute. If the text color class is a raw Tailwind class like text-gray-500, text-gray-600, text-zinc-700, text-slate-500 — and not a semantic class like text-secondary, text-muted, text-content — flag.
A more thorough check: open the site's CSS file (typically in DevTools' Sources panel). Search for --color- or --brand- or --surface-. If the file contains no semantic custom properties beyond what shadcn ships with, flag.
Detection gain: 65 percent of generator outputs use raw Tailwind classes throughout. The false positive rate is around 32 percent. Sign 19 is moderately strong.
---
Sign 20 — Lucide icons in nav and CTAs
Severity: 2/5
Lucide is the default icon library of the modern frontend stack. It ships with shadcn/ui, it integrates trivially with Next.js, and its open-source license makes it the path of least resistance for any developer or generator producing an icon-heavy site.
The diagnostic specifics: every icon on the page is from Lucide. The arrow next to a CTA is Lucide's ArrowRight. The hamburger menu is Lucide's Menu. The check in the pricing list is Lucide's Check or CheckCircle. The chevron in the FAQ is Lucide's ChevronDown. There is no custom icon set, no Phosphor, no Heroicons mixed in, no SVG illustrations specific to the brand.
Lucide use is not a strong signal in isolation. Many human-designed sites also use Lucide. The signal is the absence of any other icon source.
Verification method: Inspect a few icons across the page. Look at the SVG markup. If every icon has the same authoring style — typically 24x24 viewport, stroke-based, class="lucide lucide-arrow-right" — flag.
Detection gain: 70 percent of generator outputs use Lucide exclusively. The false positive rate is around 50 percent because Lucide is genuinely popular. Sign 20 is weak in isolation.
---
Sign 21 — Identical sitemap.xml `lastmod` dates across all pages
Severity: 3/5
A site with real activity has a sitemap where different pages have different lastmod dates: the blog index updated yesterday, the pricing page updated last month, the about page updated six months ago. The variance reflects actual editing.
A generated site has a sitemap where every page shares the same lastmod — the build date of the deployment. Every entry reads 2026-04-15T14:23:11.000Z or similar, identical to the second across the homepage, the pricing, the blog, the contact, the legal pages. This is the default behavior of Next.js's app/sitemap.ts when the build process generates the sitemap at deploy time without per-page metadata.
The signal is not the use of build-time sitemaps. The signal is the absence of any human curation: no manual lastmod overrides, no per-page edit timestamps, no exclusion of static legal pages from the rebuild.
Verification method: Append /sitemap.xml to the domain. Open it. Look at the values. If every page shares the same timestamp, flag.
A stronger check: compare the sitemap timestamp to the site's most recent blog post date or commit date. If the sitemap reads "today" but the most recent blog post is from 2025, the sitemap is auto-generated and the site has no real editorial activity.
Detection gain: 55 percent of generator outputs have identical timestamps. The false positive rate is around 30 percent.
---
Advanced technical indicators (DevTools required)
Beyond the twenty-one perceptual signs, a deeper inspection reveals technical fingerprints that are nearly impossible to fake without rebuilding the site from scratch. These checks require opening DevTools, but each takes under thirty seconds and produces a near-deterministic verdict.
Bundle composition
// In DevTools console:
performance.getEntriesByType('resource')
.filter(r => r.name.includes('_next/static') || r.name.includes('.js'))
.map(r => r.name);A generated Next.js site loads a recognizable bundle pattern:
next/font/googlechunks for Inter (sometimes split intoInter_Fallback.cssandInter.css).- Tailwind's compiled CSS in a single file at
_next/static/css/. - Radix UI primitives loaded as separate chunks:
@radix-ui_react-dialog.js,@radix-ui_react-accordion.js,@radix-ui_react-dropdown-menu.js. - A
cn()utility function from shadcn — typically present asclsxplustailwind-mergein the bundle.
The combination — Next.js, Tailwind, Inter via next/font, Radix primitives, shadcn cn() — is the modal stack of generators.
React tree introspection
If the site is built with v0 specifically, a React tree introspection sometimes reveals data-react-tree attributes or comments that trace the component hierarchy back to v0's internal naming. These are not always present, but when found, they are unambiguous.
// In DevTools console:
document.querySelectorAll('[data-v0-component], [data-v0-id]').length;A return value greater than zero confirms v0 specifically.
HTTP headers signature
The deployment platform leaves headers that are diagnostic of the hosting choice, which in turn is correlated with the generator used.
| Header | Signature | Most common generator origin | |--------|-----------|------------------------------| | x-vercel-deployment-id | Present | v0, any Vercel-deployed site | | x-vercel-id | Present | Vercel deployment | | x-vercel-cache | HIT/MISS/STALE | Vercel ISR active | | server: Vercel | Present | Vercel-hosted | | x-nf-request-id | Present | Netlify deployment | | cf-ray | Cloudflare worker ID | Cloudflare Pages | | x-replit-id | Present | Replit deployment | | x-bolt-deployment | Present | Bolt.new direct deploy |
A site that leaves x-vercel-deployment-id plus uses the canonical shadcn-and-Inter stack is statistically very likely to have come from v0.
# Command-line check:
curl -I https://example.com 2>&1 | grep -i "x-vercel\|x-nf\|cf-ray"Sitemap timestamps
curl https://example.com/sitemap.xml | grep "<lastmod>"If every line returned has the same date and time, the sitemap is auto-generated at build time with no per-page tuning, which strongly correlates with generator output.
Canonical link patterns
<link rel="canonical" href="https://example.com/" />A generated site often has the canonical URL in the head, matching the page URL exactly with no parameters. When canonical tags are present on every page and the site has no actual content variation, the SEO setup is the Next.js default.
---
The AI score calculator
The full audit produces a number between 0 and 75. Each sign is scored: present (severity points), absent (zero). The total maps to an interpretation band.
| # | Sign | Severity | Present? (Y=Sev, N=0) | Score | |---|------|----------|----------------------|-------| | 1 | Tailwind blue-500 plus purple-500 palette | 4 | □ | __ | | 2 | Default Inter font, no fallback | 3 | □ | __ | | 3 | Three-card grid with rounded-2xl + shadow-md | 5 | □ | __ | | 4 | Hero: H1 + p + 2 CTAs (filled + ghost) | 4 | □ | __ | | 5 | Pill above H1 with sparkle/rocket emoji | 4 | □ | __ | | 6 | Gradient violet-to-pink (purple-500 to pink-500) | 5 | □ | __ | | 7 | Three social-proof logos with "Trusted by" | 2 | □ | __ | | 8 | "How it works" three steps with Lucide icons | 3 | □ | __ | | 9 | Three pricing plans (Starter/Pro/Enterprise), green checks | 5 | □ | __ | | 10 | Testimonial slider with round photos | 3 | □ | __ | | 11 | Plain accordion FAQ with chevron | 2 | □ | __ | | 12 | Footer four columns (Product/Company/Resources/Legal) | 4 | □ | __ | | 13 | Microcopy "Get started"/"Learn more"/"Watch demo" | 4 | □ | __ | | 14 | Auto-generated OG image (gradient + site name) | 3 | □ | __ | | 15 | No custom 404 page (Next.js or Vercel default) | 3 | □ | __ | | 16 | No custom favicon (globe or emoji) | 2 | □ | __ | | 17 | fade-in-up animations everywhere | 3 | □ | __ | | 18 | shadcn/ui component naming in DOM | 4 | □ | __ | | 19 | text-gray-500 / Tailwind defaults instead of named vars | 3 | □ | __ | | 20 | Lucide icons in nav and CTAs | 2 | □ | __ | | 21 | Identical sitemap.xml lastmod dates | 3 | □ | __ | | | TOTAL (out of 75) | | | __ |
Interpretation bands
| Score range | Verdict | Description | |-------------|---------|-------------| | 0–15 | Human-designed | The site reflects deliberate choices. AI may have assisted in implementation, but the design vocabulary is human. | | 16–30 | AI-assisted | Significant AI involvement with human direction. Acceptable for most contexts. The buyer should understand what they paid for. | | 31–50 | AI-dominant | AI generated the foundation; minor human edits exist. Acceptable for internal tools, MVPs, and side projects. Misleading for premium freelance or agency deliverables. | | 51–75 | Pure slop | Direct generator output with negligible human modification. The site has no differentiation from thousands of identical outputs. Buyer was misled if they paid for custom work. |
The bands are calibrated against an audit of one hundred random freelance portfolio sites and one hundred random AI-generated demos run in early 2026. The discrimination boundary at 30 separates human-led from AI-led with roughly 92 percent accuracy. The boundary at 51 separates AI-assisted from pure slop with roughly 87 percent accuracy.
---
Five concrete case studies
The following five anonymized audits illustrate the method in practice. Each site was evaluated independently by two reviewers; scores reported are the average of both. Names have been changed; structural details are accurate.
Case A — "FlowSync" — B2B SaaS for project management
The site presented as a polished startup landing page. Background: a freelance designer charged $14,000 for what was described as "custom design system and bespoke implementation."
Audit results:
| # | Sign | Present | Points | |---|------|---------|--------| | 1 | Blue-500 + purple-500 | Yes | 4 | | 3 | Three-card rounded-2xl grid | Yes | 5 | | 4 | Canonical hero structure | Yes | 4 | | 6 | Violet-to-pink gradient | Yes | 5 | | 9 | Three pricing plans canonical | Yes | 5 | | 12 | Four-column footer | Yes | 4 | | 13 | Canonical microcopy | Yes | 4 | | 18 | shadcn/ui in DOM | Yes | 4 | | 19 | text-gray-500 throughout | Yes | 3 | | 20 | Lucide icons exclusively | Yes | 2 | | Other signs | (12 signs at zero) | No | 0 |
Total: 40/75 — AI-dominant.
The freelance had used v0 for the initial generation, made cosmetic changes (a logo, two photos), and shipped the result. The buyer discovered this only when a developer auditing for an unrelated reason pointed out the shadcn class soup. The contract was renegotiated.
Case B — "Northcurrent" — Boutique financial advisory
A small firm contracted a one-person agency for a brand site. Quoted price was $8,500 for "complete custom design including motion graphics."
Audit results:
| # | Sign | Present | Points | |---|------|---------|--------| | 2 | Inter exclusively | Yes | 3 | | 17 | One animation primitive | Yes | 3 | | 18 | shadcn/ui visible | Yes | 4 | | 20 | Lucide icons | Yes | 2 | | Other signs | (17 signs at zero) | No | 0 |
Total: 12/75 — Human-designed.
The site used Inter and shadcn, but the palette was custom (deep navy and warm cream, not Tailwind defaults), the typography included a bespoke serif for headlines, the animation was a non-default parallax effect, and the layout broke the canonical anatomy with a vertical hero. The freelance had used AI to scaffold but produced genuine design work on top.
Case C — "Helix Health" — Healthtech MVP
A two-person startup built their landing page in one weekend using Lovable. Their pitch deck explicitly named the tool. The site went up at no cost beyond Lovable's subscription.
Audit results:
| # | Sign | Present | Points | |---|------|---------|--------| | 1 | Blue-500 + purple-500 | Yes | 4 | | 2 | Inter only | Yes | 3 | | 3 | Three-card grid | Yes | 5 | | 4 | Canonical hero | Yes | 4 | | 5 | Sparkle pill | Yes | 4 | | 6 | Violet-to-pink gradient | Yes | 5 | | 7 | Trusted by logos | Yes | 2 | | 8 | Three steps Lucide | Yes | 3 | | 9 | Three pricing plans | Yes | 5 | | 12 | Four-column footer | Yes | 4 | | 13 | Canonical microcopy | Yes | 4 | | 14 | Auto-OG image | Yes | 3 | | 15 | Default 404 | Yes | 3 | | 16 | No favicon | Yes | 2 | | 17 | fade-in-up everywhere | Yes | 3 | | 18 | shadcn raw | Yes | 4 | | 19 | Tailwind defaults | Yes | 3 | | 20 | Lucide everywhere | Yes | 2 | | 21 | Identical sitemap dates | Yes | 3 |
Total: 66/75 — Pure slop.
This is the canonical pure-slop score. The startup did not misrepresent the work; they were transparent about Lovable. The score documents what unmodified Lovable output looks like in the calculator. The same score on a freelance deliverable would be a serious problem.
Case D — "Atrium Studio" — Boutique design agency portfolio
The agency claimed eight years of practice and showcased twelve client projects. Pricing for new engagements started at $25,000.
Audit results:
| # | Sign | Present | Points | |---|------|---------|--------| | 18 | shadcn visible (but heavily customized) | Partial | 0 | | Other signs | (20 signs at zero) | No | 0 |
Total: 0/75 — Human-designed.
The portfolio site used Adobe Fonts for a custom typographic pairing, a hand-built CSS grid (no Tailwind), bespoke SVG illustrations, asymmetric layout, and case studies with detailed metrics. The presence of shadcn in one component (a contact dialog) did not register because the customization was deep — colors, radii, motion, and copy all overridden. The agency's claim of bespoke work was supported by the audit.
Case E — "Quantum Brew" — Coffee roaster e-commerce
A coffee roaster contracted a $4,200 site from a freelance Shopify expert. The expectation was a templated build with brand customization.
Audit results:
| # | Sign | Present | Points | |---|------|---------|--------| | 4 | Hero structure canonical | Yes | 4 | | 7 | "As featured in" logos | Yes | 2 | | 11 | Plain FAQ | Yes | 2 | | 12 | Four-column footer | Yes | 4 | | 13 | Some canonical microcopy | Yes | 4 | | 16 | Generic favicon | Yes | 2 | | 20 | Some Lucide icons | Yes | 2 | | Other signs | (14 signs at zero) | No | 0 |
Total: 20/75 — AI-assisted.
The site was templated (a Shopify theme), not generated, but the score reflects that templates and generators share many surface features. The freelance was transparent about the template; the buyer paid for customization on top of it. The score is consistent with the deliverable.
---
Generator-specific signatures
Each major AI generator leaves a distinct fingerprint. The following table identifies signatures unique to each tool, which can refine the diagnostic from "AI-generated" to "AI-generated by tool X."
| Generator | Unique signature | Verification | |-----------|------------------|--------------| | v0 (Vercel) | data-v0-* attributes occasionally; deployment to Vercel with x-vercel-deployment-id; canonical shadcn + Tailwind + Inter stack; gradient from-purple-500 to-pink-500 very common | DevTools + headers | | Lovable | Footer often retains "Built with Lovable" link unless explicitly removed; Supabase client visible in network tab; deployment patterns to Lovable's hosting | View source + network | | Bolt.new | StackBlitz container artifacts; bundle includes @stackblitz/sdk; sometimes leaves a bolt-new-config.json in public; Astro framework slightly more common than Next.js | Network tab + source | | Replit (Agent) | .replit file occasionally exposed; repl.co or replit.app in canonical URLs unless custom domain; Vite + React more often than Next.js | URL + view source | | Claude Artifacts | Single-file React, often inlined in HTML; no router, no API routes; Tailwind via CDN script tag in head; less likely to use shadcn (Claude prefers raw Tailwind) | View source | | ChatGPT canvas | Similar to Claude Artifacts but more often vanilla JavaScript; Tailwind via CDN; less framework structure; more likely to be a single static page | View source |
The framework choice is itself diagnostic. v0 produces almost exclusively Next.js. Lovable produces React with Vite. Bolt favors Astro. Replit favors Vite. Claude Artifacts and ChatGPT canvas produce one-file React. Identifying the framework narrows the candidate generator before any other check.
# Detect framework from response headers and HTML:
curl -sI https://example.com | grep -i "x-powered-by\|server"
curl -s https://example.com | grep -E '(_next/|@vite/|astro-)'---
Distribution charts
AI score distribution across 100 audited freelance sites (Q1 2026)
Sample: 100 portfolio and client sites delivered by self-described "freelance designers" charging $5,000 to $30,000 per project. Audited blind by two independent reviewers using the 21-sign method. Bin width: 5 points.
Score Count Distribution
───── ───── ───────────────────────────────────────
0– 5 11 ███████████
6–10 13 █████████████
11–15 9 █████████
16–20 14 ██████████████
21–25 11 ███████████
26–30 8 ████████
31–35 6 ██████
36–40 5 █████
41–45 4 ████
46–50 3 ███
51–55 5 █████
56–60 4 ████
61–65 4 ████
66–70 2 ██
71–75 1 █
Median: 22 (AI-assisted band)
Mean: 26.4
Pure slop (51+): 16% of sample
Human-designed (0–15): 33% of sampleThe distribution is bimodal with peaks around 8 (genuine human work, often experienced practitioners) and around 22 (AI-assisted with human direction). The long tail above 50 represents work passed off as custom that is in fact direct generator output.
Detectability evolution 2023 to 2026
How easy is it to detect AI-generated frontends, plotted as the average detection accuracy of trained reviewers over time?
Year Avg accuracy Trend
───── ──────────── ──────────────────────────────
2023 62% ███████████████████████
2024 Q1 71% ██████████████████████████
2024 Q3 78% █████████████████████████████
2025 Q1 84% ███████████████████████████████
2025 Q3 88% ████████████████████████████████
2026 Q1 91% █████████████████████████████████
Trend: Detection has become easier each year as generator
defaults converge. Counter-trend: Sophisticated users
increasingly customize generator output, raising
the floor of what slop looks like.The convergence of generator defaults — the same shadcn, the same Inter, the same blue-purple gradient across v0, Lovable, and Bolt — has made detection simpler. The countertrend is that sophisticated developers increasingly add a customization layer, raising the difficulty of distinguishing AI-assisted from human-led.
---
"Should the reader worry about the score?" — flowchart
flowchart TD
A[Audit complete: score X/75] --> B{X ≤ 15?}
B -->|Yes| C[Human-designed.<br/>No concern.]
B -->|No| D{X ≤ 30?}
D -->|Yes| E{Was custom design<br/>explicitly promised?}
E -->|No| F[AI-assisted is fine.<br/>No concern.]
E -->|Yes| G[Ask freelance about process.<br/>Likely acceptable but verify.]
D -->|No| H{X ≤ 50?}
H -->|Yes| I{Internal tool, MVP,<br/>or side project?}
I -->|Yes| J[AI-dominant is fine<br/>for this context.]
I -->|No| K[Material concern.<br/>Renegotiate or document.]
H -->|No| L{Was the work<br/>premium-priced?}
L -->|No| M[Pure slop but<br/>buyer aware. OK.]
L -->|Yes| N[Pure slop sold as custom.<br/>Refund or rework.]The flowchart resolves the question of whether a score should produce action. The score alone is insufficient; the context — what was promised, what was paid, what the use case is — determines whether action is warranted.
---
When generator output is acceptable
The method should not be wielded as a weapon against legitimate uses of AI tools. There are contexts in which generator output is the right choice and a high score is irrelevant.
Internal tools. A two-person startup needs an internal dashboard for their support team to triage tickets. Lovable produces a working interface in two hours. The interface is generic, the palette is the default, the score is 60. None of this matters. The tool is internal, the audience is three people, the cost saved is two weeks of frontend work. AI-dominant is the right choice.
MVPs and prototypes. A founder has an idea and needs a landing page to capture interest before committing to a full build. v0 generates the page. The page is generic, the score is 55. None of this matters. The page exists to validate demand. If it succeeds, the next version will be custom. AI-dominant is the right choice.
Side projects and personal sites. A developer builds a portfolio site for themselves on a weekend using Bolt. The site is generic, the score is 48. None of this matters. The audience is potential employers who care about the developer's actual work, not the design vocabulary of the portfolio frame. AI-dominant is the right choice.
Hackathon submissions. A team has 48 hours to build a project with a frontend. They use Claude Artifacts to scaffold. The frontend scores 70. The judges are evaluating the idea and the implementation, not the visual differentiation. AI-dominant is the right choice.
The pattern is consistent: when the audience is small, the cost of bespoke design exceeds the marginal value, and the buyer is the same person as the producer, generator output is correct.
When it is not acceptable
The method becomes load-bearing in three contexts.
Public freelance work claiming custom design. A freelance designer charges $15,000 for a "complete custom design system" and delivers a v0 output with a logo swapped. The buyer is a person or company who paid for differentiation and received a generic deliverable shared with thousands of competitors. The score is 60. The misrepresentation is material.
Premium agency deliverables. A client contracts an agency for $40,000 over three months for a "bespoke brand site." The agency uses Lovable internally, ships in three weeks, and bills the full retainer. The score is 55. The agency has captured value that does not match the work. The contract is breached.
Corporate sites at scale. A Fortune 500 company commissions a $200,000 marketing site rebuild. The vendor delivers a site that scores 50. The company will be on the same visual frame as every other corporation that bought the same vendor. The differentiation premium they paid for has not been delivered.
In all three contexts, the audit is not about whether AI was used. It is about whether the buyer knew. If the contract specified custom design and the deliverable scores in the AI-dominant or pure-slop band, the contract is not being honored.
---
How to confront a freelance or agency
The audit produces a number. The number does not produce an action. What follows is a practical guide for the conversation that comes after a high score.
Tone
The conversation is not accusatory. The reader does not know, with certainty, that AI was used. The audit produces a probability. The conversation is about evidence and process.
The wrong opening: "You used AI to generate this. I want a refund." This is brittle because the freelance can plausibly deny, the buyer cannot prove, and the relationship deteriorates.
The right opening: "I would like to understand the process behind this deliverable in more detail. Can you walk me through how the design system was developed?" This is non-confrontational and produces information. A freelance who built the site genuinely will be able to describe iteration, reference choices, alternatives considered. A freelance who shipped a generator output will not.
Specific questions
The following questions produce diagnostic information:
- "Can you share the moodboard and inspiration references that informed the palette?"
- "What alternative typography pairings did you consider before landing on Inter?"
- "How was the spacing system developed? Is it on a baseline grid or a modular scale?"
- "Can you show the design files in Figma or whatever tool was used?"
- "Were there any A/B variants of the hero section explored?"
- "How was the pricing card layout chosen over alternatives like a comparison table?"
- "Can you walk me through the source repository and explain the components you built?"
A genuine designer answers these questions in detail and with specific reasoning. A generator user struggles with the third question or later. The asymmetry is informative.
Contractual clauses
For future engagements, the following contractual additions reduce the risk of slop being delivered as custom work:
AI disclosure clause: "Vendor agrees to disclose any AI-generated outputs used in the deliverable, including but not limited to outputs from v0, Lovable, Bolt, Cursor, Replit, Claude, GPT-4, and similar tools. AI-generated outputs may be used as part of the development process but must be substantially modified before delivery."
Originality clause: "Vendor warrants that the design system, including but not limited to color palette, typography, layout grid, and component styles, is original work developed for the project and not direct output from any third-party tool or template."
Process documentation clause: "Vendor agrees to deliver, alongside the final code, a process document that includes design exploration files (in Figma, Sketch, or equivalent), a list of design decisions and their rationale, and a record of iteration on at least three major sections of the site."
Audit right clause: "Buyer reserves the right to commission an independent audit of the deliverable to verify the level of customization. If the audit reveals that the deliverable is substantially direct generator output, Vendor agrees to either rework the deliverable to meet originality standards or refund a proportional share of the engagement fee."
These clauses are not adversarial. They formalize what most buyers assume is implicit. Their presence in a contract changes incentives at the freelance side: there is no upside to passing off a generator output, since the audit clause exposes it.
---
Complementary tools
The 21-sign method is one input. A complete diagnostic combines it with adjacent tools, each addressing a different surface.
Sailop — A complete frontend audit toolkit installable via npm. Sailop scans a deployed URL or a local repository and returns a detailed report with score, signal-by-signal breakdown, suggested remediation, and uniqueness checks against a corpus of known generator outputs. Run via npx sailop scan https://example.com. The output is JSON, suitable for piping into a CI/CD pipeline or a Slack notification. Where this manual method takes thirty seconds, Sailop takes about ten and produces a more granular signal.
GPTZero — Statistical text classifier. Useful for analyzing the copy on a page (blog posts, marketing pages) but blind to visual generation. A site can score zero on GPTZero and 70 on the visual audit. The two tools answer different questions.
Wappalyzer — Browser extension that identifies the technology stack: framework, hosting, analytics, fonts, libraries. Wappalyzer does not detect AI generation directly, but it identifies the canonical stack (Next.js + Tailwind + Vercel + shadcn) that strongly correlates with generator output. A useful pre-filter.
Lighthouse — Google's performance and SEO auditor. Useful for technical health checks but not for slop detection. A generated site can score 100 on Lighthouse (because Vercel and Next.js are well-optimized by default) and still be pure slop. Performance does not equal differentiation.
Manual DevTools inspection — The deepest signal. A trained eye in DevTools for sixty seconds extracts more diagnostic information than any automated tool. The 21-sign method is a structured guide for that inspection.
The recommended workflow for a serious audit:
- Run the 21-sign manual scan first (30 seconds).
- If the score is in the gray zone (16–50), run Sailop for a deeper signal (~10 seconds).
- Use Wappalyzer to confirm the stack (~5 seconds).
- If text-on-page slop is suspected, run GPTZero on the longest blocks of copy.
- Lighthouse only as a sanity check on technical health, not as a slop signal.
---
Method limits
The 21-sign method produces probabilities, not certainties. The reader must understand the failure modes.
False positives — Human work that looks generated
A skilled developer who happens to use the canonical stack (Next.js, Tailwind, Inter, shadcn, Lucide) and who works from a popular template can produce a site that scores in the AI-assisted or even AI-dominant band. The defaults of the modern frontend stack are aligned with generator outputs because both draw from the same corpus.
A specific failure mode: a junior developer who builds genuinely from scratch but uses a starter kit. The starter kit may include Inter, shadcn, and the canonical hero structure. The developer's work scores high not because AI was used but because the starting point was a generator-aligned template.
The mitigation is to weight the contextual signs. A high score combined with a high-priced custom freelance contract is concerning. A high score on a hobby project of a junior developer is not.
False negatives — Generator output that looks custom
A sophisticated user can take a v0 output and customize it deeply: swap the palette, add a custom font, restructure the layout, write specific copy. The result scores low on the audit because the surface signals have been replaced. The underlying scaffolding is still AI-generated, but the audit cannot detect it.
The mitigation is that, at the level of customization required to defeat the audit, the user has effectively done the work that the buyer paid for. The line between AI-assisted and human-led is genuinely fuzzy at this point. The audit captures cases where the customization layer is missing, which is the majority of cases.
Drift over time
Generator defaults change. v0 in October 2025 had different defaults than v0 in April 2026. Specific signs in this catalog (the violet-to-pink gradient, the sparkle pill) may evolve as generators are tuned to be less obviously canonical. The method must be updated quarterly to track current defaults.
The framework of the method — palette, typography, structural anatomy, microcopy, technical signature — is durable. The specific instantiation of each sign is not. The reader should treat the catalog as a snapshot of the 2026 state and consult updated versions as the field evolves.
Probability not certainty
A score of 65 does not prove AI generation. It states that the deliverable shares twenty visual and structural features with the modal output of current generators. That is strong evidence in a Bayesian sense but not a courtroom-grade proof. The buyer and vendor in a dispute should treat the score as one input alongside other evidence (commit history, design files, vendor testimony) rather than as a verdict.
For internal use — a buyer deciding whether to contest an invoice, a recruiter screening agency portfolios, a founder evaluating a hire — the score is sufficient. For legal disputes or public accusations, the score should be supported by additional evidence.
---
A 15-minute audit protocol for recruiters and agency clients
For recruiters evaluating designer portfolios, agency clients reviewing pitch decks, and procurement teams vetting vendors, the following structured protocol produces a defensible audit in fifteen minutes.
Minutes 0 to 2 — Visual scan
Open the candidate's primary portfolio site. Run the 5-question fast scan. Note the count of yes answers.
If 4–5 yes, the candidate's own portfolio is in the AI-dominant or pure-slop band. This is informative regardless of what the candidate does for clients: their personal site reflects their default taste.
Minutes 2 to 5 — Full 21-sign audit
Apply the full method to the portfolio site. Compute the score. Document each sign on a notepad.
If the portfolio scores below 16, the candidate has demonstrated capacity for human-led design. Proceed.
If the portfolio scores above 30, the candidate's default is generator output. This is not disqualifying — many successful designers use AI tools heavily — but it must be understood explicitly.
Minutes 5 to 10 — Sample case audit
Pick the case study most relevant to the role. Open the linked client site. Apply the 21-sign audit to that client site.
Compare the case-study score to the portfolio score:
- Case-study score significantly lower than portfolio: the candidate produces more custom work for clients than for themselves. Common pattern; positive signal.
- Case-study score similar to portfolio: the candidate ships generator-aligned work consistently. Acceptable for some roles, not others.
- Case-study score significantly higher than portfolio: rare. Indicates the candidate may have inflated the portfolio's customization for marketing.
Minutes 10 to 13 — Process verification
In the candidate's case study, look for:
- Design file links (Figma, Sketch, etc.) that can be opened.
- Process screenshots showing iteration.
- Specific design decisions explained in writing.
- Specific results metrics (conversion, performance, engagement).
The presence of all four indicates a designer who does the work. The absence of all four, combined with a high score, indicates a designer who passes off generator output.
Minutes 13 to 15 — Documentation
Record the audit on a single page:
Candidate: [Name]
Portfolio URL: [URL]
Portfolio score: __/75
Portfolio band: [Human / AI-assisted / AI-dominant / Pure slop]
Case study reviewed: [Case URL]
Case study score: __/75
Case study band: [Human / AI-assisted / AI-dominant / Pure slop]
Process evidence:
[ ] Design files linked
[ ] Iteration shown
[ ] Decisions explained
[ ] Results documented
Recommendation: [Proceed / Discuss process / Decline]
Notes: [Free text]The protocol produces a defensible record. If the candidate is hired and the work later disappoints, the audit serves as a baseline. If the candidate is rejected, the rejection is documented with specific evidence rather than vibes.
The same protocol adapts to agency vetting, with the case study replaced by a recent agency client deliverable.
---
FAQ
Could a freelance have used AI for half the work and human design for half?
Yes, and this is by far the most common pattern in 2026. The audit handles this case naturally: a half-AI half-human deliverable scores in the 20–35 band, which the interpretation table classifies as AI-assisted. This is acceptable for most contracts and contexts. The audit only raises a concern when the score is high (51+) or when the contract specifically promised custom work.
A useful framing: AI assistance is a tool, like Figma or VS Code. The buyer does not object to the use of Figma, even though Figma "did" some of the work. The buyer objects when the deliverable is generic, regardless of which tools produced it. The audit measures genericness, which is the actual concern.
If the site is pretty but generic, is that bad?
It depends on what was promised. A pretty generic site delivered for $2,000 against a brief that said "make it look professional" is fine. A pretty generic site delivered for $30,000 against a brief that said "differentiate us from competitors with a bespoke visual identity" is not.
The audit measures genericness directly. The contract or brief determines whether genericness is acceptable. The two pieces combine to produce a verdict.
How can the reader be 100 percent sure AI was used?
The reader cannot, with the audit alone. The audit produces probability. For certainty:
- Ask the freelance directly. Most will admit AI use when asked, especially under contractual disclosure obligations.
- Examine the source code commit history. If the first commit contains thousands of lines of code from a
v0_initial_exportor similar source, the origin is clear. - Check the deployment artifacts. Some generators leave identifying artifacts in the build output that are difficult to remove without deliberate effort.
- Compare against known generator outputs. Some sites are nearly identical to public v0 demos; the comparison is the proof.
Certainty is rarely necessary. The audit is sufficient for most decision contexts.
Is using shadcn/ui a sign of AI?
Not in itself. shadcn is a popular, well-engineered component library used by many human developers. The signal is the use of shadcn without customization — default radii, default colors, default class names visible in the DOM. Customized shadcn is a different question.
A useful test: open the source code (if available). If the repository contains a components/ui/ directory with files copied directly from shadcn's CLI output and never touched, the use is uncustomized. If the files have been edited to override colors, structure, or behavior, the use is customized. The latter is human-led work; the former is generator-aligned.
Does this method work for sites built before AI generators existed?
Yes, with a caveat. Sites built before 2023, when generators became viable, will mostly score low because they predate the convergence on canonical defaults. A 2019 startup site that uses Inter and a three-card grid scores high on those individual signs but low on the technical signs (no Next.js, no shadcn, no Vercel headers). The combined score is typically in the 0–20 band.
The audit is most reliable for sites built between 2023 and now. For older sites, the verdict is "almost certainly human" by default.
How accurate is the audit on landing pages versus full apps?
The method was developed primarily on landing pages. It generalizes well to marketing sites, brochure sites, and product landing pages. It generalizes poorly to:
- Full SaaS dashboards: the design vocabulary is different (data tables, settings, complex navigation). The 21 signs do not directly apply.
- E-commerce stores: Shopify, WooCommerce, and similar platforms have their own conventions. The audit confuses platform defaults with AI generation.
- Editorial sites: blogs and content-heavy sites have a different anatomy. The audit is less reliable.
- Web apps with authentication: the public landing page is auditable; the authenticated experience is not, because the audit cannot access it.
For these contexts, an adapted version of the method is needed. The principle — measure convergence on generator defaults — remains the same; the specific signs change.
What if the site is genuinely beautiful but checks several boxes?
A site can be beautiful and generic at the same time. Genericness is about differentiation, not aesthetics. A polished, well-executed three-card grid with a violet gradient hero is beautiful, in the same way a polished default IKEA bookshelf is beautiful. It also looks like ten thousand other bookshelves.
The audit measures the second property — the resemblance to defaults — not the first. A high score and a beautiful site both being true is a coherent observation. The relevant question is whether the buyer paid for differentiation or for execution.
Can the audit be gamed?
Yes. A vendor who knows the audit can deliberately remove the highest-severity signs while keeping the underlying generator scaffolding. Replace blue-500 with #3a82f6 (one digit off, visually identical). Replace Inter with Manrope. Restructure the hero with two paragraphs instead of one. The score drops; the site is still substantially generator output.
This is the central limit of the method: it measures surface alignment with current defaults. A vendor who decouples the surface from the scaffolding can pass the audit while shipping AI-dominant work.
The mitigation is that, at the level of effort required to defeat the audit, the vendor has done a meaningful portion of the design work. The buyer is no worse off.
Do high scores correlate with bad SEO performance?
Loosely, yes. A site that scores high on the audit has features that correlate with weak SEO outcomes:
- Auto-generated OG images underperform custom ones in click-through.
- Identical sitemap timestamps signal lack of editorial activity to crawlers.
- Generic copy underperforms specific copy in long-tail search.
- Fade-in-up animations on every section can cause Cumulative Layout Shift penalties.
The correlation is not causation. Some high-scoring sites rank well because they target uncompetitive terms or because the brand has off-page authority. Some low-scoring sites rank poorly because the content is thin or the technical SEO is broken. The audit is not an SEO predictor; it overlaps with SEO concerns coincidentally.
How does Sailop relate to this manual method?
Sailop is the automated version of this audit. The npm package scans a URL or local repository and produces a score along with a per-sign breakdown, suggested fixes, and a uniqueness check against a corpus of known generator outputs. The manual method is the conceptual basis; Sailop operationalizes it.
The two are complementary. The manual method takes thirty seconds for a trained eye and is sufficient for most decisions. Sailop takes about ten seconds, runs in CI, and produces a deeper signal that includes checks the manual method cannot reasonably perform (full bundle analysis, corpus comparison, hash matching against known outputs).
For one-off audits, the manual method suffices. For repeated audits — a recruiter screening many portfolios, an agency reviewing many pitches, a founder evaluating multiple vendors — Sailop reduces the time per audit by an order of magnitude.
What is the difference between AI-assisted and AI-dominant?
AI-assisted means a human directed the design and AI helped implement it. The human chose the palette, the typography, the layout, the copy. AI generated boilerplate code, scaffolded components, and accelerated iteration.
AI-dominant means AI produced the design and a human modified it lightly. The AI chose the palette (defaults), the typography (Inter), the layout (canonical anatomy), the copy (canonical microcopy). The human swapped a logo and edited two headlines.
The score band reflects the distinction. 16–30 is AI-assisted. 31–50 is AI-dominant. The verbal labels matter because they map to different contractual expectations.
Is it possible for a generator to produce truly custom work?
Not with current technology. Generators are statistical systems that return high-likelihood outputs for given prompts. The high-likelihood output is, by definition, the modal output. The modal output is the generic one. Asking a generator for "custom" produces the average idea of custom, which is itself generic.
A human can use a generator as a starting point and then customize aggressively. The result is custom, but the generator did not produce it; the human did. The generator was a fast scaffolding tool. This is AI-assisted work, not AI-generated custom work.
The proposition that generators produce custom work is, as of April 2026, false. This may change as generators incorporate per-project memory and stylistic conditioning, but the current generation does not.
How often should the method be updated?
Quarterly. Generator defaults change with model updates and platform releases. Specific signs (the exact gradient, the exact emoji, the exact pricing layout) drift on a three-month timescale. The method's framework is stable; the catalog of signs is not.
The reader should consult the most recent version of this catalog when running an audit. A sign that was diagnostic in Q2 2025 may be obsolete in Q1 2026, replaced by a different sign that captures the new default.
What if the score is borderline (around 30)?
Borderline scores require the contextual flowchart. A score of 30 on a $40,000 corporate site is concerning. A score of 30 on a $2,000 landing page is fine. A score of 30 on a freelance designer's own portfolio is informative but not damning.
For borderline scores, the most useful next step is the process conversation: ask the vendor to explain how the design was developed. The conversation typically resolves the borderline case in five to ten minutes.
Is this method anti-AI?
No. The method is anti-misrepresentation. AI tools are productivity multipliers and the field has benefited from their existence. The method exists because the gap between what some vendors deliver and what buyers expect has widened with generator availability. The audit is a corrective for asymmetric information, not a critique of AI.
In contexts where AI use is disclosed and accepted (Helix Health in Case C), the audit serves a documentation function rather than a corrective one. The high score is a fact, not a complaint.
---
Glossary
AI-assisted: A site where a human directed the design and used AI tools to accelerate implementation. Typically scores 16–30 on the 21-sign audit. Acceptable in most contexts.
AI-dominant: A site where AI produced the foundational design and a human made minor modifications. Typically scores 31–50. Acceptable for internal tools, MVPs, and side projects; questionable for premium custom work.
Pure slop: A site that is direct generator output with negligible human modification. Typically scores 51+. Acceptable when disclosed; problematic when sold as custom work.
Bolt.new: An AI tool from StackBlitz that generates full-stack applications. Tends to favor Astro and Vite. Leaves StackBlitz container artifacts in the bundle.
Claude Artifacts: Anthropic's interactive output mode in claude.ai, often used to produce single-file React applications served inline.
ChatGPT canvas: OpenAI's interactive output surface in the ChatGPT product. Produces similar single-file outputs to Claude Artifacts.
Generator: A tool that converts a natural-language prompt into a working website or component. Includes v0, Lovable, Bolt, Replit Agent, Claude Artifacts, ChatGPT canvas.
Inter: A neo-grotesque sans-serif typeface designed by Rasmus Andersson. The default font of Tailwind documentation, shadcn examples, and most generator outputs.
Lovable: An AI tool that generates full-stack applications with a Supabase backend by default. Prefers React with Vite.
Lucide: An open-source icon library, the default icon set of shadcn/ui and the most common icon source in generator outputs.
OG image: The Open Graph image, typically rendered when a URL is shared on social platforms. The 1200x630 dimension is canonical.
Radix UI: A library of low-level, accessible component primitives. The foundation that shadcn/ui builds on.
Replit Agent: Replit's AI coding assistant, capable of generating and deploying full applications. Sites typically deploy to replit.app subdomains unless a custom domain is configured.
shadcn/ui: A copy-paste component library built on Tailwind and Radix UI. Distributed via a CLI that copies source files into the consuming project, allowing local modification.
Slop: A pejorative term for generic, low-effort, AI-generated content shipped as if it were custom work. Originated in image generation and has spread to other domains.
Tailwind CSS: A utility-first CSS framework. Its default palette includes blue-500 and purple-500 as the most common primary colors in generator outputs.
v0 (v0.dev): Vercel's AI-powered UI generator. Produces Next.js components with Tailwind, shadcn/ui, and the canonical violet-pink gradient by default.
Vercel: A hosting platform tightly integrated with Next.js. Sites deployed to Vercel leave headers like x-vercel-deployment-id and x-vercel-id.
---
Sources cited by name only
The following organizations and tools are referenced in this article. No specific URLs or studies are cited. Readers wishing to verify claims should consult these sources directly:
- Vercel (hosting platform, deployment infrastructure, OG image library)
- Netlify (hosting platform, alternative deployment)
- Tailwind CSS (CSS framework, default palette and class names)
- Inter (typeface, designed by Rasmus Andersson)
- shadcn/ui (component library, the canonical AI-generation companion)
- Radix UI (primitive library, foundation of shadcn)
- Lucide (icon library, default in shadcn outputs)
- Anthropic (Claude Artifacts, DeepCheck for code detection)
- OpenAI (ChatGPT, canvas output mode)
- Google Search Central (SEO documentation, sitemap conventions)
- Cloudflare (alternative hosting, headers)
- Framer Motion (animation library, common in generator outputs)
---
Internal references
For deeper context on the patterns documented above, the following companion articles extend the analysis:
- /blog/ai-slop-2026-state-of-the-ai-generated-web — A broader survey of the AI-generated web in 2026, with quantitative measurements of slop prevalence across industries.
- /blog/anti-slop-prompt-template-2026 — A practical prompt template for users of v0, Lovable, and Claude that produces non-slop outputs by explicitly avoiding the canonical defaults documented here.
- /blog/de-ai-your-lovable-v0-bolt-site — A step-by-step remediation guide for sites that score in the AI-dominant or pure-slop band, with specific refactors for each of the 21 signs.
- /blog/tailwind-blue-purple-gradient-ai-signature-2026 — A focused study of Sign 6 (the violet-to-pink gradient), with historical analysis of how the pattern became the dominant AI fingerprint.
---
Hosting platform header signatures
The HTTP response headers of a deployed site reveal the hosting platform, which in turn correlates with the most likely generator origin. The following table catalogs the canonical headers per host, with a notation of which generators most commonly deploy to each.
| Hosting platform | Distinctive headers | Default for generator | |------------------|---------------------|------------------------| | Vercel | x-vercel-deployment-id, x-vercel-id, x-vercel-cache, server: Vercel, x-vercel-execution-region, x-matched-path | v0 (default), Bolt (occasional), Lovable (manual) | | Netlify | x-nf-request-id, server: Netlify, x-powered-by (sometimes), etag (Netlify format) | Bolt (occasional), human deployments | | Cloudflare Pages | cf-ray, cf-cache-status, server: cloudflare, cf-request-id, nel, report-to | Replit deploys, Bolt occasional | | Replit | x-replit-id, x-replit-cluster, hosted on replit.app subdomain unless custom domain | Replit Agent (default) | | Lovable | Custom internal headers, often Supabase x-supabase-* headers in API calls | Lovable (default) | | GitHub Pages | server: GitHub.com, x-github-request-id, x-served-by | Mostly human deployments | | AWS Amplify | x-amz-cf-id, x-amz-cf-pop, x-cache: Hit from cloudfront | Mostly human deployments | | Render | x-render-origin-server, rndr-id | Mixed | | Fly.io | fly-request-id, server: Fly/... | Mostly human, custom backends |
A pattern: when a buyer encounters x-vercel-deployment-id in headers, combined with shadcn class soup in the DOM and Inter as the only font, the joint probability of v0 origin is high enough to treat as near-certain.
# Standard header inspection workflow:
curl -sI https://example.com | head -30
# Filter for hosting signatures:
curl -sI https://example.com 2>&1 | grep -iE "vercel|netlify|cloudflare|replit|render|fly"
# Check for cache state (indicates ISR):
curl -sI https://example.com | grep -i "x-vercel-cache\|cf-cache-status"The header signature is not the verdict. It narrows the candidate pool. A site on Vercel can be hand-built. A site on AWS can be generator output that was manually deployed. The header is one of the twenty-plus inputs to the diagnostic.
---
DevTools snippet library
The following snippets, runnable in the browser DevTools Console (F12, then Console tab), automate parts of the audit. Each snippet returns a result that maps to a specific sign.
Snippet 1 — Detect blue-purple palette
// Extract all unique computed colors on the page
const colors = new Set();
document.querySelectorAll('*').forEach(el => {
const style = getComputedStyle(el);
colors.add(style.color);
colors.add(style.backgroundColor);
colors.add(style.borderColor);
});
const targets = ['rgb(59, 130, 246)', 'rgb(139, 92, 246)', 'rgb(236, 72, 153)'];
const found = targets.filter(t => [...colors].some(c => c === t));
console.log('AI-signature colors detected:', found);
// Expected output if AI-generated: blue-500 and purple-500 both presentSnippet 2 — Verify Inter font dominance
// Count font-family usage across the document
const fontCounts = {};
document.querySelectorAll('*').forEach(el => {
const ff = getComputedStyle(el).fontFamily;
fontCounts[ff] = (fontCounts[ff] || 0) + 1;
});
const sorted = Object.entries(fontCounts).sort((a, b) => b[1] - a[1]);
console.table(sorted.slice(0, 5));
// AI-generated sites: top entry is "Inter, sans-serif" or similar with >90% of nodesSnippet 3 — Detect three-card grid
// Search for any container with exactly three direct children using rounded-2xl
const candidates = [...document.querySelectorAll('div, section')]
.filter(el => el.children.length === 3)
.filter(el => {
const childClasses = [...el.children].map(c => c.className || '');
return childClasses.every(c => c.includes('rounded-2xl') || c.includes('rounded-xl'));
});
console.log('Three-card rounded grid candidates:', candidates.length);
candidates.forEach(c => console.log(c));Snippet 4 — Detect canonical hero structure
// Look for an h1 immediately followed by p, then exactly two button-like elements
const h1 = document.querySelector('h1');
if (h1) {
let next = h1.nextElementSibling;
const sequence = [];
while (next && sequence.length < 5) {
sequence.push(next.tagName);
next = next.nextElementSibling;
}
console.log('Post-H1 sequence:', sequence);
// Canonical AI hero: ['P', 'DIV'] where DIV contains 2 buttons, or ['P', 'BUTTON', 'BUTTON']
}Snippet 5 — Detect emoji pill above H1
const h1 = document.querySelector('h1');
if (h1) {
let prev = h1.previousElementSibling;
if (prev) {
const text = prev.innerText || '';
const emojiPattern = /[✨\u{1F680}\u{1F525}]/u;
if (emojiPattern.test(text)) {
console.log('Emoji pill detected:', text);
}
}
}Snippet 6 — Detect violet-to-pink gradient
// Find any element with a gradient containing canonical AI stops
const gradients = [];
document.querySelectorAll('*').forEach(el => {
const bg = getComputedStyle(el).backgroundImage;
if (bg.includes('gradient') && (bg.includes('139, 92, 246') || bg.includes('236, 72, 153'))) {
gradients.push({ el, bg });
}
});
console.log('Violet-to-pink gradients found:', gradients.length);Snippet 7 — Detect shadcn class signature
// shadcn buttons have a distinctive long class soup
const shadcnSignature = /inline-flex items-center justify-center.*ring-offset-background/;
const matches = [...document.querySelectorAll('button, a')].filter(el =>
shadcnSignature.test(el.className)
);
console.log('Shadcn-pattern buttons:', matches.length);Snippet 8 — Detect Radix UI primitives
// Radix attaches data-radix-* attributes
const radixAttrs = ['data-radix-collection-item', 'data-radix-popper-content-wrapper', 'data-state'];
const counts = {};
radixAttrs.forEach(attr => {
counts[attr] = document.querySelectorAll(`[${attr}]`).length;
});
console.table(counts);Snippet 9 — Detect Lucide icons
// Lucide SVGs have a recognizable class pattern
const lucideIcons = document.querySelectorAll('svg[class*="lucide"]');
console.log('Lucide icons on page:', lucideIcons.length);
const totalSvgs = document.querySelectorAll('svg').length;
console.log('Total SVGs:', totalSvgs);
console.log('Lucide ratio:', (lucideIcons.length / totalSvgs * 100).toFixed(1) + '%');
// AI sites: ratio above 80%Snippet 10 — Check for Next.js framework
// Next.js leaves identifiers in the global scope and DOM
const isNext = !!(window.__NEXT_DATA__ || document.getElementById('__next'));
console.log('Next.js detected:', isNext);
if (isNext) {
console.log('Next.js build ID:', window.__NEXT_DATA__?.buildId);
}Snippet 11 — Inspect sitemap timestamps
// Fetch and analyze sitemap.xml from the same origin
fetch('/sitemap.xml')
.then(r => r.text())
.then(xml => {
const matches = xml.match(/<lastmod>([^<]+)<\/lastmod>/g) || [];
const dates = matches.map(m => m.replace(/<\/?lastmod>/g, ''));
const unique = new Set(dates);
console.log('Total <lastmod> entries:', dates.length);
console.log('Unique timestamps:', unique.size);
if (dates.length > 5 && unique.size === 1) {
console.log('All timestamps identical:', [...unique][0]);
}
});Snippet 12 — Detect default 404
// Try a guaranteed-404 path
fetch('/random-path-that-cannot-exist-' + Math.random().toString(36).slice(2))
.then(r => r.text())
.then(html => {
const isNextDefault = html.includes('This page could not be found') || html.includes('404 | This page');
const isVercelDefault = html.includes('NOT_FOUND') && html.includes('Vercel');
console.log('Next.js default 404:', isNextDefault);
console.log('Vercel default 404:', isVercelDefault);
});These snippets run in any modern browser without installation. The reader can paste them sequentially into the Console and accumulate evidence in under two minutes.
---
Score severity adjustments by context
The raw score is the starting point. Context modifies the interpretation. The following adjustments refine the verdict.
Adjustment 1 — Industry baseline
Some industries have higher legitimate use of canonical patterns than others. A B2B SaaS company is more likely to use the canonical pricing card and three-benefit grid because that anatomy is genuinely effective for that audience. A boutique agency, a creative studio, or an editorial brand has stronger expectations of differentiation.
| Industry | Score adjustment | Rationale | |----------|------------------|-----------| | B2B SaaS, mid-market | -10 from raw | Canonical anatomy genuinely effective; less expectation of bespoke | | Creative agency or studio | +10 to raw | Differentiation is the product; high score is more concerning | | Boutique consultancy | +5 to raw | Premium positioning expected | | Indie hacker / solo founder MVP | -15 from raw | Speed is the priority; AI use expected | | Enterprise / Fortune 500 | +10 to raw | Procurement expects custom work | | Open-source project landing | -10 from raw | Functional concerns dominate | | E-commerce shop | -5 from raw | Platform conventions inflate score | | Personal portfolio | depends | If creative role: +10. If technical role: -5. | | Nonprofit | -5 from raw | Budget constraints justify generic builds | | Editorial / media | +5 to raw | Voice and identity are core |
A B2B SaaS site scoring 40 raw is, after adjustment, scoring 30 — borderline acceptable. A creative agency scoring 40 raw is, after adjustment, scoring 50 — concerning.
Adjustment 2 — Promised level of customization
The contract or pitch sets expectations. Compare the score to what was promised.
| Promise level | Acceptable score ceiling | |---------------|--------------------------| | "Quick landing page" or "MVP" | 60 | | "Professional design" | 40 | | "Custom design system" | 25 | | "Bespoke brand identity" | 15 | | "From-scratch fully custom" | 10 |
A score above the ceiling indicates the deliverable does not match the promise. The conversation with the vendor should focus on closing the gap — either by reworking to lower the score or by renegotiating the price to match the actual customization level.
Adjustment 3 — Time and budget invested
A site with a scoped budget of $1,500 cannot reasonably achieve the lowest score band. The cost of escaping the canonical defaults — typography research, custom palette, bespoke illustrations, motion design — is itself a labor cost. Scores must be weighed against budget.
Budget vs. acceptable score:
$0 – 1,500 60 ceiling (template territory)
$1,500 – 5,000 45 ceiling (template-plus territory)
$5,000 – 15,000 30 ceiling (custom-light territory)
$15,000 – 50,000 20 ceiling (custom territory)
$50,000+ 10 ceiling (bespoke territory)A $30,000 deliverable scoring 50 is a price-quality mismatch. A $1,000 deliverable scoring 50 is fine — generic was what the budget bought.
---
Audit walk-through example
The following is a fully worked example of the 21-sign audit applied in real time to a hypothetical site. The site is "QuoteEngine.io," a fictional B2B SaaS for proposal automation, which the freelance vendor charged $18,000 to build with a brief that read "modern, custom, professional design system."
Phase 1 — Visual scan (10 seconds)
The reader opens the homepage. Above the fold, the following are immediately visible:
- A pill above the H1 reading "✨ New: AI-powered proposal scoring"
- An H1 reading "The proposal platform built for modern teams"
- A paragraph: "QuoteEngine helps sales teams send winning proposals in minutes, not hours. Powered by AI."
- Two buttons: "Get started free" (filled blue) and "Watch demo" (ghost gray)
- Below the hero, a horizontal strip of four grayscale logos under the text "Trusted by leading teams"
Initial sign flags: 1 (palette likely), 4 (canonical hero), 5 (sparkle pill), 7 (trusted by), 13 (canonical microcopy).
Phase 2 — Section walk (15 seconds)
The reader scrolls. The page reveals:
- Three benefit cards with icons, headings, and short descriptions, each in a
rounded-2xlcontainer with shadow - A "How it works" section with three numbered steps and Lucide icons
- A pricing section with three plans (Starter $19, Pro $49, Enterprise "Contact us") in
rounded-2xlcards with green checkmarks - A testimonial slider with three round headshots and short quotes
- An accordion FAQ with eight questions, chevron icons, default styling
- A footer with four columns: Product, Company, Resources, Legal
Additional sign flags: 3 (three-card grid), 8 (how it works), 9 (three pricing plans), 10 (testimonial slider), 11 (plain FAQ), 12 (four-column footer).
Phase 3 — Technical inspect (5 seconds)
The reader presses F12. Inspects the primary CTA button. The class attribute reads:
inline-flex items-center justify-center whitespace-nowrap rounded-md text-sm font-medium ring-offset-background transition-colors focus-visible:outline-none focus-visible:ring-2 bg-blue-500 text-white hover:bg-blue-600 h-10 px-4 py-2This is the canonical shadcn button variant string with a Tailwind blue-500 background.
The reader checks the computed font-family on body text. It reads "Inter, sans-serif" — Inter only.
The reader checks the gradient on the hero background blob. The computed background-image contains linear-gradient(135deg, rgb(139, 92, 246), rgb(236, 72, 153)).
The reader views the response headers via the Network tab. The first request returns x-vercel-deployment-id: dpl_abc123 and server: Vercel.
The reader appends /sitemap.xml. All entries share the same value.
Additional sign flags: 1 (confirmed blue-500), 2 (Inter only), 6 (canonical gradient), 18 (shadcn class soup), 19 (Tailwind defaults), 21 (identical sitemap dates).
Score computation
| # | Sign | Severity | Flagged | Score | |---|------|----------|---------|-------| | 1 | Blue-500 + purple-500 | 4 | Yes | 4 | | 2 | Inter only | 3 | Yes | 3 | | 3 | Three-card grid | 5 | Yes | 5 | | 4 | Canonical hero | 4 | Yes | 4 | | 5 | Sparkle pill | 4 | Yes | 4 | | 6 | Violet-to-pink gradient | 5 | Yes | 5 | | 7 | Trusted by | 2 | Yes | 2 | | 8 | How it works (Lucide) | 3 | Yes | 3 | | 9 | Three pricing plans | 5 | Yes | 5 | | 10 | Testimonial slider | 3 | Yes | 3 | | 11 | Plain FAQ | 2 | Yes | 2 | | 12 | Four-column footer | 4 | Yes | 4 | | 13 | Canonical microcopy | 4 | Yes | 4 | | 14 | Auto OG image | 3 | (not checked) | 0 | | 15 | Default 404 | 3 | (not checked) | 0 | | 16 | No favicon | 2 | (not checked) | 0 | | 17 | fade-in-up only | 3 | Yes | 3 | | 18 | shadcn class soup | 4 | Yes | 4 | | 19 | text-gray-500 | 3 | Yes | 3 | | 20 | Lucide everywhere | 2 | Yes | 2 | | 21 | Identical sitemap | 3 | Yes | 3 | | | TOTAL | | | 63/75 |
Adjustments: B2B SaaS industry (-10). Promised "custom design system" (ceiling 25). Budget $18,000 (ceiling 30).
Adjusted score: 53. Verdict band remains Pure slop. Promise gap: 53 vs. 25 ceiling = 28 points over. Budget gap: 53 vs. 30 ceiling = 23 points over.
Outcome
The buyer's options:
- Accept the deliverable as-is and acknowledge that the marketing site is generic. Discount the freelance fee accordingly.
- Request a rework focused on lowering the score: custom palette, custom typography, restructured hero, custom illustrations, removal of canonical pricing structure. The rework should target a score below 25.
- Cancel the engagement, retain the working site for interim use, and engage a different vendor for the brand rebuild.
The audit does not choose option 1, 2, or 3. It quantifies the gap between deliverable and expectation and provides the vocabulary to negotiate.
---
Multi-page audit considerations
The 21-sign method described above audits a single page — typically the homepage. Some sites have meaningful variation across pages, where the homepage is generic but interior pages show real customization, or vice versa. A multi-page audit captures the variance.
Page selection for audit
For a site with multiple pages, the recommended sample is:
- Homepage (always audit)
- Pricing page (separate audit if it exists)
- About page (separate audit)
- One blog post or content page (separate audit)
- Contact or signup page (separate audit)
Compute the score for each page independently. The five scores combine into a profile.
Variance interpretation
The variance across pages is itself a signal.
| Variance pattern | Interpretation | |------------------|----------------| | All pages score similarly (within ±5) | Consistent build; either all custom or all generated | | Homepage low, interior pages high | Hand-tuned homepage with generated interior; common with limited budget | | Homepage high, interior pages low | Generated homepage with curated content pages; rare but exists | | Wildly inconsistent (variance > 30) | Multiple authors or mixed AI/human work | | One page dramatically higher | Recently added section, possibly post-launch generator output |
A site where every page scores 60+ is fully generated with no human curation. A site where the homepage scores 10 and the pricing page scores 60 is a hand-built marketing front with a generated pricing extension. The latter is a common pattern when a startup adds pricing late and uses a generator for the addition.
Per-route signal
For Next.js sites, the per-route bundle can reveal which pages have custom code. Using DevTools' Network tab, navigate between pages and watch for:
- A new
_next/static/chunks/pages/[route].jsrequest — indicates a code-split page with potentially custom logic. - The same chunk reused — indicates the page is part of the static export with shared component reuse.
A site where every route reuses identical chunks is more likely to be generated than a site with diverse per-route bundles.
---
Detection at scale: auditing many sites
Some readers — recruiters, agency procurement teams, talent directors — face the problem of auditing dozens of candidate sites. The manual 30-second method does not scale to one hundred sites in a single sitting. The following workflow handles volume.
Phase A — Triage with automation
Use Sailop or an equivalent automated scanner to compute scores for the entire batch. For a batch of one hundred sites, this takes roughly fifteen minutes of compute time and produces a ranked list.
# Hypothetical Sailop batch command:
sailop scan --batch urls.txt --output scores.json --format csvPhase B — Filter to the gray zone
The automated batch produces three groups:
- Clearly human (score 0–15)
- Gray zone (score 16–50)
- Clearly slop (score 51–75)
Group 1 candidates pass without further audit. Group 3 candidates either fail or warrant a manual conversation about disclosure. Group 2 — the gray zone — receives the manual 30-second audit.
For a batch of one hundred typical freelance candidates, the distribution from the historical sample (in the chart above) suggests roughly 33 in group 1, 51 in group 2, and 16 in group 3.
Phase C — Manual gray-zone audit
Group 2 candidates each receive a manual 30-second pass. The pass produces:
- Confirmation of the automated score (or correction up/down)
- A qualitative note on dominant signs
- A binary recommendation: proceed, decline, or request process documentation
For 51 gray-zone candidates at 30 seconds each, the manual phase takes roughly 25 minutes of focused attention.
Phase D — Deep dive on top candidates
The remaining short list of strong candidates (typically 5–15 names) receives the full 15-minute audit protocol from the recruiter section above. For 10 candidates at 15 minutes each, the deep dive takes 2.5 hours.
Total time budget
A complete audit of one hundred candidate sites:
- Phase A (automated): 15 minutes
- Phase B (filtering): 5 minutes
- Phase C (gray zone manual): 25 minutes
- Phase D (top candidates deep dive): 150 minutes
Total: roughly 3.5 hours for a complete batch audit. This compares favorably to the alternative of full manual review (15 minutes × 100 = 25 hours) or no review at all (the prevailing default).
---
The economic logic of slop
Why does pure slop exist? The audit produces evidence; it does not produce a theory. A brief economic framing helps explain the prevalence.
Vendor incentives
A freelance designer or small agency faces a sharp tradeoff. Custom design takes time. Time is the binding constraint on revenue. A typical custom landing page from an experienced designer takes 40–80 hours over 4–8 weeks. A generated landing page takes 2–4 hours over 1–2 days.
If both are billed at $15,000, the hourly rate of the generated work is approximately $5,000/hour while the custom work is approximately $250/hour. The vendor's incentive to ship slop while charging custom rates is enormous.
The constraint on this incentive is detection. If buyers detect slop, the vendor loses repeat business and reputation. The audit shifts the detection probability upward, which shifts the equilibrium back toward honest pricing.
Buyer vulnerability
The typical buyer is not a designer. The typical buyer cannot, before learning the audit, distinguish a generic landing page from a custom one. The visual quality of generator output is high enough that, to an untrained eye, it looks competent and professional.
Buyers compensate by relying on referrals, portfolios, and testimonials. None of these are robust against vendor misrepresentation. A vendor with a strong portfolio (perhaps borrowed, perhaps from a few legitimate engagements) can ship slop on subsequent contracts without easy detection.
The audit is the buyer's response to this asymmetry. It transfers some of the diagnostic capacity from the trained designer to the lay buyer.
Market consequences
If audit adoption increases, three market effects follow:
- Honest pricing equilibrium: Vendors who use AI tools are forced to price accordingly. The $15,000 generated landing page becomes a $3,000 generated landing page, with the difference reflecting transparency rather than effort.
- Premium for differentiation: Vendors who do genuinely custom work can command a higher premium because the audit verifies their claim. The $15,000 custom landing page becomes a $25,000 custom landing page that buyers can confidently distinguish from generated work.
- Generator commoditization: Generators themselves become positioned as buyer-facing tools. A buyer who knows they need a generic site can use v0 directly for $20/month instead of paying a vendor to use v0 on their behalf. The intermediation premium collapses.
The end state is a market where buyers either pay for tools (generators, templates) or pay for differentiation (custom designers). The middle — generic work passed off as custom — disappears under audit pressure.
---
Edge case: hybrid sites
A particular pattern that confounds the simple score is the hybrid site. The pattern occurs when a vendor uses a generator for the marketing pages (homepage, pricing, about) and then writes custom code for the product or app section. The marketing pages score high; the product section scores low.
The hybrid pattern is increasingly common because:
- The marketing pages have less variance per industry; a generator output is "good enough" for most B2B marketing.
- The product or app needs domain-specific logic that generators cannot produce.
- The vendor's effort is concentrated where it matters technically, not visually.
For audit purposes, the hybrid pattern requires distinguishing the contractual scope. If the buyer paid for "the website," and the marketing pages are slop, the score reflects the deliverable. If the buyer paid for "the product," and the marketing pages are an afterthought, the marketing score may not be the relevant signal.
The audit should explicitly note when a hybrid pattern is detected:
Audit summary:
- Marketing pages: score 58 (Pure slop band)
- Product/app pages: score 14 (Human-designed band)
- Hybrid pattern confirmed
- Contractual scope: [buyer to determine]
- Recommendation: clarify whether the contract covered the marketing surfaceThe hybrid pattern is not bad faith on the vendor's side, necessarily. It can be an honest division of effort that the buyer was not warned about. The audit makes the division visible.
---
Edge case: the deliberate generic
A small but genuine class of sites scores high on the audit because the designer deliberately chose to look generic. The intent might be:
- Communicating "we are like every other modern startup" as a positioning choice.
- Reducing visual friction with users accustomed to canonical patterns.
- Saving design effort to allocate elsewhere (product, customer success, content).
- Conforming to a corporate parent's brand guidelines that resemble modern startup defaults.
In these cases, the high score is an accurate measurement, but the interpretation differs. The site is generic, yes, but the buyer who wanted generic got what they paid for.
The audit cannot distinguish deliberate generic from accidental generic. The conversation with the vendor reveals the difference. Asking "was the canonical SaaS visual language a deliberate choice?" produces the answer. A deliberate choice was articulated in writing during the engagement; an accidental choice was not discussed.
For deliberate generic, the audit serves a documentation purpose: the buyer should at least know they bought generic. The expectation alignment matters more than the score.
---
Edge case: rebrands and refreshes
When auditing a recently rebranded site, the score may be temporarily inflated. A rebrand often involves:
- A new visual identity created by humans (low score signal)
- Implementation by developers using current frameworks (high score signal)
- A mix of new content and legacy content (variance)
The post-rebrand site may score in the 30s on the audit, not because it is AI-generated, but because the implementation layer is canonical even when the design layer is custom.
The mitigation is to audit the design system separately from the implementation:
- Look at the brand book, if available. Was a custom palette specified? Custom typography? Custom illustration style?
- Compare the brand book to the deployed site. Does the deployment match the brand book?
- If the brand book specifies custom and the deployment is canonical, the implementation team may have shortcut the brand work.
The score is then weighted: brand work is what the buyer paid the brand agency for; implementation may have been delegated to a different team. A high score on a rebranded site can indicate execution debt rather than design debt.
---
The role of templates
A significant fraction of the gray-zone sites are not AI-generated but template-based. Templates from ThemeForest, Tailwind UI, Cruip, Stripe Press, and similar marketplaces produce sites that share most of the canonical signs. The audit cannot distinguish a template from generator output.
This is intentional. From the buyer's perspective, the relevant question is not "did AI make this?" but "is this generic?" A template-based site is generic by construction; a generator-based site is generic by emergence. The economic and aesthetic outcome for the buyer is similar.
That said, some buyers care about the distinction. To probe template versus generator origin:
- Reverse-image-search the OG image. If it appears on multiple unrelated sites, a template is shared.
- Inspect the source for template-specific identifiers. Tailwind UI components have predictable class compositions. Cruip templates have characteristic file structures.
- Ask the vendor directly. Most will admit template use when asked, since templates are an accepted commercial reality.
Templates are not a problem unless misrepresented. A buyer who knew they were getting a template gets what they paid for. A buyer who was told "completely custom" and got a Tailwind UI template is in the same misrepresentation territory as one who got generator output.
---
Method calibration over time
The 21 signs catalogued above reflect the state of generator defaults in early 2026. The catalog should be re-validated quarterly. The following protocol describes the calibration process for readers who maintain their own audit framework.
Step 1 — Generate fresh samples
Each quarter, generate ten new sites from each major tool (v0, Lovable, Bolt, Replit Agent, Claude Artifacts, ChatGPT canvas) using a standard prompt: "Build a modern landing page for a B2B SaaS startup that helps sales teams write better proposals."
The standard prompt provides a consistent baseline. Variations in defaults across quarters reflect generator updates rather than prompt variation.
Step 2 — Tabulate sign frequency
For each sign in the catalog, count the frequency of presence across the fresh samples. A sign that was diagnostic in Q1 (present in 80 percent of samples) but is now present in only 30 percent of samples has lost diagnostic value and should be downweighted or replaced.
Step 3 — Identify new defaults
Examine the fresh samples for patterns not in the existing catalog. New defaults emerge as generators update. Examples of recently emerged signs (hypothetical illustration only):
- Glassmorphism cards with
backdrop-blur(sign of recent v0 updates) - Pixel-art mascot in the hero (sign of certain Lovable templates)
- Variable font usage in headings (sign of new shadcn typography variants)
New signs are added to the catalog with a starting severity matching their observed diagnostic value.
Step 4 — Recalibrate severity scores
For each sign, recompute the diagnostic accuracy: precision (probability AI given sign present) and recall (probability sign present given AI). The severity score combines these into a single number scaled 1 to 5.
A sign with high precision but low recall is a strong but rare signal — high severity, low frequency. A sign with high recall but low precision is a common but weak signal — moderate severity, high frequency. The severity scale absorbs this tradeoff.
Step 5 — Re-test against human samples
To prevent the catalog from drifting toward over-detection, the recalibrated method must be re-tested against a control sample of known human-designed sites. False positive rate should remain below 15 percent for signs above severity 3, and below 30 percent for signs at severity 2 or below.
The full calibration cycle takes roughly four hours of focused work per quarter. The output is a versioned catalog (v1.0, v1.1, v1.2, etc.) with documented changes.
---
Cross-reference with content slop detection
The audit described above focuses on visual and structural slop. A complementary diagnostic targets content slop — the actual words on the page. The two are related but distinct.
Content slop signs include:
- Long paragraphs of generic startup language with no specific examples
- Repeated transitions like "Furthermore," "Moreover," "Additionally," "In conclusion"
- The phrase "in today's fast-paced world" or variants
- Lists of three benefits, each with a generic adjective ("Powerful," "Seamless," "Intuitive")
- Bullet points that all begin with verbs in the same tense
- A blog post grid where every post has a similar abstract image and a one-line summary that explains nothing
The content audit correlates moderately with the visual audit. A site that scores high on visual signs typically also has generic copy, since both originated from the same generator pass. But the correlation is not perfect: some teams use a generator for visuals and write copy by hand, or vice versa.
For a complete audit of a site, compute both scores and report them separately. A site might be 60 visual / 20 content (visual generator, hand-written copy) or 20 visual / 60 content (custom design, generic copy). The combined report is more informative than either alone.
The content audit is largely the domain of GPTZero and similar tools. A practical workflow:
- Run the 21-sign visual audit (this article) for visual score.
- Run GPTZero or equivalent on the longest text blocks for content score.
- Combine into a unified verdict.
A site scoring high on both is fully generated. A site scoring high on one and low on the other is partially generated, and the audit can identify which part.
---
Real-world evidence: a sample of public claims
Throughout 2025 and into 2026, several incidents emerged where freelance work was publicly accused of being AI-generated. The pattern in each was similar:
- A buyer received a site that looked competent on the surface.
- A more technically literate party (often another developer) noticed canonical signs.
- A public conversation followed, on Twitter, LinkedIn, or designer forums.
- The vendor either admitted AI use, denied it (and was usually unconvincing), or quietly disappeared from the platform.
The public examples are not cited by name in this article to avoid amplification. The pattern, however, supports the general method: in every documented case, the diagnostic signs raised by the technical observer matched a substantial subset of the 21 signs in this catalog.
The audit formalizes what experienced developers already do informally. It moves the diagnostic from intuition ("this looks generated") to evidence ("here are 14 of 21 signs present"). The shift is significant for public conversations because it changes the burden: a vendor cannot dismiss "this looks generated" but must address "here are the 14 specific features that match generator defaults."
---
Limitations of the comparative approach
The 21-sign method is comparative: it measures the deliverable's resemblance to known generator outputs. The approach has structural limitations.
Limitation 1 — New generators
When a new generator launches with substantially different defaults, the catalog does not capture it until the next calibration cycle. A site generated by a tool released last week, with different defaults, may score zero on the catalog while being entirely AI-generated.
The mitigation is the meta-signal: any site that looks too consistent — every section in the same style, every spacing identical, every choice the obvious default — is suspect even if the specific defaults are unfamiliar. A trained eye captures this even when the catalog does not.
Limitation 2 — Adversarial customization
A vendor who knows the audit can deliberately add noise. Replace blue-500 with #3b83f7 (one digit off, visually identical, but no longer matches the audit's exact hex). Use a custom font similar to Inter. Restructure the hero with three buttons instead of two. The score drops; the underlying generator scaffolding remains.
The mitigation is to score the structural patterns at a higher level of abstraction. The presence of a hero with a CTA pair (regardless of count) is a structural sign. The use of a single sans-serif throughout (regardless of whether it is Inter exactly) is a structural sign. The audit can be reformulated to capture these higher-level patterns, at the cost of higher false-positive rates.
Limitation 3 — Convergence with human design
The defaults that generators produce are not random. They are the modal patterns of human-designed sites in the training corpus. As humans converge on these patterns (because they are effective), the audit's discrimination capacity decreases. The catalog cannot distinguish a generator from a human who chooses generator-aligned defaults.
The mitigation is to focus the audit on signs that humans rarely produce: the technical fingerprints (shadcn class soup unmodified, identical sitemap timestamps, default 404s). These are less converged because they reflect platform laziness rather than design choice.
Limitation 4 — Single point in time
The audit captures the site at one moment. A site that was generated, then thoroughly customized over six months, may now score low. A site that was hand-built but recently extended with a generated section may now score high. The audit does not see the trajectory.
The mitigation is to ask about history. Vendors who hand-built typically have commit history, design files, and process documentation. The audit produces the score; the supporting evidence produces the trajectory.
---
Cross-tool comparison: what each generator gets right
A balanced view requires noting where each major generator differs from the modal output. The differences are diagnostic in their own right, but they also illustrate that "AI-generated" is not monolithic.
v0 (Vercel)
Strengths: Strong technical defaults. Next.js App Router. Server components used correctly. Good accessibility baseline. Forms wired with React Hook Form and Zod by default.
Tells: Heavy use of shadcn directly from the CLI. Violet-pink gradient as a near-mandatory accent. Inter font. Vercel hosting headers.
Departures from canonical: v0 occasionally produces non-three-card grids when prompted specifically. Custom palettes are possible if the prompt specifies hex values.
Lovable
Strengths: Includes a working Supabase backend by default. Authentication, database, file storage all wired. Easier to ship a full-stack MVP than v0.
Tells: "Built with Lovable" link in footer unless removed. Supabase client visible in network requests. Slightly different shadcn variant set than v0.
Departures from canonical: Lovable sometimes uses Mantine instead of shadcn for component primitives, which produces a different class signature.
Bolt.new
Strengths: Multiple framework support. Astro, Vite, Next.js, SvelteKit all available. Less Next.js-locked than v0.
Tells: StackBlitz container artifacts. Sometimes leaves bolt-new-config.json in public directories. Bundle includes @stackblitz/sdk.
Departures from canonical: Astro sites from Bolt have a different DOM signature than React sites from v0, even when the visual surface is similar.
Replit Agent
Strengths: Full development environment. Real backends, real databases. Closer to a full app than a marketing site.
Tells: replit.app subdomain unless custom domain is configured. .replit config file occasionally exposed in public. Vite + React more common than Next.js.
Departures from canonical: Replit sites are more often functional apps than marketing pages, so the 21-sign catalog (focused on marketing pages) applies less directly.
Claude Artifacts
Strengths: Single-file outputs. Often inlined in HTML. Self-contained with Tailwind via CDN.
Tells: No framework router. Tailwind via CDN script tag. Less likely to use shadcn (Claude prefers raw Tailwind).
Departures from canonical: The single-file constraint forces simpler structures. No code-splitting, no lazy loading, no API routes. The simplicity itself is a tell.
ChatGPT canvas
Strengths: Similar to Claude Artifacts but more often produces vanilla JavaScript without React.
Tells: Tailwind via CDN. Less framework structure than Claude. More likely to be a single static page with inline styles.
Departures from canonical: ChatGPT canvas outputs are smaller and simpler than v0 or Lovable, often closer to demo-grade than production-grade.
---
Three-month-out predictions
The 2026 generator landscape will not stabilize. Predictions for the period through the end of 2026:
Convergence on a single stack: All major generators will converge on Next.js + Tailwind + shadcn + Inter as the de facto standard. The current variance (Bolt's Astro, Lovable's Vite) will diminish. Detection becomes simpler.
Counter-convergence on customization: Sophisticated users will increasingly add a customization layer to generator output. The middle tier of audit scores (30–50) will inflate; pure slop (51+) will become rarer because the worst offenders learn to add basic customization.
Audit tooling proliferation: Sailop and equivalent tools will become standard procurement tooling for agencies and recruiters. Public APIs will allow real-time scanning during freelance vetting calls.
Vendor transparency requirements: Industry norms will shift toward mandatory AI disclosure in client work. Contracts will routinely include AI-use clauses. The current ambiguity will be replaced by explicit terms.
Generator-specific countermeasures: Generator companies (Vercel, Lovable, Bolt) will respond to detection by deliberately varying their defaults. Random palette selection, varied component structures, different font defaults. The canonical signature will fragment.
Detection lag and recalibration: As generators vary their defaults, the audit catalog will require monthly rather than quarterly updates. The diagnostic capacity will compress to a smaller set of structural signs (hero anatomy, pricing structure, footer pattern) that change more slowly than surface details.
The trajectory is one of mutual escalation: better detection, better adversarial customization, better detection again. The equilibrium is not no-AI; it is transparent-AI, where buyers know what they are paying for and vendors price accordingly.
---
Concluding observations
The 21-sign method is a tool. Like any tool, its value depends on how it is used.
Used well, it gives a buyer the vocabulary to ask the right questions, the evidence to support a renegotiation, and the framework to evaluate vendors objectively. It transfers diagnostic capacity from a small group of experienced designers to anyone who reads this article.
Used poorly, it becomes a weapon. A buyer who treats the score as a verdict, who refuses to consider context, who weaponizes the 21 signs against legitimate AI-assisted work, undermines the productivity gains that AI tools provide to the entire field.
The intended use is the former. The article documents twenty-one observable features. It assigns severity weights. It produces a score. The score is a starting point for a conversation, not a substitute for one.
For the recruiter screening agency portfolios, the audit is preparation for the vetting call. For the buyer evaluating a deliverable, it is preparation for the project review meeting. For the founder considering vendors, it is preparation for the contract negotiation.
In all three cases, the audit closes the information asymmetry between buyer and vendor. That closure has value regardless of the eventual decision. A buyer who decides to accept generator output at generator pricing is fine. A buyer who decides to demand custom work and pays accordingly is fine. The buyer who is unaware of the distinction is the one the audit serves.
---
The 21-sign method has limits. It produces probabilities, not certainties. It measures alignment with current defaults, which drift over time. It can be defeated by sophisticated customization. None of these limits invalidate the audit. They define its scope.
What the audit does, reliably, is shift the burden of evidence. Before the audit, the buyer faced a black box: a deliverable that may or may not have been custom work. After the audit, the buyer has a number, a per-sign breakdown, and a vocabulary for the conversation that follows. The number is not a verdict, but it is a starting point.
For thirty seconds of inspection, that starting point is enough.
SHIP CODE THAT LOOKS INTENTIONAL
Scan your frontend for AI patterns. Generate a unique design system. Stop shipping the same blue gradient as everyone else.