Sources & methodology

Where this material comes from.

Storgy is built on a small number of well-known public-domain corpora and a couple of AI services. This page documents what each one provides, the licence under which it appears, and where on the site it shows up.

Built onProject GutenbergWikidataWikipediaWikimedia CommonsAnthropic Claude

Editorial principle

A page is only as trustworthy as its citations.

A teacher cannot cite a study guide unless the study guide can cite itself. Every poem body, every birth year, every line of analysis on Storgy traces back to a public record — Project Gutenberg, Wikidata, Wikipedia, or an AI pass that is named for what it is.

The corpus

Four public-record sources sit behind the site.

01Poem texts

Project Gutenberg

Every full poem text on Storgy is sourced from Project Gutenberg, a digital library of public-domain books operating since 1971. Each poem page links back to the source ebook so the transcription can be verified against the original.

Coverage: Poem bodies
Licence: Berne Convention life + 70 (PD)

Visit source →

02Poet metadata

Wikidata

Birth and death dates, nationality, period, and the structured links between poets come from Wikidata — the authoritative structured-data project behind Wikipedia, with every fact backed by a citation. When the Wikidata entry conflicts with Wikipedia, the Wikidata value wins.

Coverage: Dates, nationality, period, links
Licence: CC0 (public domain dedication)

Visit source →

03Biography source text

Wikipedia

The first draft of every poet biography is built from public passages on Wikipedia under the CC BY-SA licence. Each biography is then rewritten in the Storgy editorial voice before publishing — the bios are not Wikipedia copies, but Wikipedia is the factual ground they work from.

Coverage: Biographical source text
Licence: CC BY-SA 4.0

Visit source →

04Poet portraits

Wikimedia Commons

Where a poet has a public-domain or freely-licensed portrait on Wikimedia Commons, Storgy uses it. Images are loaded directly from upload.wikimedia.org so attribution and licence traceability live with the file.

Coverage: Poet portraits & historical photography
Licence: Public domain / free licences

Visit source →

AI methodology

How Claude and the humaniser produce every analysis.

Stage A

Anthropic Claude

Summaries, line-by-line analyses, theme essays, and the responses from the Poem Analyzer are produced by Anthropic's Claude Sonnet model, asked to emit a fixed JSON shape so every output carries the same seven sections.

Stage B

Zod schema validation

Every AI response is parsed through strict zod schemas server-side before it touches the database. Hallucinated keys, malformed arrays, or truncated output are rejected and re-queued — never silently rendered.

Stage C

Humaniser pass

A second model rewrites the stiff, repetitive phrasing that LLM text falls into — the over-padded "In conclusion" cadence, the rule-of-three sentences, the AI vocabulary. The aim is short, concrete prose.

Stage D

Student input is not training data

Text pasted into the Poem Analyzer and the writer tools is processed for that single request. Anthropic's API terms cover this for the primary call, and the same policy applies on the humaniser pass.

AI policy (FAQ) →Trust & data →Security →

Every poem records its public-domain status before going live.

Storgy uses the Berne Convention's life + 70 rule to classify public-domain status, so anything published in full on the site is legally clear in the EU, UK, Australia, and almost always in the US too.

Where a poet is in the US public domain but still under copyright in the UK or EU — the Eliot, Frost, and Frye gap — the full text is shown only to US visitors and Googlebot via an edge geo-gate. Readers outside the US see an excerpt and a note explaining why.

What is not used

01
No rights-managed scraping.
No crawled content from contemporary literary magazines, paywalled anthologies, or modern study-guide sites.
02
No verbatim Wikipedia.
Wikipedia is the factual ground for biographies, but every paragraph is rewritten before publishing. Source URLs are linked on each poet page.
03
No copyrighted full texts.
Living poets, and poets whose work is still under copyright globally, are not in the corpus as full text. The Poem Analyzer can still process anything a reader pastes in.

Questions

Things readers and reviewers ask about the corpus.

Is every poem on Storgy really public domain?

Every full-text poem on Storgy is sourced from Project Gutenberg and classified under the Berne Convention's life + 70 rule. Each poem page records this status in the database before publication. Anything still under copyright is either shown as an excerpt with a Project Gutenberg link, or — for the US-PD / EU-copyright gap — gated so the full text is served only to US visitors.

Where does the line-by-line analysis come from?

The analysis is produced by Anthropic's Claude Sonnet model, asked to emit a fixed JSON shape with seven sections. The output is validated through a zod schema, then passed through a second model that rewrites the stiff, repetitive cadence AI text tends to fall into. No section is hand-written line by line — the editorial control sits in the schema, the humaniser pass, and the corpus-deflect check that prevents the analyser from re-billing on a text already in the database.

Do you train AI on what students paste into the analyser?

No. Text pasted into the Poem Analyzer and the writer tools is processed for that one request. Anthropic's API terms specify that inputs and outputs are not used to train models on the API endpoints Storgy uses, and the same policy applies on the humaniser pass.

What if a fact is wrong?

Most factual errors trace back to a stale Wikidata or Wikipedia entry. The fastest path is an email to hello@storgy.com — the correction is applied to Storgy and, where the upstream source is also wrong, fed back into Wikidata so the wider linked-data web benefits.

Why does the site sometimes show different content in Europe and the US?

Copyright is territorial. A poet who died in 1953 is public domain in the US under the pre-1928 publication rule and the post-1977 95-year rule, but remains under copyright in the EU and UK until 2024 under the life + 70 rule. Rather than withhold the full text everywhere, Storgy uses an edge geo-gate (db-ip.com country lookup) to show the full poem to US visitors and an excerpt elsewhere.

Can I cite Storgy in an essay?

Cite the primary source — the Project Gutenberg ebook linked at the bottom of every poem page — for the poem itself. For background facts, the Wikidata QID is the canonical reference. The analysis sections are AI-assisted reference material and should not substitute close reading or a peer-reviewed source in academic work.

How do I report a misattribution or a broken link?

Email hello@storgy.com with the URL and what is wrong. Misattributed poems and broken Gutenberg links are the two most common reports; both are usually fixed within the same day.

Best Poems About Love

Story Rater

Simple, flat pricing

Account

Site

Where this material comes from.

A page is only as trustworthy as its citations.

Four public-record sources sit behind the site.

Project Gutenberg

Wikidata

Wikipedia

Wikimedia Commons

How Claude and the humaniser produce every analysis.

Anthropic Claude

Zod schema validation

Humaniser pass

Student input is not training data

Every poem records its public-domain status before going live.

No rights-managed scraping.

No verbatim Wikipedia.

No copyrighted full texts.

Things readers and reviewers ask about the corpus.

Is every poem on Storgy really public domain?

Where does the line-by-line analysis come from?

Do you train AI on what students paste into the analyser?

What if a fact is wrong?

Why does the site sometimes show different content in Europe and the US?

Can I cite Storgy in an essay?

How do I report a misattribution or a broken link?