The Glossary

Plain English.

Every term we use, explained without jargon. If something here is still unclear, that’s our fault — tell us and we’ll rewrite it.

AI training license

The buyer can use your archive to train an AI model from scratch or fine-tune an existing one.

Training is what most AI labs need most. Your articles, posts, podcasts, or videos become part of the data the model learns from. Once training is done, the model retains a learned representation — but the original text is not stored verbatim. A training license is typically time-bounded (e.g. 12 months) and grants rights for the duration; the buyer can also be granted perpetual rights for a higher fee.

Example: Anthropic licenses your 5-year newsletter archive to train a new model. They use the content during training but don't redistribute the raw posts.

RAG / inference license

The buyer can pull from your archive at query time, not just during training.

RAG stands for Retrieval-Augmented Generation. Instead of training a model on your archive, the AI system fetches relevant passages from your archive in real time when a user asks a question, then uses those passages to ground its answer. RAG licenses are often more valuable per article — every query can hit your content, and many products credit the source. Pricing is usually per-query, per-month, or flat for a window.

Example: A vertical AI startup serving lawyers licenses your legal-news archive for RAG. When a lawyer asks the assistant a question, the system retrieves your articles to answer.

Research-only license

Academic or non-commercial use only. No products, no revenue.

A research license restricts use to academic institutions, non-profits, or pre-product research teams. The buyer cannot use the archive to train a model that will be sold, deployed in a commercial product, or used to generate revenue. Useful when you want to support research without subsidizing for-profit AI.

Example: A university lab licenses your archive to publish a paper on media bias. They cannot use it to train a commercial model.

Perpetual license

One-time payment, the buyer keeps the rights forever.

A perpetual license is paid once and never expires. The buyer can use the archive under the agreed terms (training, RAG, or research) without re-licensing. Perpetual is the highest price point because there's no recurring revenue for the creator. Often combined with exclusivity for premium archives.

Example: An AI company pays a one-time fee for perpetual training rights on your archive. They never have to come back, but you also don't earn from them again.

Exclusive

Only one buyer can license your archive at a time.

Exclusivity means a single buyer holds the rights for the duration of the license. No other AI company can train on the same archive while the exclusive license is active. Exclusivity commands a premium — typically 3-10× a non-exclusive price — but locks you out of additional sales until the license ends.

Non-exclusive

Multiple buyers can license the same archive in parallel.

Non-exclusive licenses are the default. Each buyer pays your set price; the archive can be licensed to OpenAI, Anthropic, and a vertical startup all at the same time. Lower per-deal price, far more total revenue if the archive is in demand.

Manifest hash

A cryptographic fingerprint of every piece of content in your archive.

When you connect an archive, we list every published item (title, link, publish date, word count, content fingerprint) and produce a single SHA-256 hash from that list. The hash is the manifest. If anything changes — an article is added, removed, or edited — the hash changes. Buyers and creators can both prove what was licensed by referencing the same hash, and the hash is committed to the Solana receipt at purchase.

Ownership proof

How we confirm the archive really belongs to you, not someone else's URL.

Each platform has a different proof method. Beehiiv and Ghost: you give us a personal API key, we verify it's tied to your publication. Substack: we generate a code, you publish a post containing it, we re-fetch your RSS to confirm. YouTube: Google OAuth proves channel ownership. Custom domains: a DNS record or a small file on your site. We never accept just a URL.

Editorial firewall

We never pay journalists from sources, and never pay journalists from ourselves.

The platform exists for one transaction: AI companies licensing content from creators who own it. Sources cannot pay journalists for coverage through us. The platform itself does not pay journalists — no onboarding bonuses, no referral fees, no editorial subsidies. This is what keeps the platform credible to readers, regulators, and ethical journalism standards.

Platform fee

1% of every license sale. The lowest in the industry.

ArchiveBay takes 1% of each sale; the creator keeps 99%. The split is enforced both in the database and on-chain — the Solana smart contract literally cannot send more than 1% to the platform treasury.

USDC

A stable digital dollar pegged 1:1 to the US dollar. You can withdraw to a bank.

USDC is a stablecoin — a token on Solana whose value tracks the US dollar. When a buyer licenses your archive, you receive USDC in your wallet. You can hold it, swap it, or convert it to USD via Coinbase, Kraken, or any major exchange and withdraw to your bank.

Wallet

Where your USDC arrives after a sale. A free Solana wallet takes 2 minutes to set up.

A wallet is your address on the Solana blockchain. We recommend Phantom (phantom.app) for first-time users — it's free, takes a couple of minutes to set up, and your private key never leaves your device. License payments arrive instantly.

Journalist account

For independent journalists licensing work they personally own.

Use this if you write a Substack, run a podcast, post videos, or otherwise produce journalism you own outright. NOT for bylines on outlets like CNBC, Bloomberg, or WSJ — that work belongs to the outlet, and only the outlet can license it.

Creator account

For independent newsletter operators, video creators, niche researchers.

Same eligibility model as a journalist account, but the framing is for non-news creators: business writers, technical bloggers, niche researchers, podcasters in non-journalism domains.

Publisher account

For outlets, magazines, and media companies licensing their archives.

Publishers represent the entity that owns the work — not the individual journalist. If a magazine wants to license its 20-year archive, it registers as a publisher. Publishers must provide company name and (recommended) jurisdiction + registration number. Verification of the publishing domain is the standard proof.