AI for Work
Posts
⚖️ The AI Document Capacity Guide: Matching Tools to Your Page Count

⚖️ The AI Document Capacity Guide: Matching Tools to Your Page Count

Matching AI tools to document size: your practical guide to better results

Adam Hobson
01 May

In partnership with

A short and sharp one this week, adding to your ever-expanding AI knowledge.

We've all been there. You feed a 300-page report into ChatGPT and get responses that trail off or miss key information. The AI suddenly can't recall what was on page 17 or summarizes only the first quarter of your document.

These AI tools have specific capacity limits. We’ve spoken before about maxxing out context windows causing a slow descent into the infuriating world of hallucination.

But given a lot of these tools now have larger context windows and support MUCH larger documentation capabilities, where do we stand in terms of a simple page count?

Here's a straightforward guide to match your documents with the right AI tool.

The AI Document Capacity Cheat Sheet

AI Tool	Context Window	In A4 Pages
GPT-4 (original)	8,192 tokens	16 pages
GPT-4o/GPT-4.5/GPT-4-turbo	128,000 tokens	About 250 pages
Claude 3.7/3.5 Sonnet	200,000 tokens	Around 400 pages
Gemini 2.0 Flash	1,000,000 tokens	A hefty 2,000 pages
Gemini 1.5 Pro	2,097,152 tokens	Roughly 4,000 pages
Magic.dev LTM-2-Mini	100,000,000 tokens	A whopping 200,000 pages

But what is a "Token"?

Think of tokens as AI's attention currency. Each model has a fixed "budget" (its context window):

1 token ≈ 4 characters or ¾ of a word
EVERYTHING counts: your uploaded documents PLUS the entire conversation
Each question you ask and every response consumes more tokens
Once the budget is spent, earlier information gets pushed out

That 400-page limit for Claude? It's actually 400 pages MINUS whatever you've already "spent" on conversation. This is why complex back-and-forth about a large document can lead to the AI suddenly "forgetting" earlier details.

As we’ve discussed before, if this happens, it’s time to start a fresh chat.

What this means for your work

For the light stuff (under 16 pages): Basic GPT-4/any tool will do just fine. Annual reports, academic papers, contracts—these are well within range. At this range you’re sticking within free tiers from the major players so there is no excuse not to use them to help.

For decent-sized documents (up to 250 pages): Consider GPT-4o or similar models. These handle most books, comprehensive reports, and legal documents.

For serious research (up to 400 pages): Claude 3.7 is a good choice. This covers full manuscripts, detailed technical manuals, or multiple research papers at once.

For the heavy hitters (thousands of pages): Gemini models become necessary. Multiple books, entire company archives, years of financial statements—they have the capacity.

For massive document collections (corporate archives): Magic.dev with its 200,000-page capacity stands alone. This is enterprise-level document processing we’ll all likely never touch, but it’s good to know.

The Bottom Line

If you notice your AI giving incomplete answers or missing key information, check your document size first. Even the most advanced media-hyped AI can't analyze content beyond its context window.

Using the right AI for your document size is often the simplest way to improve performance.

IN PARTNERSHIP WITH THE RUNDOWN AI

Stay up-to-date with AI

The Rundown is the most trusted AI newsletter in the world, with 1,000,000+ readers and exclusive interviews with AI leaders like Mark Zuckerberg, Demis Hassibis, Mustafa Suleyman, and more.

Their expert research team spends all day learning what’s new in AI and talking with industry experts, then distills the most important developments into one free email every morning.

Plus, complete the quiz after signing up and they’ll recommend the best AI tools, guides, and courses – tailored to your needs.

From around the AI traps

The BBC released a course on Writing by none other than Agatha Christie. In collaboration with Agatha Christie Estate and her grandson the BBC used restored audio recordings, licensed images, interviews, her own writings, and of course AI to make this all happen.
The Atlantic published an article quoting artificial intelligence as one of the main impactors of a deteriorating job market and unusually high unemployment rates.
Duolingo launches 148 new language courses with the help of AI. It also announced that it will be “AI-first” and will gradually stop using contractors to do the work that AI can handle.

Next week: My work superpower - Notebook LM and its new features.

See you then,

Adam

Before you go!

I'd love to know what you thought of this week's email. I'm always trying to improve to bring you the best AI newsletter possible.