Source Code
Jina Reader
Extract clean web content via Jina AI โ without exposing your server IP.
Read a URL
{baseDir}/scripts/reader.sh "https://example.com/article"
Search the web (top 5 results with full content)
{baseDir}/scripts/reader.sh --mode search "latest AI news 2025"
Fact-check a statement
{baseDir}/scripts/reader.sh --mode ground "OpenAI was founded in 2015"
Options
| Flag | Description | Default |
|---|---|---|
--mode |
read, search, ground |
read |
--selector |
CSS selector to extract specific region | โ |
--wait |
CSS selector to wait for before extraction | โ |
--remove |
CSS selectors to remove (comma-separated) | โ |
--proxy |
Country code for geo-proxy (br, us, etc.) |
โ |
--nocache |
Force fresh content (skip cache) | off |
--format |
markdown, html, text, screenshot |
markdown |
--json |
Raw JSON output | off |
Examples
# Extract article content
{baseDir}/scripts/reader.sh "https://blog.example.com/post"
# Extract specific section via CSS selector
{baseDir}/scripts/reader.sh --selector "article.main" "https://example.com"
# Remove nav and ads before extraction
{baseDir}/scripts/reader.sh --remove "nav,footer,.ads" "https://example.com"
# Search with JSON output
{baseDir}/scripts/reader.sh --mode search --json "AI enterprise trends"
# Read via Brazil proxy
{baseDir}/scripts/reader.sh --proxy br "https://example.com.br"
# Fact-check a claim
{baseDir}/scripts/reader.sh --mode ground "Tesla is the most valuable car company"
API Key
export JINA_API_KEY="jina_..."
Free tier: 10M tokens (no signup needed). Get key at https://jina.ai/reader/
Pricing
- Read: ~$0.005/page (standard) | 3x for ReaderLM-v2
- Search: 10K tokens fixed + variable per result
- Ground:
300K tokens/request (30s latency)
Why Jina Reader?
- IP protection โ requests route through Jina's infra, not your server
- Clean markdown โ readability extraction + optional ReaderLM-v2
- Dynamic content โ headless Chrome renders JavaScript
- Structured extraction โ JSON schema support for data extraction