Document Extractor
Extract text and tables from native PDF and DOCX. No AI, no hallucinations, 100% deterministic precision. 1 credit/page.
Upload your PDFs and Word files. Get an Excel with exactly the fields you need. Processed on our server, never on a third-party cloud.
Copy-pasting fields from PDF to Excel. One by one. With error risk on every row.
Each module solves a specific data extraction problem.
Extract text and tables from native PDF and DOCX. No AI, no hallucinations, 100% deterministic precision. 1 credit/page.
Upload 3+ identical documents and automatically extract dates, amounts, percentages and IDs into columns. 1 credit/page.
Process scanned documents, photos and images with Gemini. Text + structured data in a single call. 3 credits/page.
Extract specific fields from invoices, payslips, contracts and bank statements with AI. Structured JSON output. 3 credits/page.
Drag & drop or full folder selection. PDF, DOCX, in batch. Up to 500 files per job.
Text and table extraction, data sectioning between documents, or AI extraction. The system calculates the cost before processing.
Excel, JSON or CSV ready to use. With source column per row and full traceability.
Simulated extraction flow with sample invoices.
| VAT ID | Date | Supplier | Base | Tax | Total |
|---|---|---|---|---|---|
| B12345678 | 15/01/2024 | Suministros Iberia SL | €1,028.00 | 21% | €1,243.88 |
| A98765432 | 22/01/2024 | Tech Solutions Spain SA | €735.50 | 21% | €890.00 |
| B55544433 | 03/02/2024 | Distribuciones Levante SL | €2,150.00 | 21% | €2,601.50 |
| A12398765 | 10/02/2024 | Servicios Digitales SL | €480.00 | 21% | €580.80 |
A 200-page contract doesn't cost the same as a 2-page invoice. Here you pay for what you actually process.
Try idpura. Free forever.
For freelancers and small businesses.
For accounting firms, practices and teams.
For high-volume companies.
For large organizations. SLA and dedicated support.
Credits renew monthly based on your plan
How much does each tool cost?
| Tool | Credits per page |
|---|---|
| Document Extractor (text + tables) | 1 cr / pág |
| Data Sectioning (multi-doc variance) | 1 cr / pág |
| AI OCR (scanned docs & images) | 3 cr / pág |
| AI Extractor (structured fields) | 3 cr / pág |
No AWS. No GCP. No Azure. Dedicated server in Germany.
Hetzner Falkenstein, Frankfurt. Your files never pass through third-party cloud services. 100% processing on dedicated hardware under German jurisdiction.
idpura processes your files and immediately deletes them from our servers. We never store your original documents under any circumstances. Extraction results are available for 24 hours for you to download, then automatically deleted. We only keep your usage history (credits used, dates, and tools) so you can review it in your dashboard.
Architecture designed to comply with GDPR. Coming soon: Clerk Organizations for team management with organization-level access control.
Currently: PDF (native and digital) and DOCX (Word 2007 onwards). The .doc format (Word 97-2003) is not supported. Coming soon: AI OCR for scanned documents and images, and AI Extractor for structured fields.
A credit equals one unit of processing. Basic tools (Document Extractor, Data Sectioning) consume 1 credit per page. AI tools (OCR, AI Extractor) consume 3 credits per page. The system shows you the exact cost before confirming processing.
Yes. All processing happens on a dedicated VPS in Germany (Hetzner). Your files are never sent to third-party cloud services. They are automatically deleted 24 hours after processing. Communication is always via HTTPS with SSL certificate.
Currently it is an individual tool. Multi-user team support is on the roadmap for Q4 2026, with roles, shared credits and organization-level access control.
Coming soon. AI OCR (3 credits/page) will process scanned documents, photos and images using Gemini. Currently only native (digital) PDFs and DOCX are supported.
The public REST API is on the roadmap for Q3-Q4 2026, available from the Business plan. It will include API keys, OpenAPI documentation and webhooks. If you have an urgent use case, contact us directly.
What's ready and what's coming.
Text + tables from PDF and DOCX to Excel, JSON and CSV. Up to 500 files per job.
Automatic extraction of variable fields between identical documents.
Full navigation in Spanish and English.
Monthly and annual subscriptions with Stripe. Starter, Pro, Business plans.
Detailed plan comparison, tools and credit table.
Terms of service, privacy policy and GDPR compliance.
Scanned documents and images to structured data with Gemini. 3 cr/page.
Specific fields from invoices, payslips and contracts with AI. 3 cr/page.
REST API with API keys, OpenAPI docs and webhooks. Business plan+.
Admin/member roles, shared credits, access control. Pro plan+.
No minimum subscription. No templates. No setup. Open beta.