Show HN: Gemini Document Processor – Generate Th Summaries from PDF/ePub with AI

10 points by kidpeterpan 2 days ago

Hello HN! I'd like to share Gemini Document Processor, an open-source tool I've developed.

This tool uses Google's Gemini AI (their latest API) to create high-quality Thai language summaries from PDF and EPUB files. Key features include:

- Support for both PDF and EPUB files - Intelligent chunking for efficient Gemini API processing - Automatic image extraction from documents - Direct integration with Obsidian (export directly to vault) - Smart retry system when errors occur (switches models/increases timeouts) - Real-time progress tracking via web interface

I built this tool because I needed to read many English documents and wanted detailed summaries in Thai.

If you frequently read long documents or want to build a knowledge base from multiple sources, this tool could save you significant time.

The output is a well-formatted Markdown file with images and metadata, ideal for storing in Obsidian, Notion, or other PKM systems.

Try it by cloning the repo and running it with Python (requires a Google Gemini API key).

Feedback, suggestions, and contributions are very welcome!

badmonster a day ago

How does the Gemini Document Processor handle error recovery if a chunk fails during processing? Does it automatically retry, or does the user need to intervene manually?

kidpeterpan 5 hours ago

@badmonster If a chunk fails, it will retry immediately, and there will be another retry after all chunks are completed.