Every month I download dozens of PDFs — invoices, contracts, pay slips, bank statements. They all land in Downloads with names like scan_001.pdf, document(3).pdf, or my personal favorite: Unbenannt.pdf.
I kept telling myself I'd organize them "later." Three years of "later" gave me 2,000+ files in one folder.
Sound familiar?
Why existing tools didn't work for me
I live in Germany, and privacy matters here — a lot. Sending my tax documents, rental contracts, and bank statements to some cloud AI service was not an option. Here's what I tried:
Claude Cowork — impressive, but requires a $20+/month subscription, sends data to the cloud, and you have to manually ask it each time. Not "set it and forget it."
macOS Smart Folders / Hazel — rule-based, but doesn't understand document content. Sorting by file extension puts all PDFs in one pile. That's not organizing, that's moving the mess.
AI File Sorter, Sorted, Sparkle — either cloud-dependent, subscription-based, macOS-only, or all three.
What I wanted was simple:
Watches my folders
automatically in the background
Reads document content and
classifies it (invoice, contract, pay slip...)
Suggests a
rename and
target folder
Runs
100% offline — nothing leaves my machine
Works on
Windows, Mac, and Linux
Free
So I built it.
Meet Ablage
"Ablage" is German for "filing" — the tray on your desk where documents go to be sorted. That's exactly what this app does, but digitally.
Ablage sits in your system tray and watches folders you choose. When a new file appears, it:
Extracts text from PDFs and DOCX files
Classifies the document type using configurable rules (keywords or regex)
Suggests a new name like Rechnung_Telekom_2026-03-15.pdf
Suggests a target folder like Finanzen/Rechnungen/2026/
Shows a
notification — you apply, customize, or skip
No ML models. No neural networks. No 500MB downloads. Just smart pattern matching that understands German documents surprisingly well.
The "no ML" bet that actually worked
Here's the thing nobody talks about: for domain-specific document classification, regex beats ML in 80% of cases if you know your domain.
German financial documents are incredibly predictable. An invoice always contains words like "Rechnungsnummer", "MwSt", "Gesamtbetrag". A rental contract always has "Kündigungsfrist" and "Mietvertrag". A pay slip always mentions "Bruttolohn" and "Steuerklasse".
Ablage supports two rule types:
Keyword rules — comma-separated terms matched against document text, with a configurable minimum match threshold. For example, a rule with keywords invoice, total, payment, due date and min 2 matches will fire when at least two of those terms appear.
Regex rules — full regular expression patterns for precise matching. Something like rechnung.*nr.*\d+ catches invoice numbers regardless of formatting.
Each rule maps to a document type, a target folder (with {YYYY} placeholder for year-based organization), and a filename template using {Sender}, {Date}, and {ext} placeholders.
Seven default rules ship out of the box covering the most common German document types. But the real power is in the rule editor — you can create your own rules through a step-by-step wizard or edit them inline.
Here's a simplified version of how the classifier works:
function classify(text: string, rules: Rule[]): ClassificationResult {
let bestMatch: ClassificationResult | null = null;
for (const rule of rules) {
let score = 0;
if (rule.type === 'keywords') {
const keywords = rule.pattern.split(',').map(k => k.trim().toLowerCase());
const matched = keywords.filter(kw => text.toLowerCase().includes(kw));
score = matched.length >= rule.minMatches
? matched.length / keywords.length
: 0;
} else {
// regex rule
score = new RegExp(rule.pattern, 'i').test(text) ? 0.9 : 0;
}
if (score > 0 && (!bestMatch || score > bestMatch.confidence)) {
bestMatch = {
type: rule.documentType,
confidence: score,
fields: extractFields(text),
suggestedName: applyTemplate(rule.nameTemplate, fields),
suggestedFolder: applyTemplate(rule.folder, fields),
};
}
}
return bestMatch ?? { type: 'unknown', confidence: 0 };
}
No training data, no model files, no GPU. And it correctly classifies the vast majority of typical German household documents.
Architecture: keeping it simple
src/
main/ # Electron main process
watcher.ts # File system monitoring (chokidar)
extractor.ts # Text extraction (PDF, DOCX)
classifier.ts # Rule-based document classification
pipeline.ts # Orchestration: watch → extract → classify → notify
mover.ts # File rename + move with undo
database.ts # SQLite: rules, history, settings
tray.ts # System tray menu
notifications.ts # Native + in-app notifications
ipc-handlers.ts # IPC bridge to renderer
renderer/ # React UI
components/
Settings.tsx # Watch folders + language
RuleEditor.tsx # Rule list with inline editing
RuleWizard.tsx # Step-by-step rule creation
History.tsx # Operation log with undo
Notification.tsx # In-app suggestion cards
Onboarding.tsx # First-launch guide
shared/
types.ts # Shared TypeScript interfaces
i18n/ # Locale files (en, de, uk, fr)
The stack:
Electron 41 + React 19 + TypeScript 5 — cross-platform desktop with system tray
Vite + vite-plugin-electron — fast dev experience with HMR
chokidar — native file system watcher
pdf-parse — PDF text extraction
mammoth — DOCX text extraction
better-sqlite3 — local database for rules, history, and settings
Everything runs in Electron's main process — no external services, no API calls, no internet required.
The file watcher: harder than it sounds
One thing I underestimated: file watching is full of edge cases.
The awaitWriteFinish option in chokidar is critical — without it, you'll try to read a PDF that Chrome is still downloading and get corrupted data. A 2-second stability threshold catches most cases.
Other gotchas I ran into:
Browser downloads create a .crdownload (Chrome) or .part (Firefox) file first, then rename. You need to ignore the temp file and catch the rename event.
File moves within watched folders trigger both unlink and add events. Without deduplication you'll process the same file twice.
Permission errors on Windows when antivirus locks a newly created file. Retry with backoff solves this.
Large files — a 50MB PDF takes time to extract. The pipeline runs async so the UI never blocks.
Smart renaming: the devil is in the date formats
Germans love dates. But they can't agree on a format:
15.03.2026 (DD.MM.YYYY — most common)
2026-03-15 (ISO — in modern documents)
15/03/2026 (DD/MM/YYYY — occasionally)
März 2026 or March 2026 (month name)
15. März 2026 (full German date)
My date extractor handles all of these and normalizes to YYYY-MM-DD for filenames. The priority order: date found in document content first, then date from the filename, and finally the file modification timestamp as a last resort.
Template placeholders make naming flexible:
Placeholder
Description
{Sender}
Detected company or sender name
{Date}
Document date as YYYY-MM-DD
{YYYY}
Year (for folder paths)
{ext}
Original file extension
So a rule with template Rechnung_{Sender}_{Date} turns scan_001.pdf into Rechnung_Telekom_2026-03-15.pdf.
The UX principle: never act without asking
This was a hard rule from day one: Ablage suggests, the user decides.
When a new file is detected, an in-app notification card appears with the detected type, the proposed new filename, and the target folder. Three options: Apply, Customize (edit before applying), or Skip.
No automatic moves. No silent renames. The user has to confirm — or the suggestion stays in a pending queue.
Why? Because one wrong auto-move of an important file destroys all trust in the tool. Better to require a click than to lose someone's tax return.
Every operation goes into a history log with full undo support. Moved a file to the wrong place? One click to reverse it.
First-launch onboarding
I added a simple onboarding flow that asks you to:
Set a
target folder where organized files will go
Add at least one
watched folder (usually Downloads)
Review the default rules
After that, drop any document into the watched folder and see Ablage in action within seconds. The onboarding made a big difference — without it, people had to dig through settings to figure out what to do first.
Localization
Since this is aimed at the German market but open source for everyone, Ablage supports four languages: English, Deutsch, Українська, and Français. The language selector lives in the Settings tab and persists across sessions.
Adding a new language is straightforward — just add a locale file to src/shared/i18n/ following the existing pattern.
What I'd do differently
A few lessons from building this:
Start with fewer rules. I initially had 7 default rules. For the first version, 3-4 would have been enough. Each rule needs testing with real documents, and more rules means more edge cases.
The rule wizard was worth the effort. The inline editor is powerful but intimidating. The step-by-step wizard that asks "What keywords should I look for?" → "Where should these files go?" → "What should they be named?" got much better feedback from testers.
SQLite was the right call. I considered JSON files or electron-store for simplicity, but once you need history with undo and queryable rule sets, SQLite pays for itself. better-sqlite3 is synchronous and fast — no async complexity in the main process.
What's next
The current version handles text-based PDFs and DOCX well. Here's what's on the roadmap:
OCR for scanned documents — right now, scanned PDFs are classified by filename only. Adding OCR would cover a much larger set of documents.
ML classification — for edge cases where keyword matching fails, a lightweight model could push accuracy further.
Privacy Shield — detect and highlight personal data (IBAN, addresses, tax numbers) before sharing documents.
DATEV export — integration with Germany's standard accounting software format.
But only if there's demand. I built the simplest thing that could work, and I'm watching whether people actually use it before investing more.
Try it
Ablage is free and open source under the MIT license.
GitHub: github.com/che1974/ablage
git clone https://github.com/che1974/ablage.git
cd ablage
npm install
npm run rebuild-sqlite
npm run dev
If you find it useful, a ⭐ on GitHub helps others discover it. Feature requests and bug reports are welcome — especially if you have document types or languages that should be supported.