Skip to main content

Invoice Data Extraction

Understanding how EZ Cloud extracts invoice data helps you get the best results from the AI processing.


How Supplier Identification Works

When an invoice enters EZ Cloud, the system:

  1. Scans for supplier identifiers - Looks for known supplier names, addresses, phone numbers, and email addresses
  2. Matches against supplier database - Compares extracted identifiers with your supplier records
  3. Applies supplier-specific training - Uses machine learning models trained on previous invoices from that supplier

Machine Learning Per Supplier

EZ Cloud trains extraction models on a per-supplier basis. This means:

  • First invoices from a new supplier may require more manual correction
  • Subsequent invoices improve as the system learns that supplier's invoice format
  • Consistent formats yield better results - Suppliers using the same invoice template get increasingly accurate extraction
Training the System

Every correction you make teaches EZ Cloud. The more invoices you process from a supplier, the more accurate extraction becomes for that supplier's format.


Field Mapping Best Practices

When reviewing extracted data, your corrections train the system. Follow these practices:

PracticeWhy It Matters
Be consistentAlways map the same data to the same field across invoices
Use the most specific fieldChoose "Unit Price" over a generic "Amount" when applicable
Correct errors promptlyThe sooner you fix extraction errors, the faster the system learns
Review all fieldsEven correct extractions benefit from confirmation

Optimizing Invoice Quality

For best extraction results, advise suppliers to:

RecommendationBenefit
Send native PDFs (not scanned images)Cleaner text extraction
Use consistent invoice templatesFaster machine learning
Include clear line item detailsBetter line-level extraction
Avoid handwritten notesReduces OCR errors
Send one invoice per PDFEliminates page selection confusion
Quality Over Quantity

A single clear, well-formatted invoice trains the system better than multiple low-quality scans.