Skip to main content

Document Processing with SAP Document AI


Enterprises process vast volumes of business documents daily—invoices, purchase orders, receipts, delivery notes—often requiring manual data entry into enterprise systems. Document processing (IDP) transforms this operational burden by leveraging AI to automatically extract, validate, and route document data to systems of record. SAP Document AI provides pre-trained models and generative AI capabilities to automate document processing, enabling organizations to reduce manual effort, accelerate cycle times, and improve data accuracy.

This reference architecture provides comprehensive guidance for designing and implementing, IDP solutions with SAP Document AI. From multi-channel document ingestion to AI-powered extraction, master data enrichment, and external system integrations, this guide covers architectural patterns, service selection criteria, and best practices for building document processing pipelines.

Architecture

An end-to-end document processing solution with SAP Document AI follows a three-layer architecture pattern separating document intake, extraction and enrichment, and posting:

image of solution diagram
Solution Diagram Resources
You can download the Solution Diagram as a .drawio file for offline use. Alternatively, you may view and edit the Solution Diagram directly on draw.io.
Please note that any changes made online will need to be saved locally if you wish to keep them.

The architecture centers around SAP Document AI as the core extraction engine and user interface, with optional enrichment and integration services on SAP BTP:

Flow

The reference architecture demonstrates how documents flow from capture through AI extraction to system posting:

  1. Ingestion Layer: For automatic processing, use SAP Document AI inbound channels for Outlook and Sharepoint. For manual uploads, SAP Document AI covers desktop and mobile scenarios with the Document AI workspace UI and Joule Work mobile app. Optional pre-processing middleware handles other document channels (fax, messaging apps, ...) and complex pre-processing or routing requirements. See Document Ingestion Patterns.
  2. Extraction and enrichment Layer: Classify and extract the document. Enrichment scenarios you can make use of Integration Suite flows or a CAP application to augment extracted data. See Data Extraction and Enrichment Patterns.
  • AI Classification and Extraction: Using SAP Document AI workflows, documents are split, classified and extracted with the right schema
  • Enrichment and validation: Document AI provides enrichment capabilities for business objects. For custom enrichment scenarios you can make use of SAP Document AI outbound notifications and Integration Suite flows or a CAP application to augment extracted data.
  • Confidence-Based auto-confirm: Documents with all critical fields above a threshold (typically 90%) can be automatically confirmed to push them to the next step.
  • Human in the loop: Users review and confirm low confidence documents within SAP Document AI workspace
  1. Posting Layer: Use outbound notifications to post the results to external systems. See Document Posting and System Integration Patterns.

Characteristics

An document processing architecture with SAP Document AI can be characterized as follows:

  • AI-powered extraction: Generative AI models extract structured attributes from unstructured documents with 85-95% accuracy, reducing manual data entry.
  • Multi-channel intake: Documents enter from email, mobile apps, APIs, or web upload, providing flexibility for diverse business processes.
  • Confidence-based routing: Automated confidence scoring enables straight-through processing for high-confidence documents while routing ambiguous cases to human review.
  • Extensible enrichment: HTTP notification hooks enable custom post-processing logic for master data lookups, business rule validation, and system-specific transformations.
  • Hybrid integration: Support for multiple integration technologies (CAP, Integration Suite, Build Process Automation) accommodates varying complexity and governance requirements.
  • Human-in-the-loop: Built-in validation workspace allows users to review and correct extractions, with corrections feeding back to improve AI model accuracy.

Examples in an SAP context

SAP Document AI enables automation across diverse document processing scenarios:

Services and Components

Resources