Hybrid invoice formats: PDF with embedded XML

Hybrid invoice formats combine a visual PDF with machine-readable XML. How does it work, which formats exist and what are the complications?

A hybrid invoice combines two worlds in one file: a visual PDF that can be read by humans and a machine-readable XML attachment that can be processed automatically. This makes hybrid formats particularly suited for the transition phase towards full e-invoicing, where not all receivers can yet process documents automatically.

The principle

A hybrid invoice is technically a PDF/A-3 document. PDF/A-3 is an archiving variant of PDF that allows arbitrary files to be embedded as attachments. In the case of a hybrid invoice, that attachment is a structured XML file containing all invoice data.

The receiver can process the document in two ways:

Visually. Open the PDF and read, print or archive the invoice, exactly like a traditional PDF invoice.

Automatically. Read the embedded XML and process it automatically in the accounting or ERP system, without OCR or manual data entry.

This distinguishes hybrid formats from both a regular PDF (visual only, not structured) and a pure XML format such as UBL or CII (machine-readable only, not visual).

Which hybrid formats exist?

Several hybrid invoice formats exist, each with its own origin and technical choices. What they have in common is the PDF/A-3 principle; where they differ is the XML schema that is embedded.

FormatRegionEmbedded XMLEN 16931 compliantFactur-X / ZUGFeRDFrance, Germany, growing in EUCII (UN/CEFACT)Yes (profile EN 16931+)ISDOC.PDFCzech Republic, SlovakiaISDOC (proprietary schema)No (conversion required)
Factur-X / ZUGFeRD

Factur-X (France) and ZUGFeRD (Germany) are two names for the same format. The embedded XML uses the CII schema from UN/CEFACT and is (from profile EN 16931 onwards) fully compliant with the European standard. Factur-X has five profiles that vary in the amount of structured data, from minimal basic data to a fully extended model.

Factur-X is the dominant hybrid format in Europe. In France it is one of the six mandatory formats under the CTC reform. In Germany and Austria, ZUGFeRD is widely adopted. The format also has its own Peppol DocumentTypeId, which means it can be sent directly via the Peppol network.

Read more: Factur-X / ZUGFeRD: the hybrid invoice format

ISDOC.PDF

ISDOC.PDF is the Czech variant of the hybrid concept. Instead of CII, it uses the national ISDOC XML as the embedded attachment. The format is not natively EN 16931 compliant and does not have its own Peppol DocumentTypeId. For cross-border use, conversion to UBL is required.

ISDOC.PDF is primarily relevant for the Czech and Slovak market, where it is widely supported by local accounting software. Internationally it is unknown; trading partners outside this region expect UBL or CII.

Read more: ISDOC: the Czech standard for e-invoicing

Why hybrid?

Hybrid formats solve a practical problem. The transition to fully structured e-invoicing (pure XML) is gradual. Not every receiver can process XML yet, but at the same time senders want to automate their invoicing process.

A hybrid invoice offers the best of both worlds: the sender can invoice in a fully structured way (the XML contains all data), while the receiver can simply read the invoice as a PDF if needed. Once the receiver can process automatically, the data is immediately available without any changes on the sending side.

Technical structure

All hybrid formats follow the same technical layout:

┌─────────────────────────────────┐
│  PDF/A-3 container              │
│  ┌───────────────────────────┐  │
│  │  Visual PDF pages         │  │
│  │  (invoice layout)         │  │
│  └───────────────────────────┘  │
│  ┌───────────────────────────┐  │
│  │  XMP metadata             │  │
│  │  (describes attachment)   │  │
│  └───────────────────────────┘  │
│  ┌───────────────────────────┐  │
│  │  Embedded XML file        │  │
│  │  (CII, ISDOC, ...)       │  │
│  └───────────────────────────┘  │
└─────────────────────────────────┘

The three layers are:

  1. PDF/A-3 container: the outer shell, an ISO-certified archiving format.
  2. XMP metadata: describes the embedded attachment (type, version, conformance level).
  3. XML attachment: the structured invoice file in the schema of the respective format.
Complications with hybrid formats

Hybrid formats bring specific challenges that do not apply to pure XML formats.

PDF versus XML consistency. The PDF and the embedded XML must contain the same information. In practice, discrepancies can arise: an amount in the PDF may differ from the amount in the XML, for example due to rounding differences or errors in generation. The question then is which source is authoritative. In automated processing, that is always the XML; in disputes, the PDF version can cause confusion.

Profiles and processability. With Factur-X/ZUGFeRD, the profile determines how much information is in the XML. The lower profiles (Minimum, Basic WL) contain too little data for full automated processing. A receiver who only gets the Minimum profile cannot use the XML to fully book the invoice and must still consult the PDF.

Validation. In addition to XML schema validation, hybrid formats also require the PDF/A-3 structure to be correct. Embedding the XML file must comply with the PDF/A-3 specification, including correct XMP metadata and relationship type (af:relationship). This adds an extra validation layer.

Interoperability with national formats. Formats like ISDOC.PDF that use their own national XML schema (instead of CII or UBL) are not directly exchangeable with European trading partners. Conversion is required, but that conversion takes place on the embedded XML, not on the PDF container. After conversion, the result is typically a standalone UBL file, whereby the hybrid character is lost.

File size. A hybrid invoice is larger than a standalone XML file, because it contains both the visual PDF and the XML. For a single invoice this is negligible, but in batch processing of thousands of invoices per day, the difference can be relevant.

eConnect and hybrid formats

eConnect supports all common hybrid invoice formats. Received hybrid documents are automatically processed via the embedded XML, regardless of whether it is Factur-X, ZUGFeRD or ISDOC.PDF. The visual PDF is preserved as an attachment.

The PSB can also transform hybrid formats to Peppol BIS Billing V3 and other supported formats. The PSB extracts the embedded XML, transforms it to the desired output format and routes the document to the receiver. This also works in reverse: a UBL invoice can be transformed into a hybrid format if the receiver expects it.

What is particularly noteworthy is that the PSB API can receive and process these formats directly. Software vendors who already invoice in Factur-X or ISDOC can send their existing documents directly to the API without having to convert beforehand. The PSB handles the transformation automatically. This makes it particularly easy for software partners in, for example, the Czech Republic or France to connect to the Peppol network via eConnect.

View the Transform API

Validate your invoice