pgpainless/misc/OpenPGPMessageFormat.md

8.7 KiB

Pushdown Automaton for the OpenPGP Message Format

See RFC4880 §11.3. OpenPGP Messages for the formal definition.

A simulation of the automaton can be found here.

The graph representation of the Pushdown Automaton looks like the following:

graph LR
    start((start)) -- "ε,ε/m#" --> pgpmsg((OpenPGP Message))
    pgpmsg -- "Literal Data,m/ε" --> literal((Literal Message))
    literal -- "ε,#/ε" --> accept((Valid))
    literal -- "Signature,o/ε" --> sig4ops((Corresponding Signature))
    sig4ops -- "Signature,o/ε" --> sig4ops
    sig4ops -- "ε,#/ε" --> accept
    pgpmsg -- "OnePassSignature,m/o" --> ops((One-Pass-Signed Message))
    ops -- "ε,ε/m" --> pgpmsg
    pgpmsg -- "Signature,m/ε" --> signed((Signed Message))
    signed -- "ε,ε/m" --> pgpmsg
    pgpmsg -- "Compressed Data,m/ε" --> comp((Compressed Message))
    comp -. "ε,ε/m" .-> pgpmsg
    comp -- "ε,#/ε" --> accept
    comp -- "Signature,o/ε" --> sig4ops
    pgpmsg -- "SKESK|PKESK,m/k" --> esks((ESKs))
    pgpmsg -- "Sym. Enc. (Int. Prot.) Data,m/ε" --> enc
    esks -- "SKESK|PKESK,k/k" --> esks
    esks -- "Sym. Enc. (Int. Prot.) Data,k/ε" --> enc((Encrypted Message))
    enc -. "ε,ε/m" .-> pgpmsg
    enc -- "ε,#/ε" --> accept
    enc -- "Signature,o/ε" --> sig4ops

Formally, the PDA is defined as M = (\mathcal{Q}, \Sigma, \Upgamma, \delta, q_0, Z, F), where

  • \mathcal{Q} is a finite set of states
  • \Sigma is a finite set which is called the input alphabet
  • \Upgamma is a finite set which is called the stack alphabet
  • \delta is a finite set of \mathcal{Q}\times(\Sigma\cup\{\epsilon\})\times\Upgamma\times\mathcal{Q}\times\Upgamma^*, the transition relation
  • q_0\in\mathcal{Q} is the start state
  • Z\in\Upgamma is the initial stack symbol
  • F\subseteq\mathcal{Q} is the set of accepting states

In our diagram, the initial state q_0 is called start. The initial stack symbol Z is ε (TODO: Make it #?). The set of accepting states is F=\text{valid}. \delta is defined by the transitions shown in the graph diagram.

The input alphabet consists of the following OpenPGP packets:

  • Literal Data: Literal Data Packet
  • Signature: Signature Packet
  • OnePassSignature: One-Pass-Signature Packet
  • Compressed Data: Compressed Data Packet
  • SKESK: Symmetric-Key Encrypted Session Key Packet
  • PKESK: Public-Key Encrypted Session Key Packet
  • Sym. Enc. Data: Symmetrically Encrypted Data Packet
  • Sym. Enc. Int. Prot. Data: Symmetrically Encrypted Integrity Protected Data Packet

Additionally, ε is used to transition without reading OpenPGP packets.

The following stack alphabet is used:

  • m: OpenPGP Message
  • o: One-Pass-Signature packet.
  • k: Encrypted Session Key
  • #: Terminal for valid OpenPGP messages

Note: The standards document states, that Marker Packets shall be ignored as well. For the sake of readability, those transitions are omitted here.

The dotted line indicates a nested transition. For example, the transition (\text{Compressed Message}, \epsilon, \epsilon, \text{OpenPGP Message}, m) indicates, that the content of the Compressed Data packet itself is an OpenPGP Message.