9.9 KiB
Pushdown Automaton for the OpenPGP Message Format
See RFC4880 §11.3. OpenPGP Messages for the formal definition.
A simulation of the automaton can be found here.
RFC4880 defines the grammar of OpenPGP messages as follows:
OpenPGP Message :- Encrypted Message | Signed Message |
Compressed Message | Literal Message.
Compressed Message :- Compressed Data Packet.
Literal Message :- Literal Data Packet.
ESK :- Public-Key Encrypted Session Key Packet |
Symmetric-Key Encrypted Session Key Packet.
ESK Sequence :- ESK | ESK Sequence, ESK.
Encrypted Data :- Symmetrically Encrypted Data Packet |
Symmetrically Encrypted Integrity Protected Data Packet
Encrypted Message :- Encrypted Data | ESK Sequence, Encrypted Data.
One-Pass Signed Message :- One-Pass Signature Packet,
OpenPGP Message, Corresponding Signature Packet.
Signed Message :- Signature Packet, OpenPGP Message |
One-Pass Signed Message.
In addition, decrypting a Symmetrically Encrypted Data packet or a
Symmetrically Encrypted Integrity Protected Data packet as well as
decompressing a Compressed Data packet must yield a valid OpenPGP
Message.
This grammar can be translated into a pushdown automaton with the following graphical representation:
graph LR
start((start)) -- "ε,ε/m#" --> pgpmsg((OpenPGP Message))
pgpmsg -- "Literal Data,m/ε" --> literal((Literal Message))
literal -- "ε,#/ε" --> accept((Valid))
literal -- "Signature,o/ε" --> sig4ops((Corresponding Signature))
sig4ops -- "Signature,o/ε" --> sig4ops
sig4ops -- "ε,#/ε" --> accept
pgpmsg -- "OnePassSignature,m/o" --> ops((One-Pass-Signed Message))
ops -- "ε,ε/m" --> pgpmsg
pgpmsg -- "Signature,m/ε" --> signed((Signed Message))
signed -- "ε,ε/m" --> pgpmsg
pgpmsg -- "Compressed Data,m/ε" --> comp((Compressed Message))
comp -. "ε,ε/m" .-> pgpmsg
comp -- "ε,#/ε" --> accept
comp -- "Signature,o/ε" --> sig4ops
pgpmsg -- "SKESK|PKESK,m/k" --> esks((ESKs))
pgpmsg -- "Sym. Enc. (Int. Prot.) Data,m/ε" --> enc
esks -- "SKESK|PKESK,k/k" --> esks
esks -- "Sym. Enc. (Int. Prot.) Data,k/ε" --> enc((Encrypted Message))
enc -. "ε,ε/m" .-> pgpmsg
enc -- "ε,#/ε" --> accept
enc -- "Signature,o/ε" --> sig4ops
Formally, the PDA is defined as M = (\mathcal{Q}, \Sigma, \Upgamma, \delta, q_0, Z, F)
, where
\mathcal{Q}
is a finite set of states\Sigma
is a finite set which is called the input alphabet\Upgamma
is a finite set which is called the stack alphabet\delta
is a finite set of\mathcal{Q}\times(\Sigma\cup\textbraceleft\epsilon\textbraceright)\times\Upgamma\times\mathcal{Q}\times\Upgamma^*
, the transition relationq_0\in\mathcal{Q}
is the start stateZ\in\Upgamma
is the initial stack symbolF\subseteq\mathcal{Q}
is the set of accepting states
In our diagram, the initial state is q_0 = \text{start}
.
The initial stack symbol is Z=\epsilon
(TODO: Make it #
?).
The set of accepting states is F=\textbraceleft\text{valid}\textbraceright
.
\delta
is defined by the transitions shown in the graph diagram.
The input alphabet \Sigma
consists of the following OpenPGP packets:
\text{Literal Data}
: Literal Data Packet\text{Signature}
: Signature Packet\text{OnePassSignature}
: One-Pass-Signature Packet\text{Compressed Data}
: Compressed Data Packet\text{SKESK}
: Symmetric-Key Encrypted Session Key Packet\text{PKESK}
: Public-Key Encrypted Session Key Packet\text{Sym. Enc. Data}
: Symmetrically Encrypted Data Packet\text{Sym. Enc. Int. Prot. Data}
: Symmetrically Encrypted Integrity Protected Data Packet
Additionally, \epsilon
is used to transition without reading OpenPGP packets or popping the stack.
The following stack alphabet \Upgamma
is used:
m
: OpenPGP Messageo
: One-Pass-Signature packet.k
: Encrypted Session Key#
: Terminal for valid OpenPGP messages
Note: The standards document states, that Marker Packets shall be ignored as well. For the sake of readability, those transitions are omitted here.
The dotted line indicates a nested transition.
For example, the transition (\text{Compressed Message}, \epsilon, \epsilon, \text{OpenPGP Message}, m)
indicates, that the content of the
Compressed Data packet itself is an OpenPGP Message.