openpgp-notes/book/source/signing_data.md
2023-12-20 14:55:23 +01:00

212 lines
19 KiB
Markdown

<!--
SPDX-FileCopyrightText: 2023 The "Notes on OpenPGP" project
SPDX-License-Identifier: CC-BY-SA-4.0
-->
# Signatures over data
In OpenPGP, a *{term}`data signature`* guarantees the {term}`authenticity<Authentication>` and, implicitly, the integrity of certain data. Typical use cases of {term}`data signatures<Data Signature>` include the {term}`authentication` of software packages and emails.
"{term}`Authenticity<Authentication>`" in this context means that the {term}`data signature` was issued by {term}`the entity controlling the signing key material<Certificate Holder>`. However,
it does not automatically signal if the expected party indeed controls the {term}`signer` {term}`certificate<OpenPGP Certificate>`. OpenPGP does offer mechanisms for *strong {term}`authentication`*, connecting {term}`certificates<OpenPGP Certificate>` to specific {term}`identities<Identity>`. This verifies that the intended communication partner is indeed associated with the cryptographic {term}`identity` behind the {term}`signature<OpenPGP Signature Packet>`[^sign-auth].
[^sign-auth]: Other signing solutions, like [signify](https://flak.tedunangst.com/post/signify), focus on pure signing without strong {term}`authentication` of the {term}`signer`'s {term}`identity`.
{term}`Data signatures<Data Signature>` can only be issued by {term}`component keys<Component Key>` with the *{term}`signing<Signing Key Flag>`* [key flag](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-key-flags).
Note that {term}`data signatures<Data Signature>` are distinct from [](/signing_components), which are used to form and maintain {term}`certificates<OpenPGP Certificate>`, as well as to {term}`certify<Certification>` {term}`identities<Identity>` on {term}`certificates<OpenPGP Certificate>`.
(data-signature-types)=
## Signature types
{term}`OpenPGP data signatures<Data Signature>` use one of two [signature types](signature-types):
- [**Binary signature**](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#sigtype-binary) ({term}`type ID<Signature Type ID>` `0x00`): This is the standard {term}`signature type` for binary data and is typically used for files or data streams. {term}`Binary signatures<Binary Signature>` are calculated over the data without any modifications or transformations.
- [**Text signature**](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-signature-of-a-canonical-te) ({term}`type ID<Signature Type ID>` `0x01`): Used for textual data, such as email bodies. When calculating a {term}`text signature`, the data is first normalized by converting line endings into a canonical form (`<CR><LF>`). This approach mitigates issues caused by platform-specific text encodings. This is especially important for detached and {term}`cleartext signatures<Cleartext Signature>`, where the message file might undergo re-encoding between the creation and {term}`verification` of the {term}`signature<OpenPGP Signature Packet>`.
{term}`Data signatures<Data Signature>` are generated by {term}`hashing<Hash Digest>` the message content along with the {term}`metadata` in the {term}`OpenPGP signature packet`, and calculating a {term}`cryptographic signature` over that {term}`hash<Hash Digest>`. The resulting {term}`cryptographic signature` is stored in the {term}`signature packet<OpenPGP Signature Packet>`.
{term}`Data signatures<Data Signature>` manifest in three distinct forms, which will be detailed in the subsequent section.
(forms-of-data-signatures)=
## Forms of OpenPGP data signatures
{term}`OpenPGP data signatures<Data Signature>` can be applied in three distinct forms[^sign-modes-gpg]:
- **{term}`Detached<Detached Signature>`**: The OpenPGP signature exists as a separate entity, independent of the signed data.
- **{term}`Inline<Inline Signature>`**: Both the original data and its corresponding {term}`OpenPGP signature<OpenPGP Signature Packet>` are encapsulated within an {term}`OpenPGP message`.
- **{term}`Cleartext signature`**: A plaintext message and its {term}`OpenPGP signature<OpenPGP Signature Packet>` coexist in a combined text format, preserving the readability of the original message.
[^sign-modes-gpg]: These three forms of {term}`signature<OpenPGP Signature Packet>` application align with GnuPG's `--detach-sign`, `--sign`, and `--clearsign` command options.
## Detached signatures
A {term}`detached signature` is produced by calculating an {term}`OpenPGP signature<OpenPGP Signature Packet>` over the data intended for signing. The original data remains unchanged, and the {term}`OpenPGP signature<OpenPGP Signature Packet>` is stored separately, e.g. as a standalone file. A {term}`detached signature` file can be distributed alongside or independent of the original data. The {term}`authenticity<Authentication>` and integrity of the original data file can be {term}`verified<Verification>` by using the {term}`detached signature` file.
This {term}`signature<OpenPGP Signature Packet>` format is especially useful for signing software releases and other files where it is imperative that the content remains unaltered during the signing process.
(inline-signature)=
## Inline signatures
An {term}`inline signature` joins the signed data and its corresponding {term}`data signature` into a single {term}`OpenPGP message`.
This method is commonly used for signing or encrypting emails. Most email software capable of handling OpenPGP communications typically uses {term}`inline signatures<Inline Signature>`.
OpenPGP defines two variant forms of inline-signed messages:
1. **{term}`One-pass signed messages<One-pass signed Message>`** This is the commonly used format for inline-signed messages. A signer can produce and a verifier can verify this format in one pass.
2. **{term}`Prefixed signed messages<Prefixed signed Message>`** This format predates[^inline-signature-formats] {term}`one-pass signed messages<One-pass signed Message>` and is conceptually slightly simpler. However, it has no strong benefits and is now rarely used.
[^inline-signature-formats]: One-pass signing was first specified in RFC 2440. The format was not supported in PGP 2.6.x.
(one-pass-signature)=
### One-pass signed message
This is the commonly used format for inline signed messages.
#### Structure
A {term}`one-pass signed<One-pass signed Message>` {term}`OpenPGP message` consists of three segments:
1. [**One-pass signature packets**](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#one-pass-sig): These one or more {term}`packets<Packet>` precede the signed data and enable {term}`signature<OpenPGP Signature Packet>` computation (both creation and verification) in a single pass.
2. [**Literal data packet**](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#lit): This contains the original data (e.g., the body of a message), without additional interpretation or conversion.
3. **{term}`Data signature packets<OpenPGP Signature Packet>`**: These contain the {term}`cryptographic signature` corresponding to the original data.
```{figure} plain_svg/ops-signed-message.svg
:name: fig-ops-signed-message
:alt: Depicts the structure of a one-pass signed message. Two one-pass signatures lead the literal data packet, followed by two signature packets. Arrows show, how the hash-algorithm field of the one-pass signatures is inspected in order to initiate the hashing procedure.
The structure of a one-pass signed message.
```
```{note}
Despite its name, a {term}`one-pass signature packet` is not a type of {term}`signature packet<OpenPGP Signature Packet>`.
Instead, it's a type of auxiliary packet that can be used in conjunction with {term}`signature packets<OpenPGP Signature Packet>`. Its use allows storing the {term}`signature packets<OpenPGP Signature Packet>` after the message body.
```
#### The function of the one-pass signature packet
To understand the purpose of this packet, consider that without it, the position of signature packets within an inline signed OpenPGP message constitutes a trade-off for efficient data processing. In particular when plaintext data is large and exceeds available memory in size.
The producer of a signed OpenPGP message wants to streamline the signature calculation process in such a way that allows to emit a copy of the signed data while calculating the cryptographic signature. On the signer's side, the signature packet is therefore easy to store after the signed data.
The verifier, on the other hand, needs some information from the signature packet in order to perform the signature verification process. In particular, the verifier needs to know which hash algorithm was used to calculate the signature, in order to perform the same hashing operation on the message data.
As a consequence, without a {term}`one-pass signature packet`, either:
- the producer would need to process the signed data twice:
- once to calculate the signature, and
- a second time to emit the signed data (the result is a prefixed signed message), or
- the verifier would need to process the OpenPGP message twice:
- once to read the signature packets at the end in order to determine the hash algorithm, and
- a second time to process the body of the message, and calculate the hash verifying the signature.
The one-pass signature packet solves this issue, by allowing both the creation and verification of a signed message in a single pass. It effectively contains a copy of the data in a signature packet, but without the cryptographic signature data.
The signer can easily emit this metadata before processing the full message, and for the verifier, this metadata enables processing of the message body. Both signer and verifier can efficiently generate or check a one-pass signed message.
#### Creation
To produce an {term}`inline signature`, the {term}`signer` decides on a hash algorithm and emits a {term}`one-pass signature packet<One-pass Signature Packet>` into the destination {term}`OpenPGP message`. This contains essential information such as the {term}`fingerprint<OpenPGP Fingerprint>` of the {term}`signing key<OpenPGP Component Key>` and the {term}`hash<Hash Digest>` algorithm used for computing the {term}`signature<OpenPGP Signature Packet>`'s {term}`hash digest`. The signer then processes the entirety of the plaintext data, emitting it as a {term}`literal data<Literal Data Packet>` into the message as well. Once the data is processed, the {term}`signer` calculates a {term}`cryptographic signature` using the calculated hash value. Lastly, the result is emitted as a {term}`data signature packet` to the output message, and the whole packet sequence can be efficiently stored or transmitted.
For efficient {term}`verification`, an application must understand how to handle the {term}`literal data<Literal Data Packet>` prior to reading from it. This requirement is addressed by the {term}`one-pass signature packets<One-pass Signature Packet>` located at the beginning of {term}`inline-signed<Inline Signature>` messages. This setup enables the verifier to process the data correctly and efficiently in only a single pass.
Strictly speaking, knowing just the hash algorithm would be sufficient to begin the verification process. However, having efficient access to the signer's fingerprint or key ID upfront allows OpenPGP software to fetch the signer's certificate(s) before processing the entirety of the - potentially large - signed data. This may, for example, involve downloading the certificate from a keyserver. In case fetching the signer's certificate(s) fails, or requires additional input from the user, it is better to signal the user about this before processing the data.
#### Verification
{term}`Inline-signed<Inline Signature>` messages enable efficient {term}`verification` in *one pass*, structured as follows:
1. **Initiation with {term}`one-pass signature packets<One-pass Signature Packet>`**: These {term}`packets<Packet>` begin the {term}`verification` process. They include the {term}`signer`'s {term}`key ID`/{term}`fingerprint<OpenPGP Fingerprint>`, essential for identifying the appropriate {term}`public key<OpenPGP Certificate>` for signature {term}`validation`.
2. **Processing the {term}`literal data packet`**: This step involves {term}`hashing<Hash Digest>` the literal data, preparing it for {term}`signature<OpenPGP Signature Packet>` {term}`verification`.
3. **{term}`Verifying<Verification>` {term}`signature packets<OpenPGP Signature Packet>`**: Located at the end of the message, these {term}`packets<Packet>` are checked against the previously calculated {term}`hash digest`.
Important to note, the {term}`signer`'s {term}`public key<OpenPGP Certificate>`, critical for the final {term}`verification` step, is not embedded in the message. Verifiers must acquire this {term}`key` externally (e.g., from a {term}`key server`) to authenticate the {term}`signature<OpenPGP Signature Packet>` successfully.
(prefixed-signature)=
### Prefixed signed message
A {term}`prefixed signed message` consists of {term}`signature packet(s)<signature packet>` followed by the message. For the verifier, processing one-pass signed and prefixed signed messages are equally convenient. However, on the signer's side, it takes more resources to generate a {term}`prefixed signed message`.
#### Structure
In this format, the signature packets are stored ahead of the message itself:
1. **{term}`Data signature packets<OpenPGP Signature Packet>`**: These one or more packets contain the {term}`cryptographic signature` corresponding to the original data.
2. [**Literal data packet**](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#lit): This contains the original data (e.g., the body of a message), without additional interpretation or conversion.
```{figure} plain_svg/prefixed-signed-message.svg
:name: fig-prefixed-signed-message
:alt: Depicts the structure of a prefixed signed message. As an example, two signature packets lead a literal data packet. Arrows show, how the signatures hash algorithm field is inspected to start the hashing procedure.
Structure of a prefixed signed message.
```
Compared to a {term}`one-pass signed message`, there are no {term}`one-pass signature packets<One-pass Signature Packet>` in this format, and the (otherwise equivalent) {term}`signature packet(s)<signature packet>` are stored ahead of the signed data.
For verification, this is equally convenient as the one-pass signed message form.
However, when a signer creates a {term}`prefixed signed message`, the signed data must be processed twice:
- once reading it to calculate the cryptographic signature, and
- once more to store the data in the generated OpenPGP message, after the signature packet(s).
(cleartext-signature)=
## Cleartext signatures
The *{term}`Cleartext Signature Framework`* (CSF) in OpenPGP accomplishes two primary objectives:
- maintaining the message in a human-readable cleartext format, accessible without OpenPGP-specific software
- incorporating an {term}`OpenPGP signature<OpenPGP Signature Packet>` for {term}`authentication` by users with OpenPGP-compatible software
### Example
The following is a detailed example of a {numref}`cleartext` signature:
```text
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
hello world
-----BEGIN PGP SIGNATURE-----
wpgGARsKAAAAKQWCZT0vBCIhBtB7JOyRoU3SQKwtU+bIqeBUlJpBIi6nOFdu0Zyu
o9yZAAAAANqgIHAzoRTzu/7Zuxc8Izf4r3/qSCmBfDqWzTXqmVtsSBSHACka3qbN
eehqu8H6S0UK8V7yHbpVhExu9Hu72jWEzU/B0h9MR5gDhJPoWurx8YfyXBDsRS4y
r13/eqMN8kfCDw==
=Ks9w
-----END PGP SIGNATURE-----
```
This {term}`signature<Cleartext Signature>` consists of two parts: a message ("hello world") and an ASCII-armored {term}`OpenPGP signature<OpenPGP Signature Packet>`. The message is immediately comprehensible to a human reader, while the {term}`signature<OpenPGP Signature Packet>` block allows for the message's {term}`authenticity<Authentication>` {term}`verification` via OpenPGP software.
### Use case
{term}`Cleartext signatures<Cleartext Signature>` combine the advantages of both {term}`detached<Detached Signature>` and {term}`inline signatures<Inline Signature>`:
- **Self-contained format**: {term}`Cleartext signatures<Cleartext signature>` enable the message and its {term}`signature<OpenPGP Signature Packet>` to be stored as a single file.
- **Human readability**: The message within a {term}`cleartext signature` remains accessible in a plain text format. This eliminates the need for specialized software to read the message content.
These features are particularly beneficial in scenarios where signed messages are managed semi-manually and where existing system infrastructure offers limited or no native support for OpenPGP in the workflow[^arch-certifications].
[^arch-certifications]: An illustrative example is the workflow adopted by Arch Linux to {term}`certify<Certification>` {term}`User IDs<User ID>` of new packagers. This process relies on [cleartext signed statements from existing packagers](https://gitlab.archlinux.org/archlinux/archlinux-keyring/-/blob/master/.gitlab/issue_templates/New%20Packager%20Key.md?ref_type=heads&plain=1#L33-46). These signed statements are stored as attachments in an issue tracking system for later inspection. The advantage of this approach lies in the convenience of having the message and signature in a single file, which simplifies manual handling. Based on the vouches in these {term}`cleartext signed<Cleartext Signature>` messages and an [email confirmation from the new packager](https://gitlab.archlinux.org/archlinux/archlinux-keyring/-/wikis/workflows/verify-a-packager-key), the main key operators can issue {term}`OpenPGP third-party certifications<Third-party Identity Certification>`.
### Text transformations for cleartext signatures
The {term}`cleartext signature framework` includes specific text normalization procedures to ensure the integrity and clarity of the message:
- **Escaping dashes**: The framework implements a method of [dash-escaped text](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-dash-escaped-text) within the message. Dash-escaping ensures that the parser correctly distinguishes between the armor headers, which are part of the {term}`signature<OpenPGP Signature Packet>`'s structure, and any lines in the message that happen to start with a dash.
- **Normalization of line endings**: Consistent with the approach for any other [text signature](data-signature-types), a {term}`cleartext signature` is calculated on the text with normalized line endings (`<CR><LF>`). This ensures that the {term}`signature<OpenPGP Signature Packet>` remains valid regardless of the text format of the receiving {term}`implementation<OpenPGP Implementation>`.
### Pitfalls
Despite their widespread adoption, {term}`cleartext signatures<Cleartext Signature>` have their limitations and are sometimes viewed as a "legacy method"[^csf-gnupg]. The {term}`RFC` details the [pitfalls of cleartext signatures](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-issues-with-the-cleartext-s), such as incompatibility with semantically meaningful whitespace, challenges with large messages, and security vulnerabilities related to misleading Hash header manipulations. Given these issues, safer alternatives like {term}`inline<Inline Signature>` and {term}`detached signature` forms are advised.
[^csf-gnupg]: https://lists.gnupg.org/pipermail/gnupg-devel/2023-November/035428.html