diff --git a/book/source/07-signing_data.md b/book/source/07-signing_data.md index 1741fb2..7331a3a 100644 --- a/book/source/07-signing_data.md +++ b/book/source/07-signing_data.md @@ -6,71 +6,88 @@ SPDX-License-Identifier: CC-BY-SA-4.0 (signing_data)= # Signatures over data -A data signature guarantees the authenticity (and implicitly also the integrity) of a message, e.g., an email or a file. +A *data signature* guarantees the authenticity (and implicitly also the integrity) of some data. Typical use cases for data signatures in OpenPGP are signatures for software packages or emails. -More specifically, when we say "authenticity", we mean that the signature guarantees that whoever controls the signing key material has issued that signature. The question of who controls that key material is a separate concern. We might independently want to verify that our intended communication partner uses the cryptographic identity in question. +When we say "authenticity," here, we mean that the signature guarantees that whoever controls the signing key material has issued the signature. -Note that signatures over data are different from {ref}`component_signatures_chapter`, which are used to attach metadata or subkeys to a certificate. +It is a separate question if the party we expect indeed controls the signer certificate. OpenPGP does offer mechanisms for *strong authentication* of the connection between certificates and identities. So, if necessary, we can also verify that our intended communication partner really uses the cryptographic identity that issued the signature[^sign-auth]. -Typical use cases for signatures over data in OpenPGP are signatures for software packages or emails. +[^sign-auth]: Other signing solutions, such as [signify](https://flak.tedunangst.com/post/signify), typically only offer a solution for pure signing, without offering a mechanism for strong authentication of the identity of the signer. -When signing data, OpenPGP offers mechanisms for strong authentication, based on bindings between certificates and identities, and the option to verify those bindings. +Data signatures can only be issued by component keys that carry the *signing* [key flag](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-key-flags). -Other signing solutions, such as [signify](https://flak.tedunangst.com/post/signify), typically only offer a solution for pure signing, without offering -a mechanism for authentication. +Note that signatures over data are distinct from {ref}`component_signatures_chapter`, which are used to attach metadata or subkeys to a certificate. ## Signature types -Two OpenPGP [signature types](signature_types) apply to data signatures: +Data signatures use one of two OpenPGP [signature types](signature_types): -- Signature of a binary document (*Binary Signature*, type ID `0x00`): a universal signature type for binary data. This signature type is typically used for files or data streams. +- "Signature of a binary document" (*Binary Signature*, type ID `0x00`): A universal signature type for binary data. Binary signatures are typically used for files or data streams. Binary signatures are calculated over the data "as is", without performing any transformations. -- Signature of a canonical text document (*Text Signature*, type ID `0x01`): used for textual data, such as email bodies. When calculating a text signature, the data is first normalized by converting line endings into a canonical form (``). This normalization mitigates issues caused by platform-specific default text encodings. - (This can be useful for detached signatures, when the message file may get re-encoded between signature generation and validation) +- "Signature of a canonical text document" (*Text Signature*, type ID `0x01`): Used for textual data, such as email bodies. When calculating a text signature, the data is first normalized by converting line endings into a canonical form (``). The normalization mitigates issues caused by platform-specific text encodings, for example with detached signatures, where the message file may get re-encoded between signature generation and validation. -Data signatures are always calculated by a component key that carries the *signing* key flag. +Data signatures are generated by hashing the message content, plus the metadata in the signature packet, and calculating a cryptographic signature over that hash. The resulting cryptographic signature is stored in an OpenPGP signature packet. -Data signatures are created by hashing the message content and calculating a cryptographic signature over the hash. -The resulting cryptographic signature is stored in an OpenPGP signature packet, which can be used in different ways. We'll discuss these in the following sections. +Data signature packets can be used in three different forms. We'll discuss these in the following section. ## Forms of OpenPGP data signatures -OpenPGP signatures over data can be generated and distributed in three forms[^sign-modes-gpg]: +OpenPGP signatures over data can be used in three different forms[^sign-modes-gpg]: - *Detached*: The signature is a standalone artifact, separate from the signed data. -- *Inline*: The original data and the signature over the data are stored in an OpenPGP container. -- *Cleartext signature*: A method to sign text while leaving the original message in a human-readable format. +- *Inline*: The original data and the signature over the data are collectively stored in an OpenPGP container. +- *Cleartext signature*: A message in text format and a signature over this message are stored in a combined text-format, which leaves the original message in a human-readable representation. -[^sign-modes-gpg]: These signature forms correspond with GnuPG's `--detach-sign`, `--sign` and `--clear-sign` modes. +[^sign-modes-gpg]: These three signature forms correspond with GnuPG's `--detach-sign`, `--sign` and `--clear-sign` modes. ### Detached signatures -This method is especially useful for signing software releases and other files that must not be modified by the signing process. +A detached signature is produced by calculating an OpenPGP signature over the signed data. The original data is left as is, while the OpenPGP signature is stored as a standalone file. A detached signature can be distributed alongside or independent of the original data. The authenticity and integrity of the original data file can be verified using the detached signature file. -A detached signature is produced by calculating an OpenPGP signature over a piece of data. -The resulting OpenPGP signature packet can then be distributed alongside or independent of the original data. +This signature format is especially useful for signing software releases and other files that must not be modified by the signing process. ### Inline signatures -This method is usually used with signed and/or encrypted emails. +An inline signature joins the signed data and a signature over this data into one combined OpenPGP message. -Most clients that support OpenPGP for encrypted and/or signed messages make use of inline-signatures. -To produce a signature, the entirety of the data needs to be processed by the producer. This has the consequence that an application that efficiently emits signed data can only append the signature at the end of the data stream. -On the other hand, an application that needs to efficiently verify signed data needs to know the signer's public key and used hash algorithm before processing the data. -To solve this issue, so-called One-Pass Signature packets are prefixed to the signed data. Those are small packets containing the fingerprint of the signing key, as well as the used hash algorithm. This is all the information a receiving application needs to know to initiate the verification process. +This method is usually used with signed and/or encrypted emails. Most software that supports OpenPGP for encrypted and/or signed messages uses inline-signatures. -To produce an inline-signed message, the original data is first wrapped in a [Literal Data packet](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#lit), which is prefixed with one or more [One-Pass Signature packets](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#one-pass-sig), and affixed with the corresponding signature packets. -The verifying application can read the One-Pass Signature packets and initiate the verification process. -The literal data can then be processed, such that the signatures at the end of the message can be verified in *one pass*. +#### Structure -TODO: explain nesting of OPSs. +An inline-signed OpenPGP message consists of three segments: + +- One or more [One-Pass Signature packets](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#one-pass-sig), +- the original data, wrapped in a [Literal Data packet](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#lit), +- the corresponding Data Signature packets. + +#### Creation + +To produce an inline signature, the signer processes the entirety of the data by reading from an input file and writing into am output OpenPGP message file. The signer calculates a cryptographic signature over the course of this process. Therefore, an efficient signer can only emit the resulting data signature packet at the end of this process, and thus store it at the end of the data stream. + +On the other hand, an efficient verifying application needs to know how to process the literal data before reading it. This is the purpose of the so-called One-Pass Signature packets in the first segment of inline-signed messages. One-Pass Signature packets contain the fingerprint of the signing key, as well as the hash algorithm used to calculate the hash digest for the signature. + +```{admonition} TODO +:class: warning + +Is the signer keyid/fingerprint in the OPS important for the verifier to be able to verify the signature efficiently? Or is it (only?) there to be hashed and signed, along with the literal data? +``` + +#### Verification + +This structure allows verifying applications to verify inline-signed messages in *one pass*: + +- The One-Pass Signature packets initiate the verification process, +- the literal data can then be processed (which means: it gets hashed), +- the signature packets at the end of the message can be verified against the hash digest that the previous step calculated. + +Note that the final step of verifying the cryptographic signature requires access to the signer's public key material. This public key material is not included in the signed message. The verifier must obtain the signer's public key data out-of-band (e.g. by obtaining the signer's certificate from a key server). ### Cleartext signatures -The *Cleartext Signature Framework* (CSF) is a mechanism that combines two goals: +The *Cleartext Signature Framework* (CSF) is an OpenPGP mechanism that combines two goals: - It leaves the message in clear text format, so that it can be viewed directly by a human in a program that knows nothing about OpenPGP. -- At the same time it adds an OpenPGP signature that allows verification of that message by users whose software supports OpenPGP. +- At the same time, it adds an OpenPGP signature that allows verification of that message by users whose software supports OpenPGP. #### Example @@ -91,21 +108,31 @@ r13/eqMN8kfCDw== -----END PGP SIGNATURE----- ``` -The cleartext signature consists of two blocks, which contain the message and a signature, respectively. In this case the message consists of the text "hello world". +The cleartext signature consists of two blocks, which contain the message and a signature, respectively. In this case, the message consists of the text "hello world". Notice that this message is readable by a human reader, without requiring additional software tools, as long as the reader understands which elements to ignore. -The message is followed by a block that contains an OpenPGP signature for the message, in ASCII armored form. Using OpenPGP software, this signature can be verified. +The message is followed by a block that contains an ASCII-armored OpenPGP signature for the message. Using this signature, OpenPGP software can verify the authenticity of the message in the first block. #### Use-case One use-case for cleartext signatures is: Asking someone to sign some piece of data. The person who is asked to sign the data can easily inspect it with simple commandline tools, such as `cat`, and verify that they agree with the data they are asked to sign. +```{admonition} TODO +:class: warning + +(Ask David for details:) + We use this for example to verify User ID and primary key of Arch Linux packagers before signing the User IDs on their keys with the main signing keys and to verify the data claims when introducing new packagers (i.e. already established packagers vouch for the data of a new packager). +``` #### Text transformations for cleartext signatures -TODO: explain text transforms for cleartext signatures (LF->CRLF etc) +```{admonition} TODO +:class: warning + +explain text transformations for cleartext signatures (LF->CRLF and additional escaping) +``` #### Pitfalls @@ -115,4 +142,14 @@ At the same time, they are considered a "legacy method"[^csf-gnupg] by some. [^csf-gnupg]: https://lists.gnupg.org/pipermail/gnupg-devel/2023-November/035428.html -The RFC points out a number of [pitfalls of cleartext signatures](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-issues-with-the-cleartext-s), and advises that in many cases, the inline and detached signature forms are preferable. +The RFC points out a number of specific [pitfalls of cleartext signatures](https://www.ietf.org/archive/id/draft-ietf-openpgp-crypto-refresh-12.html#name-issues-with-the-cleartext-s), and how to avoid them. It advises that in many cases, the inline and detached signature forms are preferable. + +## Advanced topics + +### Nesting of one-pass signatures + +```{admonition} TODO +:class: warning + +Write +``` \ No newline at end of file