Andrea Corbellinihttps://andrea.corbellini.name/Sun, 19 Nov 2023 16:33:00 +0000Running the operating system that you're currently using in a virtual machine (with Secure Boot and TPM emulation)https://andrea.corbellini.name/2023/11/19/running-current-os-inside-vm/<p>In this article I will show you how to start your current operating system
inside a virtual machine. That is: launching the operating system (with all
your settings, files, and everything), inside a virtual machine, while you’re
using it.</p>
<p>This article was written for Ubuntu, but it can be easily adapted to other
distributions, and with appropriate care it can be adapted to non-Linux kernels
and operating systems as well.</p>
<h1 id="motivation">Motivation</h1>
<p>Before we start, why would a sane person want to do this in the first place?
Well, here’s why I did it:</p>
<ul>
<li>
<p><strong>To test changes that affect Secure Boot without a reboot.</strong></p>
<p>Recently I was doing some experiments with Secure Boot and the Trusted
Platform Module (TPM) on a new laptop, and I got frustrated by how time
consuming it was to test changes to the boot chain. Every time I modified a
file involved during boot, I would need to reboot, then log in, then
re-open my terminal windows and files to make more modifications… Plus,
whenever I screwed up, I would need to manually recover my system, which
would be even more time consuming.</p>
<p>I thought that I could speed up my experiments by using a virtual machine
instead.</p>
</li>
<li>
<p><strong>To predict the future TPM state (in particular, the values of PCRs 4, 5,
8, and 9) after a change, without a reboot.</strong></p>
<p>I wanted to predict the values of my TPM PCR banks after making changes to
the bootloader, kernel, and initrd. Writing a script to calculate the PCR
values automatically is in principle not that hard (and I actually did it
before, in a different context), but I wanted a robust, generic solution
that would work on most systems and in most situations, and emulation was
the natural choice.</p>
</li>
<li>
<p>And, of course, <strong>just for the fun of it!</strong></p>
</li>
</ul>
<p>To be honest, I’m not a big fan of Secure Boot. The reason why I’ve been
working on it is simply that it’s the standard nowadays and so I have to stick
with it. Also, there are no real alternatives out there to achieve the same
goals. I’ll write an article about Secure Boot in the future to explain the
reasons why I don’t like it, and how to make it work better, but that’s another
story…</p>
<h1 id="procedure">Procedure</h1>
<p>The procedure that I’m going to describe has 3 main steps:</p>
<ol>
<li>create a copy of your drive</li>
<li>emulate a TPM device using swtpm</li>
<li>emulate the system with QEMU</li>
</ol>
<p>I’ve tested this procedure on Ubuntu 23.04 (Lunar) and 23.10 (Mantic), but it
should work on any Linux distribution with minimal adjustments. The general
approach can be used for any operating system, as long as appropriate
replacements for QEMU and swtpm exist.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>Before we can start, we need to install:</p>
<ul>
<li><a href="https://www.qemu.org/">QEMU</a>: a virtual machine emulator</li>
<li><a href="https://github.com/stefanberger/swtpm/wiki">swtpm</a>: a TPM emulator</li>
<li><a href="https://wiki.ubuntu.com/UEFI/OVMF">OVMF</a>: a UEFI firmware implementation</li>
</ul>
<p>On a recent version of Ubuntu, these can be installed with:</p>
<div class="highlight"><pre><span></span><code>sudo apt install qemu-system-x86 ovmf swtpm
</code></pre></div>
<p>Note that OVMF only supports the x86_64 architecture, so we can only emulate
that. If you run a different architecture, you’ll need to find another UEFI
implementation that is not OVMF (but I’m not aware of any freely available
ones).</p>
<h2 id="create-a-copy-of-your-drive">Create a copy of your drive</h2>
<p>We can decide to either:</p>
<ul>
<li>
<p><strong>Choice #1: <a href="#early-boot-components-only">run only the components involved early at
boot</a> (shim, bootloader, kernel, initrd).</strong> This
is useful if you, like me, only need to test those components and how they
affect Secure Boot and the TPM, and don’t really care about the rest (the
init process, login manager, …).</p>
</li>
<li>
<p><strong>Choice #2: <a href="#entire-system">run the entire operating system</a>.</strong> This can
give you a fully usable operating system running inside the virtual machine,
but may also result in some instability inside the guest (because we’re
giving it a filesystem that is in use), and may also lead to some data loss
if we’re not careful and make typos. Use with care!</p>
</li>
</ul>
<h3 id="early-boot-components-only">Choice #1: Early boot components only</h3>
<p>If we’re interested in the early boot components only, then we need to make a
copy the following from our drive: the GPT partition table, the EFI partition,
and the <code>/boot</code> partition (if we have one). Usually all these 3 pieces are at
the “start” of the drive, but this is not always the case.</p>
<p>To figure out where the partitions are located, run:</p>
<div class="highlight"><pre><span></span><code>sudo parted -l
</code></pre></div>
<p>On my system, this is the output:</p>
<div class="highlight"><pre><span></span><code>Model: WD_BLACK SN750 2TB (nvme)
Disk /dev/nvme0n1: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 525MB 524MB fat32 boot, esp
2 525MB 1599MB 1074MB ext4
3 1599MB 2000GB 1999GB lvm
</code></pre></div>
<p>In my case, the partition number 1 is the EFI partition, and the partition
number 2 is the <code>/boot</code> partition. If you’re not sure what partitions to look
for, run <code>mount | grep -e /boot -e /efi</code>. Note that, on some distributions
(most notably the ones that use <code>systemd-boot</code>), a <code>/boot</code> partition may not
exist, so you can leave that out in that case.</p>
<p>Anyway, in my case, I need to copy the first 1599 MB of my drive, because
that’s where the data I’m interested in ends: those first 1599 MB contain the
GPT partition table (which is always at the start of the drive), the EFI
partition, and the <code>/boot</code> partition.</p>
<p>Now that we have identified how many bytes to copy, we can copy them to a file
named <code>drive.img</code> with <code>dd</code> (maybe after running <code>sync</code> to make sure that all
changes have been committed):</p>
<div class="highlight"><pre><span></span><code># replace '/dev/nvme0n1' with your main drive (which may be '/dev/sda' instead),
# and 'count' with the number of MBs to copy
sync && sudo -g disk dd if=/dev/nvme0n1 of=drive.img bs=1M count=1599 conv=sparse
</code></pre></div>
<h3 id="entire-system">Choice #2: Entire system</h3>
<p>If we want to run our entire system in a virtual machine, then I would
recommend creating a QEMU copy-on-write (COW) file:</p>
<div class="highlight"><pre><span></span><code># replace '/dev/nvme0n1' with your main drive (which may be '/dev/sda' instead)
sudo -g disk qemu-img create -f qcow2 -b /dev/nvme0n1 -F raw drive.qcow2
</code></pre></div>
<p>This will create a new copy-on-write image using <code>/dev/nvme0n1</code> as its “backing
storage”. Be very careful when running this command: you don’t want to mess up
the order of the arguments, or you might end up writing to your storage device
(leading to data loss)!</p>
<p>The advantage of using a copy-on-write file, as opposed to copying the whole
drive, is that this is much faster. Also, if we had to copy the entire drive,
we might not even have enough space for it (even when using sparse files).</p>
<p>The big drawback of using a copy-on-write file is that, because our main drive
likely contains filesystems that are mounted read-write, any modification to
the filesystems on the host may be perceived as data corruption on the guest,
and that in turn may cause all sort of bad consequences inside the guest,
including kernel panics.</p>
<p>Another drawback is that, with this solution, later we will need to give QEMU
permission to <em>read</em> our drive, and if we’re not careful enough with the
commands we type (e.g. we swap the order of some arguments, or make some
typos), we may potentially end up <em>writing</em> to the drive instead.</p>
<h2 id="emulate-a-tpm-device-using-swtpm">Emulate a TPM device using swtpm</h2>
<p>There are various ways to run the swtpm emulator. Here I will use the “vTPM
proxy” way, which is not the easiest, but has the advantage that the emulated
device will look like a real TPM device not only to the guest, but also to the
host, so that we can inspect its PCR banks (among other things) from the host
using familiar tools like <code>tpm2_pcrread</code>.</p>
<p>First, enable the <code>tpm_vtpm_proxy</code> module (which is not enabled by default on
Ubuntu):</p>
<div class="highlight"><pre><span></span><code>sudo modprobe tpm_vtpm_proxy
</code></pre></div>
<p>If that worked, we should have a <code>/dev/vtpmx</code> device. We can verify its
presence with:</p>
<div class="highlight"><pre><span></span><code>ls /dev/vtpmx
</code></pre></div>
<p>swtpm in “vTPM proxy” mode will interact with <code>/dev/vtpmx</code>, but in order to do
so it needs the <code>sys_admin</code> capability. On Ubuntu, swtpm ships with this
capability explicitly disabled by AppArmor, but we can enable it with:</p>
<div class="highlight"><pre><span></span><code>sudo sh -c "echo ' capability sys_admin,' > /etc/apparmor.d/local/usr.bin.swtpm"
systemctl reload apparmor
</code></pre></div>
<p>Now that <code>/dev/vtpmx</code> is present, and swtpm can talk to it, we can run swtpm
in “vTPM proxy” mode:</p>
<div class="highlight"><pre><span></span><code>sudo mkdir /tpm/swtpm-state
sudo swtpm chardev --tpmstate dir=/tmp/swtpm-state --vtpm-proxy --tpm2
</code></pre></div>
<p>Upon start, swtpm should create a new <code>/dev/tpmN</code> device and print its name on
the terminal. On my system, I already have a real TPM on <code>/dev/tpm0</code>, and
therefore swtpm allocates <code>/dev/tpm1</code>.</p>
<p>The emulated TPM device will need to be readable and writeable by QEMU, but the
emulated TPM device is by default accessible only by root, so either we run
QEMU as root (not recommended), or we relax the permissions on the device:</p>
<div class="highlight"><pre><span></span><code># replace '/dev/tpm1' with the device created by swtpm
sudo chmod a+rw /dev/tpm1
</code></pre></div>
<p>Make sure not to accidentally change the permissions of your real TPM device!</p>
<h2 id="emulate-the-system-with-qemu">Emulate the system with QEMU</h2>
<p>Inside the QEMU emulator, we will run the OVMF UEFI firmware. On Ubuntu, the
firmware comes in 2 flavors:</p>
<ul>
<li>with Secure Boot enabled (<code>/usr/share/OVMF/OVMF_CODE_4M.ms.fd</code>), and</li>
<li>with Secure Boot disabled (in <code>/usr/share/OVMF/OVMF_CODE_4M.fd</code>)</li>
</ul>
<p>(There are actually even more flavors, see <a href="https://askubuntu.com/q/1409590">this AskUbuntu
question</a> for the details.)</p>
<p>In the commands that follow I’m going to use the Secure Boot flavor, but if you
need to disable Secure Boot in your guest, just replace <code>.ms.fd</code> with <code>.fd</code> in
all the commands below.</p>
<p>To use OVMF, first we need to copy the EFI variables to a file that can be read
& written by QEMU:</p>
<div class="highlight"><pre><span></span><code>cp /usr/share/OVMF/OVMF_VARS_4M.ms.fd /tmp/
</code></pre></div>
<p>This file (<code>/tmp/OVMF_VARS_4M.ms.fd</code>) will be the equivalent of the EFI flash
storage, and it’s where OVMF will read and store its configuration, which is
why we need to make a copy of it (to avoid modifications to the original file).</p>
<p>Now we’re ready to run QEMU:</p>
<ul>
<li>
<p>If you <a href="#early-boot-components-only">copied only the early boot files (choice
#1)</a>:</p>
<div class="highlight"><pre><span></span><code># replace '/dev/tpm1' with the device created by swtpm
qemu-system-x86_64 \
-accel kvm \
-machine q35,smm=on \
-cpu host \
-smp cores=4,threads=1 \
-m 4096 \
-vga virtio \
-bios /usr/share/ovmf/OVMF.fd \
-drive if=pflash,unit=0,format=raw,file=/usr/share/OVMF/OVMF_CODE_4M.ms.fd,readonly=on \
-drive if=pflash,unit=1,format=raw,file=/tmp/OVMF_VARS_4M.ms.fd \
-drive if=virtio,format=raw,file=drive.img \
-tpmdev passthrough,id=tpm0,path=/dev/tpm1,cancel-path=/dev/null \
-device tpm-tis,tpmdev=tpm0
</code></pre></div>
</li>
<li>
<p>If you have <a href="#entire-system">a copy-on-write file for the entire system (choice
#2)</a>:</p>
<div class="highlight"><pre><span></span><code># replace '/dev/tpm1' with the device created by swtpm
sudo -g disk qemu-system-x86_64 \
-accel kvm \
-machine q35,smm=on \
-cpu host \
-smp cores=4,threads=1 \
-m 4096 \
-vga virtio \
-bios /usr/share/ovmf/OVMF.fd \
-drive if=pflash,unit=0,format=raw,file=/usr/share/OVMF/OVMF_CODE_4M.ms.fd,readonly=on \
-drive if=pflash,unit=1,format=raw,file=/tmp/OVMF_VARS_4M.ms.fd \
-drive if=virtio,format=qcow2,file=drive.qcow2 \
-tpmdev passthrough,id=tpm0,path=/dev/tpm1,cancel-path=/dev/null \
-device tpm-tis,tpmdev=tpm0
</code></pre></div>
<p>Note that this last command makes QEMU run as the <code>disk</code> group: on Ubuntu,
this group has the permission to read <em>and write</em> all storage devices, so
be careful when running this command, or you risk losing your files
forever! If you want to add more safety, you may consider using an
<a href="https://manpages.ubuntu.com/manpages/mantic/en/man5/acl.5.html">ACL</a> to
give the user running QEMU read-only permission to your backing storage.</p>
</li>
</ul>
<p>In either case, after launching QEMU, our operating system should boot…
while running inside itself!</p>
<p>In some circumstances though it may happen that the wrong operating system is
booted, or that you end up at the EFI setup screen. This can happen if your
system is not configured to boot from the “first” EFI entry listed in the EFI
partition. Because the boot order is not recorded anywhere on the storage
device (it’s recorded in the EFI flash memory), of course OVMF won’t know which
operating system you intended to boot, and will just attempt to launch the
first one it finds. You can use the EFI setup screen provided by OVMF to change
the boot order in the way you like. After that, changes will be saved into the
<code>/tmp/OVMF_VARS_4M.ms.fd</code> file on the host: you should keep a copy of that file
so that, next time you launch QEMU, you’ll boot directly into your operating
system.</p>
<h2 id="reading-pcr-banks-after-boot">Reading PCR banks after boot</h2>
<p>Once our operating system has launched inside QEMU, and after the boot process
is complete, the PCR banks will be filled and recorded by swtpm.</p>
<p>If we choose to <a href="#early-boot-components-only">copy only the early boot files (choice
#1)</a>, then of course our operating system won’t be
<em>fully</em> booted: it’ll likely hang waiting for the root filesystem to appear,
and may eventually drop to the initrd shell. None of that really matters if all
we want is to see the PCR values stored by the bootloader.</p>
<p>Before we can extract those PCR values, we first need to stop QEMU (Ctrl-C is
fine), and then we can read it with <code>tpm2_pcrread</code>:</p>
<div class="highlight"><pre><span></span><code># replace '/dev/tpm1' with the device created by swtpm
tpm2_pcrread -T device:/dev/tpm1
</code></pre></div>
<p>Using the method described here in this article, PCRs 4, 5, 8, and 9 inside the
emulated TPM <em>should</em> match the PCRs in our real TPM. And here comes an
interesting application of this method: if we upgrade our bootloader or kernel,
and we want to know the <em>future</em> PCR values that our system will have after
reboot, we can simply follow this procedure and obtain those PCR values without
shutting down our system! This can be especially useful if we use TPM sealing:
we can reseal our secrets and make them unsealable at the next reboot without
trouble.</p>
<h2 id="restarting-the-virtual-machine">Restarting the virtual machine</h2>
<p>If we want to restart the guest inside the virtual machine, and obtain a
consistent TPM state every time, we should start from a “clean” state every
time, which means:</p>
<ol>
<li>restart swtpm</li>
<li>recreate the <code>drive.img</code> or <code>drive.qcow2</code> file</li>
<li>launch QEMU again</li>
</ol>
<p>If we don’t restart swtpm, the virtual TPM state (and in particular the PCR
banks) won’t be cleared, and new PCR measurements will simply be added on top
of the existing state. If we don’t recreate the drive file, it’s possible that
some modifications to the filesystems will have an impact on the future PCR
measurements.</p>
<p>We don’t necessarily need to recreate the <code>/tmp/OVMF_VARS_4M.ms.fd</code> file every
time. In fact, if you need to modify any EFI setting to make your system
bootable, you might want to preserve it so that you don’t need to change EFI
settings at every boot.</p>
<h1 id="automating-the-entire-process">Automating the entire process</h1>
<p>I’m (very slowly) working on turning this entire procedure into a script, so
that everything can be automated. Once I find some time I’ll finish the script
and publish it, so if you liked this article, stay tuned, and let me know if
you have any comment/suggestion/improvement/critique!</p>andreacorbelliniSun, 19 Nov 2023 16:33:00 +0000tag:andrea.corbellini.name,2023-11-19:/2023/11/19/running-current-os-inside-vm/miscHow to run Remark42 on Fly.iohttps://andrea.corbellini.name/2023/09/19/running-remark42-on-flyio/<p>As I wrote on my <a href="https://andrea.corbellini.name/2023/09/05/disqus-to-remark42/">previous post</a>, I
recently switched from <a href="https://disqus.com/">Disqus</a> to
<a href="https://remark42.com/">Remark42</a> for the comments on my blog. Here I will
explain how I set it up on <a href="https://fly.io/">Fly.io</a>.</p>
<h1 id="overview">Overview</h1>
<p>The setup that I ended up with looks like the following:</p>
<figure>
<img src="https://andrea.corbellini.name/images/remark42-setup.svg" alt="Diagram of the components for the Remark42 setup">
</figure>
<p>Something to note about this setup is that the “machine” (more on that later)
and the storage volume are both a single instance. This is not a distributed
setup. This is because Remark42 stores comments in a single file and does not
make use of a distributed database. This is listed as a “feature” on the
<a href="https://remark42.com/">Remark42 website</a>. How one is supposed to implement
replication? I have no idea. Thankfully Fly.io seems to be fast to provision
machines, and the Remark42 daemon also seems fast to start, so hopefully if a
problem occurs (or when updates are required), the downtime will be minimal.</p>
<p>It is imperative however to understand that, because of the
non-distributed/non-replicated nature of this setup, backups should be made
periodically to avoid the risk of losing your comments forever.</p>
<h1 id="preliminaries">Preliminaries</h1>
<p>Before setting up Remark42, I had never used <a href="https://fly.io/">Fly.io</a> before.
As Fly.io newbie, I would describe it as a cloud provider focused on Docker
containers. Fly.io uses some concepts (like “apps” and “machines”) that make
sense after you practice a bit with them, but as a beginner they are not the
easiest to learn. Most of the complexity I think comes from the fact that the
Fly.io documentation is poorly written. On top of that, it appears that Fly.io
is migrating their offering from “V1 apps” to “V2 apps”, and today some
documentation applies only to “V1 apps”, other pieces apply only to “V2 apps”,
resulting in a big mess. The error messages you get are also far from clear.</p>
<p>But don’t get too scared: once you get to know Fly.io, it can actually be fun
to use.</p>
<p>Creating resources on Fly.io requires installing their command line client:
<code>flyctl</code>. Because I do not like to run unknown software unconfined, I <a href="https://snapcraft.io/andrea-flyctl">packaged
it as a snap</a> that you can install using:</p>
<div class="highlight"><pre><span></span><code>snap<span class="w"> </span>install<span class="w"> </span>andrea-flyctl
</code></pre></div>
<p>Another source of confusion that I had the beginning was that, by reading the
documentation, it looked like a second command line tool named <code>fly</code> was needed
in addition to <code>flyctl</code>. It turns out that <code>fly</code> and <code>flyctl</code> are the same
thing, it’s just that they’re transitioning from a name to another. If you
installed the tool through the snap, you can set up these aliases so that you
can copy and paste commands without trouble:</p>
<div class="highlight"><pre><span></span><code><span class="nb">alias</span><span class="w"> </span><span class="nv">fly</span><span class="o">=</span>/snap/bin/andrea-flyctl.fly
<span class="nb">alias</span><span class="w"> </span><span class="nv">flyctl</span><span class="o">=</span>/snap/bin/andrea-flyctl.fly
</code></pre></div>
<p>According to the documentation (and assuming it’s up-to-date), <code>flyctl</code> does
not support everything that Fly.io supports, so sometimes <code>curl</code> is used to
interact directly with the Fly.io API. In order to use that, you’ll need to
download an authentication token from the Fly.io interface and store it in a
file (that I’ll call <code>~/fly-token</code> from now on).</p>
<p>I’m going to skip over the steps to create and configure a Fly.io account,
obtaining an authentication token, as those were easy steps in my opinion.</p>
<h1 id="creating-a-machine">Creating a machine</h1>
<p>A Fly.io “machine” is a virtual machine running a single Docker container with
a persistent volume attached to it. In order to create my Fly.io machine to run
Remark42 in it, I loosely followed this page from the Fly.io documentation:
<a href="https://fly.io/docs/machines/guides-examples/functions-with-machines/">Run User Code on Fly Machines
</a>.
“Loosely” because it turned out that some pieces on that page are not fully
correct, but anyway…</p>
<p>Before creating a machine, you first need to create an “app”. A Fly.io app is
basically an endpoint, which consists of a DNS name (in the form
<code>${app_name}.fly.dev</code>), and a set of IP addresses. Behind these IP addresses
there are Fly.io load balancers that will forward requests to the machines
inside the app.</p>
<p>You can do that through the API like this:</p>
<div class="highlight"><pre><span></span><code>curl<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">"Authorization: Bearer </span><span class="k">$(</span><~/fly-token<span class="k">)</span><span class="s2">"</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s1">'Content-Type: application/json'</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span><span class="s1">'https://api.machines.dev/v1/apps'</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">'{ "app_name": "${app_name}", "org_slug": "personal" }'</span>
</code></pre></div>
<p>(Replace <code>${app_name}</code> with some identifier of your choice; I chose <code>remark42</code>
without knowing that this would have removed the possibility for other people
to register an app with the same name.)</p>
<p>IP addresses need to be manually allocated:</p>
<div class="highlight"><pre><span></span><code>fly<span class="w"> </span>ips<span class="w"> </span>allocate-v4<span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="w"> </span>--shared
fly<span class="w"> </span>ips<span class="w"> </span>allocate-v6<span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span>
</code></pre></div>
<p>The <code>--shared</code> option to <code>allocate-v4</code> tells Fly.io to allocate an IP address
that may be shared with other Fly.io apps, even outside of your
account/organization. Remove <code>--shared</code> if you want to use a dedicated IP, but
note that dedicated IPv4 addresses is a paid feature.</p>
<p>Allocating IPs is an important step: it can be done later, after creating the
machine, but it must be done, otherwise your machine will be unreachable and it
won’t be obvious why.</p>
<p>You should now create a persistent volume for your machine:</p>
<div class="highlight"><pre><span></span><code>fly<span class="w"> </span>volume<span class="w"> </span>create<span class="w"> </span>remark42_db_0<span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="w"> </span>--size<span class="o">=</span><span class="m">1</span>
</code></pre></div>
<p>This will display a warning about replication, but you can ignore it because,
sadly, Remark42 does not support replication.</p>
<p>Remark42 needs to be given a secret key (I guess for the purpose of signing
<a href="https://en.wikipedia.org/wiki/JSON_Web_Token">JWT tokens</a>). Fly.io has a handy
feature to manage secrets, and make them available to machines, albeit poorly
documented. You can set the Remark42 secret like this:</p>
<div class="highlight"><pre><span></span><code>fly<span class="w"> </span>secrets<span class="w"> </span><span class="nb">set</span><span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="w"> </span><span class="nv">SECRET</span><span class="o">=</span><span class="s1">'a very secret string'</span>
</code></pre></div>
<p>(You can generate a random secret string with a command like <code>cat /dev/urandom
| tr -Cd 'a-zA-Z0-9' | head -c64</code>, which means: get some random bytes, keep
only alphanumeric characters, get the first 64 characters.)</p>
<p>You may be wondering: how is the container running inside the machine supposed
to access this secret? The Fly.io documentation doesn’t say a word about it,
but after experimenting I was able to find that all the app secrets are passed
as environment variables, which is great, because this is exactly what Remark42
expects.</p>
<p><strong>Note: it’s important to set <code>SECRET</code> before creating the machine, or Remark42
will refuse to start.</strong></p>
<p>Now you’re ready to spin up the machine: create a configuration file for it…</p>
<div class="highlight"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"remark42-0"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"config"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"image"</span><span class="p">:</span><span class="w"> </span><span class="s2">"umputun/remark42:latest"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"env"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"SITE"</span><span class="p">:</span><span class="w"> </span><span class="s2">"andrea.corbellini.name"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"REMARK_URL"</span><span class="p">:</span><span class="w"> </span><span class="s2">"https://${app_name}.fly.dev"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"ALLOWED_HOSTS"</span><span class="p">:</span><span class="w"> </span><span class="s2">"'self',https://andrea.corbellini.name"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"AUTH_SAME_SITE"</span><span class="p">:</span><span class="w"> </span><span class="s2">"none"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"AUTH_ANON"</span><span class="p">:</span><span class="w"> </span><span class="s2">"true"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"AUTH_EMAIL_ENABLE"</span><span class="p">:</span><span class="w"> </span><span class="s2">"true"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"AUTH_EMAIL_FROM"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Andrea's Blog <hi@andrea.corbellini.name>"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"AUTH_EMAIL_SUBJ"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Andrea's Blog - Email Confirmation"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"NOTIFY_USERS"</span><span class="p">:</span><span class="w"> </span><span class="s2">"email"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"NOTIFY_ADMINS"</span><span class="p">:</span><span class="w"> </span><span class="s2">"email"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"NOTIFY_EMAIL_FROM"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Andrea's Blog <hi@andrea.corbellini.name>"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"ADMIN_SHARED_EMAIL"</span><span class="p">:</span><span class="w"> </span><span class="s2">"corbellini.andrea@gmail.com"</span><span class="p">,</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">"mounts"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"volume"</span><span class="p">:</span><span class="w"> </span><span class="s2">"${volume_id}"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"path"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/srv/var"</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">"services"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"ports"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"port"</span><span class="p">:</span><span class="w"> </span><span class="mi">443</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"handlers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="s2">"tls"</span><span class="p">,</span>
<span class="w"> </span><span class="s2">"http"</span>
<span class="w"> </span><span class="p">]</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"port"</span><span class="p">:</span><span class="w"> </span><span class="mi">80</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"handlers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="s2">"http"</span>
<span class="w"> </span><span class="p">]</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">"protocol"</span><span class="p">:</span><span class="w"> </span><span class="s2">"tcp"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"internal_port"</span><span class="p">:</span><span class="w"> </span><span class="mi">8080</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">"checks"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"httpget"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"http"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"port"</span><span class="p">:</span><span class="w"> </span><span class="mi">8080</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"method"</span><span class="p">:</span><span class="w"> </span><span class="s2">"GET"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"path"</span><span class="p">:</span><span class="w"> </span><span class="nt">"/ping"</span>
<span class="w"> </span><span class="nt">"interval"</span><span class="p">:</span><span class="w"> </span><span class="s2">"15s"</span><span class="p">,</span>
<span class="w"> </span><span class="nt">"timeout"</span><span class="p">:</span><span class="w"> </span><span class="s2">"10s"</span><span class="p">,</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="nt">"metadata"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">"fly_platform_version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"v2"</span><span class="p">,</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">}</span>
<span class="p">}</span>
</code></pre></div>
<p>…and give it to Fly.io:</p>
<div class="highlight"><pre><span></span><code>curl<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">"Authorization: Bearer </span><span class="k">$(</span><~/fly-token<span class="k">)</span><span class="s2">"</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s1">'Content-Type: application/json'</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span><span class="s2">"https://api.machines.dev/v1/apps/</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="s2">/machines"</span>
<span class="w"> </span>-d<span class="w"> </span>@config.json
</code></pre></div>
<p>There’s a lot here, so let me break it down for you:</p>
<ul>
<li>
<p><code>"image": "umputun/remark42:latest"</code>: this is the Docker image for Remark42.</p>
</li>
<li>
<p><code>"env": { ... }</code>: these are all the environment variables to pass to our
container. They are briefly documented on the <a href="https://remark42.com/docs/configuration/parameters/">Remark42
website</a>, and here’s a
bit more detailed explanation of some of them:</p>
<ul>
<li>
<p><code>"SITE": "andrea.corbellini.name"</code>: this is the internal identifier for the
site, it can be an arbitrary string, it won’t be visible, and you can omit
it.</p>
</li>
<li>
<p><code>"REMARK_URL": "https://${app_name}.fly.dev"</code>: this is the URL where
Remark42 will be serving requests from. I set it to the Fly.io app
endpoint. It’s <strong>important that you do not put a trailing slash</strong>, or
Remark42 will error out later on. It’s also important that the protocol
(http or https) matches your blog’s protocol, or Remark42 will refuse to
display comments (this makes local testing a bit annoying).</p>
</li>
<li>
<p><code>"ALLOWED_HOSTS": "'self',https://andrea.corbellini.name"</code>: this is the
list of sources that will be put into the <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy/frame-ancestors"><code>Content-Security-Policy:
frame-ancestors</code>
header</a>)
of HTTP responses. Essentially, this defines where the Remark42 comments
can be displayed.</p>
</li>
<li>
<p><code>"AUTH_SAME_SITE": "none"</code>: this disable the <a href="https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie#samesitesamesite-value">“same site” policy for
cookies</a>.
Disabling it is necessary because, in my setup, comments are served from
one domain (<code>remark42.fly.dev</code>) to another domain
(<code>andrea.corbellini.name</code>).</p>
</li>
<li>
<p><code>"AUTH_ANON": "true"</code>: allows anonymous commenters. You may or may not
want it.</p>
</li>
<li>
<p><code>"AUTH_EMAIL_ENABLE": "true"</code> and friends: allows email-based
authentication of commenters.</p>
</li>
<li>
<p><code>"NOTIFY_USERS" "email"</code>: allows readers and commenters to be notified of
new comments via email.</p>
</li>
<li>
<p><code>"NOTIFY_ADMINS" "email"</code> and <code>"ADMIN_SHARED_EMAIL":
"corbellini.andrea@gmail.com"</code>: makes Remark42 send me an email every
time there’s a new comment.</p>
</li>
</ul>
</li>
<li>
<p><code>"mounts": [ ... ]</code>: this tells Fly.io to attach the volume that you created
earlier to the container at the path <code>/srv/var</code>, which is what Remark42 uses
to store its database as well as daily backups.</p>
</li>
<li>
<p><code>"services": [ ... ]</code>: this tells Fly.io what to expose through the load
balancer. With the configuration that I provided, the Fly.io endpoint
(<code>${app_name}.fly.dev</code>) will provide both HTTP and HTTPS to the internet.
However, the load balancer will talk to the machine over plain HTTP on port
8080 (meaning that TLS is terminated at the load balancer).</p>
<p>I think in the future I will setup <a href="https://certbot.eff.org/">certbot</a>
inside the container so that I can do TLS termination on the machine, but
not today.</p>
</li>
<li>
<p><code>"checks": { ... }</code>: this tells Fly.io to check if the Remark42 daemon is
healthy by using its <code>/ping</code>endpoint.</p>
</li>
<li>
<p><code>"metadata": { "fly_platform_version": "v2" }</code>: this tells Fly.io to use a
“V2 machine”, or something like that. <strong>Setting this metadata is very
important, or certain things won’t work later on.</strong> The Fly.io documentation
doesn’t tell you to do it, but this is needed if you need to update the
environment variables or the secrets inside the machine.</p>
</li>
</ul>
<p>Note that all of this configuration can be changed at any time, so if you make
any mistakes or you just want to experiment, you don’t have to overly worry.
You can even destroy your machine and recreate it from scratch if you want.</p>
<p>To view the configuration of an existing machine use the following:</p>
<div class="highlight"><pre><span></span><code>curl<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">"Authorization: Bearer </span><span class="k">$(</span><~/fly-token<span class="k">)</span><span class="s2">"</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span><span class="s2">"https://api.machines.dev/v1/apps/</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="s2">/machines/</span><span class="si">${</span><span class="nv">machine_id</span><span class="si">}</span><span class="s2">"</span>
</code></pre></div>
<p>And to update it:</p>
<div class="highlight"><pre><span></span><code>curl<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">"Authorization: Bearer </span><span class="k">$(</span><~/fly-token<span class="k">)</span><span class="s2">"</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s1">'Content-Type: application/json'</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span><span class="s2">"https://api.machines.dev/v1/apps/</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="s2">/machines/</span><span class="si">${</span><span class="nv">machine_id</span><span class="si">}</span><span class="s2">"</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span>@new-config.json
</code></pre></div>
<p>I was also successful at changing configuration using <code>fly machines update</code>,
although it can’t be used for everything (for example: it can be used to <em>add</em>
or <em>change</em> environment variables, but not to <em>remove</em> them).</p>
<h1 id="testing-the-setup">Testing the setup</h1>
<p>If everything went well, you should be able to interact with Remark42 at
<code>https://${app_name}.fly.dev/web</code>. This should let you read and post new
comments.</p>
<h1 id="configuring-remark42-to-send-emails">Configuring Remark42 to send emails</h1>
<p>For sending emails, I chose to use <a href="https://elasticemail.com/">Elastic Email</a>,
which is an email-delivery service that supports SMTP with STARTTLS. Creating
an Elastic Email, setting up
<a href="https://en.wikipedia.org/wiki/DomainKeys_Identified_Mail">DKIM</a> and
<a href="https://en.wikipedia.org/wiki/Sender_Policy_Framework">SPF</a>, and obtaining
SMTP credentials was extremely easy with Elastic Email, so I won’t cover it
here.</p>
<p>Setting up email delivery with Remark42 is pretty easy once you have the SMTP
credentials. Set the necessary (non-secret) configuration like this:</p>
<div class="highlight"><pre><span></span><code>fly<span class="w"> </span>machines<span class="w"> </span>update<span class="w"> </span><span class="si">${</span><span class="nv">machine_id</span><span class="si">}</span><span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-e<span class="w"> </span><span class="nv">SMTP_HOST</span><span class="o">=</span>smtp.elasticemail.com<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-e<span class="w"> </span><span class="nv">SMTP_PORT</span><span class="o">=</span><span class="m">2525</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-e<span class="w"> </span><span class="nv">SMTP_STARTTLS</span><span class="o">=</span><span class="nb">true</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-e<span class="w"> </span><span class="nv">SMTP_USERNAME</span><span class="o">=</span>...
</code></pre></div>
<p>And then set the SMTP password as a Fly.io secret:</p>
<div class="highlight"><pre><span></span><code>fly<span class="w"> </span>secrets<span class="w"> </span><span class="nb">set</span><span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="w"> </span><span class="nv">SMTP_PASSWD</span><span class="o">=</span><span class="s1">'a very secret password'</span>
</code></pre></div>
<p>Doing both <code>machines update</code> and <code>secrets set</code> will automatically restart the
machine so that Remark42 can pick up the new configuration. Pretty neat, heh?</p>
<h1 id="configuring-authentication-providers-for-remark42">Configuring authentication providers for Remark42</h1>
<p>Remark42 can let your users log in from a variety of providers, including:
GitHub, Google, Facebook, Telegram, and more. There are specific instructions
for each provider in the <a href="https://remark42.com/docs/configuration/authorization/">Remark42
documentation</a>. There’s
really not much to add on top of what’s already written there. Just remember:
set non-secret environment variables with <code>fly machines update</code>, and set
secrets with <code>fly secrets set</code>.</p>
<h1 id="creating-an-administrator-account">Creating an administrator account</h1>
<p>If you want to be able to moderate comments, you’ll need an administrator
account. With Remark42, this is a 3 step process: first you create an account
(like any other user would do), then you copy the ID of the user you just
created, and lastly you add that user ID to the <code>ADMIN_SHARED_ID</code> environment
variable:</p>
<div class="highlight"><pre><span></span><code>fly<span class="w"> </span>machines<span class="w"> </span>update<span class="w"> </span><span class="si">${</span><span class="nv">machine_id</span><span class="si">}</span><span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="w"> </span>-e<span class="w"> </span><span class="nv">ADMIN_SHARED_ID</span><span class="o">=</span>...
</code></pre></div>
<p>As step-by-step guide is on the <a href="https://remark42.com/docs/manuals/admin-interface/">Remark42
documentation</a>.</p>
<h1 id="importing-comments-from-disqus-or-any-other-platform">Importing comments from Disqus (or any other platform)</h1>
<p>In order to import comments into Remark42, first you need to temporarily set an
“admin password” for Remark42 (here the word “admin” has nothing to do with the
administrator account you just created; it’s a totally separate concept):</p>
<div class="highlight"><pre><span></span><code>fly<span class="w"> </span>secrets<span class="w"> </span><span class="nb">set</span><span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="w"> </span><span class="nv">ADMIN_PASSWD</span><span class="o">=</span><span class="s1">'this is super secret'</span>
</code></pre></div>
<p>You can now copy your Disqus (or equivalent) backup on the machine and import
it. I could not find an easy way to do it through <code>flyctl</code> (but I also did not
spend too much time looking for an option), I did however find a way to open a
console on the machine, so what I did was simply copying and pasting the
base64-encoded backup:</p>
<div class="highlight"><pre><span></span><code><span class="c1"># on my laptop</span>
base64<span class="w"> </span><<span class="w"> </span>disqus-export.xml.gz<span class="w"> </span><span class="c1"># copy the output</span>
<span class="c1"># attach to the machine</span>
fly<span class="w"> </span>console<span class="w"> </span>--app<span class="o">=</span><span class="si">${</span><span class="nv">app_name</span><span class="si">}</span><span class="w"> </span>--machine<span class="o">=</span><span class="si">${</span><span class="nv">machine_id</span><span class="si">}</span>
<span class="c1"># on the machine</span>
<span class="nb">cd</span><span class="w"> </span>/srv/var
base64<span class="w"> </span>-d<span class="w"> </span>><span class="w"> </span>disqus-export.xml.gz<span class="w"> </span><span class="c1"># paste the output from earlier</span>
gunzip<span class="w"> </span>disqus-export.xml.gz
import<span class="w"> </span>--provider<span class="o">=</span>disqus<span class="w"> </span>--file<span class="o">=</span>/srv/var/disqus-export.xml<span class="w"> </span>--url<span class="o">=</span>http://localhost:8080
rm<span class="w"> </span>disqus-export.xml
</code></pre></div>
<p><strong>Note: importing comments will clear the Remark42 database.</strong> Any pre-existing
comment will be deleted. See also the <a href="https://remark42.com/docs/backup/migration/">Remark42
documentation</a> for more
information.</p>
<p>Another note: for some reason, my Disqus export referenced my blog posts using
<code>http://</code> URLs instead of <code>https://</code>. Because of that, Remark42 did correctly
import all the Disqus comments in its database, but would not display them
under my blog posts. Remember: Remark42 is very picky when it comes to URL
schemes. To fix this, I simply <a href="https://remark42.com/docs/backup/backup/">created a backup from
Remark42</a>, modified the backup to
change all <code>http</code> entries to <code>https</code>, and then <a href="https://remark42.com/docs/backup/restore/">restored the
backup</a>. This was quite trivial
given that the format used by the backups is extremely intuitive.</p>
<h1 id="final-remarks">Final remarks</h1>
<p>That was it!</p>
<p>Setting up Remark42 on Fly.io wasn’t particularly difficult, but it took me way
more time than expected due to the poor documentation of both Remark42 and
Fly.io. I had to resort to trial-and-error multiple times to make things work.</p>
<p>One big drawback of Remark42 is that it does not allow replication. This means
that:</p>
<ul>
<li>if the machine running my instance of Remark42 goes down, or becomes
unreachable for any reason, there will be downtime;</li>
<li>some people who are “far away” from the Remark42 instance may experience
higher latency than others;</li>
<li>I need to periodically take backups of my Remark42 database and copy it
somewhere, otherwise if my single storage volume is lost, I will lose all the
comments.</li>
</ul>
<p>Nonetheless I think both Remark42 and Fly.io are very interesting products. I
love Remark42’s features, and Fly.io is easy enough to use once you get
familiar with it. I think I’m gonna stick with them for a long time.</p>andreacorbelliniTue, 19 Sep 2023 02:12:00 +0000tag:andrea.corbellini.name,2023-09-19:/2023/09/19/running-remark42-on-flyio/miscMy journey from Disqus to Remark42https://andrea.corbellini.name/2023/09/05/disqus-to-remark42/<p>Readers of this blog might have noticed a few changes recently. For example,
I’ve been working on improving the look of the blog (maybe with questionable
results), as well as improving the experience on mobile. But one of the biggest
changes that perhaps some have noticed is that all of the comments on all of my
articles have suddenly disappeared since February 2023. Now, almost 7 months
later, all comments have finally been restored.</p>
<p>The reason for this 7 months blackout of comments is that I decided to change
the platform that hosts comments: I got rid of <a href="https://disqus.com/">Disqus</a>,
and eventually replaced it with <a href="https://remark42.com/">Remark42</a>. Here I will
describe why I did it. There will be another (more technical) blog post about
my new setup.</p>
<h1 id="premise">Premise</h1>
<p>My blog is a static website that has been using Disqus as a commenting platform
for a long time: since at least 2015 (8 years ago), or maybe even more (back
when my blog was on WordPress). Disqus at that time was gaining a lot of
popularity, it was free, and it was very attractive to me because easy to set
up. I might be wrong, but at that time, Disqus did not look to me like the
data-savvy, privacy-invading, revenue-oriented company that it is today. Maybe
I just naive, but so I kept using Disqus all these years without paying too
much attention to it: after all, it worked, so why would I spend any time
thinking about it?</p>
<h1 id="advertisements-on-my-blog">Advertisements on my blog!?</h1>
<p>Fast-forward to February 2023: one day, a person very close to me, with the
utmost kindness that characterizes her, came to me and said: “the ads on your
blog suck! They’re the worst kind of ads!”</p>
<p>At the beginning I had no idea what she was talking about. I have never
intentionally run any sort of advertisements on my blog. I hate advertisements!</p>
<p>Then I realized what was going on: precisely because I hate advertisements, I
run ad-blockers on all my devices. Maybe <em>there were</em> ads on my blog, but I
never noticed because I block those ads. The only third-party service that I
used to run on my blog was Disqus, so I immediately turned my attention to it.
I disabled my ad-blockers, refreshed my blog, scrolled down to the comments
section, and… the sad truth was revealed: Disqus was showing ads to my
readers. And yes, those ads were some of the worst kind of ads.</p>
<p>And I knew that, together with those ads, there was massive tracking,
collection of data, and maybe even data sharing with third-parties. People who
know me, know that I deeply care about privacy, and having Disqus on my blog
tracking my readers was the complete opposite of what I wanted.</p>
<p>I was extremely disappointed.</p>
<h1 id="leaving-disqus">Leaving Disqus</h1>
<p>I did some quick research and I discovered that (1) I could not disable Disqus
ads without paying, and (2) Disqus was no longer that nice commenting platform
that I met in 2015. It had mutated into something obsessed about revenue, and
it was clear that their business model was completely based on ads. My fears
about tracking were <a href="https://techcrunch.com/2021/05/05/disqus-facing-3m-fine-in-norway-for-tracking-users-without-consent/">quickly
confirmed</a>.
Let’s just say that Disqus turned out to something that does not really align
with my values.</p>
<p>I made the difficult decision to completely <a href="https://github.com/andreacorbellini/andreacorbellini.github.io/commit/4f0e450d31441cab387a0e70f884fb65f19693fd">remove
Disqus</a>
from my blog on the same day. But I firmly believe that <strong><a href="https://blog.codinghorror.com/a-blog-without-comments-is-not-a-blog/">a blog without
comments is not a
blog</a></strong>,
and so I <em>had</em> to find an alternative.</p>
<h1 id="looking-for-a-new-platform">Looking for a new platform</h1>
<p>I quickly started to look for new commenting platforms that could replace
Disqus. The basic criteria that this new platform had to meet were (in no
particular order):</p>
<ul>
<li>be free of charge</li>
<li>display only comments, no ads</li>
<li>respect the privacy of users</li>
<li>allow users to comment anonymously (at least to some extent)</li>
</ul>
<p>The last time that I searched for a commenting platform was in 2015. Back in
those days, there were not many solutions, and that’s one reason why I ended up
with Disqus. I thought: 8 years have passed since then, surely the space must
have improved, and alternatives must be proliferating, right? Well, no, not
really. I struggled to find a managed platform that met those criteria.</p>
<p>I did find some solutions that were using Mastodon or GitHub as a backend to
store comments, but I did not like at all the idea of forcing my readers to
have a Mastodon or GitHub account to comment on my blog.</p>
<h1 id="trying-cactus-comments">Trying Cactus Comments</h1>
<p>One platform that came up multiple times during my search was <a href="https://cactus.chat/">Cactus
Comments</a>. Quoting the homepage of the project:</p>
<blockquote>
<p>Cactus Comments is a federated comment system built on Matrix. It respects
your privacy, and puts you in control. The entire thing is completely free
and open source.</p>
</blockquote>
<p>That sounded interesting, although I did not really know what
<a href="https://matrix.org/">Matrix</a> was to begin with (if you, like me earlier this
year, do not know what Matrix is: it is a team communication platform, somewhat
similar to <a href="https://slack.com/">Slack</a>). I thought that I could give Cactus a
try. So, a few days after removing Disqus, I onboarded on Cactus Comments.</p>
<p>Onboarding was not hard, but it was not trivial either, mostly because I was
not familiar with Matrix. The frontend shown to readers was a bit
disappointing: even though Matrix supports threads, Cactus Comments does not.
Overall, the number of features that commenters could use was scarce: people
could only post a comment, and not much else; they had no ability to edit their
comments, or delete them. But it did allow people to post even without creating
a Matrix account, and that was great for me.</p>
<p>The “administrative interface” (if we can call it this way) was also
disappointing. All the administration and moderation had to be done through
Matrix, sometimes by communicating with a bot, and could not be done by
clicking buttons on my blog. Every blog post had to have its own Matrix channel
and I (the author) had to manually join each channel in order to get some sort
of notification for new comments.</p>
<p>I needed a Matrix client to spot new comments, and to perform moderation
actions, and I chose <a href="https://element.io/">Element</a> for that purpose. Sadly,
Element was totally unreadable on small displays like my phone. And apparently
there’s no web-based Matrix client that works well on mobile. I could have
installed an app for my phone, but I <em>hate</em> installing apps, especially for
activities that can in theory be done through a web browser.</p>
<p>Cactus Comments also did not support importing comments from Disqus, so moving
to this platform meant that all the conversations that happened over the years
on my blog were lost. But because Cactus Comments is free & open source
software, I thought that I could add support for importing comments from Disqus
if I decided to settle with Cactus Comments, so this was not a deal breaker.</p>
<p>Overall my experience with Cactus Comments was not great, but I was willing to
accept that in exchange for a platform that was free, managed by someone else,
and respecting the privacy of my readers.</p>
<p>There was however one big problem that eventually led me to remove Cactus
Comments from my blog: Cactus did not support sending email notifications. This
meant that if you left a comment on this blog, I would not get notified. And if
I responded to your comment, you would not get notified. In order to spot new
comments, I had to check the Matrix channels periodically, and readers and to
check my blog periodically. Maybe if I installed a Matrix app I could have
received push notifications on my phone, but that’s not what I wanted, and this
wouldn’t have solved the problem for my commenters anyway.</p>
<p>I was pretty bad at checking for new comments on Cactus. What happened multiple
times is that people would leave comments or questions on my blog, but I
wouldn’t notice until 2 weeks later. At that point, it was pointless for me to
respond because so much time had passed that those commenters surely wouldn’t
be checking my blog for a response…</p>
<p>I would say that with Cactus I had a blog that allowed <em>comments</em>, but did not
allow <em>conversations</em>. Not allowing conversations made the comments pointless
in my opinion. I might as well have had no comments at all: at least people
would stop leaving questions there that were destined to be unanswered, and
instead they would have emailed me directly.</p>
<h1 id="meet-remark42">Meet Remark42</h1>
<p>Between August and September 2023, I decided that I had to restart my quest for
a commenting platform. This time I knew that I had to look for a solution that
I had to install and manage myself. I was not super-excited about it, but from
my first search for a Disqus alternative, I couldn’t find any managed solution
that I really liked.</p>
<p>Initially I thought about writing my own commenting platform in Rust with a
key-value store, but then I figured that if I looked for a software to install
instead of a managed platform, maybe I could find something I liked.</p>
<p>After some research, I decided to go with <a href="https://remark42.com/">Remark42</a>.
There were a few contenders, but Remark42 won because it looked like it had
all of the features I needed, and more:</p>
<ul>
<li>it supports sending of email notifications, both to me, and to my readers;</li>
<li>it supports various authentication mechanisms, including: email, GitHub,
Google, Facebook, etc (it’s nice to give commenters a choice);</li>
<li>it supports leaving comments anonymously, without logging in or leaving an
email address;</li>
<li>commenters can edit and delete comments;</li>
<li>it supports importing comments from Disqus;</li>
<li>in fact, it supports importing comments from any platform: the format it uses
for restoring backups is JSON-based and very easy to replicate (in theory I
could import the comments from Cactus, even though I have not done that yet);</li>
<li>it’s privacy-focused, and it looks like it’s implemented with security in
mind.</li>
</ul>
<p>I decided to host it on <a href="https://fly.io/">Fly.io</a>, which offers some compute
and storage capacity for free. I was introduced to Fly.io on Mastodon, but I
had never used it before.</p>
<p>For sending emails, I chose <a href="https://elasticemail.com/">Elastic Email</a>, which
also offers the features I needed for free. I also had never used this service
before, and did not know much about it: it showed up while searching for a free
SMTP provider. Elastic Email describes itself as a marketing service, which
does not sound great from the point of view of privacy, but I figured that all
the emails being sent here contain only public information (all comments are
public after all), so there’s not much to protect besides email addresses. And
people are free to use temporary email providers like
<a href="https://www.mailinator.com/">Mailinator</a> if they don’t want to leave their
real email, or even leave no email address at all. (Should I be concerned about
Elastic Email, like I should have been concerned about Disqus? Let me know…
in the <a href="#comments">comments</a> below.)</p>
<p>Setting up Remark42 on Fly.io was relatively easy, but it took me way longer
than I had expected, mostly because the Fly.io documentation was quite
inconsistent and confusing, and also the Remark42 documentation was not fully
clear. In the end I managed to make everything work and I’m pretty happy with
the setup I ended up with. I’m going to publish details about my setup in a
future blog post, in case you’re interested (update: said blog post is now
<a href="https://andrea.corbellini.name/2023/09/19/running-remark42-on-flyio/">published</a>).</p>
<h1 id="conclusion">Conclusion</h1>
<p>That’s all I have to say for now! Remark42 has been running on my blog for a
few days, so it’s too early for me to say whether I’ll stick with it or I will
look for a new solution, but so far it looks very promising, and I’m very happy
with it. I hope this is the beginning of a long journey!</p>andreacorbelliniTue, 05 Sep 2023 08:30:00 +0000tag:andrea.corbellini.name,2023-09-05:/2023/09/05/disqus-to-remark42/miscOn ignoring mistakes, resilience, and the hidden dangers thereinhttps://andrea.corbellini.name/2023/03/18/mistakes/<p>As a scuba diver who often explores new places, I can say that I have found
myself in some dangerous situations, but I always made it back to the surface
without facing any negative consequences. Does this mean that I never made any
mistakes? Absolutely not: mistakes were made, and lessons were learned.</p>
<p>We can all agree that learning from mistakes is good, but sometimes, when
mistakes happen and consequences don’t manifest themselves immediately, we run
the risk of not noticing them, not learning from them, repeating them, and over
time developing a false sense of confidence, which can drive us to believe that
our repeated mistakes are actually good practices.</p>
<p>Why do we ignore mistakes? Because sometimes outcomes are positive even if we
make mistakes. “I made it out of water even this time, this means that my dive
was executed perfectly.” This is a common way of reasoning, but in reality,
things are much more complex than that. There is a difference between <em>correct
execution</em> and <em>successful outcome</em>, and the two should not be confused. In
fact, everyone should know from experience that goals can be achieved even if
the execution was sloppy and full of mistakes. Catastrophic consequences may
happen if we fail to see that.</p>
<p>An example of the consequences of ignoring mistakes is given by the two space
shuttle disasters: the <a href="https://en.wikipedia.org/wiki/Space_Shuttle_Challenger_disaster">Challenger disaster of
1986</a>, and the
<a href="https://en.wikipedia.org/wiki/Space_Shuttle_Columbia_disaster">Columbia disaster of
2003</a>. Both
these instances were caused by NASA leadership ignoring the concerns from the
engineering teams. Problems that occurred in previous shuttle launches should
have been a wake-up call for NASA leadership. Instead, all the previous
successful launches and re-entries <em>despite</em> the problems were seen as
accomplishments, and nourished the leadership’s overconfidence. “We have made
it this time too, this means that all those concerns that engineers raised were
excessive.”</p>
<p>The tendency of diverting from proper procedures, dismissing valid concerns,
and ignoring problems, has a name: it’s called <a href="https://en.wikipedia.org/wiki/Normalization_of_deviance">normalization of
deviance</a>. The driving
force of normalization of deviance is overconfidence and the false belief that
positive outcomes are inherently caused by correct executions.</p>
<p>Overconfidence and normalization of deviance can spread like a virus in an
organization. It is important to be vigilant for signs of overconfidence in
individuals, before it infects other people. I once had to deal with a manager
who was a self-declared micromanager (and proud to be) but lacked technical
foundations and knowledge of the product. He would consistently and quickly
dismiss anything that he did not understand, and focus on short-term goals of
questionable usefulness. Whenever his team would accomplish a goal, he would
send a pumped-up announcement, often containing inaccuracies, and carefully
skipping over the shortcomings of the solutions implemented. Given the apparent
success of this management style, other managers started to follow his example.
Soon after (in less than a year), the entire organization became a toxic
environment where raising even the minimal concern was seen as an attack on the
“great new vision”.</p>
<p>I see many parallels between this manager story and what is happening with
‘Twitter 2.0’ right now (although, I must say, in my case engineers did not get
fired on the spot for speaking the truth). And with that manager, just like
with ‘Twitter 2.0’, whenever problems occurred, those problems would either be
ignored or blamed on the preexisting components built before the manager
joined, never on the new, careless developments.</p>
<p>The truth however was that problems that occurred had been preannounced weeks,
or months before, but concerns around them had been promptly dismissed due to
being too challenging to address, and because “everything works right now, so
that’s not a concern”.</p>
<p>The idea that everything must be correct because everything works, goals are
achieved, and outcomes are successful, is a dangerous idea that can potentially
have catastrophic consequences. It’s important to be critical and analytical,
regardless of the outcome. This does not mean that success shouldn’t be
celebrated, but that mistakes should be captured so that lessons can be learned
from them, <em>even if</em> the final outcome was successful. Not learning from
mistakes does not allow us to advance, and on the contrary can only lead us to
repeat them. And if we keep repeating the same mistakes, sooner or later, those
will have some negative consequences.</p>
<p>A common practice in the aviation industry is to write reports on incidents,
close calls, and near misses, whenever they occur, even if the flight was
concluded successfully and no injuries or damages occurred. These reports are
collected in databases like the <a href="https://asrs.arc.nasa.gov/">Aviation Safety Reporting
System</a> (which can be freely consulted online), so
that flight safety experts and regulators can identify common failure scenarios
and eventually introduce mechanisms to improve safety in the aviation industry.
A key element of these reports is that they are not meant to put the blame on
certain people, but rather focus on what chain of events led to a certain
mistake. “Human mistake” is generally not a valid root cause: if a human was
able to make a mistake, it means that a mechanism is missing that can either
prevent the mistake or detect it before it causes any negative consequences.</p>
<p>Some companies in other industries have similar processes for writing reports
or retrospectives when a mistake happens (regardless of the outcome), with the
goal of finding proper root causes and preventing future mistakes. Amazon with
its <a href="https://aws.amazon.com/blogs/mt/why-you-should-develop-a-correction-of-error-coe/">Correction of
Error</a>
practice is a famous example.</p>
<p>I think introducing these practices in an organization can help to establish a
healthy culture where finding mistakes and raising concerns is encouraged,
rather than being oppressed. However these practices, by themselves, may not be
enough to ensure that such a culture can be maintained over time, because
people can always disagree on what is considered a ‘mistake’. Empathy is
probably the key to a truly healthy culture that allows people to learn and
advance.</p>
<p>There are also cases where we are aware of problems, and we see them as such,
but we deliberately choose not to do anything about them. This is where
resilience comes into play.</p>
<p>Resilience is generally a good quality to have. Resilience can give us the
strength to go through long-term hardships, and can have positive effects on
our tenancy and determination. But even resilience, when taken to the extreme,
can be dangerous. Resilience can lead us to ignore problems, and not react to
them. Resilience can make us tolerate a negative situation, without finding a
proper strategy to cope with it.</p>
<p>Poor planning forces you to consistently work extra hours? Resist and keep
going, until you burn out. The relationship with your partner doesn’t satisfy
you? Resist and think that things will get better, while the relationship
slowly deteriorates. Feel pain in your knee every time you run for more than 30
minutes? Resist and don’t go to see a doctor, the pain will go away, sooner or
later… until you cannot run anymore.</p>
<p>When we let resilience become an excuse to avoid solving problems, we can end
up in situations from which it’s difficult to recover.</p>
<p>It’s important to make a distinction between what is under our control and what
is not. We can fix problems that are under our control, but in situations where
we cannot directly change the course of things, finding an alternative strategy
is the only way. Resisting and hoping that things will get better often does
not give the expected outcome–on the contrary, it can be detrimental.</p>
<p>In the end, I think that the ‘practice’ of ignoring mistakes (because of the
overconfidence built from successful outcomes) or ignoring problems (because of
resilience taken to the extreme) are hidden time bombs, silently ticking,
waiting for the right conditions before exploding. We need to be aware that
just because things seem to work today, it doesn’t mean that we’re making the
right decisions, and this can have consequences in the future. Being critical,
analytical, empathetic, and honest is important to avoid these behaviors and
the dangers that come with them.</p>andreacorbelliniSat, 18 Mar 2023 08:20:00 +0000tag:andrea.corbellini.name,2023-03-18:/2023/03/18/mistakes/miscAuthenticated encryption: why you need it and how it workshttps://andrea.corbellini.name/2023/03/09/authenticated-encryption/<p>In this article I want to explore a common problem of modern cryptographic
ciphers: malleability. I will explain that problem with some hands-on examples,
and then look in detail at how that problem is solved through the use of
authenticated encryption. I will describe in particular two algorithms that
provide authenticated encryption: ChaCha20-Poly1305 and AES-GCM, and briefly
mention some of their variants.</p>
<h1 id="the-problem">The problem</h1>
<p>If we want to encrypt some data, a very common approach is to use a symmetric
cipher. When we use a symmetric cipher, we hold a <strong>secret key</strong>, which is
generally a sequence of bits chosen at random of some fixed length (nowadays
ranging from 128 to 256 bits). The symmetric cipher takes two inputs: the
secret key, and the message that we want to encrypt, and produces a single
output: a ciphertext. Decryption is the inverse process: it takes the secret
key and the ciphertext as the input and yields back the original message as an
output. With symmetric ciphers, we use the same secret key both to encrypt and
decrypt messages, and this is why they are called symmetric (this is in
contrast with public key cryptography, or asymmetric cryptography, where
encryption and decryption are performed using two different keys: a public key
and a private key).</p>
<p>Generally speaking, symmetric ciphers can be divided into two big families:</p>
<ul>
<li><strong>stream ciphers</strong>, which can encrypt data bit-by-bit;</li>
<li><strong>block ciphers</strong>, which can encrypt data block-by-block, where a block has a
fixed size (usually 128 bits).</li>
</ul>
<p>As we will discover soon, both these two families exhibit the same fundamental
problem, although they slightly differ in the way this problem manifests
itself. To understand this problem, let’s take a close look at how these two
families of algorithms work and how we can manipulate the ciphertexts they
produce.</p>
<h2 id="stream-ciphers">Stream ciphers</h2>
<p>A good way to think of a stream cipher is as a deterministic random number
generator that yields a sequence of random bits. The <strong>secret key</strong> can be
thought of as the <em>seed</em> for the random number generator. Every time we
initialize the random number generator with the same secret key, we will get
exactly the same sequence of random bits out of it.</p>
<p>The bits coming out of the random number generator can then be XOR-ed together
with the data that we want to encrypt: <code><span
style="color:#204a87">ciphertext</span> = <span style="color:#a40000">random
sequence</span> XOR <span style="color:#4e9a06">message</span></code>, like in
the following example:</p>
<pre>
<span style="color:#a40000">random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F</span>
⊕
<span style="color:#4e9a06">message: I would really like an ice cream right now</span>
=
<span style="color:#204a87">ciphertext: zB686Y0H46144HwT9RQQR6vZV1gU1779n390ZCqXV1</span>
</pre>
<p>The XOR operator acts as a toggle that can either flip bits or keep them
unchanged. Let me explain with an example:</p>
<ul>
<li><code>a XOR 0 = a</code></li>
<li><code>a XOR 1 = NOT a</code></li>
</ul>
<p>If we XOR “something” with a 0 bit, we get “something” out; if we XOR
“something” with a 1 bit, we get the opposite of “something”. And if we use the
same toggle twice, we return to the initial state:</p>
<ul>
<li><code>a XOR b XOR b = a</code></li>
</ul>
<p>This works for any <code>a</code> and any <code>b</code> and it’s due to the fact that <code>b XOR b</code> is
always equal to 0. In more technical terms, each input is its own self-inverse
under the XOR operator.</p>
<p>The self-inverse property gives us a way to decrypt the message that we
encrypted above: all we have to do is to replay the random sequence and XOR it
together with the ciphertext:</p>
<pre>
<span style="color:#a40000">random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F</span>
⊕
<span style="color:#204a87">ciphertext: zB686Y0H46144HwT9RQQR6vZV1gU1779n390ZCqXV1</span>
=
<span style="color:#4e9a06">message: I would really like an ice cream right now</span>
</pre>
<p>This works because <code><span style="color:#204a87">ciphertext</span> = <span
style="color:#a40000">random sequence</span> XOR <span
style="color:#4e9a06">message</span></code>, therefore <code><span
style="color:#a40000">random sequence</span> XOR <span
style="color:#204a87">ciphertext</span> = <span style="color:#a40000">random
sequence</span> XOR <span style="color:#a40000">random sequence</span> XOR
<span style="color:#4e9a06">message</span></code>. The two <code><span
style="color:#a40000">random sequence</span></code> are the same, so they
cancel each other (self-inverse), leaving only <code><span
style="color:#4e9a06">message</span></code>:</p>
<pre>
<span style="color:#a40000">random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F</span>
⊕
<span style="color:#a40000">random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F</span>
⊕
<span style="color:#4e9a06">message: I would really like an ice cream right now</span>
=
<span style="color:#4e9a06">message: I would really like an ice cream right now</span>
</pre>
<p>Only the owner of the secret key will be able to generate the random sequence,
therefore only the owner of the secret key should, in theory, be able to
recover the message using this method.</p>
<h3 id="playing-with-stream-ciphers">Playing with stream ciphers</h3>
<p>The self-inverse property not only allows us to recover the message from the
random sequence and the ciphertext, but it also allows us to recover the random
sequence if can correctly guess the message:</p>
<pre>
<span style="color:#4e9a06">message: I would really like an ice cream right now</span>
⊕
<span style="color:#204a87">ciphertext: zB686Y0H46144HwT9RQQR6vZV1gU1779n390ZCqXV1</span>
=
<span style="color:#a40000">random sequence: 3bAWC5ThFSPXX1W8P94q3XV35TG6CRVTNAPW27Q69F</span>
</pre>
<p>This “feature” opens the door to at least two serious problems. If we are able
to correctly guess the message or a portion of it, then we can:</p>
<ol>
<li><strong>decrypt</strong> other ciphertexts produced by the same secret key (or at least
portions of them, depending on what portions of the random sequence we were
able to recover);</li>
<li><strong>modify</strong> ciphertexts.</li>
</ol>
<p>And we can do all of this without any knowledge of the secret key.</p>
<p>The first problem implies that key reuse is forbidden with stream ciphers.
Every time we want to encrypt something with a stream cipher, we need a new
key. This problem is easily solved by the use of a <em>nonce</em> (also known as
<em>initialization vector</em>, <em>IV</em>, or <em>starting variable</em>, <em>SV</em>): a random value
that is generated before every encryption, and that is combined in some way
with the secret key to produce a new value to initialize the random number
generator. If the nonce is unique per encryption, then we can be sufficiently
confident that the random sequence generated will also be unique. The nonce
value does not necessarily need to be kept secret, and needs to be known at
decryption time. Nonces are usually generated at random at encryption time and
stored alongside the ciphertext.</p>
<p>The second problem is a bit more subtle: if we have a ciphertext and we can
correctly guess the original message that produced it, we can modify it using
the XOR operator to “cancel” the original message and “insert” a new message,
like in this example:</p>
<pre>
<span style="color:#3465a4">ciphertext: zB686Y0H46144HwT9RQQ<strong style="color:#204a87">R6vZV1gU1779</strong>n390ZCqXV1</span>
⊕
<span style="color:#73d216">message: I would really like <strong style="color:#4e9a06">an ice cream</strong> right now</span>
⊕
<span style="color:#73d216">altered message: I would really like <strong style="color:#4e9a06">to go to bed</strong> right now</span>
=
<span style="color:#3465a4">tampered ciphertext: zB686Y0H46144HwT9RQQ<strong style="color:#204a87">G7vTZt3Yc030</strong>n390ZCqXV1</span>
</pre>
<p>This message, when correctly decrypted with the secret key, will return the
tampered ciphertext <em>without detection!</em></p>
<p>Note that I do not need to know the full message to carry out this technique,
in fact, the following example (where unknown parts have been replaced by
hyphens) produces the same result as the above one:</p>
<pre>
<span style="color:#3465a4">ciphertext: zB686Y0H46144HwT9RQQ<strong style="color:#204a87">R6vZV1gU1779</strong>n390ZCqXV1</span>
⊕
<span style="color:#73d216">message: --------------------<strong style="color:#4e9a06">an ice cream</strong>----------</span>
⊕
<span style="color:#73d216">altered message: --------------------<strong style="color:#4e9a06">to go to bed</strong>----------</span>
=
<span style="color:#3465a4">tampered ciphertext: zB686Y0H46144HwT9RQQ<strong style="color:#204a87">G7vTZt3Yc030</strong>n390ZCqXV1</span>
</pre>
<p>This problem is known as
<a href="https://en.wikipedia.org/wiki/Malleability_(cryptography)"><strong>malleability</strong></a>,
and it’s a serious issue in the real world because most of the messages that we
exchange are in practice relatively easy to guess.</p>
<p>Suppose for example that I have control over a WiFi network, and I can inspect
and alter the internet traffic that passes through it. Suppose that I know that
a person connected to my WiFi network is visiting an e-commerce website and
that they’re interested in a particular item. The traffic that your browser
exchanges with the e-commerce website may be encrypted, and therefore I won’t
be able to decrypt its contents, but I might be able to guess certain parts of
it, like the HTTP headers sent by the website, or some parts of the HTML that
are common to all pages on that website, or even the name and the price of the
item you want to buy. If I can guess that information (which is public
information, not a secret, and it’s generally easy to guess), then I might be
able to alter some parts of the web page, showing you false information, and
altering the price that you see in an attempt to trick you into buying that
item.</p>
<details>
<summary>An example of malleability using ChaCha20 with OpenSSL</summary>
<p>Here’s a practical example of how we can take the output of a stream cipher,
and alter it as we wish without knowledge of the secret key. I’m going to use
the OpenSSL command line interface to encrypt a message with a stream cipher:
<a href="https://en.wikipedia.org/wiki/ChaCha20">ChaCha20</a>. This is a modern, fast,
stream cipher with a good reputation:</p>
<div class="highlight"><pre><span></span><code>openssl<span class="w"> </span>enc<span class="w"> </span>-chacha20<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-K<span class="w"> </span>0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-iv<span class="w"> </span>0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-in<span class="w"> </span><<span class="o">(</span><span class="nb">echo</span><span class="w"> </span><span class="s1">'I would really like an ice cream right now'</span><span class="o">)</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-out<span class="w"> </span>ciphertext.bin
</code></pre></div>
<p>The <code>-K</code> option specifies the key in hexadecimal format (256 bits, or 32 bytes,
or 64 hex characters), the <code>-iv</code> is the nonce, also known as <em>initialization
vector</em> (128 bits, or 16 bytes, or 32 hex characters).</p>
<p>This trivial Python script can tamper with the ciphertext:</p>
<div class="highlight"><pre><span></span><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'ciphertext.bin'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">ciphertext</span> <span class="o">=</span> <span class="n">file</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">guessed_message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'--------------------an ice cream----------</span><span class="se">\n</span><span class="s1">'</span>
<span class="n">replacement_message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'--------------------to go to bed----------</span><span class="se">\n</span><span class="s1">'</span>
<span class="n">tampered_ciphertext</span> <span class="o">=</span> <span class="nb">bytes</span><span class="p">(</span><span class="n">x</span> <span class="o">^</span> <span class="n">y</span> <span class="o">^</span> <span class="n">z</span> <span class="k">for</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span> <span class="ow">in</span>
<span class="nb">zip</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">,</span> <span class="n">guessed_message</span><span class="p">,</span> <span class="n">replacement_message</span><span class="p">))</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'tampered-ciphertext.bin'</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">file</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">tampered_ciphertext</span><span class="p">)</span>
</code></pre></div>
<p>This script is using partial knowledge of the message. It knows (thanks to an
educated guess) that the original message contained the words “an ice cream” at
a specific offset, and uses that knowledge to replace those words with new ones
(“to go to bed”) which add up to the same length. Note that this technique
cannot be used to remove or add parts from the message, only to modify them
without changing their length.</p>
<p>Now if we run this script and we decrypt the <code>tampered-ciphertext.bin</code> file
with the same key and nonce as before, we get “to go to bed” instead of “an ice
cream”, without any error indicating that tampering occurred:</p>
<div class="highlight"><pre><span></span><code>openssl<span class="w"> </span>enc<span class="w"> </span>-d<span class="w"> </span>-chacha20<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-K<span class="w"> </span>0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-iv<span class="w"> </span>0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-in<span class="w"> </span>tampered-ciphertext.bin
</code></pre></div>
</details>
<h2 id="block-ciphers">Block ciphers</h2>
<p>We have seen that stream ciphers alone have a serious problem (malleability)
that allows anyone to modify arbitrary portions of ciphertexts without
detection. Let’s take a look at the alternative: block ciphers. Will they have
the same problem?</p>
<p>While a stream cipher can encrypt variable amounts of data, a block cipher can
only take as input a block of data of a fixed size, and produce as output
another block of data. A good block cipher produces an output that is
indistinguishable from random.</p>
<p>The block size is generally small, usually 128 bits (16 bytes), so if we want
to encrypt larger amounts of data, we have to split the data into multiple
blocks, and encrypt each block individually. If the data is too short to fit in
a block, the data will also need to be padded.</p>
<pre>
message: <span style="color:#4e9a06">The cat is on th</span> <span style="color:#204a87">e table.........</span>
|______________| |______________|
<span style="color:#4e9a06">block #1</span> <span style="color:#204a87">block #2</span>
<span style="color:#204a87">(padded)</span>
ciphertext: <span style="color:#73d216">c2TNPW3r09hZ6f1P</span> <span style="color:#3465a4">Vc32VX41XSy579Y9</span>
</pre>
<p>This approach however has a problem: if we encrypt multiple blocks with the
same secret key, then portions of messages that are the repeated will produce
the same output. This gives the ability to analyze a ciphertext and find
patterns in it without knowledge of the secret key. This problem is famously
evident when encrypting pictures:</p>
<figure>
<img src="https://andrea.corbellini.name/images/lack-of-diffusion.webp" alt="Ubuntu logo before and after encryption with a block cipher">
<figcaption>Example of applying a block cipher to an uncompressed image. The original colors are lost, but the overall layout of the image is still understandable. That's because multiple blocks of the image (containing the RGB values of each pixel), for example from the white background, are repeated multiple times, yielding the same exact encrypted blocks. The inspiration for making this image came from <a href="https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation">Wikipedia</a>.</figcaption>
</figure>
<details>
<summary>How this image was generated</summary>
<p>Before jumping into how I encrypted the image, let me spend a few words on how
I did NOT encrypt the image: I did not use a modern image format. Modern image
formats are very sophisticated, they’re not a simple sequence of RGB values.
Instead, they have some control structures mixed in the image, they implement
compression to reduce the image size, etc. This complexity means that if I
simply take an image in any format and encrypt it, the result won’t be
visualizable by an image viewer: the image viewer would just throw an error
because it would find invalid data structures.</p>
<p>Note that this does not imply that encrypting modern image formats is more
secure: people can still analyze patterns in them, but it simply means that a
modern image format, once encrypted, would not produce the sensational
visualization that I showed above.</p>
<p>In order to produce this visualization I had to find an uncompressed image
format without too much metadata in it. Thankfully the Wikipedia article on
<a href="https://en.wikipedia.org/wiki/Image_file_format">image file formats</a> provided
a list, which included the <a href="https://en.wikipedia.org/wiki/Netpbm">Netpbm</a>
family of formats (something I never heard of before). Among the formats in
this family, I chose PPM, because it’s the one that supports colors.</p>
<p>The PPM file format is very simple: it has 3 lines of metadata, followed by the
RGB values for each pixel. No compression. Definitely the right format for this
kind of experiment!</p>
<p>So here’s what I did: first of all I downloaded an image (the Ubuntu “Circle of
Friends” logo, obtained from Wikipedia) and converted it to PPM with
<a href="https://imagemagick.org/script/convert.php">ImageMagick</a>:</p>
<div class="highlight"><pre><span></span><code>convert<span class="w"> </span>UbuntuCoF.png<span class="w"> </span>img.ppm
</code></pre></div>
<p>I separated the header from the RGB values:</p>
<div class="highlight"><pre><span></span><code>head<span class="w"> </span>-n3<span class="w"> </span>img.ppm<span class="w"> </span>><span class="w"> </span>ppm-header
tail<span class="w"> </span>-n+4<span class="w"> </span>img.ppm<span class="w"> </span>><span class="w"> </span>ppm-image
</code></pre></div>
<p>The reason why I separated the header from RGB values is that I won’t encrypt
the header. If I did, then the image won’t be visualizable by an image viewer,
just like if I used a modern image format. In a real-world scenario, a person
would be able to easily guess the header if it was encrypted.</p>
<p>I encrypted the RGB values with
<a href="https://en.wikipedia.org/wiki/Advanced_Encryption_Standard">AES-256</a>, a
modern, strong block cipher with a good reputation:</p>
<div class="highlight"><pre><span></span><code>openssl enc -aes-256-ecb \
-K 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef \
-in ppm-image \
-out ppm-image-encrypted
</code></pre></div>
<p>Then I joined the header and the encrypted RGB values in a PPM file:</p>
<div class="highlight"><pre><span></span><code>cat ppm-header ppm-image-encrypted > img-encrypted.ppm
</code></pre></div>
<p>This results in a randomized image that can be viewed without problems on an
image viewer. It’s interesting to see that, if you change the encryption key,
you will get a different randomized image!</p>
</details>
<p>The problem we have just seen is known as <strong>lack of diffusion</strong>. This is kinda
analogous to the first problem we identified with stream ciphers: at the root
of both problems there is key reuse. We solved this problem for stream ciphers
by combining a key with a random nonce. We could use the same strategy here,
but it would be an expensive approach, as initializing a block cipher with a
new key is a relatively expensive operation. It’s much cheaper to initialize
the block cipher once, and reuse it for every block encryption. We need a way
to “link” blocks to each other, so that if two linked blocks contain the same
plaintext, their encryption will give different results.</p>
<p>There are various strategies do that. These strategies are known as <a href="https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation"><em>mode of
operation</em></a> of
block ciphers. Let’s take a look at two of them: <strong>Cipher Block Chaining
(CBC)</strong> and <strong>Counter Mode (CTR)</strong>.</p>
<h3 id="cipher-block-chaining-cbc">Cipher Block Chaining (CBC)</h3>
<p>This mode of operation, as the name suggests, ‘chains’ each block to the next
one. The way it works is by using the XOR operator in the following way:</p>
<ul>
<li>
<p>First of all, a random nonce is generated. The purpose of the nonce is the
same as before (with stream ciphers): ensuring that using the same secret
key to perform multiple encryptions yields different results each time, so
that secret information or patterns are not revealed.</p>
<p>The nonce does not need to be kept secret and is normally stored alongside
the ciphertext so that it can be easily used during decryption. It is
however important that the nonce is unique.</p>
</li>
<li>
<p>The first block of message <code>m[0]</code> is XOR-ed with the nonce, and then
encrypted the block cipher, producing the first block of ciphertext <code>c[0] =
block_encrypt(m[0] XOR nonce)</code></p>
</li>
<li>
<p>The second block of message <code>m[1]</code> is XOR-ed with <code>c[0]</code>, and then
encrypted with the block cipher: <code>c[1] = block_encrypt(m2 XOR c[0])</code></p>
</li>
<li>
<p>…</p>
</li>
<li>
<p>The last block of message <code>m[n]</code> is XOR-ed together with <code>c[n-1]</code>, and then
encrypted with the block cipher: <code>c[n] = block_encrypt(m[n] XOR c[n-1])</code></p>
</li>
</ul>
<figure>
<img src="https://andrea.corbellini.name/images/cipher-block-chaining.svg" alt="Visualization of the Cipher Block Chaining (CBC) mode of operation">
</figure>
<p>The XOR operator is back! With stream ciphers, the XOR operator was allowing us
to tamper with ciphertexts. Can we do the same thing here? Yes of course! The
approach is slightly different though: instead of acting directly on the block
that we want to change, we will act on the block that precedes it.</p>
<p>For example, if we want to change the sentence “I came home in the afternoon
and the cat was on the table” so that it reads ‘dog’ instead of ‘cat’, we would
need to change the block right before the one that contains the word ‘cat’. If
we want to change the very first block, for example to change the word ‘came’
to ‘left’, we would need to change the nonce instead.</p>
<pre>
<span style="color:#75507b">nonce</span>+<span style="color:#3465a4">ciphertext</span>: <span style="color:#75507b">yz<strong style="color:#5c3566">URZR</strong>bP6X1w3ZRL</span> <span style="color:#3465a4">XRDnPbEkx3JUP2Fv C2ZWt<strong style="color:#204a87">19E</strong>dAXDi76H pkbk8qTgaSdzerbF 8CWYqscBqE6cSLmx</span>
⊕
<span style="color:#73d216">message</span> (shifted 1 block to the left): <span style="color:#73d216">I <strong style="color:#4e9a06">came</strong> home in t he afternoon and the <strong>cat</strong> was on the table.......</span>
⊕
<span style="color:#73d216">altered message</span> (shifted 1 block to the left): <span style="color:#73d216">I <strong style="color:#4e9a06">left</strong> home in t he afternoon and the <strong>dog</strong> was on the table.......</span>
=
tampered <span style="color:#75507b">nonce</span>+<span style="color:#3465a4">ciphertext</span>: <span style="color:#75507b">yz<strong style="color:#5c3566">ZVQC</strong>bP6X1w3ZRL</span> <span style="color:#3465a4">XRDnPbEkx3JUP2Fv C2ZWt<strong style="color:#204a87">67V</strong>dAXDi76H pkbk8qTgaSdzerbF 8CWYqscBqE6cSLmx</span>
</pre>
<p>If we do the above, and then decrypt the tampered ciphertext, we will get
something like this:</p>
<div class="highlight"><pre><span></span><code>I left home in t���������������the dog was on the table
</code></pre></div>
<details>
<summary>How to get this result using AES-CBC with OpenSSL</summary>
<p>Here’s a step-by-step guide on how to tamper with a ciphertext encrypted with
AES-256 in CBC mode.</p>
<p>First, generate a valid ciphertext:</p>
<div class="highlight"><pre><span></span><code>openssl<span class="w"> </span>enc<span class="w"> </span>-aes-256-cbc<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-K<span class="w"> </span>0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-iv<span class="w"> </span>0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-in<span class="w"> </span><<span class="o">(</span><span class="nb">echo</span><span class="w"> </span><span class="s1">'I came home in the afternoon and the cat was on the table'</span><span class="o">)</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-out<span class="w"> </span>ciphertext.bin
</code></pre></div>
<p>Like in the stream cipher example, <code>-K</code> is the key in hexadecimal format (256
bits), while <code>-iv</code> is the nonce (128 bits).</p>
<p>We can perform the tampering with this Python script:</p>
<div class="highlight"><pre><span></span><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'ciphertext.bin'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">ciphertext</span> <span class="o">=</span> <span class="n">file</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">guessed_message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'---------------------cat----------------------------------------'</span>
<span class="n">replacement_message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'---------------------dog----------------------------------------'</span>
<span class="n">tampered_ciphertext</span> <span class="o">=</span> <span class="nb">bytes</span><span class="p">(</span><span class="n">x</span> <span class="o">^</span> <span class="n">y</span> <span class="o">^</span> <span class="n">z</span> <span class="k">for</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span> <span class="ow">in</span>
<span class="nb">zip</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">,</span> <span class="n">guessed_message</span><span class="p">,</span> <span class="n">replacement_message</span><span class="p">))</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'tampered-ciphertext.bin'</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">file</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">tampered_ciphertext</span><span class="p">)</span>
</code></pre></div>
<p>Note that OpenSSL does not store the nonce along with the ciphertext, but
instead expects it to be passed as a command line argument. We need to modify
it separately, so here’s another Python script just for the nonce:</p>
<div class="highlight"><pre><span></span><code><span class="n">nonce</span> <span class="o">=</span> <span class="nb">bytes</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span><span class="s1">'0123456789abcdef0123456789abcdef'</span><span class="p">)</span>
<span class="n">guessed_message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'--came----------'</span>
<span class="n">replacement_message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'--left----------'</span>
<span class="n">tampered_nonce</span> <span class="o">=</span> <span class="nb">bytes</span><span class="p">(</span><span class="n">x</span> <span class="o">^</span> <span class="n">y</span> <span class="o">^</span> <span class="n">z</span> <span class="k">for</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span> <span class="ow">in</span>
<span class="nb">zip</span><span class="p">(</span><span class="n">nonce</span><span class="p">,</span> <span class="n">guessed_message</span><span class="p">,</span> <span class="n">replacement_message</span><span class="p">))</span>
<span class="nb">print</span><span class="p">(</span><span class="n">tampered_nonce</span><span class="o">.</span><span class="n">hex</span><span class="p">())</span>
</code></pre></div>
<p>If we run that script, we get: <code>01234a6382bacdef0123456789abcdef</code>.</p>
<p>Now to decrypt the tampered ciphertext with the tampered nonce:</p>
<div class="highlight"><pre><span></span><code>openssl<span class="w"> </span>enc<span class="w"> </span>-d<span class="w"> </span>-aes-256-cbc<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-K<span class="w"> </span>0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-iv<span class="w"> </span>01234a6382bacdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-in<span class="w"> </span>tampered-ciphertext.bin
</code></pre></div>
</details>
<p>It’s interesting to see that we were successful in changing the word ‘cat’ to
‘dog’ but in doing so we had to sacrifice a block, which, when decrypted,
resulted in random bytes.</p>
<p>In a real world scenario, seeing some random bytes could raise some suspicion,
and maybe generate some errors in applications, however that’s not always the
case (how many times have we seen garbled text on our monitors, and we never
worried that somebody was tampering with our communications). Also, when
dealing with formats like HTML, one could conceal tampering attempts using
comment blocks, or using JavaScript. One example of what I’m describing is the
<a href="https://efail.de/">EFAIL vulnerability</a>: discovered in 2017, it affected some
popular email clients including Gmail, it targeted the use of AES in CBC mode
(as well as another mode very similar to it: Cipher Feedback, CFB), and allowed
the injection of malicious content in HTML emails.</p>
<p>We can conclude that block ciphers in CBC mode, just like stream ciphers, are
also malleable.</p>
<h3 id="counter-mode-ctr">Counter Mode (CTR)</h3>
<p>Are other modes of operation all malleable like CBC, or will they be different?
Let’s take a look at another, very common, mode of operation: Counter Mode
(CTR), so that we can get a better sense of how the problem of malleability can
affect the world of block ciphers.</p>
<p>The mechanism behind Counter Mode is very simple:</p>
<ol>
<li>
<p>A random nonce is generated. The purpose of the nonce is the usual one:
make sure that repeated encryptions using the same key produce different
results.</p>
</li>
<li>
<p>A counter (usually an integer) is initialized to 1 (or any other starting
value of your choice).</p>
</li>
<li>
<p>The nonce is concatenated with the counter, and encrypted using the block
cipher: <code>r[0] = block_encrypt(nonce || counter)</code>.</p>
<p>Because the block cipher can only accept as input a block of a fixed size,
it follows that the length of the nonce plus the length of the counter must
be equal to the block size. For example, for a 128-bit block cipher, a
common choice is to have a 96-bit nonce and a 32-bit counter.</p>
</li>
<li>
<p>The counter is incremented: <code>counter = counter + 1</code> (the increment does not
necessarily need to be by 1, but that’s a common choice). The nonce and the
new counter are concatenated again, and encrypted using the block cipher:
<code>r[1] = block_encrypt(nonce || counter)</code>.</p>
</li>
<li>
<p>The counter is incremented again (<code>counter = counter + 1</code>), and a new block
is encrypted, just like before: <code>r[2] = block_encrypt(nonce || counter)</code>.</p>
</li>
<li>
<p>…</p>
</li>
</ol>
<p>This mechanism produces a sequence of blocks <code>r[0]</code>, <code>r[1]</code>, <code>r[2]</code>, … which
are indistinguishable from random. This sequence of random blocks can be XOR-ed
with the message to produce the ciphertext.</p>
<p>It’s important that the values for the counter never repeat. If, for example,
we’re using a 32-bit counter, the counter will “reset” (go back to the starting
value) after 2<sup>32</sup> iterations, and will start repeating the same
sequence of random blocks as it did at the beginning. This introduces the
problem of lack of diffusion that we have seen before, just at a larger scale.
If we’re using a 32-bit counter with a 128-bit block cipher, we cannot encrypt
more than 128·2<sup>32</sup> bits = 64 GiB of data at once. This is a very
important detail: exceeding these limits may allow the decryption of portions
of ciphertext without knowledge of the secret key.</p>
<figure>
<img src="https://andrea.corbellini.name/images/counter-mode.svg" alt="Visualization of the Counter Mode (CTR) mode of operation">
</figure>
<p>What Counter Mode is doing is effectively <strong>turning a block cipher into a
stream cipher</strong>. As such, a block cipher in Counter Mode has the exact same
malleability problems of stream ciphers that we have seen before.</p>
<details>
<summary>An example of malleability using AES-CTR with OpenSSL</summary>
<p>This example is going to be very similar (almost identical) to the example with
ChaCha20 that I showed in the stream cipher section, just that this time I’m
going to use AES-256 in CTR mode.</p>
<p>Let’s produce a valid ciphertext:</p>
<div class="highlight"><pre><span></span><code>openssl<span class="w"> </span>enc<span class="w"> </span>-aes-256-ctr<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-K<span class="w"> </span>0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-iv<span class="w"> </span>0123456789abcdef0123456700000001<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-in<span class="w"> </span><<span class="o">(</span><span class="nb">echo</span><span class="w"> </span><span class="s1">'Can you give me a ride to the party?'</span><span class="o">)</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-in<span class="w"> </span><<span class="o">(</span><span class="nb">echo</span><span class="w"> </span><span class="s1">'Do not give me a ride to the party!'</span><span class="o">)</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-out<span class="w"> </span>ciphertext.bin
</code></pre></div>
<p>Tamper it with Python:</p>
<div class="highlight"><pre><span></span><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'ciphertext.bin'</span><span class="p">,</span> <span class="s1">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">ciphertext</span> <span class="o">=</span> <span class="n">file</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">guessed_message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'--------------------an ice cream----------</span><span class="se">\n</span><span class="s1">'</span>
<span class="n">replacement_message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'--------------------to go to bed----------</span><span class="se">\n</span><span class="s1">'</span>
<span class="n">tampered_ciphertext</span> <span class="o">=</span> <span class="nb">bytes</span><span class="p">(</span><span class="n">x</span> <span class="o">^</span> <span class="n">y</span> <span class="o">^</span> <span class="n">z</span> <span class="k">for</span> <span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">z</span><span class="p">)</span> <span class="ow">in</span>
<span class="nb">zip</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">,</span> <span class="n">guessed_message</span><span class="p">,</span> <span class="n">replacement_message</span><span class="p">))</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s1">'tampered-ciphertext.bin'</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
<span class="n">file</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">tampered_ciphertext</span><span class="p">)</span>
</code></pre></div>
<p>And now we can decrypt it:</p>
<div class="highlight"><pre><span></span><code>openssl<span class="w"> </span>enc<span class="w"> </span>-d<span class="w"> </span>-aes-256-ctr<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-K<span class="w"> </span>0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-iv<span class="w"> </span>0123456789abcdef0123456700000001<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-in<span class="w"> </span>tampered-ciphertext.bin
</code></pre></div>
</details>
<h1 id="the-solution-authenticated-encryption-ae">The solution: Authenticated Encryption (AE)</h1>
<p>We have seen that stream ciphers and block ciphers (in their mode of operation)
both exhibit the same problem (in different flavors): malleability. I’ve shown
some examples of how this problem can be exploited with modern ciphers like
ChaCha20 and AES. These ciphers, alone, <strong>cannot guarantee the integrity or
authenticity of encrypted data</strong>.</p>
<p>In this context, <a href="https://en.wikipedia.org/wiki/Data_integrity"><em>integrity</em></a> is
the assurance that data is not corrupted or modified in any way.
<a href="https://en.wikipedia.org/wiki/Message_authentication"><em>Authenticity</em></a> can be
thought of as a stronger version of integrity, and it’s the assurance that a
given ciphertext was produced only with knowledge of the secret key.</p>
<p>Does this mean that modern ciphers, like ChaCha20 and AES, should be considered
insecure and avoided? Absolutely not! The correct answer is that those ciphers
cannot be used alone. You should think of them as basic building blocks, and
you need some additional building blocks in order to construct a complete and
secure cryptosystem. One of these additional building blocks, that we are going
to explore in this article, is an algorithm that provides integrity and
authentication: welcome <strong>Authenticated Encryption (AE)</strong>.</p>
<p>When using authenticated encryption, an adversary may be able to modify a
ciphertext using the techniques described above, but such modification would be
detected by the authentication algorithm, and decryption will fail with an
error. The decrypted message at that point should be discarded, preventing the
use of tampered data.</p>
<p>There are many different methods to implement authenticated encryption. The
most common approach is to use an authentication algorithm to authenticate the
ciphertext produced by a cipher. Here I will describe two very popular
authentication algorithms:</p>
<ul>
<li><strong>Poly1305</strong>, which is often used in conjunction with the stream cipher
ChaCha20 to form <strong>ChaCha20-Poly1305</strong>;</li>
<li><strong>Galois/Counter Mode (GCM)</strong>, which is often used with the block cipher AES
to form <strong>AES-GCM</strong>.</li>
</ul>
<p>These authentication algorithms work by computing a hash of the ciphertext,
which is then stored alongside the ciphertext. This hash is not a regular hash,
but it’s a <strong>keyed hash</strong>. A regular hash is a function that takes as input
some data and returns a fixed-size bit string:</p>
<p>$$\operatorname{hash}: data \rightarrow bits$$</p>
<p>A keyed hash instead takes two inputs: a secret key and some data, and produces
a fixed-size bit string:</p>
<p>$$\operatorname{keyed-hash}: (key, data) \rightarrow bits$$</p>
<p>The output of the keyed hash is more often called <em>Message Authentication Code
(MAC)</em>, or <em>authentication tag</em>, or even just <em>tag</em>.</p>
<p>During decryption, the same authentication algorithm is run again on the
ciphertext, and a new tag is produced. If the new tag matches the original tag
(that was stored alongside the ciphertext), then decryption succeeds. Else, if
the tags don’t match, it means that the ciphertext was modified (or the stored
tag was modified), and decryption fails. This gives us a way to detect
tampering and gives us the opportunity to reject ciphertexts that were not
produced by the secret key.</p>
<p>The secret key passed to the keyed hash function is not necessarily the same
secret key used for the encryption. In fact, both ChaCha20-Poly1305 and AES-GCM
operate on a <strong>subkey</strong> derived from the key used for encryption.</p>
<h2 id="poly1305">Poly1305</h2>
<p>Poly1305 is a keyed hash function proposed by <a href="https://en.wikipedia.org/wiki/Daniel_J._Bernstein">Daniel J.
Bernstein</a> in 2004, who is
also the author of ChaCha20. It works by using <strong>polynomials evaluated modulo
the prime 2<sup>130</sup> - 5</strong>, hence the name.</p>
<p>The key to Poly1305 is a 256-bit string, and it’s split into two halves:</p>
<ul>
<li>the first half (128 bits) is called $r$;</li>
<li>the second half (128 bits) is called $s$.</li>
</ul>
<p>We’ll see later how this key is generated when Poly1305 is used to implement
authenticated encryption. For now, let’s assume that the key is a random
(unpredictable) bit string provided as an input.</p>
<p>The first half $r$ is also <em>clamped</em> by setting some of its bits to 0. This is
a performance-related optimization that some Poly1305 implementations can take
advantage of when doing multiplication using 64-bit registers. Clamping is
performed by applying the following hexadecimal bitmask:</p>
<div class="highlight"><pre><span></span><code>0ffffffc0ffffffc0ffffffc0fffffff
</code></pre></div>
<p>The message to authenticate is split into chunks of 128 bits each: $m_1$,
$m_2$, $m_3$, … $m_n$. If the length of the message is not a multiple of 128
bits, then the last block may be shorter. The authentication tag is then
calculated as follows:</p>
<ul>
<li>
<p>Interpret $r$ and $s$ as two 128-bit little-endian integers.</p>
</li>
<li>
<p>Initialize the Poly1305 state $a_0$ to the integer 0. As we shall see
later, this state will need to hold at most 131 bits.</p>
</li>
<li>
<p>For each block $m_i$:</p>
<ul>
<li>
<p>Interpret the block $m_i$ as a little-endian integer.</p>
</li>
<li>
<p>Compute $\overline{m}_i$ by appending a 1-bit to the end of the block
$m_i$. If $m_i$ is 128 bits long, then this is equivalent to computing
$\overline{m}_i = 2^{128} + m_i$. In general, if the length of the
block $m_i$ in bits is $\operatorname{len}(m_i)$, then this is
equivalent to $\overline{m}_i = 2^{\operatorname{len}(m_i)} + m_i$.</p>
<p>This step ensures that the resulting block $\overline{m}_i$ is always
non-zero, even if the original block $\overline{m}_i$ is zero. This is
important for the security of the algorithm, as explained later.</p>
</li>
<li>
<p>Compute the new state $a_i = (a_{i-1} + \overline{m}_i) \cdot r
\pmod{2^{130} - 5}$. Note that, because the operation is modulo
$2^{130} - 5$, the result will always fit in 130 bits.</p>
</li>
</ul>
</li>
<li>
<p>Once each block has been processed, compute the final state
$a_{n+1} = a_n + s$. Note that the state $a_n$ is at most 130 bits long,
and $s$ is at most 128 bits long, hence the result will be at most 131 bits
long.</p>
</li>
<li>
<p>Truncate the final state $a_{n+1}$ to 128 bits by removing the most
significant bits.</p>
</li>
<li>
<p>Return the truncated final state $a_{n+1}$ as a little-endian byte string.</p>
</li>
</ul>
<p>What this method is doing is computing the following polynomial in $r$ and $s$:</p>
<p>$$\begin{align*}
tag
& = ((((((\overline{m}_1 \cdot r) + \overline{m}_2) \cdot r) + \cdots + \overline{m}_n) \cdot r) \bmod{(2^{130} - 5)}) + s \\
& = (\overline{m}_1 r^n + \overline{m}_2 r^{n-1} + \cdots + \overline{m}_n r) \bmod{(2^{130} - 5)} + s
\end{align*}$$</p>
<p>$r$ and $s$ are secrets, and they come from the Poly1305 key. Note that if we
didn’t add $s$ at the end, then the resulting polynomial would be a polynomial
in $r$, and one could use polynomial root-finding methods to figure out $r$
from the authentication tag, without knowledge of the key. Therefore it’s
important that $s$ is non-zero.</p>
<p>In Python, this is what a Poly1305 implementation could look like (disclaimer:
this is for learning purposes, and not necessarily secure or optimized for
performance):</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">iter_blocks</span><span class="p">(</span><span class="n">message</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Splits a message in blocks of 16 bytes (128 bits) each, except for the last</span>
<span class="sd"> block, which may be shorter.</span>
<span class="sd"> """</span>
<span class="n">start</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">while</span> <span class="n">start</span> <span class="o"><</span> <span class="nb">len</span><span class="p">(</span><span class="n">message</span><span class="p">):</span>
<span class="k">yield</span> <span class="n">message</span><span class="p">[</span><span class="n">start</span><span class="p">:</span><span class="n">start</span><span class="o">+</span><span class="mi">16</span><span class="p">]</span>
<span class="n">start</span> <span class="o">+=</span> <span class="mi">16</span>
<span class="k">def</span> <span class="nf">poly1305</span><span class="p">(</span><span class="n">key</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">,</span> <span class="n">message</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="o">==</span> <span class="mi">32</span> <span class="c1"># 256 bits</span>
<span class="c1"># Prime for the evaluation of the polynomial</span>
<span class="n">p</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o"><<</span> <span class="mi">130</span><span class="p">)</span> <span class="o">-</span> <span class="mi">5</span>
<span class="c1"># Split the key into two parts r and s</span>
<span class="n">r</span> <span class="o">=</span> <span class="nb">int</span><span class="o">.</span><span class="n">from_bytes</span><span class="p">(</span><span class="n">key</span><span class="p">[:</span><span class="mi">16</span><span class="p">],</span> <span class="s1">'little'</span><span class="p">)</span> <span class="c1"># 128 bits</span>
<span class="n">s</span> <span class="o">=</span> <span class="nb">int</span><span class="o">.</span><span class="n">from_bytes</span><span class="p">(</span><span class="n">key</span><span class="p">[</span><span class="mi">16</span><span class="p">:],</span> <span class="s1">'little'</span><span class="p">)</span> <span class="c1"># 128 bits</span>
<span class="c1"># Clamp r</span>
<span class="n">r</span> <span class="o">=</span> <span class="n">r</span> <span class="o">&</span> <span class="mh">0x0ffffffc0ffffffc0ffffffc0fffffff</span>
<span class="c1"># Initialize the state</span>
<span class="n">a</span> <span class="o">=</span> <span class="mi">0</span>
<span class="c1"># Update the state with every block</span>
<span class="k">for</span> <span class="n">block</span> <span class="ow">in</span> <span class="n">iter_blocks</span><span class="p">(</span><span class="n">message</span><span class="p">):</span>
<span class="c1"># Append a 1-bit to the end of each block</span>
<span class="n">block</span> <span class="o">=</span> <span class="n">block</span> <span class="o">+</span> <span class="sa">b</span><span class="s1">'</span><span class="se">\1</span><span class="s1">'</span>
<span class="c1"># Convert the block to an integer</span>
<span class="n">c</span> <span class="o">=</span> <span class="nb">int</span><span class="o">.</span><span class="n">from_bytes</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s1">'little'</span><span class="p">)</span>
<span class="c1"># Update the state</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">((</span><span class="n">a</span> <span class="o">+</span> <span class="n">c</span><span class="p">)</span> <span class="o">*</span> <span class="n">r</span><span class="p">)</span> <span class="o">%</span> <span class="n">p</span>
<span class="c1"># Add s to the state and truncate it to 128 bits, removing the most</span>
<span class="c1"># significant bits and keeping only the least significant 128 bits</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="n">a</span> <span class="o">+</span> <span class="n">s</span><span class="p">)</span> <span class="o">&</span> <span class="p">((</span><span class="mi">1</span> <span class="o"><<</span> <span class="mi">128</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
<span class="c1"># Convert the state from an integer to a 16-byte string (128 bits)</span>
<span class="k">return</span> <span class="n">a</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="s1">'little'</span><span class="p">)</span>
</code></pre></div>
<p>And here is an example of how that code could be used:</p>
<div class="highlight"><pre><span></span><code>key = bytes.fromhex('0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef')
msg = b'I had a very nice day today at the beach'
print(poly1305(key, msg).hex())
</code></pre></div>
<p>This returns <code>b0c4cb74b3089e9a982e3baa90c1bb5f</code>, which is the same result that
we would get using OpenSSL:</p>
<div class="highlight"><pre><span></span><code>openssl<span class="w"> </span>mac<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-macopt<span class="w"> </span>hexkey:0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-in<span class="w"> </span><<span class="o">(</span><span class="nb">echo</span><span class="w"> </span>-n<span class="w"> </span><span class="s1">'I had a very nice day today at the beach'</span><span class="o">)</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>poly1305
</code></pre></div>
<p>A few things to note:</p>
<ul>
<li>
<p>The same key cannot be reused to construct two distinct tags. In fact,
suppose that we use the same hash key to compute <code>tag1 = Poly1305(key,
msg1)</code> and <code>tag2 = Poly1305(key, msg2)</code>. Then, because $s$ is the same for
both, we could subtract the two tags (<code>tag1 - tag2</code>) to remove the $s$ part
and obtain a polynomial in $r$. From there, we could use algebraic methods
to figure out $r$. Once we have $r$, we can use either one of the tags and
compute $s$, therefore recovering the full secret key.</p>
<p>Similarly, if the keys were generated using a predictable algorithm (for
example, incrementally: <code>key[i+1] = key[i] + 1</code>), it would still be
possible to use a similar approach to figure out the secret key.</p>
<p>For this reason, <strong>Poly1305 keys must be unique and unpredictable</strong>.
Generating Poly1305 keys randomly or pseudo-randomly is an acceptable
approach. Authentication functions like Poly1305 are called <strong>one-time
authenticators</strong> because they can be used only one time with the same key.</p>
</li>
<li>
<p>If we didn’t add the 1-bits at the end of each block (in other words, if we
used the $m_i$ blocks instead of $\overline{m}_i$), then encrypting a
message full of zero bits would be the equivalent of encrypting an empty
message. Adding the 1-bits is a way to ensure that the length of the
message always has an effect on the output.</p>
</li>
</ul>
<h3 id="use-of-poly1305-with-chacha20-chacha20-poly1305">Use of Poly1305 with ChaCha20 (ChaCha20-Poly1305)</h3>
<p>Let’s see how we can combine ChaCha20 and Poly1305 to construct an authenticated cipher. To recap:</p>
<ul>
<li>ChaCha20 is a stream cipher;</li>
<li>Poly1305 is a one-time authenticator;</li>
<li>ChaCha20, like most ciphers, requires the use of a unique nonce to allow key
reuse.</li>
</ul>
<p>Putting the two together gives birth to ChaCha20-Poly1305. Here I’m going to
describe how to implement it as standardized in <a href="https://www.rfc-editor.org/rfc/rfc8439">RFC
8439</a>.</p>
<p>The <strong>inputs to the ChaCha20-Poly1305 encryption function</strong> are:</p>
<ul>
<li>a 256-bit secret key;</li>
<li>a 96-bit nonce;</li>
<li>a variable-length plaintext message.</li>
</ul>
<p>The <strong>outputs from the ChaCha20-Poly1305 encryption function</strong> are:</p>
<ul>
<li>a variable-length ciphertext (same length as the input plaintext);</li>
<li>a 128-bit authentication tag.</li>
</ul>
<p>The <strong>ChaCha20-Poly1305 decryption function</strong> will accept the same secret key,
nonce, ciphertext, and authentication tag as the input, and produce either the
plaintext or an error as the output. The error is returned in case the
authentication fails.</p>
<figure>
<img src="https://andrea.corbellini.name/images/chacha20-poly1305-encryption.svg" alt="Diagram of data flow during encryption with ChaCha20-Poly1305">
<figcaption>Data flow during a ChaCha20-Poly1305 encryption. This shows the inputs in <span style="color:#3465a4;font-weight:bold">blue</span>, the outputs in <span style="color:#73d216;font-weight:bold">green</span>, and the intermediate objects in <span style="color:#cc0000;font-weight:bold">red</span>.</figcaption>
</figure>
<p>ChaCha20-Poly1305 works in the following way:</p>
<ol>
<li>
<p>The <strong>ChaCha20</strong> stream cipher is <strong>initialized</strong> with the 256-bit secret key and
the 96-bit nonce.</p>
</li>
<li>
<p>The stream cipher is used to encrypt a 256-bit string of all zeros. The
result is the <strong>Poly1305 subkey</strong>.</p>
<p>If you recall how a stream cipher works, you should know that encrypting
using a stream cipher is equivalent to performing the XOR of a random bit
stream with the plaintext. Here the plaintext is all zeros, so the process
of generating the Poly1305 subkey is equivalent to grabbing the first 256
bits from the ChaCha20 bit stream.</p>
<p>We previously saw that the Poly1305 subkey must be unpredictable and unique
in order for Poly1305 to be secure. The use of ChaCha20 with a unique nonce
ensures that: because ChaCha20 is a stream cipher, its output will be
random and unpredictable. Therefore, with this construction, the subkey
will be unpredictable even if the nonce is predictable.</p>
</li>
<li>
<p>The stream cipher is used to encrypt another 256-bit string. The result is
discarded. This is equivalent to advancing the stream cipher state by 256
bits.</p>
<p>This step may seem weird, and in fact is not needed for security purposes,
but it’s a mere implementation detail. This step is here because ChaCha20
has an internal state of 512 bits. In the previous step we obtained the
first 256 bits of the state, and this next step is to discard the rest of
the state to start with a fresh state. There is no particular reason for
requiring a fresh state. The reason why RFC 8439 does that is because…
spoiler alert: ChaCha20 is a block cipher under the hood. Its block size is
512 bits. If you read the RFC, you’ll see that it asks to call the ChaCha20
block encryption function once, grab the first 256 bits, and discard the
rest. Here I’m treating ChaCha20 as a stream cipher, so I have to include
this extra step to discard the bits.</p>
</li>
<li>
<p>The <strong>plaintext</strong> is <strong>encrypted</strong> using the stream cipher.</p>
<p>Note that this is done without resetting the state of the cipher. We are
continuing to use the same stream cipher instance that was used to generate
the Poly1305 subkey.</p>
</li>
<li>
<p>The <strong>ciphertext</strong> is <strong>padded</strong> with zeros to make its length a multiple
of 16 bytes (128 bits) and is <strong>authenticated using Poly1305</strong>, via the
subkey generated in step 2.</p>
<p>This step may be done in parallel to the previous one, that is: every time
we generate a chunk of ciphertext, we feed it to the Poly1305
authentication function.</p>
<p>Why pad the ciphertext before passing it to Poly1305? After all, ChaCha20
is a stream cipher, and Poly1305 can accept arbitrary-sized messages.
Again, this is an detail of RFC 8439 and padding does not serve any
specific purpose.</p>
</li>
<li>
<p>The <strong>length</strong> of the <strong>ciphertext</strong> (in bytes) is fed into the <strong>Poly1305
authenticator</strong>. This length is represented as a 64-bit little-endian
integer padded with 64 zero bits.</p>
<p>The reason why the length is represented as 64 bits and padded (instead of
representing it as 128 bits) will be clearer later: what I have given you
so far is a simplified view of ChaCha20-Poly1305 and authenticated
encryption in general. I will give you the full picture when talking about
<a href="#authenticated-encryption-with-associated-data-aead">associated data</a>
later on, and at that point this step will be slightly modified.</p>
</li>
<li>
<p>The ciphertext from ChaCha20 and the authentication tag from Poly1305 are
returned.</p>
</li>
</ol>
<p>The decryption algorithm works in a very similar way: ChaCha20 is initialized
in the same way, the subkey is generated in the same way, the Poly1305
authentication tag is calculated from the ciphertext in the same way. The only
difference is that ChaCha20 is used to decrypt the ciphertext (instead of
encrypting the plaintext) and that the input authentication tag is compared to
the calculated authentication tag before returning.</p>
<p>Here is a Python implementation of ChaCha20-Poly1305, based on the
implementations of ChaCha20 and Poly1305 from
<a href="https://pypi.org/project/pycryptodome/">pycryptodome</a> (usual disclaimer: this
code is for educational purposes, and is not necessarily secure or optimized
for performance):</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">Crypto.Cipher</span> <span class="kn">import</span> <span class="n">ChaCha20</span>
<span class="kn">from</span> <span class="nn">Crypto.Hash</span> <span class="kn">import</span> <span class="n">Poly1305</span>
<span class="k">def</span> <span class="nf">chacha20poly1305_encrypt</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="p">,</span> <span class="n">message</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="o">==</span> <span class="mi">32</span> <span class="c1"># 256 bits</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">nonce</span><span class="p">)</span> <span class="o">==</span> <span class="mi">12</span> <span class="c1"># 96 bits</span>
<span class="c1"># Initialize the ChaCha20 cipher with the key and nonce</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">ChaCha20</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="o">=</span><span class="n">nonce</span><span class="p">)</span>
<span class="c1"># Derive the Poly1305 subkey using the ChaCha20 cipher</span>
<span class="n">subkey</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="mi">32</span><span class="p">)</span> <span class="c1"># 256 bits</span>
<span class="n">subkey_r</span> <span class="o">=</span> <span class="n">subkey</span><span class="p">[:</span><span class="mi">16</span><span class="p">]</span>
<span class="n">subkey_s</span> <span class="o">=</span> <span class="n">subkey</span><span class="p">[</span><span class="mi">16</span><span class="p">:]</span>
<span class="c1"># Initialize the Poly1305 authenticator with the subkey</span>
<span class="n">authenticator</span> <span class="o">=</span> <span class="n">Poly1305</span><span class="o">.</span><span class="n">Poly1305_MAC</span><span class="p">(</span><span class="n">r</span><span class="o">=</span><span class="n">subkey_r</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="n">subkey_s</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="kc">None</span><span class="p">)</span>
<span class="c1"># Discard the rest of the internal ChaCha20 state</span>
<span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="mi">32</span><span class="p">)</span> <span class="c1"># 256 bits</span>
<span class="c1"># Encrypt the message</span>
<span class="n">ciphertext</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
<span class="c1"># Authenticate the ciphertext</span>
<span class="n">authenticator</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span>
<span class="c1"># Pad the ciphertext with zeros (to make it a multiple of 16 bytes)</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">%</span> <span class="mi">16</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">authenticator</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="p">(</span><span class="mi">16</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">%</span> <span class="mi">16</span><span class="p">))</span>
<span class="c1"># Authenticate the length of the associated data (0 for simplicity)</span>
<span class="n">authenticator</span><span class="o">.</span><span class="n">update</span><span class="p">((</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="s1">'little'</span><span class="p">))</span> <span class="c1"># 64 bits</span>
<span class="c1"># Authenticate the length of the ciphertext</span>
<span class="n">authenticator</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="s1">'little'</span><span class="p">))</span> <span class="c1"># 64 bits</span>
<span class="c1"># Generate the authentication tag</span>
<span class="n">tag</span> <span class="o">=</span> <span class="n">authenticator</span><span class="o">.</span><span class="n">digest</span><span class="p">()</span>
<span class="k">return</span> <span class="p">(</span><span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">chacha20poly1305_decrypt</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="p">,</span> <span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="o">==</span> <span class="mi">32</span> <span class="c1"># 256 bits</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">nonce</span><span class="p">)</span> <span class="o">==</span> <span class="mi">12</span> <span class="c1"># 96 bits</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">tag</span><span class="p">)</span> <span class="o">==</span> <span class="mi">16</span> <span class="c1"># 128 bits</span>
<span class="c1"># Initialize the ChaCha20 cipher and the Poly1305 authenticator, in the</span>
<span class="c1"># same exact way as it was done during encryption</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">ChaCha20</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="o">=</span><span class="n">nonce</span><span class="p">)</span>
<span class="n">subkey</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="mi">32</span><span class="p">)</span>
<span class="n">subkey_r</span> <span class="o">=</span> <span class="n">subkey</span><span class="p">[:</span><span class="mi">16</span><span class="p">]</span>
<span class="n">subkey_s</span> <span class="o">=</span> <span class="n">subkey</span><span class="p">[</span><span class="mi">16</span><span class="p">:]</span>
<span class="n">authenticator</span> <span class="o">=</span> <span class="n">Poly1305</span><span class="o">.</span><span class="n">Poly1305_MAC</span><span class="p">(</span><span class="n">r</span><span class="o">=</span><span class="n">subkey_r</span><span class="p">,</span> <span class="n">s</span><span class="o">=</span><span class="n">subkey_s</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="kc">None</span><span class="p">)</span>
<span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="mi">32</span><span class="p">)</span>
<span class="c1"># Generate the authentication tag, like during encryption</span>
<span class="n">authenticator</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">%</span> <span class="mi">16</span><span class="p">:</span>
<span class="n">authenticator</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="p">(</span><span class="mi">16</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">%</span> <span class="mi">16</span><span class="p">))</span>
<span class="n">authenticator</span><span class="o">.</span><span class="n">update</span><span class="p">((</span><span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="s1">'little'</span><span class="p">))</span>
<span class="n">authenticator</span><span class="o">.</span><span class="n">update</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="mi">8</span><span class="p">,</span> <span class="s1">'little'</span><span class="p">))</span>
<span class="n">expected_tag</span> <span class="o">=</span> <span class="n">authenticator</span><span class="o">.</span><span class="n">digest</span><span class="p">()</span>
<span class="c1"># Compare the input tag with the generated tag. If they're different, the</span>
<span class="c1"># plaintext must not be returned to the caller</span>
<span class="k">if</span> <span class="n">tag</span> <span class="o">!=</span> <span class="n">expected_tag</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s1">'authentication failed'</span><span class="p">)</span>
<span class="c1"># The two tags match; decrypt the plaintext and return it to the caller</span>
<span class="c1"># Note that, because ChaCha20 is a symmetric cipher, there is no difference</span>
<span class="c1"># between the encrypt and decrypt method: here we are reusing the same</span>
<span class="c1"># exact code used during decryption</span>
<span class="n">message</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span>
<span class="k">return</span> <span class="n">message</span>
</code></pre></div>
<p>And here is how it can be used:</p>
<div class="highlight"><pre><span></span><code><span class="n">key</span> <span class="o">=</span> <span class="nb">bytes</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span><span class="s1">'0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef'</span><span class="p">)</span>
<span class="n">nonce</span> <span class="o">=</span> <span class="nb">bytes</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span><span class="s1">'0123456789abcdef01234567'</span><span class="p">)</span>
<span class="n">message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'I wanted to go to the beach, but now I changed my mind'</span>
<span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span> <span class="o">=</span> <span class="n">chacha20poly1305_encrypt</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="p">,</span> <span class="n">message</span><span class="p">)</span>
<span class="n">decrypted_message</span> <span class="o">=</span> <span class="n">chacha20poly1305_decrypt</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="p">,</span> <span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">message</span> <span class="o">==</span> <span class="n">decrypted_message</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">'ciphertext: </span><span class="si">{</span><span class="n">ciphertext</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">' tag: </span><span class="si">{</span><span class="n">tag</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">' plaintext: </span><span class="si">{</span><span class="n">decrypted_message</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
</code></pre></div>
<p>Running it produces the following output:</p>
<div class="highlight"><pre><span></span><code>ciphertext: 5d9b09cc5d90ca9ddff2d3470cfd6b563c5158e952bfae6acf1ebf9a3b968a488a41969567ef5ccfe05dcf9e548567028ff374a754af
tag: dac3c05d261920e278ceb22e2800aa95
plaintext: b'I wanted to go to the beach, but now I changed my mind'
</code></pre></div>
<p>This is the same output we would obtain by using the ChaCha20-Poly1305
implementation from pycryptodome directly:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">Crypto.Cipher</span> <span class="kn">import</span> <span class="n">ChaCha20_Poly1305</span>
<span class="n">key</span> <span class="o">=</span> <span class="nb">bytes</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span><span class="s1">'0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef'</span><span class="p">)</span>
<span class="n">nonce</span> <span class="o">=</span> <span class="nb">bytes</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span><span class="s1">'0123456789abcdef01234567'</span><span class="p">)</span>
<span class="n">message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'I wanted to go to the beach, but now I changed my mind'</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">ChaCha20_Poly1305</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="o">=</span><span class="n">nonce</span><span class="p">)</span>
<span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt_and_digest</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">'ciphertext: </span><span class="si">{</span><span class="n">ciphertext</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">' tag: </span><span class="si">{</span><span class="n">tag</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
</code></pre></div>
<p>As already stated, it is extremely important that the nonce passed to
ChaCha20-Poly1305 is unique. It may be predictable, but it must be unique. If
the same nonce is reused twice or more, we can:</p>
<ul>
<li>
<p>Decrypt arbitrary messages without using the secret key, if we can guess at
least one message from its ciphertext.</p>
<p>This can be done using the techniques described at the beginning of this
article: by recovering the random bit string from the XOR of the ciphertext
with the guessed message.</p>
</li>
<li>
<p>Recover the Poly1305 subkey, and, at that point, tamper with ciphertexts
and forge new, valid authentication tags.</p>
<p>This can be done by using algebraic methods on the polynomial of the
authentication tag.</p>
</li>
</ul>
<p>There is also a variant of ChaCha20-Poly1305, called XChaCha20-Poly1305, that
features an extended 192-bit nonce (the X stands for ‘extended’). This is
described in an <a href="https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-xchacha">RFC
draft</a> but so
far it hasn’t been accepted as a standard yet. I won’t cover XChaCha20 in
detail here, because it’s slightly more complex and does not add much to the
topic of this article, but XChaCha20-Poly1305 has better security properties
than ChaCha20-Poly1305, so you should prefer it in your applications if you can
use it. The reason why XChaCha20-Poly1305 has better properties than
ChaCha20-Poly1305 is that, having a longer nonce, the probability of generating
two random nonces with the same value are much lower.</p>
<h2 id="galoiscounter-mode-gcm">Galois/Counter Mode (GCM)</h2>
<p>Let’s now take a look at Galois/Counter Mode (GCM). This is commonly used with
the Advanced Encryption Standard (AES), to construct the authenticated cipher
AES-GCM. One main difference between Poly1305 and GCM is that Poly1305 can work
with any stream or block cipher, while GCM is designed to work with block
ciphers with a block size of 128 bits.</p>
<p>GCM was proposed by David McGrew and John Viega in 2004 and is standardized in
<a href="https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-38d.pdf">NIST Special Publication
800-38D</a>
as well as <a href="https://www.rfc-editor.org/rfc/rfc5288">RFC 5288</a>. It takes its
name from Galois fields, also known as <a href="https://en.wikipedia.org/wiki/Finite_field">finite
fields</a>, which in turn get their
name from the French mathematician <a href="https://en.wikipedia.org/wiki/%C3%89variste_Galois">Évariste
Galois</a>, who introduced the
concept of finite fields as we know them today.</p>
<p>As we did before with Poly1305, we are going to first see how the keyed hash
function used by GCM works, and then we will see how to use it to construct an
authenticated cipher like AES-GCM on top of it. Before we can do that though,
we need to understand what finite fields are, and what specific types of finite
fields are used in GCM.</p>
<h3 id="finite-fields-galois-fields">Finite Fields (Galois Fields)</h3>
<p>What is a field? A field is a mathematical structure that contains a bunch of
elements, and those elements can interact with each other using addition and
multiplication. For both these operations there’s an identity element and an
inverse element. Addition and multiplication in a field must obey the usual
properties that we’re used to: commutativity, associativity, and
distributivity.</p>
<p>A well-known example of a field is the field of fractions. Here is why
fractions form a field:</p>
<ul>
<li>the <strong>elements</strong> of the field are the fractions;</li>
<li><strong>addition</strong> is well-defined: if we add two fractions, we get a fraction out
(example: $5/3 + 3/2 = 19/6$);</li>
<li><strong>multiplication</strong> is also well-defined: if we multiply two fractions, we get
a fraction out (example: $1/2 \cdot 8/3 = 4/3$);</li>
<li>the <strong>additive identity element</strong> is 0: if we add 0 to any fraction, we get
the same fraction back;</li>
<li>the <strong>additive inverse element</strong> is the negated fraction (example: $5/1$ is
the additive inverse of $-5/1$ because $5/1 + (-5/1) = 0$);</li>
<li>the <strong>multiplicative identity element</strong> is 1: multiplying any fraction by 1
yields the same fraction back;</li>
<li>the <strong>multiplicative inverse element</strong> is what we get if we swap the
numerator with the denominator (example: $3/2$ is the multiplicative inverse
of $2/3$ because $3/2 \cdot 2/3 = 1$)—except for 0, which does not have a
multiplicative inverse.</li>
<li>and so on…</li>
</ul>
<p>On top of addition, multiplication, and inverse elements, we can define derived
operations like subtraction and division. Subtracting $a$ from $b$ is
equivalent to adding $a$ to the additive inverse of $b$: $a - b = a + (-b)$.
Similarly, division can be defined in terms of multiplication with
multiplicative inverses ($a / b = a b^{-1}$).</p>
<p>Fields are a generalization of structures where addition, multiplication,
subtraction, and division behave according to the rules that we’re used to.
Field elements do not necessarily need to be numbers.</p>
<p>An example of something that is <em>not</em> a field is the integers. That’s because
integers don’t have multiplicative inverses (for example, there’s no integer
that multiplied by 5 makes the result equal to 1). However, there is a way to
turn the integers into a field: if we take the integers and a prime number <em>p</em>,
then we can construct the <strong>field of integers modulo <em>p</em></strong>.</p>
<p>When we work with the integers modulo a prime <em>p</em>, whenever we see <em>p</em> appear
in any of our expressions, we can replace it with 0. In other words, in such a
field, <em>p</em> and 0 are two different ways to write the same element–they are two
different <em>representations</em> of the same element.</p>
<p>Here is an example: in the field of integers modulo 7, the expression 5 + 3
equals 1, because:</p>
<ul>
<li>5 + 3 evaluates to 8;</li>
<li>8, by definition, is 7 + 1;</li>
<li>if 7 and 0 are the same element, then 7 + 1 is equal to 0 + 1</li>
<li>0 + 1 evaluates to 1</li>
</ul>
<p>What we have just seen is that 8 is just a different representation of 1, just
like 7 is a different representation of 0. Different symbols, same object.
Just like, in programming languages, we can have multiple variables pointing to
the same memory location: here the numbers are like variables, and what they
point to is what really matters.</p>
<p>In the field of integers modulo 7, the additive inverse for 5 is 2, because 5
+ 2 = 7 = 0. If we manipulate the equation, we get that 5 = −2. In other words,
5 and −2 are two different representations for the same element, and similarly
2 and −5 are also two different representations of the same element. A similar
story holds for multiplication: the multiplicative inverse for 5 is 3 because:
5 · 3 = 15 = 7 + 7 + 1 = 1, so we can write 5 = 3<sup>−1</sup> as well as 3 =
5<sup>−1</sup>.</p>
<p>What we have just seen is an example of a <strong>finite field</strong>. It’s different from
a general field because it contains a finite number of elements (unlike
fractions, which do not have a limit). In the case of the integers modulo 7,
the number of elements is 7, and the list of elements is: {0, 1, 2, 3, 4, 5,
6}, or {−3, −2, −1, 0, 1, 2, 3}, or {0, 1, 2, 3, 2<sup>−1</sup>,
3<sup>−1</sup>, 6}, depending on what representation we like the most.</p>
<details>
<summary>A few words about terminology, notation, and equivalences of finite fields</summary>
<p>There can be many ways to construct a finite field (or even a general field). I
have given an example using numbers, but a field does not necessarily need to
be formed from numbers. We can also use vectors, matrices, polynomials, and
anything you would like. As long as addition, multiplication, identity
elements, and inverse elements are well-defined, you can get a field. Using
programming terms, you can think of a field as an interface or a trait that can
have arbitrary implementations.</p>
<p>An important result in algebra is that finite fields with the same number of
elements are “unique up to isomorphism”. This means that if two finite fields
have the same number of elements, then there is an equivalence relation between
the two. The number of elements of a field is therefore enough to define a
field. It’s not enough to tell us what the elements of the field look like, or
how they can be represented, but it’s enough to know how it behaves. To denote
a field with $n$ elements, there are two major notations: $GF(n)$ and
$\mathbb{F}_{n}$.</p>
<p>Another important result in algebra is that $n$ may be either a prime number,
or a power of a prime. For example, we can have finite fields with 2 elements,
or with 9 (= 3<sup>2</sup>) elements, but we cannot have a field with 6 (= 2·3)
elements. For this reason, you will often find finite fields denoted as
$GF(p^k)$ or $\mathbb{F}_{p^k}$, where $p$ is a prime and $k$ is an integer
greater than 0. The prime $p$ is called <em>characteristic</em> of the field, while $n
= p^k$ is called <em>order</em> of the field.</p>
<p>Some common fields also have their own notation: in particular, the field of
integers modulo a prime $p$ is denoted as $Z/pZ$. This notation encodes the
“building instructions” to construct the field, in fact:</p>
<ul>
<li>$Z$ denotes the integers: $Z = \{\dots, -2, -1, 0, 1, 2, \dots\}$;</li>
<li>$pZ$ denotes the integers multiplied by $p$: $pZ = \{\dots, -2p, -p, 0, p,
2p\}$ (example: $2Z = \{\dots, -4, -2, 0, 2, 4, \dots\}$);</li>
<li>$A/B$ is a <em>quotient</em>. This is a way to define an equivalence relation
between elements, and its meaning is: within $A/B$, all the elements of $B$
are equivalent to 0. In the case of $Z/pZ$, all the multiples of $p$ are
equivalent to 0, which is indeed what happens with the integers modulo $p$.
The way I described this equivalence relation earlier is by saying that
multiples of $p$ are different representations for 0.</li>
</ul>
<p>Note that the integers modulo a power of a prime ($Z/p^kZ$, with $k$ greater
than 1) do not form a field. The problem is that elements in $Z/p^kZ$ sometimes
do not have a multiplicative inverse. For example, in $Z/4Z$, the number 2 does
not have a multiplicative inverse (there is no element that multiplied by 2
gives 1). A field $GF(p^k)$ with $k$ greater than 1 needs to be constructed in
a different way. One such way is to use polynomials, as described in the next
section.</p>
</details>
<h4 id="polynomial-fields">Polynomial fields</h4>
<p>Let’s now move our attention from integers to polynomials, like this one:</p>
<p>$$x^7 + 5x^3 - 9x^2 + 2x + 1$$</p>
<p>Polynomials are a sum of coefficients multiplied by a variable (usually denoted
by the letter <em>x</em>) raised to an integral power.</p>
<p>Let’s restrict our view to polynomials that have integer coefficients, like the
one shown above. Something that is <em>not</em> a polynomial with integer coefficients
is $1/2 x^2 + x$, because it has a fraction in it.</p>
<p>Integers and polynomials with integer coefficients are somewhat similar to each
other. They kinda behave the same in many aspects. One important property of
integers is the <a href="https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic">unique factorization
theorem</a>: if
we have an integer, there’s a way to write it as a multiplication of some
primary factors. For example, the integer 350 can be factored as the
multiplication of 2, 5, 5, and 7.</p>
<p>$$350 = 7 \cdot 5 \cdot 5 \cdot 2$$</p>
<p>This factorization is <em>unique</em>: we can change the order of the factors, but
it’s not possible to obtain a different set of factors (there’s no way to make
the number 3 appear in the factorization of 350, or to make the number 7
disappear).</p>
<p>Polynomials with integer coefficients also have a unique factorization. In the
case of integers, We call the unique factors “prime numbers”; in the case of
polynomials we have “irreducible polynomials”. And just like we can have a
field of integers modulo a prime, we can have a <strong>field of polynomials modulo
an irreducible polynomial</strong>.</p>
<table>
<thead>
<tr>
<th>Integers</th>
<th>Polynomials (with integer coefficients)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Unique factorization: $42 = 7 \cdot 3 \cdot 2$</td>
<td>Unique factorization: $x^3 - 1 = (x^2 + x + 1)(x - 1)$</td>
</tr>
<tr>
<td>Prime numbers: 2, 3, 5, 7, 11, …</td>
<td>Irreducible polynomials: $x + 1$, $x^2 - 2$, $x^2 + x + 1$, …</td>
</tr>
<tr>
<td>Integers modulo a prime number</td>
<td>Polynomials modulo an irreducible polynomial</td>
</tr>
</tbody>
</table>
<p>Let’s take a look at how arithmetic in polynomial fields works. Let’s take, for
example, the field of polynomials with integer coefficients modulo $x^3 + x +
1$, and try to compute the result of $(x^2 + 1)(x^2 + 2)$. If we expand the
expression, we get:</p>
<p>$$(x^2 + 1)(x^2 + 2) = x^4 + 3x^2 + 2$$</p>
<p>This expression can be <em>reduced</em>. Reducing a polynomial expression is the
equivalent of what we were doing with the integers modulo a prime, when we were
saying that 8 = 7 + 1 = 1 (mod 7). That “conversion” from 8 to 1 is the
equivalent of the reduction that we’re talking about here.</p>
<p>To reduce $x^4 + 3x^2 + 2$, first note that $x^4 = x \cdot x^3$. Also note that
$x^3 = x^3 + x + 1 - x - 1$. Here we have just added and removed the term $x +
1$: the result hasn’t changed, but now the irreducible polynomial $x^3 + x + 1$
appears in the expression, and so we can substitute it with 0. Putting
everything together, we get:</p>
<p>$$\begin{align*}
(x^2 + 1)(x^2 + 2) & = x^4 + 3x^2 + 2 \\
& = x \cdot x^3 + 3x^2 + 2 \\
& = x \cdot (x^3 + x + 1 - x - 1) + 3x^2 + 2 \\
& = x \cdot (0 - x - 1) + 3x^2 + 2 \\
& = -x^2 - x + 3x^2 + 2 \\
& = 2x^2 - x + 2
\end{align*}$$</p>
<p>It’s interesting to note that, if the polynomial field is formed by an
irreducible polynomial with degree $n$, then all the polynomials in that field
will all have degree less than $n$. That’s because if any $x^n$ (or higher)
appears in a polynomial expression, then we can use the substitution trick I
just showed to reduce its degree.</p>
<h4 id="binary-fields">Binary fields</h4>
<p>Let’s now look at polynomials where <strong>coefficients are from the field of
integers modulo 2</strong>, meaning that they can be either 0 or 1. This is an example
of such a polynomial:</p>
<p>$$x^7 + x^4 + x^2 + 1$$</p>
<p>or, in a more explicit form, where we can clearly see all the coefficients:</p>
<p>$$1 x^7 + 0 x^6 + 0 x^5 + 1 x^4 + 0 x^3 + 1 x^2 + 0 x^1 + 1 x^0$$</p>
<p>These are called <strong>binary polynomials</strong>. It’s interesting to note that if we
ignore the variables and the powers, and keep only the coefficients, then what
we get is a <strong>bit string</strong>:</p>
<p>$$(1 0 0 1 0 1 0 1)$$</p>
<p>This suggests that there’s an interesting duality between binary polynomials
and bit strings. This means, in particular, that binary polynomials can be
represented in a very compact and natural way on computers.</p>
<p>The duality between binary polynomials and bit strings also suggests that
perhaps we can use bitwise operations to perform arithmetic on binary
polynomials. And this turns out to be true, in fact:</p>
<ul>
<li>binary polynomial addition can be computed using the XOR operator on the two
corresponding bit strings;</li>
<li>binary polynomial multiplication can be computed using XOR, AND and
bit-shifting.</li>
</ul>
<p>Computers are pretty fast at performing these bitwise operations, and this
makes binary polynomials quite attractive for use in computer algorithms and
cryptography.</p>
<details>
<summary>Arithmetic with binary polynomials</summary>
<p>The arithmetic of such polynomials is quite interesting: in fact, because $1 +
1 = 0$ (modulo 2), then also $x^k + x^k = 0$, in fact:</p>
<p>$$1 \cdot x^k + 1 \cdot x^k = (1 + 1) x^k = 0 \cdot x^k = 0$$</p>
<p>It’s easy to see that addition modulo 2 is equivalent to the XOR binary
operator. And addition of two binary polynomials is equivalent to the bitwise
XOR of their corresponding bit strings:</p>
<p>$$\begin{array}{ccccc}
(x^3 + x^2 + 1) & + & (x^2 + x) & = & x^3 + x + 1 \\
\updownarrow & & \updownarrow & & \updownarrow \\
(1101) & \oplus & (0110) & = & (1011)
\end{array}$$</p>
<p>Multiplication of binary polynomials can also be implemented as a bitwise
operation on bit strings. First, note that multiplying a polynomial by a
monomial is equivalent to bit-shifting:</p>
<p>$$\begin{array}{ccccc}
(x^3 + x + 1) & \cdot & x^2 & = & x^5 + x^3 + x^2 \\
\updownarrow & & \updownarrow & & \updownarrow \\
(1011) & \ll & 2 & = & (101100)
\end{array}$$</p>
<p>Then note that multiplication of two polynomials can be expressed as the sum of
multiplications by monomials:</p>
<p>$$(x^3 + 1)(x^2 + x + 1) = (x^3 + 1) \cdot x^2 + (x^3 + 1) \cdot x^1 + (x^3 + 1) \cdot x^0$$</p>
<p>Putting everything together, we have multiplications by monomials (equivalent
to bit-shifts) and sums (equivalent to bitwise XOR). This suggests that
multiplication can be implemented on top of bitwise XOR and bit-shifting.</p>
<p>Here is some Python code to implement binary polynomial multiplication, where
each polynomial is represented compactly as an <code>int</code>:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">multiply</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Compute a*b, where a and b are two integers representing binary</span>
<span class="sd"> polynomials.</span>
<span class="sd"> a and b are expected to have their most significant bit set to</span>
<span class="sd"> the monomial with the highest power. For example, the polynomial</span>
<span class="sd"> x^8 is represented as the integer 0b10000.</span>
<span class="sd"> """</span>
<span class="k">assert</span> <span class="n">a</span> <span class="o">>=</span> <span class="mi">0</span>
<span class="k">assert</span> <span class="n">b</span> <span class="o">>=</span> <span class="mi">0</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">while</span> <span class="n">b</span><span class="p">:</span>
<span class="n">result</span> <span class="o">^=</span> <span class="n">a</span> <span class="o">*</span> <span class="p">(</span><span class="n">b</span> <span class="o">&</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">a</span> <span class="o"><<=</span> <span class="mi">1</span>
<span class="n">b</span> <span class="o">>>=</span> <span class="mi">1</span>
<span class="k">return</span> <span class="n">result</span>
</code></pre></div>
<p>Other than XOR and bit-shifting, this code also uses AND to “query” whether a
certain monomial is present or not.</p>
<p>Here is an example of how to use the code:</p>
<div class="highlight"><pre><span></span><code><span class="n">a</span> <span class="o">=</span> <span class="mb">0b0101_0111</span> <span class="c1"># x^6 + x^4 + x^2 + x + 1</span>
<span class="n">b</span> <span class="o">=</span> <span class="mb">0b0001_1010</span> <span class="c1"># x^4 + x^3 + x</span>
<span class="n">c</span> <span class="o">=</span> <span class="n">multiply</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">c</span> <span class="o">==</span> <span class="mb">0b0111_0110_0110</span> <span class="c1"># x^10 + x^9 + x^8 + x^6 + x^5 + x^2 + x</span>
</code></pre></div>
</details>
<p>Now that we have introduced binary polynomials, we can of course form <strong>binary
polynomials modulo a binary irreducible polynomial</strong>. These form a finite
field, which is more concisely called: <strong>binary field</strong>.</p>
<p>Note that in a binary field where the modulo is an irreducible polynomial of
degree $n$, all polynomials in the field can be represented as $n$-bit strings,
and all $n$-bit strings have a corresponding binary polynomial in the field.</p>
<details>
<summary>Arithmetic in binary fields</summary>
<p>If we have three integers $a$, $b$, and $p$, we can compute $(a + b) \bmod{p}$
or $a \cdot b \bmod{p}$ by performing the binary operation (addition or
multiplication) and then taking the remainder of the division by $p$. This is a
method that returns the results of addition or multiplication using a
representation with the lowest number of digits possible.</p>
<p>What if instead of having 3 integers we have three binary polynomials $A$, $B$,
and $P$ and we want to compute $(A + B) \bmod{P}$ or $A \cdot B \bmod{P}$? It
turns out that these operations can be implemented with code that is even
easier than the integer counterpart: no division needs to be involved!</p>
<p>Let’s start with addition: we have already seen that addition with binary
polynomials can be implemented with a simple XOR operation. This means that if
the degree of $A$ and $B$ is lower than the degree of $P$, then the result of
$A + B$ is also going to have degree less than $P$, hence no reduction is
needed. We can use the result as-is, without any transformation: <strong>adding two
binary field elements can be implemented with a single XOR operation</strong>.</p>
<p>With multiplication the story is different: the product $A \cdot B$ may have
degree equal to or higher than $P$. For example, if $A = B = x$ and $P = x^2 +
1$, the product $A \cdot B$ is equal to $x^2$, which has the same degree as
$P$. We need to find a way to efficiently reduce the higher-degree terms of
this product. To see one way to do that, note that we can write $P$ like this:</p>
<p>$$P = x^n + Q$$</p>
<p>where $n$ is the degree of $P$ (the maximum power of $P$) and $Q$ is another
binary polynomial, with degree strictly lower than $n$. Rearranging the
equation, we get:</p>
<p>$$x^n = P + Q$$</p>
<p>Note that subtraction and addition are the same operations in a binary field.
Because $P$ equals 0, we can write:</p>
<p>$$x^n = Q$$</p>
<p>This equivalence gives us a way to eliminate higher-level terms that appear
during multiplication: whenever we see an $x^n$ appearing in the result, we can
remove that term and add $Q$ instead. One way to do that, using binary strings,
is to discard the highest bit (the one corresponding to $x^n$) and XOR with
the binary string corresponding to $Q$.</p>
<p>Another way to do it is to just add $P$ (XOR by the binary string corresponding
to $P$). This is equivalent to adding 0, results in the more compact
representation that we’re interested in.</p>
<p>We could use similar tricks to eliminate terms like $x^{n+1}$, but these tricks
are not necessary if we eliminate $x^n$ terms as soon as they appear in an
iterative way.</p>
<p>Here is some Python code for multiplication in binary fields that uses the “add
$P$” trick just described:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">multiply</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">p</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Compute a*b modulo p, where a, b and c are three integers representing</span>
<span class="sd"> binary polynomials.</span>
<span class="sd"> a, b and p are expected to have their most significant bit set to the</span>
<span class="sd"> highest power monomial. For example, the polynomial x^8 is represented as</span>
<span class="sd"> 0b10000.</span>
<span class="sd"> """</span>
<span class="n">bit_length</span> <span class="o">=</span> <span class="n">p</span><span class="o">.</span><span class="n">bit_length</span><span class="p">()</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">assert</span> <span class="n">a</span> <span class="o">>=</span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">a</span> <span class="o"><</span> <span class="p">(</span><span class="mi">1</span> <span class="o"><<</span> <span class="n">bit_length</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">b</span> <span class="o">>=</span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">b</span> <span class="o"><</span> <span class="p">(</span><span class="mi">1</span> <span class="o"><<</span> <span class="n">bit_length</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">bit_length</span><span class="p">):</span>
<span class="n">result</span> <span class="o">^=</span> <span class="n">a</span> <span class="o">*</span> <span class="p">(</span><span class="n">b</span> <span class="o">&</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">a</span> <span class="o"><<=</span> <span class="mi">1</span>
<span class="n">a</span> <span class="o">^=</span> <span class="n">p</span> <span class="o">*</span> <span class="p">((</span><span class="n">a</span> <span class="o">>></span> <span class="n">bit_length</span><span class="p">)</span> <span class="o">&</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">b</span> <span class="o">>>=</span> <span class="mi">1</span>
<span class="k">return</span> <span class="n">result</span>
</code></pre></div>
<p>This code is essentially the same as the binary polynomial multiplication code we had before, except for this line in the <code>for</code> loop:</p>
<div class="highlight"><pre><span></span><code><span class="n">a</span> <span class="o">^=</span> <span class="n">p</span> <span class="o">*</span> <span class="p">((</span><span class="n">a</span> <span class="o">>></span> <span class="n">bit_length</span><span class="p">)</span> <span class="o">&</span> <span class="mi">1</span><span class="p">)</span>
</code></pre></div>
<p>This line is what “adds $P$” whenever adding the shifted $A$ would result in a
$x^n$ term to appear.</p>
<p>Again, we achieved implementing <strong>multiplication using only XOR, AND and
bit-shifting</strong>.</p>
<p>Note that the binary polynomial $P$ here does not necessarily need to be an
irreducible polynomial for this algorithm to work. However, the resulting
algebraic structure won’t be a field unless $P$ is irreducible. A similar story
holds for integers: we can have integers modulo a non-prime number, but that’s
not a field.</p>
</details>
<h3 id="the-ghash-keyed-hash-function">The GHASH keyed hash function</h3>
<p>GCM uses a binary field. The irreducible binary polynomial that defines the
binary field used by GCM is:</p>
<p>$$P = x^{128} + x^7 + x^2 + x + 1$$</p>
<p>We will call this field the <em>GCM field</em>. Note that this polynomial has degree
128, hence the GCM field elements can be represented as 128-bit strings, and
each 128-bit string has a corresponding element in the GCM field.</p>
<p>The keyed hash function used by GCM is called GHASH and takes as input a
128-bit key. We will call this key $H$. This key is interpreted as an element
of the GCM field.</p>
<p>The message to authenticate is split into blocks of 128 bits each: $M_1$,
$M_2$, $M_3$, … $M_n$. If the length of the message is not a multiple of 128
bits, then the last block is padded with zeros. Each block of message is also
interpreted as an element of the GCM field.</p>
<p>Here is how the authentication tag is computed from $H$ and the padded message
blocks $M_1$, …, $M_n$:</p>
<ul>
<li>The initial state (a GCM field element) is initialized to 0: $A_0 = 0$.</li>
<li>For every block of message $M_i$, the next state $A_i$ is computed as $A_i =
(A_{i-1} + M_i) \cdot H \bmod{P}$.</li>
<li>The final state $A_n$ is returned as a 128-bit string.</li>
</ul>
<p>What this function is doing is computing the following polynomial in $H$:</p>
<p>$$\begin{align*}
Tag
& = (((M_1 \cdot H + M_2) \cdot H + \cdots M_n) \cdot H) \bmod{P} \\
& = (M_1 H^n + M_2 H^{n-1} + \cdots M_n H) \bmod{P}
\end{align*}$$</p>
<p>This construction is somewhat similar to the one from Poly1305, although there
are important differences:</p>
<ul>
<li>In Poly1305, the elements of the tag polynomial are integers modulo a prime,
in GHASH they are elements of a binary field.</li>
<li>GHASH does not perform any step to encode the length of the message, hence
the tag for an empty message will be the same as the tag for a sequence of
zero blocks. We will see later that GCM fixes this problem by appending the
length of the message to the end of the input passed to GHASH.</li>
<li>Most importantly, the final $Tag$ polynomial is a polynomial in one unknown,
and as such $H$ may be easily recoverable using algebraic methods. For this
reason, <strong>GHASH is not suitable as a secure one-time authenticator</strong>. We will
see that GCM fixes this problem by encrypting the output of GHASH.</li>
</ul>
<h3 id="use-of-gcm-with-aes-aes-gcm">Use of GCM with AES (AES-GCM)</h3>
<p>GCM is the combination of a block cipher, Counter Mode (CTR), and the GHASH
function that we have just seen. The block cipher is often AES. When we combine
AES with GCM, the what we get is AES-GCM, which is described below. However the
block cipher does not necessarily need to be AES: what is important is that the
block size of the cipher is 128 bits, and that’s because GHASH only works on
128-bit blocks.</p>
<p>The <strong>inputs to the AES-GCM encryption function</strong> are:</p>
<ul>
<li>a secret key (the length of the key depends on the variant of AES used: if
AES-128, this will be 128 bits);</li>
<li>a 96-bit nonce;</li>
<li>a variable-length plaintext message.</li>
</ul>
<p>The <strong>outputs of the AES-GCM encryption function</strong> are:</p>
<ul>
<li>a variable-length ciphertext (same length as the input plaintext);</li>
<li>a 128-bit authentication tag.</li>
</ul>
<p>The <strong>AES-GCM decryption function</strong> will accept the same secret key, nonce,
ciphertext, and authentication tag as the input, and produce either the
plaintext or an error as the output. The error is returned in case the
authentication fails.</p>
<figure>
<img src="https://andrea.corbellini.name/images/aes-gcm-encryption.svg" alt="Diagram of data flow during encryption with AES-GCM">
<figcaption>Data flow during an AES-GCM encryption. This shows the inputs in <span style="color:#3465a4;font-weight:bold">blue</span>, the outputs in <span style="color:#73d216;font-weight:bold">green</span>, and the intermediate objects in <span style="color:#cc0000;font-weight:bold">red</span>.</figcaption>
</figure>
<p>AES-GCM works in the following way:</p>
<ol>
<li>
<p>The <strong>GHASH subkey</strong> $H$ is generated by encrypting a zero-block: $H =
\operatorname{Encrypt}(key, \underbrace{000\dots0}_\text{128 bits})$.</p>
</li>
<li>
<p>The block cipher <strong>AES</strong> is <strong>initialized</strong> in Counter Mode (AES-CTR) with
the key, the nonce, and a 32-bit, big-endian counter starting at <strong>2</strong>.</p>
</li>
<li>
<p>The <strong>plaintext</strong> is <strong>encrypted</strong> using the instance of AES-CTR just
created.</p>
</li>
<li>
<p>The <strong>GHASH</strong> function is run with the following inputs:</p>
<ul>
<li>the subkey $H$, computed in step 1;</li>
<li>the ciphertext padded with zeros to make its length a multiple of 16
bytes (128 bits), concatenated to the length (in bits) of the ciphertext
represented as a 128-bit big-endian integer.</li>
</ul>
<p>The result is a 128-bit block $S = \operatorname{GHASH}(H, ciphertext || padding || length)$.</p>
</li>
<li>
<p>The AES-CTR <strong>counter</strong> is set to <strong>1</strong>.</p>
</li>
<li>
<p>The block $S$ is then <strong>encrypted</strong> using AES-CTR. The result of the
encryption is the <strong>authentication tag</strong>.</p>
<p>Note that, because $S$ matches the block size of the cipher, this
encryption won’t cause the counter value 2 to be reused.</p>
</li>
<li>
<p>The ciphertext and authentication tag are returned.</p>
</li>
</ol>
<p>Here is how AES-GCM and GHASH can be implemented in Python, using the AES
implementation from <a href="https://pypi.org/project/pycryptodome/">pycryptodome</a>
(usual disclaimer: this code is for educational purposes, and it’s not
necessarily secure or optimized for performance):</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">Crypto.Cipher</span> <span class="kn">import</span> <span class="n">AES</span>
<span class="k">def</span> <span class="nf">multiply</span><span class="p">(</span><span class="n">a</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">b</span><span class="p">:</span> <span class="nb">int</span><span class="p">)</span> <span class="o">-></span> <span class="nb">int</span><span class="p">:</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Compute a*b in the GCM field, where a and b are two integers representing</span>
<span class="sd"> elements of the GCM field.</span>
<span class="sd"> a and b are expected to have their least significant bit set to the highest</span>
<span class="sd"> power monomial. For example, the polynomial x^125 is represented as 0b100.</span>
<span class="sd"> """</span>
<span class="n">bit_length</span> <span class="o">=</span> <span class="mi">128</span>
<span class="n">q</span> <span class="o">=</span> <span class="mh">0xe1000000000000000000000000000000</span>
<span class="k">assert</span> <span class="n">a</span> <span class="o">>=</span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">a</span> <span class="o"><</span> <span class="p">(</span><span class="mi">1</span> <span class="o"><<</span> <span class="n">bit_length</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">b</span> <span class="o">>=</span> <span class="mi">0</span> <span class="ow">and</span> <span class="n">b</span> <span class="o"><</span> <span class="p">(</span><span class="mi">1</span> <span class="o"><<</span> <span class="n">bit_length</span><span class="p">)</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">bit_length</span><span class="p">):</span>
<span class="n">result</span> <span class="o">^=</span> <span class="n">a</span> <span class="o">*</span> <span class="p">((</span><span class="n">b</span> <span class="o">>></span> <span class="mi">127</span><span class="p">)</span> <span class="o">&</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">a</span> <span class="o">=</span> <span class="p">(</span><span class="n">a</span> <span class="o">>></span> <span class="mi">1</span><span class="p">)</span> <span class="o">^</span> <span class="p">(</span><span class="n">q</span> <span class="o">*</span> <span class="p">(</span><span class="n">a</span> <span class="o">&</span> <span class="mi">1</span><span class="p">))</span>
<span class="n">b</span> <span class="o"><<=</span> <span class="mi">1</span>
<span class="k">return</span> <span class="n">result</span>
<span class="k">def</span> <span class="nf">pad_block</span><span class="p">(</span><span class="n">data</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bytes</span><span class="p">:</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Pad data with zero bytes so that the resulting length is a multiple of 16</span>
<span class="sd"> bytes (128 bits).</span>
<span class="sd"> """</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="o">%</span> <span class="mi">16</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">data</span> <span class="o">+=</span> <span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="p">(</span><span class="mi">16</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">)</span> <span class="o">%</span> <span class="mi">16</span><span class="p">)</span>
<span class="k">return</span> <span class="n">data</span>
<span class="k">def</span> <span class="nf">iter_blocks_padded</span><span class="p">(</span><span class="n">data</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Split the given data into blocks of 16 bytes (128 bits) each, padding the</span>
<span class="sd"> last block with zeros if necessary.</span>
<span class="sd"> """</span>
<span class="n">start</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">while</span> <span class="n">start</span> <span class="o"><</span> <span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">):</span>
<span class="k">yield</span> <span class="n">pad_block</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="n">start</span><span class="p">:</span><span class="n">start</span><span class="o">+</span><span class="mi">16</span><span class="p">])</span>
<span class="n">start</span> <span class="o">+=</span> <span class="mi">16</span>
<span class="k">def</span> <span class="nf">ghash</span><span class="p">(</span><span class="n">subkey</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">,</span> <span class="n">message</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bytes</span><span class="p">:</span>
<span class="n">subkey</span> <span class="o">=</span> <span class="nb">int</span><span class="o">.</span><span class="n">from_bytes</span><span class="p">(</span><span class="n">subkey</span><span class="p">,</span> <span class="s1">'big'</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">subkey</span> <span class="o"><</span> <span class="p">(</span><span class="mi">1</span> <span class="o"><<</span> <span class="mi">128</span><span class="p">)</span>
<span class="n">state</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">for</span> <span class="n">block</span> <span class="ow">in</span> <span class="n">iter_blocks_padded</span><span class="p">(</span><span class="n">message</span><span class="p">):</span>
<span class="n">block</span> <span class="o">=</span> <span class="nb">int</span><span class="o">.</span><span class="n">from_bytes</span><span class="p">(</span><span class="n">block</span><span class="p">,</span> <span class="s1">'big'</span><span class="p">)</span>
<span class="n">state</span> <span class="o">=</span> <span class="n">multiply</span><span class="p">(</span><span class="n">state</span> <span class="o">^</span> <span class="n">block</span><span class="p">,</span> <span class="n">subkey</span><span class="p">)</span>
<span class="k">return</span> <span class="n">state</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="s1">'big'</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">aes_gcm_encrypt</span><span class="p">(</span><span class="n">key</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">,</span> <span class="n">nonce</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">,</span> <span class="n">message</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="ow">in</span> <span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="mi">24</span><span class="p">,</span> <span class="mi">32</span><span class="p">)</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">nonce</span><span class="p">)</span> <span class="o">==</span> <span class="mi">12</span>
<span class="c1"># Initialize a raw AES instance and encrypt a 16-byte block of all zeros to</span>
<span class="c1"># derive the GHASH subkey H</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">AES</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="n">AES</span><span class="o">.</span><span class="n">MODE_ECB</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">)</span>
<span class="n">h</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="mi">16</span><span class="p">)</span>
<span class="c1"># Encrypt the message with AES in CTR mode, with the counter composed by</span>
<span class="c1"># the concatenation of the 12 byte (96 bits) nonce and a 4 byte (32 bits)</span>
<span class="c1"># integer, starting from 2</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">AES</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="n">AES</span><span class="o">.</span><span class="n">MODE_CTR</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="o">=</span><span class="n">nonce</span><span class="p">,</span> <span class="n">initial_value</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">ciphertext</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
<span class="c1"># Compute the GHASH of the ciphertext plus the ciphertext length in bits</span>
<span class="n">s</span> <span class="o">=</span> <span class="n">ghash</span><span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="n">pad_block</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">*</span> <span class="mi">8</span><span class="p">)</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="s1">'big'</span><span class="p">))</span>
<span class="c1"># Encrypt the GHASH value using AES in CTR mode, with the counter composed</span>
<span class="c1"># by the concatenation of the 12 byte (96 bits) nonce and a 4 byte (32</span>
<span class="c1"># bits) integer set at 1. The GHASH value fits in one block, so the counter</span>
<span class="c1"># won't be increased during this round of encryption</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">AES</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="n">AES</span><span class="o">.</span><span class="n">MODE_CTR</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="o">=</span><span class="n">nonce</span><span class="p">,</span> <span class="n">initial_value</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">tag</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="k">return</span> <span class="p">(</span><span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">aes_gcm_decrypt</span><span class="p">(</span><span class="n">key</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">,</span> <span class="n">nonce</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">,</span> <span class="n">ciphertext</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">,</span> <span class="n">tag</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">):</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="ow">in</span> <span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="mi">24</span><span class="p">,</span> <span class="mi">32</span><span class="p">)</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">nonce</span><span class="p">)</span> <span class="o">==</span> <span class="mi">12</span>
<span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">tag</span><span class="p">)</span> <span class="o">==</span> <span class="mi">16</span>
<span class="c1"># Compute the GHASH subkey, the GHASH value, and the authentication tag, in</span>
<span class="c1"># the same exact way as it was done during encryption</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">AES</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="n">AES</span><span class="o">.</span><span class="n">MODE_ECB</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">)</span>
<span class="n">h</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="mi">16</span><span class="p">)</span>
<span class="n">s</span> <span class="o">=</span> <span class="n">ghash</span><span class="p">(</span><span class="n">h</span><span class="p">,</span> <span class="n">pad_block</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">+</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">*</span> <span class="mi">8</span><span class="p">)</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="mi">16</span><span class="p">,</span> <span class="s1">'big'</span><span class="p">))</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">AES</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="n">AES</span><span class="o">.</span><span class="n">MODE_CTR</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="o">=</span><span class="n">nonce</span><span class="p">,</span> <span class="n">initial_value</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">expected_tag</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="c1"># Compare the input tag with the generated tag. If they're different, the</span>
<span class="c1"># plaintext must not be returned to the caller</span>
<span class="k">if</span> <span class="n">tag</span> <span class="o">!=</span> <span class="n">expected_tag</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s1">'authentication failed'</span><span class="p">)</span>
<span class="c1"># The two tags match; decrypt the plaintext and return it to the caller.</span>
<span class="c1"># Note that, because AES-CTR is a symmetric cipher, there is no difference</span>
<span class="c1"># between the encrypt and decrypt method: here we are reusing the same</span>
<span class="c1"># exact code used during decryption</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">AES</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="n">AES</span><span class="o">.</span><span class="n">MODE_CTR</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="o">=</span><span class="n">nonce</span><span class="p">,</span> <span class="n">initial_value</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
<span class="n">message</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span>
<span class="k">return</span> <span class="n">message</span>
</code></pre></div>
<p>And here is how the code can be used:</p>
<div class="highlight"><pre><span></span><code><span class="n">key</span> <span class="o">=</span> <span class="nb">bytes</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span><span class="s1">'0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef'</span><span class="p">)</span>
<span class="n">nonce</span> <span class="o">=</span> <span class="nb">bytes</span><span class="o">.</span><span class="n">fromhex</span><span class="p">(</span><span class="s1">'0123456789abcdef01234567'</span><span class="p">)</span>
<span class="n">message</span> <span class="o">=</span> <span class="sa">b</span><span class="s1">'I went to the zoo yesterday but not today'</span>
<span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span> <span class="o">=</span> <span class="n">aes_gcm_encrypt</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="p">,</span> <span class="n">message</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">'ciphertext: </span><span class="si">{</span><span class="n">ciphertext</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">' tag: </span><span class="si">{</span><span class="n">tag</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
<span class="n">decrypted_message</span> <span class="o">=</span> <span class="n">aes_gcm_decrypt</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="p">,</span> <span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">' plaintext: </span><span class="si">{</span><span class="n">decrypted_message</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
<span class="k">assert</span> <span class="n">message</span> <span class="o">==</span> <span class="n">decrypted_message</span>
</code></pre></div>
<p>This snippet produces the following output:</p>
<div class="highlight"><pre><span></span><code>ciphertext: e0c32db2962f9b729c69028d9a1fdfb2c93839fc1188f314c58ee97fd6a242404953bb208df609a33c
tag: 9fa6fe2f77a0c98282868924ace0e4ec
plaintext: b'I went to the zoo yesterday but not today'
</code></pre></div>
<p>This is the same output we would obtain by using the AES-GCM implementation
from pycryptodome directly:</p>
<div class="highlight"><pre><span></span><code><span class="kn">from</span> <span class="nn">Crypto.Cipher</span> <span class="kn">import</span> <span class="n">AES</span>
<span class="n">cipher</span> <span class="o">=</span> <span class="n">AES</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="n">mode</span><span class="o">=</span><span class="n">AES</span><span class="o">.</span><span class="n">MODE_GCM</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="n">key</span><span class="p">,</span> <span class="n">nonce</span><span class="o">=</span><span class="n">nonce</span><span class="p">)</span>
<span class="n">ciphertext</span><span class="p">,</span> <span class="n">tag</span> <span class="o">=</span> <span class="n">cipher</span><span class="o">.</span><span class="n">encrypt_and_digest</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">'ciphertext: </span><span class="si">{</span><span class="n">ciphertext</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s1">' tag: </span><span class="si">{</span><span class="n">tag</span><span class="o">.</span><span class="n">hex</span><span class="p">()</span><span class="si">}</span><span class="s1">'</span><span class="p">)</span>
</code></pre></div>
<p>Nonce reuse is catastrophic for AES-GCM in two ways:</p>
<ul>
<li>
<p>Because the ciphertext produced by AES-GCM is just a variant of AES-CTR,
nonce reuse with GCM can have the same consequences as nonce reuse with
AES-CTR, or any other stream cipher: if someone is able to guess the
plaintext, they can recover the random stream, and use that to decrypt other
messages (or portions of them).</p>
</li>
<li>
<p>If the same nonce is used twice or more, the GHASH subkey $H$ will always be
the same. Even if the output of GHASH is encrypted in step 7, we can use the
XOR of two authentication tags to “cancel” the encryption and obtain a
polynomial in $H$. From there, we can use algebraic methods to recover $H$.
This gives us the ability to forge new, valid authentication tags.</p>
</li>
</ul>
<p>It’s worth mentioning that there’s a variant of AES-GCM, called AES-GCM-SIV,
(Synthetic Initialization Vector) specified in <a href="https://www.rfc-editor.org/rfc/rfc8452">RFC
8452</a>. This differs from AES-GCM in
that it uses a little-endian version of GHASH called POLYVAL (which is faster
on modern CPUs), and in that it allows nonce reuse without the two catastrophic
consequences that I mentioned above.</p>
<p>(Nonce reuse with AES-GCM-SIV however still presents a problem, just not as
serious as the two ones above: specifically, it breaks <a href="https://en.wikipedia.org/wiki/Ciphertext_indistinguishability">ciphertext
indistinguishability</a>.)</p>
<h1 id="authenticated-encryption-with-associated-data-aead">Authenticated Encryption with Associated Data (AEAD)</h1>
<p>The way I have described authenticated encryption, and in particular the
constructions ChaCha20-Poly1305 and AES-GCM, is accurate, but incomplete. What
I have told you is that when you use an authenticated encryption cipher, the
ciphertext is checked for integrity and authenticity. But we can use the same
technique to authenticate <em>anything</em>, not just ciphertexts: we can, for example,
authenticate some plaintext data, or authenticate a piece of plaintext data and
a piece of ciphertext altogether.</p>
<p>When we use a method to authenticate a plaintext message only, what we get is a
Message Authentication Code (MAC). We don’t use the word “encryption” in this
context, because the confidentiality of the message is not ensured (only its
authenticity).</p>
<p>When we use a method to authenticate both a ciphertext and a plaintext message,
what we get is <strong>Authenticated Encryption with Associated Data (AEAD)</strong>. In
this construction, there are two messages involved: one to be encrypted
(resulting in a ciphertext), and one to be kept in plaintext. The plaintext
message is called “associated data” (AD) or “additional authenticated data”
(AAD). Both the ciphertext and the associated data are authenticated at
encryption time, so their integrity and authenticity will be enforced.</p>
<p>The <strong>inputs to the encryption function</strong> of an AEAD cipher are, generally
speaking:</p>
<ul>
<li>a key;</li>
<li>a nonce;</li>
<li>the additional data;</li>
<li>the message to encrypt.</li>
</ul>
<p>The <strong>outputs of the encryption</strong> are:</p>
<ul>
<li>the ciphertext;</li>
<li>the authentication tag.</li>
</ul>
<p>Note that there’s only one authentication tag that covers both the additional
data and the ciphertext.</p>
<p>The <strong>inputs to the decryption</strong> function are:</p>
<ul>
<li>the key used for encryption;</li>
<li>the nonce used for encryption;</li>
<li>the additional data used for encryption;</li>
<li>the ciphertext.</li>
</ul>
<p>And the <strong>output of the decryption</strong> is either an error or the decrypted
message.</p>
<p>It’s important to note that the associated data must be both at encryption time
and decryption time. Changing a single bit of it will make the entire
decryption operation fail.</p>
<p>Both ChaCha20-Poly1305 and AES-GCM (and their variants, XChaCha20-Poly1305 and
AES-GCM-SIV) are AEAD ciphers. Here’s how they implement AEAD:</p>
<ul>
<li>When the Poly1305 or GHASH authenticator is first initialized, they are fed
the additional data, padded with zeros to make its size a multiple of 16
bytes (128 bits).</li>
<li>Then the padded ciphertext is fed into the authenticator.</li>
<li>The length of the additional data and the length of the ciphertext are
represented as two 64-bit integers, concatenated, and fed into the
authenticator.</li>
</ul>
<figure>
<img src="https://andrea.corbellini.name/images/chacha20-poly1305-aead-encryption.svg" alt="Diagram of data flow during encryption with ChaCha20-Poly1305, including the Associated Data (AE)">
<figcaption>Updated data flow during a ChaCha20-Poly1305 encryption which shows where the Associated Data (AE) is placed.</figcaption>
</figure>
<figure>
<img src="https://andrea.corbellini.name/images/aes-gcm-aead-encryption.svg" alt="Diagram of data flow during encryption with AES-GCM, including the Associated Data (AE)">
<figcaption>Updated data flow during an AES-GCM encryption which shows where the Associated Data (AE) is placed.</figcaption>
</figure>
<p>If the additional data is empty, then what you get are exactly the
constructions that I described earlier in this article.</p>
<p>Authenticated Encryption with Associated Data is useful in situations where you
want to encode some metadata along with your encrypted data. For example: an
identifier for the resource that is encrypted, or the type of data encrypted
(text, image, video, …), or some information that indicates what key and
algorithm was used to encrypt the resource, or maybe the expiration of the
data. The associated data is in plaintext so systems that do not have access to
the secret key can gather some properties about the encrypted resource. It must
however be understood that the associated data cannot be trusted until it’s
verified using the secret key. Systems that analyze the associated data must be
designed in such a way that, if the associated data is tampered, nothing bad
will happen, and such tampering attempt will be detected sooner or later.</p>
<h1 id="a-word-of-caution">A word of caution</h1>
<p>Something very important to understand is that when using authenticated
encryption ciphers like ChaCha20-Poly1305 or AES-GCM, <strong>decryption can in
theory succeed even if the verification of the authentication tag fails</strong>.</p>
<p>For example, we can decrypt a ciphertext encrypted with ChaCha20-Poly1305 by
using ChaCha20 and ignoring the authentication tag. Similarly, we can decrypt a
ciphertext encrypted with AES-GCM by using AES-CTR and, again, ignoring the
authentication tag. This possibility opens the doors to all the nasty scenarios
that we have seen at the beginning of this article, removing all the benefits
of authenticated encryption.</p>
<p>Perhaps the most important thing to remember when using authenticated
encryption is: <strong>never use decrypted data until you have verified its
authenticity</strong>.</p>
<p>Why am I emphasizing this? Because some AE or AEAD implementations do return
plaintext bytes <em>before</em> verifying their authenticity.</p>
<p>The code samples that I have provided do the following: they first calculate
the authentication tag, compare it to the input tag, and only if the comparison
succeeds they perform the decryption. This is a simple approach, but it may be
expensive when encrypting large amounts of data (for example: several
gigabytes). The reason why this approach is expensive is that, if the
ciphertext is too large, it may not fit all in memory, and the ciphertext would
have to be read from the storage device twice: once for calculating the tag,
and once for decrypting the ciphertext. Also, chances are that by the time the
application has computed the tag, the underlying ciphertext may have changed
without detection.</p>
<p>What some authenticated encryption implementations do when dealing with large
amounts of data is that they calculate the tag <em>and</em> perform the decryption in
parallel. They read the ciphertext chunk-by-chunk, and pass each chunk to both
the authenticator and the decryption function, returning a chunk of decrypted
bytes to the caller at each iteration. Only at the end, when the full
ciphertext has been read, the authenticity is checked, and the application may
return an error only at that point. With such implementations, it is imperative
that the exit status of the application is checked before using any of the
decrypted bytes.</p>
<p>An implementation that works like that (returning decrypted bytes before
authentication is complete) is GPG. Here is an example of the output that GPG
produces when decrypting a tampered message:</p>
<div class="highlight"><pre><span></span><code>gpg: AES256.CFB encrypted data
gpg: encrypted with 1 passphrase
This is a very long message.
gpg: WARNING: encrypted message has been manipulated!
</code></pre></div>
<p>The decrypted message (“This is a very long message”) got printed, together
with a warning, and the exit status is 2, indicating that an error occurred. It
is important in this case that the decrypted message is not used in any way.</p>
<p>Other implementations avoid this problem by simply not encrypting large amounts
of data. If given a large file to encrypt, the file is first split into
multiple chunks of a few KiB, then each chunk is encrypted independently, with
its own nonce and authentication tag. Because each chunk is small,
authentication and decryption can happen in memory, one before the other. If a
chunk was tampered, decryption would stop, returning truncated output, but
never tampered output. It’s still important to check the exit status of such an
implementation, but the consequences are less catastrophic than before. The
drawback of this approach is that the total size of the ciphertext increases,
because each chunk requires a nonce, an authentication tag, and some
information about the position of the chunk (to prevent the chunks from being
reordered). Storing the nonces or the positions can be avoided by using an
algorithm to generate them on the fly, but storing the tag cannot be avoided.</p>
<p>The method of splitting that I have just described (of splitting long messages
into chunks that are individually encrypted and authenticated) is used for
example in <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a>, as
well as the command line tool <a href="https://github.com/FiloSottile/age">AGE</a>.</p>
<h1 id="summary-and-final-considerations">Summary and final considerations</h1>
<p>At the beginning of this article we have seen some risks of using bare
encryption ciphers: one of them in particular was malleability, that is: the
property that ciphertexts may be modified without detection.</p>
<p>This problem was addressed by using Authenticated Encryption (AE) or
Authenticated Encryption with Associated Data (AEAD), which are methods to
provide integrity and authenticity in addition to confidentiality when
encrypting data.</p>
<p>We have seen the details of the two most popular authenticated encryption
ciphers and briefly mentioned some of their variants. Their features are
summarized here:</p>
<table>
<thead>
<tr>
<th>Cipher</th>
<th>Cipher Type</th>
<th>Key Size</th>
<th>Nonce Size</th>
<th>Nonce Reuse</th>
<th>Tag Size</th>
</tr>
</thead>
<tbody>
<tr>
<td>ChaCha20-Poly1305</td>
<td>Stream, AEAD</td>
<td>256 bits</td>
<td>96 bits</td>
<td>Catastrophic</td>
<td>128 bits</td>
</tr>
<tr>
<td>XChaCha20-Poly1305</td>
<td>Stream, AEAD</td>
<td>256 bits</td>
<td>192 bits</td>
<td>Catastrophic</td>
<td>128 bits</td>
</tr>
<tr>
<td>AES-GCM</td>
<td>Stream, AEAD</td>
<td>128, 192, 256 bits</td>
<td>96 bits</td>
<td>Catastrophic</td>
<td>128 bits</td>
</tr>
<tr>
<td>AES-GCM-SIV</td>
<td>Stream, AEAD</td>
<td>128 or 256 bits</td>
<td>96 bits</td>
<td>Reduced risk</td>
<td>128 bits</td>
</tr>
</tbody>
</table>
<p>Authenticated encryption is used in most of our modern protocols, including
TLS, S/MIME, PGP/GPG, and many more. Failure to implement authenticated
encryption correctly has lead to some serious issues in the past.</p>
<p>Whenever you’re using encryption, ask yourself: how is integrity and
authentication verified? And remember: it’s essential to verify the
authenticity of data <em>before</em> using it.</p>
<p>I hope you enjoyed this article! As usual, if you have any suggestions or
spotted some mistakes, let me know in the comments or by contacting me!</p>andreacorbelliniThu, 09 Mar 2023 18:35:00 +0000tag:andrea.corbellini.name,2023-03-09:/2023/03/09/authenticated-encryption/miscWhat time is it? A simple question with a complex answer. How computers synchronize timehttps://andrea.corbellini.name/2023/01/23/what-time-is-it/<p>Ever wondered how your computer or your phone displays the current date and
time accurately? What keeps all the devices in the world (and in space) in
agreement on what time it is? What makes applications that require precise
timing possible?</p>
<p>In this article, I will explain some of the challenges with time
synchronization and explore two of the most popular protocols that devices use
to keep their time in sync: the Network Time Protocol (NTP) and the Precision
Time Protocol (PTP).</p>
<h1 id="what-is-time">What is time?</h1>
<p>It wouldn’t be a good article about time synchronization without spending a few
words about time. We all have an intuitive concept of time since childhood, but
stating precisely what ‘time’ is can be quite a challenge. I’m going to give
you my idea of it.</p>
<p>Here is a simple definition to start with: <strong>time is how we measure changes</strong>.
If the objects in the universe didn’t change and appeared to be fixed, without
ever moving or mutating, I think we could all agree that time wouldn’t be
flowing. Here by ‘change’ I mean any kind of change: from objects falling or
changing shape, to light diffusing through space, or our memories building up
in our mind.</p>
<p>This definition may be a starting point but does not capture all we know about
time. Something that it does not capture is our concept of past, present, and
future. From our day-to-day experience, we know in fact that an apple would
fall off the tree due to gravity, under the normal flow of time. If we observed
an apple rising from the ground, attaching itself to the tree (without the
action of external forces), we could perhaps agree that what we’re observing is
time flowing backward. And yet, both the apple falling off the tree and the
apple rising from the ground are two valid <em>changes</em> from an initial state.
This is where causality comes into place: <strong>time flows in such a way that the
cause must precede the effect</strong>.</p>
<p>We can now refine our definition of time as an <strong>ordered sequence of changes,
where each change is linked to the previous one by causality</strong>.</p>
<h1 id="how-do-we-measure-time">How do we measure time?</h1>
<p>Now we have a more precise definition of time, but we still don’t have enough
tools to define what is a second, an hour, or a day. This is where things get
more complicated.</p>
<p>If we look at the definition of ‘second’ from the international standard, we
can see that it is currently defined from the emission frequency of caesium-133
(<sup>133</sup>Cs) atoms. If you irradiate caesium-133 atoms with some light
having sufficient energy, the atoms will absorb the light, get excited, and
release the energy back in the form of light at a specific frequency. That
frequency of emission is defined as <span>9<span
style="margin-left:0.2em">192</span><span
style="margin-left:0.2em">631</span><span
style="margin-left:0.2em">770</span></span> Hz, and the second is defined as
the inverse of that frequency. This definition is known as the <a href="https://en.wikipedia.org/wiki/Caesium_standard">caesium
standard</a>.</p>
<p>Here’s a problem to think about: how do we know that a caesium-133 atom, after
getting excited, really emits light at a fixed frequency? The definition of
second is implying that the frequency is constant and the same all over the
world, but how do we know it’s really the case? This assumption is supported by
quantum physics, according to which atoms can only transition between discrete
(quantified) energy states. When an atom gets excited, it transitions from an
energy state $E_1$ to an energy state $E_2$. Atoms like to be in the lowest
energy state, so the atom will not stay in the state $E_2$ for long, and will
want to go back to $E_1$. When doing that, it will release an amount of energy
of exactly $E_2 - E_1$ in the form of a photon. According to the <a href="https://en.wikipedia.org/wiki/Planck_relation">Planck
formula</a>, the photon will have
frequency $f = (E_2 - E_1) / h$ where $h$ is the Planck constant. Because the
energy levels are fixed, the resulting emission frequency is fixed as well.</p>
<p>By the way, this process of absorption and emission of photons is the same
process that causes fluorescence.</p>
<figure>
<img src="https://andrea.corbellini.name/images/atomic-clock-emission.svg" alt="Visualization of the absorption and emission process for an atom transitioning between two energy states">
<figcaption>Visualization of the absorption and emission process for an atom transitioning between a ground state $E_1$ to an excited state $E_2$.</figcaption>
</figure>
<p>Assuming that caesium-133 atoms emit light at a single, fixed frequency, we can
now build <em>extremely</em> accurate caesium atomic clocks and measure spans of time
with them. Existing caesium atomic clocks are estimated to be so precise that
they may lose one second every 100 million years.</p>
<p>The same approach can be applied to other substances as well: atomic clocks
have been constructed using rubidium (Rb), strontium (Sr), hydrogen (H),
krypton (Kr), ammonia (NH<sub>3</sub>), ytterbium (Yb), each having its own
emission frequency, and their own accuracy. The <a href="https://www.theverge.com/2015/4/22/8466681/most-accurate-atomic-clock-optical-lattice-strontium">most accurate clock ever
built</a>
is a strontium clock which may lose one second every 15 billion years.</p>
<h1 id="time-dilation">Time dilation</h1>
<p>If we have two atomic clocks and we let them run for a while, will they show
the same time? This might sound like a rhetorical question: we just established
that the frequencies of emission of atoms are fixed, so why would two identical
atomic clocks ever get out of sync? Well, as a matter of fact, two identical
atomic clocks may get out of sync, and this problem is not due to the clocks,
but with time itself: it appears that time does not always flow in the same way
everywhere.</p>
<p>Many experiments have shown this effect on our planet, the most famous one
probably being the <a href="https://en.wikipedia.org/wiki/Hafele%E2%80%93Keating_experiment">Hafele-Keating
experiment</a>.
In this experiment, a set of caesium clocks was placed on an airplane flying
around the world west-to-east, another set was placed on an airplane flying
east-to-west, and another set remained on ground. The 3 sets of clocks, which
were initially in sync before the planes took off, were showing different times
once reunited after the trip. This experiment and similar ones have been
repeated and refined multiple times, and they all showed consistent results.</p>
<p>These effects were due to <a href="https://en.wikipedia.org/wiki/Time_dilation">time
dilation</a>, and the results were
consistent with the predictions of <a href="https://en.wikipedia.org/wiki/Special_relativity">special
relativity</a> and <a href="https://en.wikipedia.org/wiki/General_relativity">general
relativity</a>.</p>
<h2 id="time-dilation-due-to-special-relativity">Time dilation due to special relativity</h2>
<p>Special relativity predicts that if two clocks are moving with two different
velocities, they are going to measure different spans of time.</p>
<p>Special relativity is based on two principles:</p>
<ul>
<li>the speed of light is constant;</li>
<li>there are no privileged reference frames.</li>
</ul>
<p>To understand how these principles affect the flow of time, it’s best to look
at an example: imagine that a passenger is sitting on a train with a laser and
a mirror in front of them. Another person is standing on the ground next to the
railroad and observing the train passing. The passenger points the laser
perpendicular to the mirror and turns it on.</p>
<p>What the passenger will observe is the beam of light from the laser to hit the
mirror and come back in a straight line:</p>
<figure>
<img src="https://andrea.corbellini.name/images/special-relativity-train-reference-frame.webp" alt="Beam of light in the train reference frame">
<figcaption>Portion of the beam of light in the train reference frame, emitted from the laser (bottom) and bouncing from the mirror (top). Note how it follows a vertical path.</figcaption>
</figure>
<p>From the observer perspective, however, things are quite different. Because the
train is moving relative to the observer, the beam looks like it’s taking a
different, slightly longer path:</p>
<figure>
<img src="https://andrea.corbellini.name/images/special-relativity-observer-reference-frame.webp" alt="Beam of light in the observer reference frame">
<figcaption>The same portion of light beam as before, but this time in the observer reference frame. Note how it follows a diagonal path, longer than the vertical path in the train reference frame.</figcaption>
</figure>
<p>If both the passenger and the observer measure how long it took for the light
beam to hit back at the source, and if the principles of special relativity
hold, then the two persons will record different measurements. If the speed of
light is constant, and there is no privileged reference frame, then the speed
of light $c$ must be the same in both reference frames. From the passenger’s
perspective, the beam has traveled a distance of $2 L$, taking a time $2 L /
c$. From the observer’s perspective, the beam has traveled a longer distance $2
M$, with $M > L$, taking a longer time $2 M / c$.</p>
<figure>
<img src="https://andrea.corbellini.name/images/special-relativity-reference-frame-comparison.webp" alt="Beam of light in the observer reference frame">
<figcaption>Comparison of the light beams as seen from the two reference frames. In the train reference frame, the light beam is a vertical line of length $L$ (therefore traveling a path of length $2 L$ after bouncing from the mirror). In the observer reference frame, the light beam is distorted due to the velocity of the train. If the train moves at speed $v$, then the light beam travels a total length of $2 M = 2 L c / \sqrt{c^2 - v^2}$.</figcaption>
</figure>
<p>How can we reconcile these counterintuitive measurements? Special relativity
does it is by stating that time flows differently in the two reference frames.
Time runs “slower” inside the train and runs “faster” for the observer. One
consequence of that is that the passenger ages less than the observer.</p>
<p>Time dilation due to special relativity is not easily detectable in our
day-to-day life, but it can still cause problems with high-precision clocks.
This time dilation may in fact cause clock drifts in the order of hundreds of
nanoseconds per day.</p>
<h2 id="time-dilation-due-to-general-relativity">Time dilation due to general relativity</h2>
<p>Experimental data shows that clocks in a gravitational field do not follow
(solely) the rules of special relativity. This does not mean that special
relativity is wrong, but it’s a sign that it is incomplete. This is where
general relativity comes into play. In general relativity, <strong>gravity</strong> is not
seen as a <em>force</em>, like in classical physics, but rather as a deformation of
<a href="https://en.wikipedia.org/wiki/Spacetime">spacetime</a>. All objects that have
mass bend spacetime, and the path of objects traveling through spacetime is
affected by its curvature.</p>
<p>An apple falling from a tree is not going towards the ground because there’s a
force “pushing” it down, but rather because that’s the shortest <a href="https://en.wikipedia.org/wiki/World_line">path in
spacetime</a> (a straight line in bent
spacetime).</p>
<figure>
<img src="https://andrea.corbellini.name/images/apple-falling-classical-physics.webp" alt="Apple falling according to classical physics, following a parabolic motion">
<figcaption>Apple falling according to classical physics, following a parabolic motion.</figcaption>
</figure>
<figure>
<img src="https://andrea.corbellini.name/images/apple-falling-general-relativity.webp" alt="Apple falling according to general relativity, following a straight path in distorted spacetime">
<figcaption>Apple falling according to general relativity, following a straight path in distorted spacetime.</figcaption>
</figure>
<p>The larger the mass of objects, the larger the curvature of spacetime they
produce. Time flows “slower” near large masses, and “faster” away from it.
Interesting facts: people on a mountain age faster than people on the sea
level, and it has been
<a href="https://phys.org/news/2016-05-earth-core-younger-thought.html">calculated</a>
that the core of the Earth is 2.5 years younger than the crust.</p>
<p>The time dilation caused by gravity on the surface of the Earth may amount to
clock drifts in the order of hundreds of nanoseconds per day, just like special
relativity.</p>
<h1 id="can-we-actually-synchronize-clocks">Can we actually synchronize clocks?</h1>
<p>Given what we have seen about time dilation, and that we may experience time
differently, does it even make sense to talk about time synchronization? Can we
agree on time if time flows differently for us?</p>
<p>The short answer is yes: the trick is to restrict our view to a closed system,
like the surface of our planet. If we place some clocks scattered across the
system, they will almost certainly experience different flows of time, due to
different velocities, different altitudes, and other time dilation phenomena.
We cannot make those clocks agree on how much time has passed since a specific
event; what we can do is aggregate all the time measurements from the clocks
and average them out. This way we end up with a value that is representative of
how much time has passed on the entire system—in other words, we get an
“overall time” for the system.</p>
<p>Very often, the system that we consider is not restricted to just the surface
of our planet, but involves the Sun, and sometimes the moon as well. In fact,
what we call one <em>year</em> is roughly the time it takes for the Earth to complete
an orbit around the Sun; one <em>day</em> is roughly the time it takes for the Earth
to spin around itself once and face the Sun in the same position again.
Including the Sun (or the moon) in our time measurements is complicated: in
part this complexity comes from the fact that precise measurements of the
Earth’s position are difficult, and in part from the fact that the Earth’s
rotation is not regular, not fully predictable, and it’s slowing down. It’s
worth noting that climate and geological events affect the Earth’s rotation in
a measurable way, and such events are very hard to model accurately.</p>
<p>What is important to understand here is that the word ‘time’ is often used to
mean different things. Depending on how we measure it, we can end up with
<strong>different definitions of time</strong>. To avoid ambiguity, I will classify ‘time’
into two big categories:</p>
<ul>
<li>
<p><strong>Elapsed time</strong>: this is the time measured directly by a clock, without
using any extra information about the system where the clock lies into or
about other clocks.</p>
<p>We can use elapsed time to measure durations, latencies, frequencies, as
well as lengths.</p>
</li>
<li>
<p><strong>Coordinated time</strong>: this is the time measured by using a clock, paired with
information about the system where it’s located (like position, velocity, and
gravity), and/or information from other clocks.</p>
<p>This notion of time is mostly useful for coordinating events across the
system. Some practical examples: scheduling the execution of tasks in the
future, checking the expiration of certificates, real-time communication.</p>
</li>
</ul>
<h1 id="time-standards">Time standards</h1>
<p>Over the centuries several <a href="https://en.wikipedia.org/wiki/Time_standard">time
standards</a> have been introduced to
measure <em>coordinated time</em>. Nowadays there are three major standards in use:
TAI, UTC, and GNSS. Let’s take a brief look at them.</p>
<h2 id="tai">TAI</h2>
<p><a href="https://en.wikipedia.org/wiki/International_Atomic_Time">International Atomic Time
(TAI)</a> is based on the
weighted average of the <em>elapsed time</em> measured by several atomic clocks spread
across the world. The more a clock in TAI is precise, the more it contributes
to the weighted average. The fact that the clocks are spread in multiple
locations, and the use of an average, mitigates relativistic effects and yields
a value that we can think of as the overall time flow experienced by the
surface of the Earth.</p>
<p>Note that the calculations for TAI does not include the Earth’s position with
respect to the Sun.</p>
<figure>
<img src="https://andrea.corbellini.name/images/tai-equipment-distribution.webp" alt="Distribution of the laboratories that contribute to TAI all over the world">
<figcaption>Distribution of the laboratories that contribute to International Atomic Time (TAI) all over the world as of 2020. Map taken from the <a href="https://webtai.bipm.org/ftp/pub/tai/annual-reports/bipm-annual-report/annual_report_2020.pdf">BIPM Annual Report on Time Activities</a>.</figcaption>
</figure>
<h2 id="utc">UTC</h2>
<p><a href="https://en.wikipedia.org/wiki/Coordinated_Universal_Time">Coordinated Universal Time
(UTC)</a> is built upon
TAI. UTC, unlike TAI, is periodically adjusted to synchronize it with the
Earth’s rotation around itself and the Sun. The goal is to make sure that 24
UTC hours are equivalent to a solar day (within a certain degree of precision).
Because, as explained earlier, the Earth’s rotation is irregular, not fully
predictable, and slowing down, periodic adjustments have to be made to UTC at
irregular intervals.</p>
<p>The adjustments are performed by inserting <a href="https://en.wikipedia.org/wiki/Leap_second">leap
seconds</a>: these are extra seconds
that are added to the UTC time to “slow down” the UTC time flow and keep it in
sync with Earth’s rotation. On days when a leap second is inserted, UTC clocks
go from 23:59:<strong>59</strong> to 23:59:<strong>60</strong>.</p>
<figure>
<img src="https://andrea.corbellini.name/images/leap-seconds-timeline.svg" alt="Visualization of leap seconds inserted into UTC, and a comparison with TAI">
<figcaption>A visualization of leap seconds inserted into UTC until the end of 2022. Each orange dot represents a leap second (not in scale). When UTC was started in 1972, it started with 10 seconds of offset from TAI. As you can see, the insertion of leap seconds is very irregular: some decades have seen many leap seconds, others have seen much more.</figcaption>
</figure>
<p>It’s worth noting that the practice of inserting leap seconds is most likely
<a href="https://en.wikipedia.org/wiki/Leap_second#Future_of_leap_seconds">going to be
discontinued</a>
in the future. The main reason is that leap seconds have been the source of
complexity and bugs in computer systems, and the benefit-to-pain ratio of leap
seconds is not considered high enough to keep adding them. If leap seconds are
discontinued, UTC will become effectively equivalent to TAI, with an offset:
UTC will always differ from TAI by a few seconds, but this difference will
always be constant, if no more leap seconds are inserted.</p>
<h2 id="gnss">GNSS</h2>
<p><a href="https://en.wikipedia.org/wiki/GNSS">Global Navigation Satellite System (GNSS)</a>
is based on a mix of accurate atomic clocks on ground and less accurate atomic
clocks on artificial satellites orbiting around the Earth. The clocks on the
satellites, being less accurate and subject to a variety of relativistic
effects, are updated about twice a day from ground stations to correct clock
drifts. Nowadays there are several implementations of GNSS around the world,
including:</p>
<ul>
<li>the United States’ <a href="https://en.wikipedia.org/wiki/Global_Positioning_System">Global Positioning System (GPS)</a>;</li>
<li>the European <a href="https://en.wikipedia.org/wiki/Galileo_(satellite_navigation)">Galileo</a> system;</li>
<li>China’s <a href="https://en.wikipedia.org/wiki/BeiDou">BeiDou (BDS)</a>;</li>
<li>the Russian <a href="https://en.wikipedia.org/wiki/GLONASS">GLONASS</a>.</li>
</ul>
<p>When GPS was launched, it was synchronized with UTC, however GPS, unlike UTC,
is not adjusted to follow the Earth’s rotation, and due to that, GPS today
differs from UTC by 18 seconds (because 18 leap seconds have been inserted
since GPS was launched in 1980). BeiDou also does not implement leap seconds.
GPS and BeiDou are therefore compatible with TAI.</p>
<p>Other GNSS systems like Galileo and GLONASS do implement leap seconds and are
therefore compatible with UTC.</p>
<h1 id="time-synchronization-protocols">Time synchronization protocols</h1>
<p>Dealing with <em>coordinated time</em> is not trivial. Different ways to deal with
relativistic effects and Earth’s irregular rotation result in different time
standards that are not always immediately compatible with each other.
Nonetheless, once we agree on a well-defined time standard, we have a way to
ask the question “what time is it?” and receive an accurate answer all around
the world (within a certain degree of precision).</p>
<p>Let’s now take a look at how computers on a network can obtain an accurate
value for the coordinated time given by a time standard. I will describe two
popular protocols: NTP and PTP. The two are using similar algorithms, but offer
different precision: milliseconds (NTP) and nanoseconds (PTP). Both use UDP/IP
as the transport protocol.</p>
<h2 id="network-time-protocol-ntp">Network Time Protocol (NTP)</h2>
<p>The way time synchronization works with NTP is the following: a computer that
wants to synchronize its time periodically queries an NTP server (or multiple
servers) to get the current coordinated time. The server that provides the
current coordinated time may have obtained the time from an accurate source
clock connected to the server (like an atomic clock synchronized with TAI or
UTC, or a GNSS receiver), or from a previous synchronization from another NTP
server.</p>
<p>To record how “fresh” the coordinated time from an NTP server is (how distant
the NTP server is from the source clock), NTP has a concept of <strong>stratum</strong>:
this is a number that indicates the number of ‘hops’ from the accurate clock
source:</p>
<ul>
<li>stratum <strong>0</strong> is used to indicate an accurate clock;</li>
<li>stratum <strong>1</strong> is a server that is directly connected to a stratum <strong>0</strong> clock;</li>
<li>stratum <strong>2</strong> is a server that is synchronized from a stratum <strong>1</strong> server;</li>
<li>stratum <strong>3</strong> is a server that is synchronized from a stratum <strong>2</strong> server;</li>
<li>and so on…</li>
</ul>
<p>The maximum stratum allowed is 15. There’s also a special stratum 16: this is
not a real stratum, but a special value used by clients to indicate that time
synchronization is not happening (most likely because the NTP servers are
unreachable).</p>
<figure>
<img src="https://andrea.corbellini.name/images/ntp-strata.svg" alt="Visualization of NTP strata in a distributed network">
<figcaption>Examples of different NTP strata in a distributed network. A stratum <em>n</em> server obtains its time from stratum <em>n</em> - 1 servers.</figcaption>
</figure>
<p>The major problem with synchronizing time over a network is latency. Networks
can be composed of multiple links, some of which may be slow or overloaded.
Simply requesting the current time from an NTP server without taking latency
into account would lead to an imprecise response. Here is how NTP deals with
this problem:</p>
<ol>
<li>The NTP client sends a request via a UDP packet to an NTP server. The packet
includes an <strong>originate timestamp</strong> $t_0$ that indicates the local time of
the client when the packet was sent.</li>
<li>The NTP server receives the request and records the <strong>receive timestamp</strong>
$t_1$, which indicates the local time of the server when the request was
received.</li>
<li>The NTP server processes the request, prepares a response, and records the
<strong>transmit timestamp</strong> $t_2$, which indicates the local time of the server
when the response was sent. The timestamps $t_0$, $t_1$ and $t_2$ are all
included in the response.</li>
<li>The NTP client receives the response and records the timestamp $t_3$, which
indicates the local time of the client when the response was received.</li>
</ol>
<figure>
<img src="https://andrea.corbellini.name/images/ntp-sync-algorithm.svg" alt="Visualization of the NTP time synchronization algorithm">
<figcaption>The NTP synchronization algorithm.</figcaption>
</figure>
<p>Our goal is now to calculate an estimate for the network latency and processing
delay and use that information to calculate, in the most accurate way possible,
the offset between the NTP client clock and the NTP server clock.</p>
<p>The difference $t_3 - t_0$ is the duration of the overall exchange. The
difference $t_2 - t_1$ is the duration of the NTP server processing delay. If
we subtract these two durations, we get the total network latency experienced,
also known as <strong>round-trip delay</strong>:</p>
<p>$$\delta = (t_3 - t_0) - (t_2 - t_1)$$</p>
<p>If we assume that the transmit delay and the receive delay are the same, then
$\delta / 2$ is the <strong>average network latency</strong> (this assumption may not be
true in a general network, but that’s the assumption that NTP makes).</p>
<p>Under this assumption, the time $t_0 + \delta/2$ is the time on the client’s
clock that corresponds to $t_1$ on the server’s clock. Similarly, $t_3 -
\delta/2$ on the client’s clock corresponds to $t_2$ on the server’s clock.
These correspondences let us calculate two estimates for the offset between the
client’s clock and the server’s clock:</p>
<p>$$\begin{align*}
\theta_1 & = t_1 - (t_0 + \delta/2) \\
\theta_2 & = t_2 - (t_3 - \delta/2)
\end{align*}$$</p>
<p>We can now calculate the client-server <strong>offset</strong> $\theta$ as an average of
those two estimates:</p>
<p>$$\begin{align*}
\theta & = \frac{\theta_1 + \theta_2}2 \\
& = \frac{t_1 - (t_0 + \delta/2) + t_2 - (t_3 - \delta/2)}2 \\
& = \frac{t_1 - t_0 - \delta/2 + t_2 - t_3 + \delta/2}2 \\
& = \frac{(t_1 - t_0) + (t_2 - t_3)}2 \\
\end{align*}$$</p>
<p>Note that the offset $\theta$ may be a positive duration (meaning that the
client clock is in the past), a negative duration (meaning that the client
clock is in the future) or zero (meaning that the client clock agrees with the
server clock, which is unlikely).</p>
<p>After calculating the offset $\theta$, the client can update its local clock by
shifting it by $\theta$ and from that point the client will be in sync with the
server (within a certain degree of precision).</p>
<p>Once the synchronization is done, it is expected that the client’s clock will
start drifting away from the server’s clock. This may happen due to
relativistic effects and more importantly because often clients do not use
high-precision clocks. For this reason, it is important that NTP clients
synchronize their time periodically. Usually NTP clients start by synchronizing
time every minute or so when they are started, and then progressively slow down
until they synchronize time once every half an hour or every hour.</p>
<p>There are some drawbacks with this synchronization method:</p>
<ul>
<li>The request and response delays may not be perfectly symmetric, resulting in
inaccuracies in the calculations of the offset $\theta$. Network
instabilities, packet retransmissions, change of routes, queuing may all
cause unpredictable and inconsistent delays.</li>
<li>The timestamps $t_1$ and $t_3$ must be set <em>as soon as possible</em> (as soon as
the packets are received), and similarly $t_0$ and $t_2$ must be set <em>as late
as possible</em>. Because NTP is implemented at the software level, there may be
non-negligible delays in acquiring and recording these timestamps. These
delays may be exacerbated if the NTP implementation is not very performant,
or if the client or server are under high load.</li>
<li>Errors propagate and add up when increasing the number of strata.</li>
</ul>
<p>For all these reasons, NTP clients do not synchronize time just from a single
NTP server, but from multiple ones. NTP clients take into account the
round-trip delays, stratum, and jitter (the variance in round-trip delays) to
decide the best NTP server to get their time from. Under ideal network
conditions, an NTP client will always prefer a server with a low stratum.
However, an NTP client may prefer an NTP server with high stratum and more
reliable connectivity over an NTP server with low stratum but a very unstable
network connection.</p>
<p>The precision offered by NTP is in the order of a few milliseconds.</p>
<h2 id="precision-time-protocol-ptp">Precision Time Protocol (PTP)</h2>
<p>PTP is a time synchronization protocol for applications that require more
accuracy than the one provided by NTP. The main differences between PTP and NTP
are:</p>
<ul>
<li><strong>Precision:</strong> NTP offers millisecond precision, while PTP offers nanosecond
precision.</li>
<li><strong>Time standard:</strong> NTP transmits UTC time, while PTP transmits TAI time and
the difference between TAI and UTC.</li>
<li><strong>Scope:</strong> NTP is designed to be used over large networks, including the
internet, while PTP is designed to be used in local area networks.</li>
<li><strong>Implementation:</strong> NTP is mainly software based, while PTP can be
implemented both via software and on specialized hardware. The use of
specialized hardware considerably reduces delays and jitter introduced by
software.</li>
</ul>
<figure>
<img src="https://andrea.corbellini.name/images/timecard.jpg" alt="Picture of a Time Card device">
<figcaption><a href="http://www.timingcard.com/">Time Card</a>: an open-source hardware card with a PCIe interface that can be plugged into a computer that can serve as a PTP master. It can be optionally connected to a GNSS receiver and contains a rubidium (Rb) clock.</figcaption>
</figure>
<ul>
<li><strong>Hierarchy:</strong> NTP can support a complex hierarchy of NTP servers, organized
via strata. While PTP does not put a limitation on the number of nodes
involved, the hierarchy is usually only composed of <strong>master</strong> clocks (the
source of time information) and <strong>slave</strong> clocks (the receivers of time
information). Sometimes <strong>boundary</strong> clocks are used to relay time
information to network segments that are unreachable by the master clocks.</li>
<li><strong>Clock selection:</strong> in NTP, clients select the best NTP server to use based
on the NTP server clock quality and the network connection quality. In PTP,
slaves do not select the best master clock to use. Instead, master clocks
perform a selection between themselves using a method called <em>best master
clock algorithm</em>. This algorithm takes into account the clock’s quality and
input from system administrators, and does not factor network quality at all.
The master clock selected by the algorithm is called <strong>grandmaster</strong> clock.</li>
<li>
<p><strong>Algorithm:</strong> in NTP, clients poll the time information from servers
periodically and calculate the clock offset using the algorithm described
above (based on the timestamps $t_0$, $t_1$, $t_2$ and $t_3$). With PTP, the
algorithm used by slaves to calculate the offset from the grandmaster clock
is somewhat similar to the one used in NTP, but the order of operations is
different:</p>
<ol>
<li>the grandmaster periodically broadcasts its time information $T_0$ over
the network;</li>
<li>each slave records the time $T_1$ when the broadcasted time was
received;</li>
<li>each slave sends a packet to the grandmaster at time $T_2$;</li>
<li>the grandmaster receives the packet at time $T_3$ and sends that value
back to the slave.</li>
</ol>
<p>The average network delay can be calculated as $\delta = ((T_3 - T_0) -
(T_2 - T_1)) / 2$. The clock offset can be calculated as $\theta = ((T_1 -
T_0) + (T_2 - T_3)) / 2$.</p>
</li>
</ul>
<figure>
<img src="https://andrea.corbellini.name/images/ptp-sync-algorithm.svg" alt="Visualization of the PTP time synchronization algorithm">
<figcaption>The PTP time synchronization algorithm.</figcaption>
</figure>
<h1 id="summary">Summary</h1>
<ul>
<li>Synchronizing time across a computer network is not an easy task, and first
of all requires agreeing on a definition of ‘time’ and on a time standard.</li>
<li>Relativistic effects make it so that time may not flow at the same speed all
over the globe, and this means that time has to be measured and aggregated
across the planet in order to get a suitable value that can be agreed on.</li>
<li>Atomic clocks and GNSS are the clock sources used for most applications
nowadays.</li>
<li>NTP is a time synchronization protocol that can be used on large and
distributed networks like the internet and provides millisecond precision.</li>
<li>PTP is a time synchronization protocol for local area networks and provides
nanosecond precision.</li>
</ul>andreacorbelliniMon, 23 Jan 2023 19:15:00 +0000tag:andrea.corbellini.name,2023-01-23:/2023/01/23/what-time-is-it/timetimerelativityperformanceclocksntpptpCan we encrypt data using Elliptic Curves?https://andrea.corbellini.name/2023/01/02/ec-encryption/<p>From time to time, I hear people saying that Elliptic Curve Cryptography (ECC)
cannot be used to directly encrypt data, and you can only do key agreement and
digital signatures with it. This is a common misconception, but it’s not
actually true: you can indeed use elliptic curve keys to encrypt arbitrary
data. And I’m not talking about hybrid-encryption schemes (like
<a href="https://en.wikipedia.org/wiki/Integrated_Encryption_Scheme">ECIES</a> or
<a href="https://datatracker.ietf.org/doc/rfc9180/">HPKE</a>): I’m talking about pure
elliptic curve encryption, and I’m going to show an example of it in this
article. It’s true however that pure elliptic curve encryption is not widely
used or standardized because, as I will explain at the end of the article, key
agreement is more convenient for most applications.</p>
<h1 id="quick-recap-on-elliptic-curve-cryptography">Quick recap on Elliptic Curve Cryptography</h1>
<p>I wrote an <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/">in-depth article about elliptic curve
cryptography</a> in the past on this blog, and
here is a quick recap: points on an elliptic curve from an interesting
algebraic structure: a <em>cyclic group</em>. This group lets us do some algebra with
the points of the elliptic curve: if we have two points $A$ and $B$, we can
<strong>add</strong> them ($A + B$) or <strong>subtract</strong> them ($A - B$). We can also <strong>multiply</strong>
a point by an integer, which is the same as doing repeated addition ($n A$ = $A
+ A + \cdots + A$, $n$ times).</p>
<p>We know some efficient algorithms for doing multiplication, but the reverse of
multiplication is believed to be a “hard” problem for certain elliptic curves,
in the sense that we know efficient methods for computing $B = n A$ given $n$
and $A$, but we do not know very efficient methods to figure out $n$ given $A$
and $B$. This problem of reversing a multiplication is known as Elliptic
Curve Discrete Logarithm Problem (ECDLP).</p>
<p>Elliptic Curve Cryptography is based on multiplication of elliptic curve points
by integers and its security is given mainly by the difficulty of solving the
ECDLP.</p>
<p>In order to use Elliptic Curve Cryptography, we first have to generate a
<strong>private-public key pair</strong>:</p>
<ul>
<li>the <strong>private key</strong> is a random integer $s$;</li>
<li>the <strong>public key</strong> is the result of multiplying the integer $s$ with the
generator $G$ of the elliptic curve group: $P = s G$.</li>
</ul>
<p>Let’s now see a method to use Elliptic Curve Cryptography to encrypt arbitrary
data, so that we can demystify the common belief that elliptic curves cannot be
used to encrypt.</p>
<h1 id="elliptic-curve-elgamal">Elliptic Curve ElGamal</h1>
<p>One method to encrypt data with elliptic curve keys is
<strong><a href="https://en.wikipedia.org/wiki/ElGamal_encryption">ElGamal</a></strong>. This is not
the only method, of course, but it’s the one that I chose because it’s well
known and simple enough. ElGamal is a cryptosystem that takes the name from
<a href="https://en.wikipedia.org/wiki/Taher_Elgamal">its author</a> and works on any
cyclic group, not just elliptic curve groups.</p>
<p>If we want to <strong>encrypt</strong> a message using the public key $P$ via ElGamal, we
can do the following:</p>
<ol>
<li>map the message to a point $M$ on the elliptic curve</li>
<li>generate a random integer $t$</li>
<li>compute $C_1 = t G$</li>
<li>compute $C_2 = t P + M$</li>
<li>return the tuple $(C_1, C_2)$</li>
</ol>
<p>To <strong>decrypt</strong> an encrypted tuple $(C_1, C_2)$ using the private key $s$, we
can do the following:</p>
<ol>
<li>compute $M = C_2 - s C_1$</li>
<li>map the point $M$ back to a message</li>
</ol>
<p>The scheme works because:
$$\begin{align*}
s C_1 & = s (t G) \\
& = t (s G) \\
& = t P
\end{align*}$$
therefore:
$$\begin{align*}
C_2 - s C_1 & = (t P + M) - (t P) \\
& = M
\end{align*}$$</p>
<p>There’s however a big problem with this scheme: how do we map a message to a
point, and vice versa? How can we perform step 1 of the encryption algorithm,
or step 2 of the decryption algorithm?</p>
<h1 id="mapping-a-message-to-a-point">Mapping a message to a point</h1>
<p>A message can be an arbitrary byte string. An elliptic curve point is,
generally speaking, a pair of integers $(x, y)$ belonging to the elliptic curve
field. How can we transform a byte string into a pair of field integers?</p>
<p>Well, as far as computers are concerned, both byte strings and integers have
the same nature: they are just sequences of bits, so there’s a natural map
between the two. We could take the message, split it into two parts, and
interpret the first part as an integer $x$ and the second part as an integer
$y$. This would work for obtaining two arbitrary integers, but there’s a
problem: the coordinates $x$ and $y$ of an elliptic curve point are related by
a mathematical equation (the curve equation), so we cannot choose two arbitrary
$x$ and $y$ and expect them to identify a valid point on the curve. In fact,
for curves in Weierstrass form, given $x$ there are at most two possible
choices for $y$, so it’s <em>very</em> unlikely that this splitting method will yield
a valid point.</p>
<p>Let’s change our strategy a little bit: instead of transforming the message to
a pair $(x, y)$, we transform it to $x$ and then we compute a valid $y$ from
the curve equation. This is a much better method, but there’s still a problem:
generally speaking, not every $x$ will have a corresponding $y$. Not every $x$
can satisfy the curve equation.</p>
<p>Luckily, most of the popular elliptic curves used in cryptography have an
interesting property: about half of the possible field integers are valid
$x$-coordinates. To see this, let’s take a look at an example: the curve
<code>secp384r1</code>. This is a Weierstrass curve that has the following order:</p>
<div class="highlight"><pre><span></span><code>0xffffffffffffffffffffffffffffffffffffffffffffffffc7634d81f4372ddf581a0db248b0a77aecec196accc52973
</code></pre></div>
<p>I remind you that the order is the number of valid points that belong to the
elliptic curve group. Because this is a Weierstrass curve, for each $x$ there
are 2 possible points, so the number of valid $x$-coordinates is <code>order / 2</code>.
Given an arbitrary 384-bit integer, what are the chances that this is a valid
$x$-coordinate? The answer is <code>(order / 2) / (2 ** 384)</code> which is approximately
0.5 or 50%.</p>
<p>OK, but how does this help with our goal: mapping an arbitrary message to a
valid $x$-coordinate? It’s simple: we can <em>append</em> a random byte (or multiple
bytes) to the message. We call this extra byte (or bytes): <strong>padding</strong>. If the
resulting padded message does not translate to a valid $x$-coordinate, we
choose another random padding and try again, until we find one that works.
Given that there’s 50% chance of finding a valid $x$ coordinate, this method
will find a valid $x$-coordinate very quickly: on average, this will happen on
the first or the second try.</p>
<figure>
<img src="https://andrea.corbellini.name/images/ec-elgamal-padding.svg" alt="Padding a message to obtain a valid elliptic curve point" width="500" height="120">
<figcaption>Example of how to use padding to obtain a valid elliptic curve point from an arbitrary message.</figcaption>
</figure>
<p>This operation can be easily <strong>reversed</strong>: if you have a point $(x, y)$, in
order to recover the message that generated it, just take the $x$ coordinate
and remove the padding. That’s it!</p>
<p>It’s worth noting that there are some standard curves where all the possible
byte strings (of the proper size) can be translated to elliptic curve points,
without any random padding needed. For example, with
<a href="https://en.wikipedia.org/wiki/Curve25519">Curve25519</a>, every 32-byte string is
a valid elliptic curve point. Another curve like that is
<a href="https://en.wikipedia.org/wiki/Curve448">Curve448</a>.</p>
<p>It’s also important to note that the padding does not need to be truly random.
In the image above I show a padding that is simply a constantly increasing
sequence of numbers: 1, 2, 3, … That’s enough to find a valid point.</p>
<h1 id="putting-everything-together">Putting everything together</h1>
<p>We have seen how to map a message to a point and how ElGamal works, so now we
have all the elements to write some working code. I’m choosing
<a href="https://www.python.org/">Python</a> and the
<a href="https://github.com/cslashm/ECPy">ECPy</a> package to work with elliptic curves,
which you can install with <code>pip install ecpy</code>.</p>
<div class="highlight"><pre><span></span><code><span class="kn">import</span> <span class="nn">random</span>
<span class="kn">from</span> <span class="nn">ecpy.curves</span> <span class="kn">import</span> <span class="n">Curve</span><span class="p">,</span> <span class="n">Point</span>
<span class="k">def</span> <span class="nf">message_to_point</span><span class="p">(</span><span class="n">curve</span><span class="p">:</span> <span class="n">Curve</span><span class="p">,</span> <span class="n">message</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">)</span> <span class="o">-></span> <span class="n">Point</span><span class="p">:</span>
<span class="c1"># Number of bytes to represent a coordinate of a point</span>
<span class="n">coordinate_size</span> <span class="o">=</span> <span class="n">curve</span><span class="o">.</span><span class="n">size</span> <span class="o">//</span> <span class="mi">8</span>
<span class="c1"># Minimum number of bytes for the padding. We need at least 1 byte so that</span>
<span class="c1"># we can try different values and find a valid point. We also add an extra</span>
<span class="c1"># byte as a delimiter between the message and the padding (see below)</span>
<span class="n">min_padding_size</span> <span class="o">=</span> <span class="mi">2</span>
<span class="c1"># Maximum number of bytes that we can encode</span>
<span class="n">max_message_size</span> <span class="o">=</span> <span class="n">coordinate_size</span> <span class="o">-</span> <span class="n">min_padding_size</span>
<span class="k">if</span> <span class="nb">len</span><span class="p">(</span><span class="n">message</span><span class="p">)</span> <span class="o">></span> <span class="n">max_message_size</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span><span class="s1">'Message too long'</span><span class="p">)</span>
<span class="c1"># Add a padding long enough to ensure that the resulting padded message has</span>
<span class="c1"># the same size as a point coordinate. Initially the padding is all 0</span>
<span class="n">padding_size</span> <span class="o">=</span> <span class="n">coordinate_size</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
<span class="n">padded_message</span> <span class="o">=</span> <span class="nb">bytearray</span><span class="p">(</span><span class="n">message</span><span class="p">)</span> <span class="o">+</span> <span class="sa">b</span><span class="s1">'</span><span class="se">\0</span><span class="s1">'</span> <span class="o">*</span> <span class="n">padding_size</span>
<span class="c1"># Put a delimiter between the message and the padding, so that we can</span>
<span class="c1"># properly remove the padding at decrypt time</span>
<span class="n">padded_message</span><span class="p">[</span><span class="nb">len</span><span class="p">(</span><span class="n">message</span><span class="p">)]</span> <span class="o">=</span> <span class="mh">0xff</span>
<span class="k">while</span> <span class="kc">True</span><span class="p">:</span>
<span class="c1"># Convert the padded message to an integer, which may or may not be a</span>
<span class="c1"># valid x-coordinate</span>
<span class="n">x</span> <span class="o">=</span> <span class="nb">int</span><span class="o">.</span><span class="n">from_bytes</span><span class="p">(</span><span class="n">padded_message</span><span class="p">,</span> <span class="s1">'little'</span><span class="p">)</span>
<span class="c1"># Calculate the corresponding y-coordinate (if it exists)</span>
<span class="n">y</span> <span class="o">=</span> <span class="n">curve</span><span class="o">.</span><span class="n">y_recover</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
<span class="k">if</span> <span class="n">y</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">:</span>
<span class="c1"># x was not a valid coordinate; increment the padding and try again</span>
<span class="n">padded_message</span><span class="p">[</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">else</span><span class="p">:</span>
<span class="c1"># x was a valid coordinate; return the point (x, y)</span>
<span class="k">return</span> <span class="n">Point</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">curve</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">encrypt</span><span class="p">(</span><span class="n">public_key</span><span class="p">:</span> <span class="n">Point</span><span class="p">,</span> <span class="n">message</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bytes</span><span class="p">:</span>
<span class="n">curve</span> <span class="o">=</span> <span class="n">public_key</span><span class="o">.</span><span class="n">curve</span>
<span class="c1"># Map the message to an elliptic curve point</span>
<span class="n">message_point</span> <span class="o">=</span> <span class="n">message_to_point</span><span class="p">(</span><span class="n">curve</span><span class="p">,</span> <span class="n">message</span><span class="p">)</span>
<span class="c1"># Generate a randon number</span>
<span class="n">seed</span> <span class="o">=</span> <span class="n">random</span><span class="o">.</span><span class="n">randrange</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">curve</span><span class="o">.</span><span class="n">field</span><span class="p">)</span>
<span class="c1"># Calculate c1 and c2 according to the ElGamal algorithm</span>
<span class="n">c1</span> <span class="o">=</span> <span class="n">seed</span> <span class="o">*</span> <span class="n">curve</span><span class="o">.</span><span class="n">generator</span>
<span class="n">c2</span> <span class="o">=</span> <span class="n">seed</span> <span class="o">*</span> <span class="n">public_key</span> <span class="o">+</span> <span class="n">message_point</span>
<span class="c1"># Encode c1 and c2 and return them</span>
<span class="k">return</span> <span class="nb">bytes</span><span class="p">(</span><span class="n">curve</span><span class="o">.</span><span class="n">encode_point</span><span class="p">(</span><span class="n">c1</span><span class="p">)</span> <span class="o">+</span> <span class="n">curve</span><span class="o">.</span><span class="n">encode_point</span><span class="p">(</span><span class="n">c2</span><span class="p">))</span>
<span class="k">def</span> <span class="nf">point_to_message</span><span class="p">(</span><span class="n">point</span><span class="p">:</span> <span class="n">Point</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bytes</span><span class="p">:</span>
<span class="c1"># Number of bytes to represent a coordinate of a point</span>
<span class="n">coordinate_size</span> <span class="o">=</span> <span class="n">curve</span><span class="o">.</span><span class="n">size</span> <span class="o">//</span> <span class="mi">8</span>
<span class="c1"># Convert the x-coordinate of the point to a byte string</span>
<span class="n">padded_message</span> <span class="o">=</span> <span class="n">point</span><span class="o">.</span><span class="n">x</span><span class="o">.</span><span class="n">to_bytes</span><span class="p">(</span><span class="n">coordinate_size</span><span class="p">,</span> <span class="s1">'little'</span><span class="p">)</span>
<span class="c1"># Find the padding delimiter</span>
<span class="n">message_size</span> <span class="o">=</span> <span class="n">padded_message</span><span class="o">.</span><span class="n">rfind</span><span class="p">(</span><span class="mh">0xff</span><span class="p">)</span>
<span class="c1"># Remove the padding and return the resulting message</span>
<span class="n">message</span> <span class="o">=</span> <span class="n">padded_message</span><span class="p">[:</span><span class="n">message_size</span><span class="p">]</span>
<span class="k">return</span> <span class="n">message</span>
<span class="k">def</span> <span class="nf">decrypt</span><span class="p">(</span><span class="n">curve</span><span class="p">:</span> <span class="n">Curve</span><span class="p">,</span> <span class="n">secret_key</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span> <span class="n">ciphertext</span><span class="p">:</span> <span class="nb">bytes</span><span class="p">)</span> <span class="o">-></span> <span class="nb">bytes</span><span class="p">:</span>
<span class="c1"># Decode c1 and c2 and convert them to elliptic curve points</span>
<span class="n">c1_bytes</span> <span class="o">=</span> <span class="n">ciphertext</span><span class="p">[:</span><span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">//</span> <span class="mi">2</span><span class="p">]</span>
<span class="n">c2_bytes</span> <span class="o">=</span> <span class="n">ciphertext</span><span class="p">[</span><span class="nb">len</span><span class="p">(</span><span class="n">ciphertext</span><span class="p">)</span> <span class="o">//</span> <span class="mi">2</span><span class="p">:]</span>
<span class="n">c1</span> <span class="o">=</span> <span class="n">curve</span><span class="o">.</span><span class="n">decode_point</span><span class="p">(</span><span class="n">c1_bytes</span><span class="p">)</span>
<span class="n">c2</span> <span class="o">=</span> <span class="n">curve</span><span class="o">.</span><span class="n">decode_point</span><span class="p">(</span><span class="n">c2_bytes</span><span class="p">)</span>
<span class="c1"># Calculate the message point according to the ElGamal algorithm</span>
<span class="n">message_point</span> <span class="o">=</span> <span class="n">c2</span> <span class="o">-</span> <span class="n">secret_key</span> <span class="o">*</span> <span class="n">c1</span>
<span class="c1"># Convert the message point to a message and return it</span>
<span class="k">return</span> <span class="n">point_to_message</span><span class="p">(</span><span class="n">message_point</span><span class="p">)</span>
</code></pre></div>
<p>And here is an usage example:</p>
<div class="highlight"><pre><span></span><code><span class="n">curve</span> <span class="o">=</span> <span class="n">Curve</span><span class="o">.</span><span class="n">get_curve</span><span class="p">(</span><span class="s1">'secp384r1'</span><span class="p">)</span>
<span class="n">secret_key</span> <span class="o">=</span> <span class="mh">0x123456789abcdef</span>
<span class="n">public_key</span> <span class="o">=</span> <span class="n">secret_key</span> <span class="o">*</span> <span class="n">curve</span><span class="o">.</span><span class="n">generator</span>
<span class="n">message</span> <span class="o">=</span> <span class="s1">'hello'</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">' Message:'</span><span class="p">,</span> <span class="n">message</span><span class="p">)</span>
<span class="n">encrypted</span> <span class="o">=</span> <span class="n">encrypt</span><span class="p">(</span><span class="n">public_key</span><span class="p">,</span> <span class="n">message</span><span class="o">.</span><span class="n">encode</span><span class="p">(</span><span class="s1">'utf-8'</span><span class="p">))</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">'Encrypted:'</span><span class="p">,</span> <span class="n">encrypted</span><span class="o">.</span><span class="n">hex</span><span class="p">())</span>
<span class="n">decrypted</span> <span class="o">=</span> <span class="n">decrypt</span><span class="p">(</span><span class="n">curve</span><span class="p">,</span> <span class="n">secret_key</span><span class="p">,</span> <span class="n">encrypted</span><span class="p">)</span><span class="o">.</span><span class="n">decode</span><span class="p">(</span><span class="s1">'utf-8'</span><span class="p">)</span>
<span class="nb">print</span><span class="p">(</span><span class="s1">'Decrypted:'</span><span class="p">,</span> <span class="n">decrypted</span><span class="p">)</span>
</code></pre></div>
<p>Which produces the following output:</p>
<div class="highlight"><pre><span></span><code> Message: hello
Encrypted: 04fa333c6a03994c5bce4627de4447c5cdd358415f8db2745b67836932a0d5e81f19...
Decrypted: hello
</code></pre></div>
<h1 id="some-considerations-on-padding-and-security">Some considerations on padding and security</h1>
<p>It’s important to note that padding is a very delicate problem in cryptography.
There exist many <a href="https://en.wikipedia.org/wiki/Padding_(cryptography)">padding
schemes</a>, and <strong>not all
of them are secure</strong>. The padding scheme that I wrote in this article was just
for demonstration purposes and may not be the most secure, so don’t use it in
production systems. Take a look at
<a href="https://en.wikipedia.org/wiki/Optimal_asymmetric_encryption_padding">OAEP</a> if
you’re looking for a modern and secure padding scheme.</p>
<p>Another thing to note is that the decryption method that I wrote does not check
if the decryption was successful. If you try to decrypt an invalid ciphertext,
or use the wrong key, you won’t get an error but instead a random result, which
is not desiderable. A good padding scheme like OAEP will instead throw an error
if decryption was unsuccessful.</p>
<p>(Receiving an error when decryption is not successful is very important due to
the fact that schemes like ElGamal are
<a href="https://en.wikipedia.org/wiki/Malleability_(cryptography)">malleable</a>. Check
out my post about <a href="https://andrea.corbellini.name/2023/03/09/authenticated-encryption/">authenticated
encryption</a> for examples and
details about why this is important.)</p>
<h1 id="cost-of-elliptic-curve-encryption">Cost of elliptic curve encryption</h1>
<p>With Elliptic Curve ElGamal, if we are using an <em>n</em>-bit elliptic curve, we can
encrypt messages that are at most <em>n</em>-bit long (actually less than that,
if we’re using padding), and the output is at least <em>2n</em>-bit long (if the
resulting points $C_1$ and $C_2$ are encoded using point compression). This
means that encryption using Elliptic Curve ElGamal doubles the size of the data
that we want to encrypt. It also requires a fair amount of compute resources,
because it involves a random number generation and 2 point multiplications.</p>
<p>In short, Elliptic Curve ElGamal is expensive both in terms of space and in
terms of time and compute power, and this makes it unattractive in applications
like <a href="https://en.wikipedia.org/wiki/Transport_Layer_Security">TLS</a> or general
purpose encryption.</p>
<p>So what can we use Elliptic Curve ElGamal for? We can use it to encrypt
<em>symmetric keys</em>, such as
<a href="https://en.wikipedia.org/wiki/Advanced_Encryption_Standard">AES</a> keys or
<a href="https://en.wikipedia.org/wiki/ChaCha20">ChaCha20</a> keys, and then use these
symmetric keys to encrypt our arbitrary data. Symmetric keys are relatively
short (ranging from 128 to 256 bits nowadays), so they can be encrypted with
one round of Elliptic Curve ElGamal with most curves. It’s worth noting that
this is the same approach that we use with
<a href="https://en.wikipedia.org/wiki/RSA_(cryptosystem)">RSA</a> encryption: for most
applications, we don’t use RSA to encrypt data directly, but rather we use RSA
to encrypt symmetric keys which are later used for encrypting data.</p>
<p>These are the reason why schemes like Elliptic Curve ElGamal, or other methods of
encryption with elliptic curves, are not used in practice:</p>
<ul>
<li>elliptic curve encryption is more expensive than hybrid encryption;</li>
<li>hybrid encryption scales better and is more performant;</li>
<li>elliptic curve key exchange is simpler and has fewer pitfalls than
encryption.</li>
</ul>
<p>In conclusion, there are no practical benefits from elliptic curve encryption
compared to hybrid encryption with key agreement, and that’s why we don’t use
it. However, the idea that elliptic curves cannot be used for encryption is a
myth, and I hope this article will help clarify that confusion.</p>andreacorbelliniMon, 02 Jan 2023 06:30:00 +0000tag:andrea.corbellini.name,2023-01-02:/2023/01/02/ec-encryption/cryptographyeccencryptionelgamalThe curious case of bad blocks on an SSD, and how I got rid of themhttps://andrea.corbellini.name/2022/12/29/curious-ssd-badblocks/<p>I recently inherited a laptop that was broken by pouring some hot coffee on it.
When I dissected it, it was pretty clear that most of it was unrecoverable: the
CPU was completely fried, and its thermal paste splashed everywhere on the
motherboard. (I wish I took a picture of it that I could share.) There were
however a few pieces that looked in a good state. One of those components was a
NVMe Solid State Drive (SSD). I decided to take this SSD and recycle it in my
own laptop, maybe to join my LVM pool.</p>
<p>When I plugged it in my laptop however the SSD I tried to navigate the
filesystem, and it appeared to be working quite slowly. Opening certain files
sometimes would hang indefinitely. Upon inspection of the SMART data and the
kernel logs, it was clear that the drive was returning plenty of <strong>read
errors</strong>.</p>
<p>Here is a sample of the kernel logs:</p>
<div class="highlight"><pre><span></span><code>$ dmesg
...
[ 860.465707] ata2.00: exception Emask 0x0 SAct 0x8 SErr 0x0 action 0x0
[ 860.465726] ata2.00: irq_stat 0x40000008
[ 860.465733] ata2.00: failed command: READ FPDMA QUEUED
[ 860.465737] ata2.00: cmd 60/08:18:58:c5:28/00:00:00:00:00/40 tag 3 ncq dma 4096 in
[ 860.465737] res 41/40:08:58:c5:28/00:00:00:00:00/00 Emask 0x409 (media error) <F>
[ 860.465750] ata2.00: status: { DRDY ERR }
[ 860.465754] ata2.00: error: { UNC }
[ 860.467010] ata2.00: configured for UDMA/133
[ 860.467046] sd 1:0:0:0: [sda] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[ 860.467054] sd 1:0:0:0: [sda] tag#3 Sense Key : Medium Error [current]
[ 860.467060] sd 1:0:0:0: [sda] tag#3 Add. Sense: Unrecovered read error - auto reallocate failed
[ 860.467066] sd 1:0:0:0: [sda] tag#3 CDB: Read(10) 28 00 00 28 c5 58 00 00 08 00
[ 860.467069] I/O error, dev sda, sector 2671960 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0
...
[ 1057.914608] ata2: softreset failed (device not ready)
[ 1057.914623] ata2: hard resetting link
[ 1063.230631] ata2: found unknown device (class 0)
[ 1067.934891] ata2: softreset failed (device not ready)
[ 1067.934911] ata2: hard resetting link
[ 1073.270826] ata2: found unknown device (class 0)
[ 1078.486604] ata2: link is slow to respond, please be patient (ready=0)
[ 1102.970841] ata2: softreset failed (device not ready)
[ 1102.970860] ata2: limiting SATA link speed to 1.5 Gbps
[ 1102.970865] ata2: hard resetting link
[ 1108.034602] ata2: found unknown device (class 0)
[ 1108.194622] ata2: softreset failed (device not ready)
[ 1108.194638] ata2: reset failed, giving up
[ 1108.194642] ata2.00: disable device
[ 1108.194677] ata2: EH complete
[ 1108.194726] sd 1:0:0:0: [sda] tag#6 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK cmd_age=232s
[ 1108.194740] sd 1:0:0:0: [sda] tag#6 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[ 1108.194748] I/O error, dev sda, sector 0 op 0x1:(WRITE) flags 0x800 phys_seg 0 prio class 0
...
</code></pre></div>
<p>These logs show that the SSD was returning errors (exceptions) to the operating
system, and also that the SSD would sometimes become so slow to respond that
the kernel would attempt to reset it (which didn’t really work, I can tell
you).</p>
<p>Here is an excerpt of the SMART data:</p>
<div class="highlight"><pre><span></span><code>$ smartctl -a /dev/sda
...
SMART Attributes Data Structure revision number: 0
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 166 001 006 Pre-fail Always In_the_past 0
5 Retired_Block_Count 0x0032 100 100 036 Old_age Always - 76
9 Power_On_Hours 0x0032 099 099 000 Old_age Always - 1740
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 2247
100 Total_Erase_Count 0x0032 100 100 000 Old_age Always - 7654272
168 Min_Erase_Count 0x0032 253 096 000 Old_age Always - 0
169 Max_Erase_Count 0x0032 083 083 000 Old_age Always - 181
171 Program_Fail_Count 0x0032 253 253 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 253 253 000 Old_age Always - 0
174 Unexpect_Power_Loss_Ct 0x0030 100 100 000 Old_age Offline - 14
175 Program_Fail_Count_Chip 0x0032 253 253 000 Old_age Always - 0
176 Unused_Rsvd_Blk_Cnt_Tot 0x0032 253 253 000 Old_age Always - 0
177 Wear_Leveling_Count 0x0032 090 090 000 Old_age Always - 116
178 Used_Rsvd_Blk_Cnt_Chip 0x0032 100 100 000 Old_age Always - 399
179 Used_Rsvd_Blk_Cnt_Tot 0x0032 100 100 000 Old_age Always - 2460
180 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 2980
184 End-to-End_Error 0x0032 100 100 000 Old_age Always - 9919
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 10051
188 Command_Timeout 0x0032 253 253 000 Old_age Always - 0
194 Temperature_Celsius 0x0002 038 000 000 Old_age Always - 38 (Min/Max 16/48)
195 Hardware_ECC_Recovered 0x0032 100 085 000 Old_age Always - 715203
196 Reallocated_Event_Count 0x0032 100 100 036 Old_age Always - 76
198 Offline_Uncorrectable 0x0032 253 253 000 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x0032 253 253 000 Old_age Always - 0
204 Soft_ECC_Correction 0x000e 100 001 000 Old_age Always - 13
212 Phy_Error_Count 0x0032 253 253 000 Old_age Always - 0
234 Unknown_SK_hynix_Attrib 0x0032 100 100 000 Old_age Always - 32297
241 Total_Writes_GB 0x0032 100 100 000 Old_age Always - 3715
242 Total_Reads_GB 0x0032 100 100 000 Old_age Always - 3680
250 Read_Retry_Count 0x0032 096 096 000 Old_age Always - 176835377
...
</code></pre></div>
<p>This table show various attributes for the operational status of the SSD. The
meaning of the numeric values is pretty much vendor-specific, so trying to
understand those number exactly is quite a challenge, but what matters is that
the numbers under the <code>VALUE</code> column are higher than the <code>THRESH</code> (threshold)
column. The <code>WORST</code> column indicates the lowest <code>VALUE</code> that has ever been
observed.</p>
<p>To my surprise, despite all the errors and hangs that the SSD was experiencing,
the SMART values looked pretty good. Sure, there’s a very low <code>WORST</code> value for
<code>Raw_Read_Error_Rate</code> (001, much lower than the threshold 001), and there is
also and indication that this attribute failed in the past, but besides that
everything looked acceptable enough.</p>
<p>Of course the SMART log was recording the read errors as well. Here’s another
excerpt from the output:</p>
<div class="highlight"><pre><span></span><code>$ smartctl -a /dev/sda
...
SMART Error Log Version: 1
ATA Error Count: 1875 (device log contains only the most recent five errors)
...
Error 1875 occurred at disk power-on lifetime: 1737 hours (72 days + 9 hours)
When the command that caused the error occurred, the device was active or idle.
After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 41 00 00 00 00 00 Error: UNC at LBA = 0x00000000 = 0
Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 08 70 98 31 af 40 40 00:02:32.920 READ FPDMA QUEUED
47 00 01 30 08 00 a0 a0 00:02:32.920 READ LOG DMA EXT
47 00 01 30 00 00 a0 a0 00:02:32.920 READ LOG DMA EXT
47 00 01 00 00 00 a0 a0 00:02:32.920 READ LOG DMA EXT
ef 10 02 00 00 00 a0 a0 00:02:32.920 SET FEATURES [Enable SATA feature]
...
</code></pre></div>
<p>Give the lack of concrete signs of old age or extended damage to the SSD, I
wondered if it could be a link problem: maybe I did not insert the drive
correctly, or maybe a pin was dirty. But no: upon inspection I did not find any
issue, and after carefully reseating the drive, the problem was persisting.</p>
<p>I proceeded to run a SMART self test, here are the results (from most recent to oldest):</p>
<div class="highlight"><pre><span></span><code>SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short captive Completed: read failure 90% 1736 5712
# 2 Short offline Completed: read failure 90% 1736 5712
# 3 Extended offline Completed: read failure 90% 1733 50117792
# 4 Extended captive Interrupted (host reset) 90% 1730 -
# 5 Short captive Interrupted (host reset) 90% 1730 -
</code></pre></div>
<p>The first two tests were interrupted by Linux, which tried to reset the device
while the tests were running. A self-test (as the name suggests) is completely
self contained and does not involve sharing of data between the SSD and the
operating system in the process. The fact that the self-test was failing due to
bad blocks was therefore a sign that this was not a link error, but that the
blocks were really damaged.</p>
<p>I decided therefore to give up on trying to fix the SSD, but I still wanted to
use it. After all, it was working for the most part: as long as you didn’t
access the bad blocks, the SSD would behave fine. So here is my plan: I would
format the SSD and create an ext4 filesystem on it, using <code>mkfs.ext4 -c</code>, which
would scan for and exclude bad blocks so that they wouldn’t be used. The
resulting filesystem would have less storage available than the advertised
capacity of the SSD, but that was an acceptable trade-off for me.</p>
<p>And here is the most interesting part: <code>mkfs.ext4 -c</code> <strong>discarded all blocks
before creating the filesystem</strong>. After that, it <strong>scanned for bad blocks and,
shockingly, it found none!</strong></p>
<p>SMART self-tests also did not report any error:</p>
<div class="highlight"><pre><span></span><code>SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1740 -
# 2 Short offline Completed without error 00% 1738 -
</code></pre></div>
<p>All the read errors, exceptions and the hanging problem that kept appearing
before disappeared!</p>
<p>I’m not fully sure how to explain how this happened, but I did some research
and the general consensus is that discarding bad blocks won’t recover them. My
theory is that, when the coffee was poured on the laptop, a spike of voltage
led to incorrect values to be written to a few blocks that were in use at that
time, causing uncorrectable discrepancies between the data and the
error-correcting-codes of the SSD. Discarding the blocks reset both the data
cells and the ECC cells, removing all the inconsistencies.</p>
<p>Do you have a better explanation? Let me know in the comments!</p>andreacorbelliniThu, 29 Dec 2022 04:00:00 +0000tag:andrea.corbellini.name,2022-12-29:/2022/12/29/curious-ssd-badblocks/information-technologyubuntustoragessdbadblocksHow to use the same DNS for all connections in Ubuntu (and other network privacy tricks)https://andrea.corbellini.name/2020/04/28/ubuntu-global-dns/<h1 id="problem">Problem</h1>
<p>Currently Ubuntu does not offer an easy way to set up a “global” DNS for all network connections: whenever you connect to a new WiFi network, if you don’t want to use the DNS server provided by the WiFi, you are forced to go to the network settings and manually set your preferred DNS server.</p>
<p>With this brief guide I want to show how you can setup a global DNS to be used for <em>all</em> the WiFi and network connections, both old and new ones. I will also show you how to use DNSSEC, DNS-over-TLS and randomized MAC addresses for all connections.</p>
<p>This guide is written for Ubuntu 20.04, but in general it will work on every distribution using systemd-resolved and NetworkManager.</p>
<h1 id="step-1-setup-the-global-dns-in-resolved">Step 1: setup the Global DNS in resolved</h1>
<p>In Ubuntu (as well as many other distributions), DNS is managed by systemd-resolved. Its configuration is in <code>/etc/systemd/resolved.conf</code>. Open that file and add a <code>DNS=</code> line inside the <code>[Resolve]</code> section listing your preferred DNS servers. For example, if you want to use <a href="https://1.1.1.1/"><code>1.1.1.1</code></a>, your <code>resolved.conf</code> should look like this:</p>
<div class="highlight"><pre><span></span><code>[Resolve]
DNS=1.1.1.1 1.0.0.1 2606:4700:4700::1111 2606:4700:4700::1001
#FallbackDNS=
#Domains=
#LLMNR=no
#MulticastDNS=no
#Cache=yes
#DNSStubListener=yes
#ReadEtcHosts=yes
</code></pre></div>
<p>Once you are done with the changes, reload systemd-resolved:</p>
<div class="highlight"><pre><span></span><code>sudo systemctl restart systemd-resolved.service
</code></pre></div>
<p>You can check your changes with <code>resolvectl status</code>: you should see your DNS servers on top of the output, under the <em>Global</em> section:</p>
<div class="highlight"><pre><span></span><code>$ resolvectl status
Global
LLMNR setting: no
MulticastDNS setting: no
DNSOverTLS setting: opportunistic
DNSSEC setting: allow-downgrade
DNSSEC supported: no
Current DNS Server: 1.1.1.1
DNS Servers: 1.1.1.1
1.0.0.1
2606:4700:4700::1111
2606:4700:4700::1001
...
</code></pre></div>
<p>This however won’t be enough to use that DNS! In fact, the <em>Global</em> DNS of systemd-resolved is just a default option that is used whenever no DNS servers are configured for an interface. When you connect to a WiFi network, NetworkManager will ask the access point for a list of DNS servers and will communicate that list to systemd-resolved, effectively overriding the settings that we just edited. If you scroll down the output of <code>resolvectl status</code>, you will see the DNS servers added by NetworkManager. We have to tell NetworkManager to stop doing that.</p>
<h1 id="step-2-disable-dns-processing-in-networkmanager">Step 2: Disable DNS processing in NetworkManager</h1>
<p>In order for systemd-resolved to consider our global DNS, we need to tell NetworkManager not to provide any DNS information for new connections. Doing that is easy: just create a new file <code>/etc/NetworkManager/conf.d/dns.conf</code> (or any name you like) with this content:</p>
<div class="highlight"><pre><span></span><code>[main]
# do not use the dhcp-provided dns servers, but rather use the global
# ones specified in /etc/systemd/resolved.conf
dns=none
systemd-resolved=false
</code></pre></div>
<p>To apply the settings either restart your computer or run:</p>
<div class="highlight"><pre><span></span><code>sudo systemctl reload NetworkManager.service
</code></pre></div>
<p>Now, when you connect to a new network connection, NetworkManager won’t push the list of DNS servers to systemd-resolved and only the global ones will be used. If you check <code>resolvectl status</code>, you should see that, for every interface, there is <em>no</em> DNS server specified. If you specified <code>1.1.1.1</code> as your DNS servers, then you can also head over to <a href="https://1.1.1.1/help">https://1.1.1.1/help</a> to verify that they’ve been correctly set up.</p>
<h1 id="dnssec-and-dns-over-tls">DNSSEC and DNS-over-TLS</h1>
<p>If you would like to enable DNSSEC and/or DNS-over-TLS, the file to edit is <code>/etc/systemd/resolved.conf</code>. You can add the following options:</p>
<ul>
<li><code>DNSSEC=true</code> if you want all queries to be DNSSEC-validated. The default is <code>DNSSEC=allow-downgrade</code>, which attempts to use DNSSEC if it works properly, and falls back to disabling validation otherwise.</li>
<li><code>DNSOverTLS=true</code> if you want all queries to go through TLS. You can also specify <code>DNSOverTLS=opportunistic</code> to attempt to use TLS if it supported, and fall back to the plaintext DNS protocol if it’s not.</li>
</ul>
<p>With those options, my <code>/etc/systemd/resolved.conf</code> looks like this:</p>
<div class="highlight"><pre><span></span><code>[Resolve]
DNS=1.1.1.1 1.0.0.1 2606:4700:4700::1111 2606:4700:4700::1001
#FallbackDNS=
#Domains=
#LLMNR=no
#MulticastDNS=no
DNSSEC=true
DNSOverTLS=opportunistic
#Cache=yes
#DNSStubListener=yes
#ReadEtcHosts=yes
</code></pre></div>
<p>Note that I’m using <code>DNSOverTLS=opportunistic</code> because I found that some access points with captive portals don’t work properly when using <code>DNSOverTLS=true</code>. Also note that <code>DNSSEC=true</code> may cause some pain because there are still many misconfigured domain records out there that will make make DNSSEC validation fail.</p>
<p>Like before, to apply the changes, run:</p>
<div class="highlight"><pre><span></span><code>sudo systemctl restart systemd-resolved.service
</code></pre></div>
<p>And to verify the changes:</p>
<div class="highlight"><pre><span></span><code>resolvectl status
</code></pre></div>
<p>If you’re using <code>1.1.1.1</code>, you can also go to <a href="https://1.1.1.1/help">https://1.1.1.1/help</a> to verify DNS-over-TLS.</p>
<h1 id="random-mac-address">Random MAC address</h1>
<p>NetworkManager supports 3 options to have a random MAC address (also known as “cloned” or “spoofed” MAC address):</p>
<ol>
<li><code>wifi.scan-rand-mac-address</code> controls the MAC address used when scanning for WiFi devices. This goes into the <code>[device]</code> section</li>
<li><code>wifi.cloned-mac-address</code> controls the MAC address for WiFi connections. This goes into the <code>[connection]</code> section</li>
<li><code>ethernet.cloned-mac-address</code> controls the MAC address for Ethernet connections. This goes into the <code>[connection]</code> section</li>
</ol>
<p>The first option can take either <code>yes</code> or <code>no</code>. The last two can take various values, but if you want a randomized MAC address you are interested in these two:</p>
<ul>
<li><code>random</code>: generate a new random MAC address each time you establish a connection</li>
<li><code>stable</code>: this generates a MAC address that is kinda random (it’s a hash), but will be reused when you connect to the same network again.</li>
</ul>
<p><code>random</code> is better if you don’t want to be tracked, but it has the disadvantage that captive portals won’t remember you. Instead <code>stable</code> allows captive portals to remember you and therefore won’t show up whenever you reconnect.</p>
<p>Whatever options you want to go with, put them into a file <code>/etc/NetworkManager/conf.d/mac.conf</code> (or any other name you like). Mine looks like this:</p>
<div class="highlight"><pre><span></span><code>[device]
# use a random mac address when scanning for wifi networks
wifi.scan-rand-mac-address=yes
[connection]
# use a random mac address when connecting to a network
ethernet.cloned-mac-address=random
wifi.cloned-mac-address=random
</code></pre></div>
<p>To apply the settings either run restart your computer or run:</p>
<div class="highlight"><pre><span></span><code>sudo systemctl reload NetworkManager.service
</code></pre></div>
<p>You can test your changes with:</p>
<div class="highlight"><pre><span></span><code>ip link
</code></pre></div>andreacorbelliniTue, 28 Apr 2020 06:30:00 +0000tag:andrea.corbellini.name,2020-04-28:/2020/04/28/ubuntu-global-dns/ubuntuubuntusystemdresolvednetwork-managerdnsmac-addressprivacy11 years of Ubuntu membershiphttps://andrea.corbellini.name/2018/05/12/11-years-of-ubuntu-membership/<p>It’s been 11 years and 1 month since I was awarded with <a href="https://wiki.ubuntu.com/Membership">official Ubuntu membership</a>. I will never forget that day: as a kid I had to write about myself on <a href="https://en.wikipedia.org/wiki/Internet_Relay_Chat">IRC</a>, in front of the Community Council members and answer their questions in a language that was not my primary one. I must confess that I was a bit scared that evening, but once I made it, it felt <em>so</em> good. It felt good not just because of the award itself, but rather because that was the recognition that I did <em>something</em> that <em>mattered</em>. I did something useful that other people could benefit from. And for me, that meant a lot.</p>
<p>So much time has passed since then. So many things have changed both in my life and around me, for better or worse. So many that I cannot even <a href="https://en.wikipedia.org/wiki/Cantor%27s_diagonal_argument">enumerate</a> all of them. Nonetheless, deep inside of me, I still feel like that young kid: curious, always ready to experiment, full of hopes and uncertain (but never scared) about the future.</p>
<p>Through the years I received the support of a bunch of people who believed in me, and I thank them all. But if today I feel so hopeful it’s undoubtedly thanks to one person in particular, a person who holds a special place in my life. A big thank you goes to you.</p>andreacorbelliniSat, 12 May 2018 21:30:00 +0000tag:andrea.corbellini.name,2018-05-12:/2018/05/12/11-years-of-ubuntu-membership/ubuntuubuntuRunning Docker Swarm inside LXC (outdated)https://andrea.corbellini.name/2016/04/13/docker-swarm-inside-lxc/<p><strong>UPDATE:</strong> This article was written in 2016 and refers to a version of Docker Swarm that is now known as “legacy Swarm”. The newer Docker Swarm won’t work in LXC as described in this article.</p>
<p>I’ve been using <a href="https://docs.docker.com/swarm/">Docker Swarm</a> inside <a href="https://linuxcontainers.org/lxc/introduction/">LXC</a> containers for a while now, and I thought that I could share my experience with you. Due to their nature, LXC containers are pretty lightweight and require very few resources if compared to virtual machines. This makes LXC ideal for development and simulation purposes. Running Docker Swarm inside LXC requires a few steps that I’m going to show you in this tutorial.</p>
<p>Before we begin, a quick premise: LXC, Docker and Swarm can be configured in many different ways. Here I’m showing just my preferred setup: LXC with AppArmor disabled, Docker with the OverlayFS storage driver, Swarm with etcd discovery. There exist many other kind of configurations that can work under LXC — leave a comment if you want to know more.</p>
<p><strong>Overview:</strong></p>
<ol>
<li><a href="#step-1">Create the Swarm Manager container</a></li>
<li><a href="#step-2">Modify configuration for the Swarm Manager container</a></li>
<li><a href="#step-3">Load the OverlayFS module</a></li>
<li><a href="#step-4">Start the container and install Docker</a></li>
<li><a href="#step-5">Check if Docker is working</a></li>
<li><a href="#step-6">Set up the Swarm Manager</a></li>
<li><a href="#step-7">Create the Swarm Agents</a></li>
<li><a href="#step-8">Play with the Swarm</a></li>
</ol>
<p><strong>Terminology:</strong></p>
<ul>
<li>the <em>host</em> is the system that will create and start the LXC containers (e.g. your laptop);</li>
<li>the <em>manager</em> is the LXC container that will run the Swarm manager (it’ll run the <code>swarm manage</code> command);</li>
<li>an <em>agent</em> is one of the many LXC containers that will run a Swarm agent node (it’ll run the <code>swarm join</code> command);</li>
</ul>
<p>To avoid ambiguity, all commands will be prefixed with a prompt such as <code>root@host:~#</code>, <code>root@swarm-manager:~#</code> and <code>root@swarm-agent-1:~#</code>.</p>
<p><strong>Prerequisites:</strong></p>
<p>This tutorial assumes that you have at least a vague idea of what Docker and Docker Swarm are. You should also be familiar with the shell.</p>
<p>This tutorial has been successfully tested on Ubuntu 15.10 (that ships with Docker 1.6) and Ubuntu 16.04 LTS (Docker 1.10), but it may work on other distributions and Docker versions as well.</p>
<h1 id="step-1">Step 1: Create the Swarm Manager container</h1>
<p>Create a new LXC container with:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@host:~# </span>lxc-create<span class="w"> </span>-t<span class="w"> </span>download<span class="w"> </span>-n<span class="w"> </span>swarm-manager
</code></pre></div>
<p>When prompted, choose your favorite distribution and architecture. I chose <code>ubuntu</code> / <code>xenial</code> / <code>amd64</code>.</p>
<p><code>lxc-create</code> needs to run as root, <a href="https://www.stgraber.org/2014/01/17/lxc-1-0-unprivileged-containers/">unprivileged containers</a> won’t work. We could actually make Docker start inside an unprivileged container, the problem is that we wouldn’t be allowed to create block and character devices, and many Docker containers need this ability.</p>
<h1 id="step-2">Step 2: Modify the configuration for the Swarm Manager container</h1>
<p>Before starting the LXC container, open the file <code>/var/lib/lxc/swarm-manager/config</code> on the host and add the following configuration to the bottom of the file:</p>
<div class="highlight"><pre><span></span><code><span class="c1"># Distribution configuration</span>
<span class="c1"># ...</span>
<span class="c1"># Container specific configuration</span>
<span class="c1"># ...</span>
<span class="c1"># Network configuration</span>
<span class="c1"># ...</span>
<span class="c1"># Allow running Docker inside LXC</span>
lxc.aa_profile<span class="w"> </span><span class="o">=</span><span class="w"> </span>unconfined
lxc.cap.drop<span class="w"> </span><span class="o">=</span>
</code></pre></div>
<p>The first rule (<code>lxc.aa_profile = unconfined</code>) disables AppArmor confinement. The second one (<code>lxc.cap.drop =</code>) gives all capabilities to the processes in LXC container.</p>
<p>These two rules may seem harmful from a security standpoint, and in fact they are. However we must remember that we will be running Docker inside the LXC container. Docker already ships with its own AppArmor profile and the two rules above are needed exactly for the purposes of letting Docker talk to AppArmor.</p>
<p>So, while Docker itself won’t be confined, <strong>Docker containers will be confined</strong>, and this is an encouraging fact.</p>
<h1 id="step-3">Step 3: Load the OverlayFS module</h1>
<p>OverlayFS is shipped with Ubuntu, but not enabled by default. To enable it:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@host:~# </span>modprobe<span class="w"> </span>overlay
</code></pre></div>
<p>It is important to do this step before installing Docker. Docker supports various storage drivers and when Docker is installed for the first time it tries to detect the most appropriate one for the system. If Docker detects that OverlayFS is not loaded, it’ll fall back to the device mapper. There’s nothing wrong with the device mapper, we can make it work, however, as I said at the beginning, in this tutorial I’m focusing only on OverlayFS.</p>
<p>If you want to load OverlayFS at boot, instead of doing it manually after every reboot, add it to <code>/etc/modules-load.d/modules.conf</code>:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@host:~# </span><span class="nb">echo</span><span class="w"> </span>overlay<span class="w"> </span>>><span class="w"> </span>/etc/modules-load.d/modules.conf
</code></pre></div>
<h1 id="step-4">Step 4: Start the container and install Docker</h1>
<p>It’s time to see if we did everything right!</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@host:~# </span>lxc-start<span class="w"> </span>-n<span class="w"> </span>swarm-manager
<span class="gp">root@host:~# </span>lxc-attach<span class="w"> </span>-n<span class="w"> </span>swarm-manager
<span class="gp">root@swarm-manager:~# </span>apt<span class="w"> </span>update
<span class="gp">root@swarm-manager:~# </span>apt<span class="w"> </span>install<span class="w"> </span>docker.io
</code></pre></div>
<p>Installation should complete without any problem. If you get an error like this:</p>
<div class="highlight"><pre><span></span><code>Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.
invoke-rc.d: initscript docker, action "start" failed.
dpkg: error processing package docker.io (--configure):
subprocess installed post-installation script returned error exit status 1
</code></pre></div>
<p>It means that Docker failed to start. Try checking <code>systemctl status docker</code> as suggested, or run <code>docker daemon</code> manually. You might get an error like this:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>daemon
<span class="go">WARN[0000] devmapper: Udev sync is not supported. This will lead to unexpected behavior, data loss and errors. For more information, see https://docs.docker.com/reference/commandline/daemon/#daemon-storage-driver-option</span>
<span class="go">ERRO[0000] There are no more loopback devices available.</span>
<span class="go">ERRO[0000] [graphdriver] prior storage driver "devicemapper" failed: loopback attach failed</span>
<span class="go">FATA[0000] Error starting daemon: error initializing graphdriver: loopback attach failed</span>
</code></pre></div>
<p>In this case, Docker is using the devicemapper storage driver and is complaining about the lack of loopback devices. If that’s the case, check whether OverlayFS is loaded and reinstall Docker.</p>
<p>Or you might get an error like this:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>daemon
<span class="go">...</span>
<span class="go">FATA[0000] Error starting daemon: AppArmor enabled on system but the docker-default profile could not be loaded.</span>
</code></pre></div>
<p>It this other case, Docker is complaining about the fact that it can’t talk to AppArmor. Check the configuration for the LXC container.</p>
<h1 id="step-5">Step 5: Check if Docker is working</h1>
<p>Once you are all set, you should be able to use Docker: try running <code>docker info</code>, <code>docker ps</code> or launch a container:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>run<span class="w"> </span>--rm<span class="w"> </span>docker/whalesay<span class="w"> </span>cowsay<span class="w"> </span>burp!
<span class="go">Unable to find image 'docker/whalesay:latest' locally</span>
<span class="go">latest: Pulling from docker/whalesay</span>
<span class="go">...</span>
<span class="go">Status: Downloaded newer image for docker/whalesay:latest</span>
<span class="go"> _______</span>
<span class="go">< burp! ></span>
<span class="go"> -------</span>
<span class="go"> \</span>
<span class="go"> \</span>
<span class="go"> \</span>
<span class="gp"> #</span><span class="c1"># .</span>
<span class="gp"> #</span><span class="c1"># ## ## ==</span>
<span class="gp"> #</span><span class="c1"># ## ## ## ===</span>
<span class="go"> /""""""""""""""""___/ ===</span>
<span class="go"> ~~~ {~~ ~~~~ ~~~ ~~~~ ~~ ~ / ===- ~~~</span>
<span class="go"> \______ o __/</span>
<span class="go"> \ \ __/</span>
<span class="go"> \____\______/</span>
</code></pre></div>
<p>It appears to be working. By the way, we can check whether Docker is correctly confining containers. Try running a Docker container and check on the host the output of <code>aa-status</code>: you should see a process running with the <code>docker-default</code> profile. For example:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>run<span class="w"> </span>--rm<span class="w"> </span>ubuntu<span class="w"> </span>bash<span class="w"> </span>-c<span class="w"> </span><span class="s1">'while true; do sleep 1; echo -n zZ; done'</span>
<span class="go">zZzZzZzZzZzZzZzZ...</span>
<span class="gp"># </span>On<span class="w"> </span>another<span class="w"> </span>shell
<span class="gp">root@host:~# </span>aa-status
<span class="go">apparmor module is loaded.</span>
<span class="go">5 profiles are loaded.</span>
<span class="go">5 profiles are in enforce mode.</span>
<span class="go"> /sbin/dhclient</span>
<span class="go"> /usr/lib/NetworkManager/nm-dhcp-client.action</span>
<span class="go"> /usr/lib/NetworkManager/nm-dhcp-helper</span>
<span class="go"> /usr/lib/connman/scripts/dhclient-script</span>
<span class="go"> docker-default</span>
<span class="go">0 profiles are in complain mode.</span>
<span class="go">4 processes have profiles defined.</span>
<span class="go">4 processes are in enforce mode.</span>
<span class="go"> /sbin/dhclient (797)</span>
<span class="go"> /sbin/dhclient (2832)</span>
<span class="go"> docker-default (6956)</span>
<span class="go"> docker-default (6973)</span>
<span class="go">0 processes are in complain mode.</span>
<span class="go">0 processes are unconfined but have a profile defined.</span>
<span class="gp">root@host:~# </span>ps<span class="w"> </span>-ef<span class="w"> </span><span class="p">|</span><span class="w"> </span>grep<span class="w"> </span><span class="m">6956</span>
<span class="go">root 6956 4982 0 17:17 ? 00:00:00 bash -c while true; do sleep 1; echo -n zZ; done</span>
<span class="go">root 6973 6956 0 17:17 ? 00:00:00 sleep 1</span>
<span class="go">root 6982 6808 0 17:17 pts/3 00:00:00 grep --color=auto 6956</span>
</code></pre></div>
<p>Yay! Everything is running as expected: we launched a process inside a Docker container, and that process is running with the <code>docker-default</code> AppArmor profile. Once again: even if LXC is running unconfined, our Docker containers are not.</p>
<h1 id="step-6">Step 6: Set up the Swarm Manager</h1>
<p>That was the hardest part. Now we can proceed setting up Swarm as we would usually do.</p>
<p>As I said at the beginning, Swarm can be configured in many ways. In this tutorial I’ll show how to set it up with etcd discovery. First of all, we need the IP address of the LXC container:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>ifconfig<span class="w"> </span>eth0
<span class="go">eth0 Link encap:Ethernet HWaddr 00:16:3e:8e:cb:43</span>
<span class="go"> inet addr:10.0.3.154 Bcast:10.0.3.255 Mask:255.255.255.0</span>
<span class="go"> inet6 addr: fe80::216:3eff:fe8e:cb43/64 Scope:Link</span>
<span class="go"> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1</span>
<span class="go"> RX packets:23177 errors:0 dropped:0 overruns:0 frame:0</span>
<span class="go"> TX packets:20859 errors:0 dropped:0 overruns:0 carrier:0</span>
<span class="go"> collisions:0 txqueuelen:1000</span>
<span class="go"> RX bytes:147652946 (147.6 MB) TX bytes:1455613 (1.4 MB)</span>
</code></pre></div>
<p><code>10.0.3.154</code> is my IP address. Let’s start etcd:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span><span class="nv">SWARM_MANAGER_IP</span><span class="o">=</span><span class="m">10</span>.0.3.154
<span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>run<span class="w"> </span>-d<span class="w"> </span>--restart<span class="o">=</span>always<span class="w"> </span>--name<span class="o">=</span>etcd<span class="w"> </span>-p<span class="w"> </span><span class="m">4001</span>:4001<span class="w"> </span>-p<span class="w"> </span><span class="m">2380</span>:2380<span class="w"> </span>-p<span class="w"> </span><span class="m">2379</span>:2379<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>quay.io/coreos/etcd<span class="w"> </span>-name<span class="w"> </span>etcd0<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-advertise-client-urls<span class="w"> </span>http://<span class="nv">$SWARM_MANAGER_IP</span>:2379,http://<span class="nv">$SWARM_MANAGER_IP</span>:4001<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-listen-client-urls<span class="w"> </span>http://0.0.0.0:2379,http://0.0.0.0:4001<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-initial-advertise-peer-urls<span class="w"> </span>http://<span class="nv">$SWARM_MANAGER_IP</span>:2380<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-listen-peer-urls<span class="w"> </span>http://0.0.0.0:2380<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-initial-cluster-token<span class="w"> </span>etcd-cluster-1<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-initial-cluster<span class="w"> </span><span class="nv">etcd0</span><span class="o">=</span>http://<span class="nv">$SWARM_MANAGER_IP</span>:2380<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-initial-cluster-state<span class="w"> </span>new
<span class="go">Unable to find image 'quay.io/coreos/etcd:latest' locally</span>
<span class="go">latest: Pulling from coreos/etcd</span>
<span class="go">...</span>
<span class="go">Status: Downloaded newer image for quay.io/coreos/etcd:latest</span>
<span class="go">e742278a97d2ad3f88658aa871903d20b4094e551969a03aa8332d3876fe5d0d</span>
<span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>ps
<span class="go">CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES</span>
<span class="go">e742278a97d2 quay.io/coreos/etcd "/etcd -name etcd0 -a" 32 seconds ago Up 31 seconds 0.0.0.0:2379-2380->2379-2380/tcp, 0.0.0.0:4001->4001/tcp, 7001/tcp etcd</span>
</code></pre></div>
<p>Replace <code>10.0.3.154</code> with the IP address of your LXC container.</p>
<p>Note that I’ve started etcd with <code>--restart=always</code>, so that every time etcd is automatically started when the LXC container starts. With this option, etcd will restart even if you explicitly stop it. Drop <code>--restart=always</code> if that’s not what you want.</p>
<p>Now we can start the Swarm manager:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>run<span class="w"> </span>-d<span class="w"> </span>--restart<span class="o">=</span>always<span class="w"> </span>--name<span class="o">=</span>swarm<span class="w"> </span>-p<span class="w"> </span><span class="m">3375</span>:3375<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>swarm<span class="w"> </span>manage<span class="w"> </span>-H<span class="w"> </span><span class="m">0</span>.0.0.0:3375<span class="w"> </span>etcd://<span class="nv">$SWARM_MANAGER_IP</span>:2379
<span class="go">Unable to find image 'swarm:latest' locally</span>
<span class="go">latest: Pulling from library/swarm</span>
<span class="go">...</span>
<span class="go">Status: Downloaded newer image for swarm:latest</span>
<span class="go">8080c93c544ff92cc2cf682ff0bbc82e0d2dfb01e1f98f202c3a0801d3427330</span>
<span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>ps
<span class="go">CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES</span>
<span class="go">46b556e73e87 swarm "/swarm manage -H 0.0" 3 seconds ago Up 2 seconds 2375/tcp, 0.0.0.0:3375->3375/tcp swarm</span>
<span class="go">e742278a97d2 quay.io/coreos/etcd "/etcd -name etcd0 -a" 7 minutes ago Up 7 minutes 0.0.0.0:2379-2380->2379-2380/tcp, 0.0.0.0:4001->4001/tcp, 7001/tcp etcd</span>
</code></pre></div>
<p>Our Swarm manager is up and running. We can connect to it and issue a few commands:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>-H<span class="w"> </span>localhost:3375<span class="w"> </span>info
<span class="go">Containers: 0</span>
<span class="go"> Running: 0</span>
<span class="go"> Paused: 0</span>
<span class="go"> Stopped: 0</span>
<span class="go">Images: 0</span>
<span class="go">Server Version: swarm/1.1.3</span>
<span class="go">Role: primary</span>
<span class="go">Strategy: spread</span>
<span class="go">Filters: health, port, dependency, affinity, constraint</span>
<span class="go">Nodes: 0</span>
<span class="go">Plugins:</span>
<span class="go"> Volume:</span>
<span class="go"> Network:</span>
<span class="go">Kernel Version: 4.4.0-15-generic</span>
<span class="go">Operating System: linux</span>
<span class="go">Architecture: amd64</span>
<span class="go">CPUs: 0</span>
<span class="go">Total Memory: 0 B</span>
<span class="go">Name: d39c33295ef3</span>
</code></pre></div>
<p>As you can see there are no nodes connected, as we would expect. Everything looks good.</p>
<h1 id="step-7">Step 7: Create the Swarm Agents</h1>
<p>Our Swarm manager can’t do anything interesting without agent nodes. Creating new LXC containers for the agents is not much different from what we already did with the manager. To set up new agents in an automatic fashion I’ve created a script, so that you don’t need to repeat the steps manually:</p>
<div class="highlight"><pre><span></span><code><span class="ch">#!/bin/bash</span>
<span class="nb">set</span><span class="w"> </span>-eu
<span class="nv">SWARM_MANAGER_IP</span><span class="o">=</span><span class="m">10</span>.0.3.154
<span class="nv">DOWNLOAD_DIST</span><span class="o">=</span>ubuntu
<span class="nv">DOWNLOAD_RELEASE</span><span class="o">=</span>xenial
<span class="nv">DOWNLOAD_ARCH</span><span class="o">=</span>amd64
<span class="k">for</span><span class="w"> </span>LXC_NAME<span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span>
<span class="k">do</span>
<span class="w"> </span><span class="nv">LXC_PATH</span><span class="o">=</span><span class="s2">"/var/lib/lxc/</span><span class="nv">$LXC_NAME</span><span class="s2">"</span>
<span class="w"> </span><span class="nv">LXC_ROOTFS</span><span class="o">=</span><span class="s2">"</span><span class="nv">$LXC_PATH</span><span class="s2">/rootfs"</span>
<span class="w"> </span><span class="c1"># Create the container.</span>
<span class="w"> </span>lxc-create<span class="w"> </span>-t<span class="w"> </span>download<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s2">"</span><span class="nv">$DOWNLOAD_DIST</span><span class="s2">"</span><span class="w"> </span>-r<span class="w"> </span><span class="s2">"</span><span class="nv">$DOWNLOAD_RELEASE</span><span class="s2">"</span><span class="w"> </span>-a<span class="w"> </span><span class="s2">"</span><span class="nv">$DOWNLOAD_ARCH</span><span class="s2">"</span>
<span class="w"> </span>cat<span class="w"> </span><span class="s"><<EOF >> "$LXC_PATH/config"</span>
<span class="s"># Allow running Docker inside LXC</span>
<span class="s">lxc.aa_profile = unconfined</span>
<span class="s">lxc.cap.drop =</span>
<span class="s">EOF</span>
<span class="w"> </span><span class="c1"># Start the container and wait for networking to start.</span>
<span class="w"> </span>lxc-start<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span>
<span class="w"> </span>sleep<span class="w"> </span>10s
<span class="w"> </span><span class="c1"># Install Docker.</span>
<span class="w"> </span>lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>apt-get<span class="w"> </span>update
<span class="w"> </span>lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>apt-get<span class="w"> </span>install<span class="w"> </span>-y<span class="w"> </span>docker.io
<span class="w"> </span><span class="c1"># Tell Docker to listen on all interfaces.</span>
<span class="w"> </span>sed<span class="w"> </span>-i<span class="w"> </span>-e<span class="w"> </span><span class="s1">'s/^#DOCKER_OPTS=.*$/DOCKER_OPTS="-H 0.0.0.0:2375"/'</span><span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_ROOTFS</span><span class="s2">/etc/default/docker"</span>
<span class="w"> </span>lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>systemctl<span class="w"> </span>restart<span class="w"> </span>docker
<span class="w"> </span><span class="c1"># Join the Swarm.</span>
<span class="w"> </span><span class="nv">SWARM_AGENT_IP</span><span class="o">=</span><span class="s2">"</span><span class="k">$(</span>lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>ifconfig<span class="w"> </span>eth0<span class="w"> </span><span class="p">|</span><span class="w"> </span>grep<span class="w"> </span>-Po<span class="w"> </span><span class="s1">'(?<=inet addr:)\S+'</span><span class="k">)</span><span class="s2">"</span>
<span class="w"> </span>lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>docker<span class="w"> </span>run<span class="w"> </span>-d<span class="w"> </span>--restart<span class="o">=</span>always<span class="w"> </span>--name<span class="o">=</span>swarm<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>swarm<span class="w"> </span>join<span class="w"> </span>--addr<span class="o">=</span><span class="s2">"</span><span class="nv">$SWARM_AGENT_IP</span><span class="s2">:2375"</span><span class="w"> </span><span class="s2">"etcd://</span><span class="nv">$SWARM_MANAGER_IP</span><span class="s2">:2379"</span>
<span class="k">done</span>
</code></pre></div>
<p>Be sure to change the values for <code>SWARM_MANAGER_IP</code>, <code>DOWNLOAD_DIST</code>, <code>DOWNLOAD_RELEASE</code> and <code>DOWNLOAD_ARCH</code> to fit your needs.</p>
<p>Thanks to this script, creating 10 new agents is as simple as running one command:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@host:~# </span>./swarm-agent-create<span class="w"> </span>swarm-agent-<span class="o">{</span><span class="m">0</span>..9<span class="o">}</span>
</code></pre></div>
<p>Here’s an explanation of what the script does:</p>
<ul>
<li>
<p>It first sets up a new LXC container following steps 1-5 above, that is: create a new LXC container (with <code>lxc-create</code>), apply the LXC configuration (<code>lxc.aa_profile</code> and <code>lxc.cap.drop</code> rules), start the container and install Docker.</p>
<div class="highlight"><pre><span></span><code><span class="nv">LXC_PATH</span><span class="o">=</span><span class="s2">"/var/lib/lxc/</span><span class="nv">$LXC_NAME</span><span class="s2">"</span>
<span class="nv">LXC_ROOTFS</span><span class="o">=</span><span class="s2">"</span><span class="nv">$LXC_PATH</span><span class="s2">/rootfs"</span>
<span class="c1"># Create the container.</span>
lxc-create<span class="w"> </span>-t<span class="w"> </span>download<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s2">"</span><span class="nv">$DOWNLOAD_DIST</span><span class="s2">"</span><span class="w"> </span>-r<span class="w"> </span><span class="s2">"</span><span class="nv">$DOWNLOAD_RELEASE</span><span class="s2">"</span><span class="w"> </span>-a<span class="w"> </span><span class="s2">"</span><span class="nv">$DOWNLOAD_ARCH</span><span class="s2">"</span>
cat<span class="w"> </span><span class="s"><<EOF >> "$LXC_PATH/config"</span>
<span class="s"># Allow running Docker inside LXC</span>
<span class="s">lxc.aa_profile = unconfined</span>
<span class="s">lxc.cap.drop =</span>
<span class="s">EOF</span>
<span class="c1"># Start the container and wait for networking to start.</span>
lxc-start<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span>
sleep<span class="w"> </span>10s
<span class="c1"># Install Docker.</span>
lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>apt-get<span class="w"> </span>update
lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>apt-get<span class="w"> </span>install<span class="w"> </span>-y<span class="w"> </span>docker.io
</code></pre></div>
</li>
<li>
<p>Our Swarm agents need to be reachable by the manager. For this reason we need to configure them so that they bind to a public interface. To do so, the script adds <code>DOCKER_OPTS="-H 0.0.0.0:2375"</code> and restarts Docker.</p>
<div class="highlight"><pre><span></span><code><span class="c1"># Tell Docker to listen on all interfaces.</span>
sed<span class="w"> </span>-i<span class="w"> </span>-e<span class="w"> </span><span class="s1">'s/^#DOCKER_OPTS=.*$/DOCKER_OPTS="-H 0.0.0.0:2375"/'</span><span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_ROOTFS</span><span class="s2">/etc/default/docker"</span>
lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>systemctl<span class="w"> </span>restart<span class="w"> </span>docker
</code></pre></div>
</li>
<li>
<p>Lastly, the script checks the IP address for the LXC container and it launches Swarm.</p>
<div class="highlight"><pre><span></span><code><span class="c1"># Join the Swarm.</span>
<span class="nv">SWARM_AGENT_IP</span><span class="o">=</span><span class="s2">"</span><span class="k">$(</span>lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>ifconfig<span class="w"> </span>eth0<span class="w"> </span><span class="p">|</span><span class="w"> </span>grep<span class="w"> </span>-Po<span class="w"> </span><span class="s1">'(?<=inet addr:)\S+'</span><span class="k">)</span><span class="s2">"</span>
lxc-attach<span class="w"> </span>-n<span class="w"> </span><span class="s2">"</span><span class="nv">$LXC_NAME</span><span class="s2">"</span><span class="w"> </span>--<span class="w"> </span>docker<span class="w"> </span>run<span class="w"> </span>-d<span class="w"> </span>--restart<span class="o">=</span>always<span class="w"> </span>--name<span class="o">=</span>swarm<span class="w"> </span><span class="se">\</span>
<span class="w"> </span>swarm<span class="w"> </span>join<span class="w"> </span>--addr<span class="o">=</span><span class="s2">"</span><span class="nv">$SWARM_AGENT_IP</span><span class="s2">:2375"</span><span class="w"> </span><span class="s2">"etcd://</span><span class="nv">$SWARM_MANAGER_IP</span><span class="s2">:2379"</span>
</code></pre></div>
</li>
</ul>
<h1 id="step-8">Step 8: Play with the Swarm</h1>
<p>Now, if we check <code>docker info</code> on the Swarm manager, we should see 10 healthy nodes:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>-H<span class="w"> </span>localhost:3375<span class="w"> </span>info
<span class="go">Containers: 10</span>
<span class="go"> Running: 10</span>
<span class="go"> Paused: 0</span>
<span class="go"> Stopped: 0</span>
<span class="go">Images: 10</span>
<span class="go">Server Version: swarm/1.1.3</span>
<span class="go">Role: primary</span>
<span class="go">Strategy: spread</span>
<span class="go">Filters: health, port, dependency, affinity, constraint</span>
<span class="go">Nodes: 10</span>
<span class="go"> swarm-agent-0: 10.0.3.73:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:32:35Z</span>
<span class="go"> swarm-agent-1: 10.0.3.97:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:31:49Z</span>
<span class="go"> swarm-agent-2: 10.0.3.58:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:31:54Z</span>
<span class="go"> swarm-agent-3: 10.0.3.195:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:32:03Z</span>
<span class="go"> swarm-agent-4: 10.0.3.235:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:32:22Z</span>
<span class="go"> swarm-agent-5: 10.0.3.174:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:32:16Z</span>
<span class="go"> swarm-agent-6: 10.0.3.222:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:32:21Z</span>
<span class="go"> swarm-agent-7: 10.0.3.140:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:31:43Z</span>
<span class="go"> swarm-agent-8: 10.0.3.95:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:32:17Z</span>
<span class="go"> swarm-agent-9: 10.0.3.125:2375</span>
<span class="go"> └ Status: Healthy</span>
<span class="go"> └ Containers: 1</span>
<span class="go"> └ Reserved CPUs: 0 / 4</span>
<span class="go"> └ Reserved Memory: 0 B / 4.052 GiB</span>
<span class="go"> └ Labels: executiondriver=native-0.2, kernelversion=4.4.0-15-generic, operatingsystem=Ubuntu 16.04, storagedriver=overlay</span>
<span class="go"> └ Error: (none)</span>
<span class="go"> └ UpdatedAt: 2016-04-13T15:32:30Z</span>
<span class="go">Plugins:</span>
<span class="go"> Volume:</span>
<span class="go"> Network:</span>
<span class="go">Kernel Version: 4.4.0-15-generic</span>
<span class="go">Operating System: linux</span>
<span class="go">Architecture: amd64</span>
<span class="go">CPUs: 40</span>
<span class="go">Total Memory: 40.52 GiB</span>
<span class="go">Name: d39c33295ef3</span>
</code></pre></div>
<p>Let’s try running a command on the Swarm:</p>
<div class="highlight"><pre><span></span><code><span class="gp">root@swarm-manager:~# </span>docker<span class="w"> </span>-H<span class="w"> </span>localhost:3375<span class="w"> </span>run<span class="w"> </span>-i<span class="w"> </span>--rm<span class="w"> </span>docker/whalesay<span class="w"> </span>cowsay<span class="w"> </span><span class="s1">'It works!'</span>
<span class="go"> ___________</span>
<span class="go">< It works! ></span>
<span class="go"> -----------</span>
<span class="go"> \</span>
<span class="go"> \</span>
<span class="go"> \</span>
<span class="gp"> #</span><span class="c1"># .</span>
<span class="gp"> #</span><span class="c1"># ## ## ==</span>
<span class="gp"> #</span><span class="c1"># ## ## ## ===</span>
<span class="go"> /""""""""""""""""___/ ===</span>
<span class="go"> ~~~ {~~ ~~~~ ~~~ ~~~~ ~~ ~ / ===- ~~~</span>
<span class="go"> \______ o __/</span>
<span class="go"> \ \ __/</span>
<span class="go"> \____\______/</span>
</code></pre></div>
<h1 id="conclusion">Conclusion</h1>
<p>We created a Swarm cluster consisting of one manager and 10 agents, and we kept memory and disk usage low thanks to LXC containers. We also succeeded in confining our Docker containers with AppArmor. Overall, this setup is probably not ideal for use in a production environment, but very useful for simulating clusters on your laptop.</p>
<p>I hope you enjoyed the tutorial. Feel free to leave a comment if you have questions!</p>andreacorbelliniWed, 13 Apr 2016 18:00:00 +0000tag:andrea.corbellini.name,2016-04-13:/2016/04/13/docker-swarm-inside-lxc/information-technologydockerswarmlxccontainersdistributed-computingWhen bureaucracy hits the web: the cookie lawhttps://andrea.corbellini.name/2015/09/22/cookie-law/<p>For a few years now, every first of April I hoped to read between the news something on the lines of “the cookie law was a joke, sorry for that”. You know, bureaucracy is slow, and it’s reasonable to think that it takes time for them to reveal jokes. Yet, many firsts of April have passed, and no such announcement has been made. Many missed opportunities for Europe to show their love for progress and their competence with the web.</p>
<p>Being compliant with the EU cookie law is hard to do. It’s not just a matter of showing a boring banner, it’s a matter of defacing your web pages, writing long privacy policies that nobody will read, implementing ways to prevent certain cookies from being set.</p>
<p>The truth is: if you, as a webmaster, want to avoid wasting time and avoid headaches, you just have to avoid cookies. This is what I have done with most websites I maintain: <strong>I have removed all analytics, all social sharing buttons, all YouTube videos, all comments</strong>. This was a sad thing to do, but it was the only thing I could do: I maintain websites for free mainly as a favor for friends and no-profits I’m involved with — it’s not my day job. Also, I do not want other people being sued because of mistakes from my side: cookies may be set in the most unexpected situations and disabling every feature that could potentially set them seems the safest choice.</p>
<p>The only exception is this blog. Here, I use cookies for Google Analytics, for social sharing buttons and for Disqus. I may live without Google Analytics (even though it gives useful insights, such as performance statistics and tips), but I can’t really remove social buttons and Disqus: this is a blog and it wouldn’t make any sense to remove social features and comments.</p>
<p>Being compliant with the EU cookie law has been on my todo list for a while, and I never found the time (nor the desire) to look into it. Today I did. I spent a few hours of my time to discover that <strong>Google Analytics is “OK”</strong> (in the sense that I do not have to display an ugly banner, nor have to ask for explicit permission from the user before setting the cookies) and to discover that <strong>social buttons and Disqus are “bad”</strong> (in the sense that I have to display a banner and ask for explicit consent from the user <em>before</em> setting the cookies). In the end, the only service that I could remove is the less problematic service.</p>
<p>As I said, I really do not want to remove social buttons, Disqus or whatever third-party content I’ll want to display in the future. Therefore, in order to comply with the cookie law, I’m forced to write code, write a privacy policy, waste another bunch of hours of my time. But not today, as I’ve already had enough sense of sadness and impotence.</p>
<p>At least for now, I guess that the EU cookie law compliance will stay on my todo list for some more time. Probably if I worked on compliance instead of writing this rant, I could have already finished (but then what’s the point of having a blog if you don’t blog?)</p>
<p>The cookie law wants to be “on the side of the users,” and it is based on noble principles: it wants users to be well-informed about how their data is used and by whom. However, as it is today, it’s against both users and webmasters. <strong>Webmasters have to lose their time working on compliance, and users receive a degraded experience due to silly regulations.</strong></p>
<p>I’d like to do <a href="http://nocookielaw.com/">what Silktide did</a>: actively protesting against the law, but I wouldn’t be so happy if I were sued. I’d like to read “the cookie law was a joke” in the news, but I’m starting to believe that it’s not going to happen any time soon. It seems that accepting the sadness of the reality is the only option I’m left with.</p>
<p>End of rant, let’s move on.</p>andreacorbelliniTue, 22 Sep 2015 18:35:00 +0000tag:andrea.corbellini.name,2015-09-22:/2015/09/22/cookie-law/miscblogcookie-lawHello Pelican!https://andrea.corbellini.name/2015/08/02/hello-pelican/<p>Today I switched from WordPress.com to <a href="http://getpelican.com/">Pelican</a> and <a href="https://pages.github.com/">GitHub Pages</a>.</p>
<p>First off, let me say: almost all URLs that were previously working should still work. Only the feed URLs are broken, and this is not something I can fix. If you were following my blog via a feed reader, you should update to the new feed. Sorry for the inconvenience.</p>
<p>Having said that, I’d like to share with you the motivation that made me move and the details of the migration.</p>
<h1 id="the-bad-things-of-wordpress">The bad things of WordPress</h1>
<p>Now, this doesn’t want to be a rant, so I’ll be pretty concise. WordPress, the content management system, is an excellent platform for blogging. Easy to start with, easy to maintain, easy to use. WordPress.com makes things even easier. It also comes with many useful features, like comments and social networks integration.</p>
<p>The problem is: you can’t customize things or add features without paying. Of course, this is business, and I do not want to discuss business decisions made at WordPress.com. Not only that, but I could live fine with most of the major limitations. Also, I was perfectly conscious of this kind of problems with WordPress.com when I started (after all, this is not <a href="https://andrea.corbellini.name/2015/02/15/new-blog-again/">the first blog I started</a>).</p>
<p>I actually become upset of WordPress.com when writing the series of blog posts about <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/">Elliptic Curve Cryptography</a>. When writing these articles, I spent a lot of time employing workarounds to overcome WordPress.com limitations. Being used to Vim and its advanced features, I also found the editors (both the old and the new one) as a great obstacle for getting things done quickly. I do not want to enter the details of the problems I’m referring to, what matters is that, eventually, I gave up and I realized it was time to move on and seek for an alternative.</p>
<h1 id="why-pelican">Why Pelican</h1>
<p>Pelican is a static site generator. I’ve always thought that a static site had too many limitations for me. But while seeking an alternative to WordPress.com, I realized that many of those limitations were not affecting me in any way. Actually, with a static site I can do everything I want: edit my articles with Vim, render my equations with MathJax, customize my theme, version control my content, write scripts to post process my content.</p>
<p>The only bad thing about Pelican is that it does not come with any theme I truly like. I decided to make my own. I’m not entirely satisfied with it, as I feel it is too “anonymous”, but I believe it is fully responsive, fast, readable and offers all the features I want. Perhaps I’ll tweak it a little more to make it more “personal”.</p>
<p>Setting up Pelican and migrating everything required some time, but at least this time I worked on true solutions, not on ugly hacks and workarounds like I did with WordPress. This implies that when writing articles I will be able to focus more on content than other details.</p>
<h1 id="why-not-other-static-site-generators">Why not other static site generators</h1>
<p>In short: Pelican is written in Python and to my eyes it looked better than the other Python static site generators. I’ll be honest and say that I did not truly evaluate all of the alternatives: I knew <a href="...">list.org</a> switched to Pelican and that made me try Pelican before all other solutions.</p>
<h1 id="conclusion">Conclusion</h1>
<p>In the end I decided to leave WordPress for Pelican hosted on GitHub Pages. I’m pretty satisfied with the result I got. The nature of GitHub Pages prevents me from using HTTP redirects (and therefore the old feed links are broken), however in exchange I’ve got much more freedom, and this is what matters to me.</p>andreacorbelliniSun, 02 Aug 2015 18:55:00 +0000tag:andrea.corbellini.name,2015-08-02:/2015/08/02/hello-pelican/miscblogpelicanwordpressLet's Encrypt is going to start soonhttps://andrea.corbellini.name/2015/06/16/lets-encrypt-is-going-to-start-soon/<p><a href="https://letsencrypt.org/">Let’s Encrypt</a> (the free, automated and open certificate authority) has just <a href="https://letsencrypt.org/2015/06/16/lets-encrypt-launch-schedule.html">announced its launch schedule</a>. According to it, certificates will be released to the public starting from the <strong>week of September 14, 2015</strong>.</p>
<p>Their intermediate certificates, which <a href="https://letsencrypt.org/2015/06/04/isrg-ca-certs.html">were generated a few days ago</a>, will be signed by <a href="https://www.identrustssl.com/">IdenTrust</a>. What this means is that if you browse a web page secured by Let’s Encrypt, you won’t get any scary message, but the usual green lock.</p>
<figure>
<img src="https://andrea.corbellini.name/images/green-lock.png" alt="Green lock" width="612" height="188">
<figcaption><strong>You will see this...</strong></figcaption>
</figure>
<figure>
<img src="https://andrea.corbellini.name/images/red-lock.png" alt="Red lock" width="612" height="300">
<figcaption><strong>... not this.</strong></figcaption>
</figure>
<p>In case you are curious: the root certificate is a 4096-bit RSA key, the two intermediate certificates are both 2048-bit RSA keys. But they are also <a href="https://letsencrypt.org/certificates/">planning to generate ECDSA keys later this year</a> as well.</p>
<p>Technical aspects aside, this will be a great opportunity for the entire web. As I have <a href="https://andrea.corbellini.name/2015/04/12/lets-encrypt-the-road-towards-a-better-web/">already written</a>, I always dreamed of an encrypted web, and I truly believe that Let’s Encrypt — or at least its approach to the problem — is the way to go.</p>
<p>So, will you get a Let’s Encrypt certificate when the time comes? I will do. Not for this blog (I can’t put a certificate without paying), but for other websites I manage.</p>
<p>Perhaps I’ll also show a “Proudly secured by Let’s Encrypt” badge.</p>andreacorbelliniTue, 16 Jun 2015 18:20:00 +0000tag:andrea.corbellini.name,2015-06-16:/2015/06/16/lets-encrypt-is-going-to-start-soon/cryptographyecdsalet's encryptrsasecuritytlsElliptic Curve Cryptography: breaking security and a comparison with RSAhttps://andrea.corbellini.name/2015/06/08/elliptic-curve-cryptography-breaking-security-and-a-comparison-with-rsa/<p><strong>This post is the fourth and last in the series <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/">ECC: a gentle introduction</a>.</strong></p>
<p>In the <a href="https://andrea.corbellini.name/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/">last post</a> we have seen two algorithms, ECDH and ECDSA, and we have seen how the discrete logarithm problem for elliptic curves plays an important role for their security. But, if you remember, we said that <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/#discrete-logarithm">we have no mathematical proofs</a> for the complexity of the discrete logarithm problem: we believe it to be “hard”, but we can’t be sure. In the first part of this post, we’ll try to get an idea of how “hard” it is in practice with today’s techniques.</p>
<p>Then, in the second part, we will try to answer the question: why do we need elliptic curve cryptography if RSA (and the other cryptosystems based on modular arithmetic) work well?</p>
<h1 id="breaking-the-discrete-logarithm-problem">Breaking the discrete logarithm problem</h1>
<p>We will now see the two most efficient algorithms for computing discrete logarithms on elliptic curve: the baby-step, giant-step algorithm, and Pollard’s rho method.</p>
<p>Before starting, as a reminder, here is what the discrete logarithm problem is about: <strong>given two points $P$ and $Q$ find out the integer $x$ that satisfies the equation $Q = xP$</strong>. The points belong to a subgroup of an elliptic curve, which has a base point $G$ and which order is $n$.</p>
<h2 id="baby-step-giant-step">Baby-step, giant-step</h2>
<p>Before entering the details of the algorithm, a quick consideration: we can always write any integer $x$ as <strong>$x = am + b$</strong>, where $a$, $m$ and $b$ are three arbitrary integers. For example, we can write $10 = 2 \cdot 3 + 4$.</p>
<p>With this in mind, we can rewrite the equation for the discrete logarithm problem as follows:
$$\begin{align*}
Q & = xP \\
Q & = (am + b) P \\
Q & = am P + b P \\
Q - am P & = b P
\end{align*}$$</p>
<p>The baby-step giant-step is a “meet in the middle” algorithm. Contrary to the brute-force attack (which forces us to calculate all the points $xP$ for every $x$ until we find $Q$), we will calculate “few” values for $bP$ and “few” values for $Q - amP$ until we find a correspondence. The algorithm works as follows:</p>
<ol>
<li>Calculate $m = \left\lceil{\sqrt{n}}\right\rceil$</li>
<li>For every $b$ in ${0, \dots, m}$, calculate $bP$ and store the results in a hash table.</li>
<li>For every $a$ in ${0, \dots, m}$:<ol>
<li>calculate $amP$;</li>
<li>calculate $Q - amP$;</li>
<li>check the hash table and look if there exist a point $bP$ such that $Q - amP = bP$;</li>
<li>if such point exists, then we have found $x = am + b$.</li>
</ol>
</li>
</ol>
<p>As you can see, initially we calculate the points $bP$ with little (i.e. <strong>“baby”</strong>) increments for the coefficient $b$ ($1P$, $2P$, $3P$, …). Then, in the second part of the algorithm, we calculate the points $amP$ with huge (i.e. <strong>“giant”</strong>) increments for $am$ ($1mP$, $2mP$, $3mP$, …, where $m$ is a huge number).</p>
<figure>
<img src="https://andrea.corbellini.name/images/baby-step-giant-step.gif" alt="Baby-step, giant-step" width="310" height="346">
<figcaption>The baby-step, giant-step algorithm: initially we calculate few points via small steps and store them in a hash table. Then we perform the giant steps and compare the new points with the points in the hash table. Once a match is found, calculating the discrete logarithm is a matter of rearranging terms.</figcaption>
</figure>
<p>To understand why this algorithm works, forget for a moment that the points $bP$ are cached and take the equation $Q = amP + bP$. Consider what follows:</p>
<ul>
<li>When $a = 0$ we are checking whether $Q$ is equal to $bP$, where $b$ is one of the integers from 0 to $m$. This way, we are comparing $Q$ against all points from $0P$ to $mP$.</li>
<li>When $a = 1$ we are checking whether $Q$ is equal to $mP + bP$. We are comparing $Q$ against all points from $mP$ to $2mP$.</li>
<li>When $a = 2$ we are comparing $Q$ against all the points from $2mP$ to $3mP$.</li>
<li>…</li>
<li>When $a = m - 1$, we are comparing $Q$ against all points from $(m - 1)mP$ to $m^2 P = nP$.</li>
</ul>
<p>In conclusion, <strong>we are checking all points from $0P$ to $nP$</strong> (that is, all the possible points) <strong>performing at most $2m$ additions and multiplications</strong> (exactly $m$ for the baby steps, at most $m$ for the giant steps).</p>
<p>If you consider that a lookup on a hash table takes $O(1)$ time, it’s easy to see that this algorithm has both <strong>time and space complexity $O(\sqrt{n})$</strong> (or <strong>$O(2^{k / 2})$</strong> if you consider the bit length). It’s still exponential time, but much better than a brute-force attack.</p>
<h3 id="baby-step-giant-step-in-practice">Baby-step giant-step in practice</h3>
<p>It may make sense to see what the complexity $O(\sqrt{n})$ means in practice. Let’s take a standardized curve: <code>prime192v1</code> (aka <code>secp192r1</code>, <code>ansiX9p192r1</code>). This curve has order $n$ = 0xffffffff ffffffff ffffffff 99def836 146bc9b1 b4d22831. The square root of $n$ is approximately 7.922816251426434 · 10<sup>28</sup> (almost <strong>eighty octillions</strong>).</p>
<p>Now imagine storing $\sqrt{n}$ points in a hash table. Suppose that each point requires exactly 32 bytes: <strong>our hash table would need approximately 2.5 · 10<sup>30</sup> bytes of memory</strong>. <a href="http://www.csc.com/big_data/flxwd/83638-big_data_just_beginning_to_explode_interactive_infographic">Looking on the web</a>, it seems that the total world storage capacity is in the order of the zettabyte (10<sup>21</sup> bytes). This is almost <strong>ten orders of magnitude</strong> lower than the memory required by our hash table! Even if our points took 1 byte each, we would be still very far from being able to store all of them.</p>
<p>This is impressive, and is even more impressive if you consider that <code>prime192v1</code> is one of the curves with the lowest order. The order of <code>secp521r1</code> (another standard curve from NIST) is approximately 6.9 · 10<sup>156</sup>!</p>
<h3 id="playing-with-baby-step-giant-step">Playing with baby-step giant-step</h3>
<p>I made <a href="https://github.com/andreacorbellini/ecc/blob/master/logs/babygiantstep.py">a Python script</a> that computes discrete logarithms using the baby-step giant-step algorithm. Obviously it only works with curves with small orders: don’t try it with <code>secp521r1</code>, unless you want to receive a <code>MemoryError</code>.</p>
<p>It should produce an output like this:</p>
<div class="highlight"><pre><span></span><code>Curve: y^2 = (x^3 + 1x - 1) mod 10177
Curve order: 10331
p = (0x1, 0x1)
q = (0x1a28, 0x8fb)
325 * p = q
log(p, q) = 325
Took 105 steps
</code></pre></div>
<h2 id="pollards">Pollard’s ρ</h2>
<p>Pollard’s rho is another algorithm for computing discrete logarithms. It has the same asymptotic time complexity $O(\sqrt{n})$ of the baby-step giant-step algorithm, but its space complexity is just $O(1)$. If baby-step giant-step can’t solve discrete logarithms because of the huge memory requirements, will Pollard’s rho make it? Let’s see…</p>
<p>First of all, another reminder of the discrete logarithm problem: given $P$ and $Q$ find $x$ such that $Q = xP$. With Pollard’s rho, we will solve a sightly different problem: given $P$ and $Q$, <strong>find the integers $a$, $b$, $A$ and $B$ such that $aP + bQ = AP + BQ$</strong>.</p>
<p>Once the four integers are found, we can use the equation $Q = xP$ to find out $x$:
$$\begin{align*}
aP + bQ & = AP + BQ \\
aP + bxP & = AP + BxP \\
(a + bx) P & = (A + Bx) P \\
(a - A) P & = (B - b) xP
\end{align*}$$</p>
<p>Now we can get rid of $P$. But before doing so, remember that our subgroup is cyclic with order $n$, therefore the coefficients used in point multiplication are modulo $n$:
$$\begin{align*}
a - A & \equiv (B - b) x \pmod{n} \\
x & = (a - A)(B - b)^{-1} \bmod{n}
\end{align*}$$</p>
<p>The principle of operation of Pollard’s rho is simple: <strong>we generate a pseudo-random sequence of points $X_1$, $X_2$, … where each $X = a_i P + b_i Q$</strong>. The sequence can be generated using a pseudo-random function $f$ like this:
$$(a_{i + 1}, b_{i + 1}) = f(X_i)$$</p>
<p>That is: the pseudo-random function $f$ takes the latest point $X_i$ in the sequence as the input, and gives the coefficients $a_{i + 1}$ and $b_{i + 1}$ as the output. From there, we can calculate $X_{i + 1} = a_{i + 1} P + b_{i + 1} Q$; we can then input $X_{i + 1}$ into $f$ again and repeat.</p>
<p>It doesn’t really matter how $f$ works internally (although certain functions may yield results faster than others), what matters is that $f$ determines the next point in the sequence based on the previous one, and that all the $a_i$ and $b_i$ coefficients are known by us.</p>
<p>By using such $f$, sooner or later we will see a loop in our sequence. That is, we will see a point $X_j = X_i$.</p>
<figure>
<img src="https://andrea.corbellini.name/images/pollard-rho.png" alt="Pollard's rho cycle visualization" width="300" height="270">
<figcaption>A visualization of what a cycle in the sequence might look like: have some initial points ($X_0$, $X_1$, $X_2$), and then the cycle itself, formed by the points $X_3$ to $X_8$. After that, $X_9 = X_3$, $X_{10} = X_4$ and so on.<br>This picture resembles the Greek letter ρ (rho), hence the name.</figcaption>
</figure>
<p>The reason why we must see the cycle is simple: the number of points is finite, hence they must repeat sooner or later. Once we see where the cycle is, we can use the equations above to figure out the discrete logarithm.</p>
<p>The problem now is: how do we detect the cycle in an efficient way?</p>
<h3 id="tortoise-and-hare">Tortoise and Hare</h3>
<p>To detect cycles, we have an efficient method: the <strong>tortoise and hare algorithm</strong> (also known as Floyd’s cycle-finding algorithm). The picture below shows the principle of operation of the tortoise and hare method, which is at the core of Pollard’s rho.</p>
<figure>
<img src="https://andrea.corbellini.name/images/tortoise-hare.gif" alt="Tortoise and Hare" width="650" height="101">
<figcaption>We have the curve $y^2 \equiv x^3 + 2x + 3 \pmod{97}$ and the points $P = (3, 6)$ and $Q = (80, 87)$. The points belong to a cyclic subgroup of order 5.<br>We walk a sequence of pairs at different speeds until we find two different pairs $(a, b)$ and $(A, B)$ that produce the same point. In this case, we have found the pairs $(3, 3)$ and $(2, 0)$ that allow us to calculate the logarithm as $x = (3 - 2)(0 - 3)^{-1} \bmod{5} = 3$. And in fact we correctly have $Q = 3P$.</figcaption>
</figure>
<p>We take two pets, the tortoise and the hare, and make them walk our sequence of points from left to right. <strong>The tortoise</strong> (the green spot in the picture) is slow and <strong>reads each point one by one</strong>; <strong>the hare</strong> (represented in red) is fast and <strong>skips a point at every step</strong>.</p>
<p>After some time both the tortoise and the hare will have found the same point, but with different coefficient pairs. Or, to express that with equations, the tortoise will have found a pair $(a, b)$ and the hare will have found a pair $(A, B)$ such that $aP + bQ = AP + BQ$.</p>
<p>It’s easy to see that this algorithm requires constant memory (<strong>$O(1)$ space complexity</strong>). Calculating the asymptotic time complexity is not that easy, but we can build a probabilistic proof that shows how <strong>the time complexity is $O(\sqrt{n})$</strong>, as we have already said. The proof is based on the “<a href="https://en.wikipedia.org/wiki/Birthday_problem">birthday paradox</a>”, which is about the probability of two people having the same birthday, where here we are concerned about the probability of two $(a, b)$ pairs yielding the same point.</p>
<h3 id="playing-with-pollards">Playing with Pollard’s ρ</h3>
<p>I’ve built <a href="https://github.com/andreacorbellini/ecc/blob/master/logs/pollardsrho.py">a Python script</a> that computes discrete logarithms using Pollard’s rho. It is not the implementation of the original Pollard’s rho, but a slight variation of it (I’ve used a more efficient method for generating the pseudo-random sequence of pairs). The script contains some useful comments, so read it if you are interested in the details of the algorithm.</p>
<p>This script, like the baby-step giant-step one, works on a tiny curve, and produces the same kind of output.</p>
<h3 id="pollards-in-practice">Pollard’s ρ in practice</h3>
<p>We said that baby-step giant-step can’t be used in practice, because of the huge memory requirements. Pollard’s rho, on the other hand, requires very few memory. So, how practical is it?</p>
<p><strong>Certicom launched a <a href="https://www.certicom.com/index.php/the-certicom-ecc-challenge">challenge</a> in 1998</strong> to compute discrete logarithms on elliptic curves with bit lengths ranging from 109 to 359. As of today, <strong>only 109-bit long curves</strong> have been successfully broken. The latest successful attempt was made in 2004. Quoting <a href="http://en.wikipedia.org/wiki/Discrete_logarithm_records">Wikipedia</a>:</p>
<blockquote>
<p>The prize was awarded on 8 April 2004 to a group of about 2600 people represented by Chris Monico. They also used a version of a parallelized Pollard rho method, taking 17 months of calendar time.</p>
</blockquote>
<p>As we have already said, <code>prime192v1</code> is one of the “smallest” elliptic curves. We also said that Pollard’s rho has $O(\sqrt{n})$ time complexity. If we used the same technique as Chris Monico (the same algorithm, on the same hardware, with the same number of machines), how much would it take to compute a logarithm on <code>prime192v1</code>?
$$17\ \text{months}\ \times \frac{\sqrt{2^{192}}}{\sqrt{2^{109}}} \approx 5 \cdot 10^{13}\ \text{months}$$</p>
<p>This number is pretty self-explanatory and gives a clear idea of how hard it can be to break a discrete logarithm using such techniques.</p>
<h2 id="pollards-vs-baby-step-giant-step">Pollard’s ρ vs Baby-step giant-step</h2>
<p>I decided to put the <a href="https://github.com/andreacorbellini/ecc/blob/master/logs/babygiantstep.py">baby-step giant-step script</a> and the <a href="https://github.com/andreacorbellini/ecc/blob/master/logs/pollardsrho.py">Pollard’s rho script</a> together with a <a href="https://github.com/andreacorbellini/ecc/blob/master/logs/bruteforce.py">brute-force script</a> into a <a href="https://github.com/andreacorbellini/ecc/blob/master/logs/comparelogs.py">fourth script</a> to compare their performances.</p>
<p>This fourth script computes all the logarithms for all the points on the “tiny” curve using different algorithms and reports how much time it did take:</p>
<div class="highlight"><pre><span></span><code>Curve order: 10331
Using bruteforce
Computing all logarithms: 100.00% done
Took 2m 31s (5193 steps on average)
Using babygiantstep
Computing all logarithms: 100.00% done
Took 0m 6s (152 steps on average)
Using pollardsrho
Computing all logarithms: 100.00% done
Took 0m 21s (138 steps on average)
</code></pre></div>
<p>As we could expect, the brute-force method is tremendously slow if compared to the others two. Baby-step giant-step is the faster, while Pollard’s rho is more than three times slower than baby-step giant-step (although it uses far less memory and fewer number of steps on average).</p>
<p>Also look at the number of steps: brute force used 5193 steps on average for computing each logarithm. 5193 is very near to 10331 / 2 (half the curve order). Baby-step giant-steps and Pollard’s rho used 152 steps and 138 steps respectively, two numbers very close to the square root of 10331 (101.64).</p>
<h2 id="final-consideration">Final consideration</h2>
<p>While discussing these algorithms, I have presented many numbers. It’s important to be cautious when reading them: algorithms can be greatly optimized in many ways. Hardware can improve. Specialized hardware can be built.</p>
<p>The fact that an approach today seems impractical, does not imply that the approach can’t be improved. It also does not imply that other, better approaches exist (remember, once again, that we have no proofs for the complexity of the discrete logarithm problem).</p>
<h1 id="shors-algorithm">Shor’s algorithm</h1>
<p>If today’s techniques are unsuitable, what about tomorrow’s techniques? Well, things are a bit more worrisome: there exist a <strong><a href="https://en.wikipedia.org/wiki/Quantum_algorithm">quantum algorithm</a> capable of computing discrete logarithms in polynomial time: <a href="https://en.wikipedia.org/wiki/Shor%27s_algorithm">Shor’s algorithm</a></strong>, which has time complexity $O((\log n)^3)$ and space complexity $O(\log n)$.</p>
<p>Quantum computers are still far from becoming sophisticated enough to run algorithms like Shor’s, still the need for <a href="https://en.wikipedia.org/wiki/Post-quantum_cryptography">quantum-resistant algorithms</a> may be something worth investigating now. What we encrypt today might not be safe tomorrow.</p>
<h1 id="ecc-and-rsa">ECC and RSA</h1>
<p>Now let’s forget about quantum computing, which is still far from being a serious problem. The question I’ll answer now is: <strong>why bothering with elliptic curves if RSA works well?</strong></p>
<p>A quick answer is given by NIST, which provides with <a href="https://www.nsa.gov/business/programs/elliptic_curve.shtml">a table that compares RSA and ECC key sizes</a> required to achieve the same level of security.</p>
<table class="table">
<thead>
<tr><th>RSA key size (bits)</th><th>ECC key size (bits)</th></tr>
</thead>
<tbody>
<tr><td>1024</td><td>160</td></tr>
<tr><td>2048</td><td>224</td></tr>
<tr><td>3072</td><td>256</td></tr>
<tr><td>7680</td><td>384</td></tr>
<tr><td>15360</td><td>521</td></tr>
</tbody>
</table>
<p>Note that there is no linear relationship between the RSA key sizes and the ECC key sizes (in other words: if we double the RSA key size, we don’t have to double the ECC key size). This table tells us not only that ECC uses less memory, but also that key generation and signing are considerably faster.</p>
<p>But why is it so? The answer is that the faster algorithms for computing discrete logarithms over elliptic curves are Pollard’s rho and baby-step giant-step, while in the case of RSA we have faster algorithms. One in particular is the <strong><a href="https://en.wikipedia.org/wiki/General_number_field_sieve">general number field sieve</a></strong>: an algorithm for integer factorization that can be used to compute discrete logarithms. The general number field sieve is the fastest algorithm for integer factorization to date.</p>
<p>All of this applies to other cryptosystems based on modular arithmetic as well, including DSA, D-H and ElGamal.</p>
<h1 id="hidden-threats-of-nsa">Hidden threats of NSA</h1>
<p>An now the hard part. So far we have discussed algorithms and mathematics. Now it’s time to discuss people, and things get more complicated.</p>
<p>If you remember, in the last post we said that certain classes of elliptic curves are weak, and to solve the problem of trusting curves from dubious sources we added a random seed to our domain parameters. And if we look at standard curves from NIST we can see that they are all verifiably random.</p>
<p>If we read the Wikipedia page for “<a href="http://en.wikipedia.org/wiki/Nothing_up_my_sleeve_number">nothing up my sleeve</a>”, we can see that:</p>
<ul>
<li>The random numbers for MD5 come from the sine of integers.</li>
<li>The random numbers for Blowfish come from the first digits of $\pi$.</li>
<li>The random numbers for RC5 come from both $e$ and the golden ratio.</li>
</ul>
<p>These numbers are random because their digits are uniformly distributed. And they are also unsuspicious, because they have a justification.</p>
<p>Now the question is: <strong>where do the random seeds for NIST curves come from?</strong> The answer is, sadly: we don’t know. Those seeds have no justification at all.</p>
<p><strong>Is it possible that NIST has discovered a “sufficiently large” class of weak elliptic curves and has tried many possible seeds until they found a vulnerable curve?</strong> I can’t answer this question, but this is a legit and important question. We know that NIST has succeeded in standardizing at least a <a href="http://en.wikipedia.org/wiki/Dual_EC_DRBG">vulnerable random number generator</a> (a generator which, oddly enough, is based on elliptic curves). Perhaps they also succeeded in standardizing a set of weak elliptic curves. How do we know? We can’t.</p>
<p>What’s important to understand is that “verifiably random” and “secure” are not synonyms. And it doesn’t matter how hard the logarithm problem is, or how long our keys are, if our algorithms are broken, there’s nothing we can do.</p>
<p>With respect to this, RSA wins, as it does not require special domain parameters that can be tampered. RSA (as well as other modular arithmetic systems) may be a good alternative if we can’t trust authorities and if we can’t construct our own domain parameters. And in case you are asking: yes, TLS may use NIST curves. If you check <a href="https://google.com/">https://google.com</a>, you’ll see that the connection is using ECDHE and ECDSA, with a certificate based on <code>prime256v1</code> (aka <code>secp256p1</code>).</p>
<h1 id="thats-all">That’s all!</h1>
<p>I hope you have enjoyed this series. My aim was to give you the basic knowledge, terminology and conventions to understand what elliptic curve cryptography today is. If I reached my aim, you should now be able to understand existing ECC-based cryptosystems and to expand your knowledge by reading “not so gentle” documentation. When writing this series, I could have skipped over many details and use a simpler terminology, but I felt that by doing so you would have not been able to understand what the web has to offer. I believe I have found a good compromise between simplicity and completeness.</p>
<p>Note though that by reading just this series, you are not able to implement secure ECC cryptosystems: security requires us to know many subtle but important details. Remember the <a href="https://andrea.corbellini.name/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/#random-curves">requirements for Smart’s attack</a> and <a href="https://andrea.corbellini.name/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/#ecdsa-k">Sony’s mistake</a> — these are just two examples that should teach you how easy is to produce insecure algorithms and how easy it is to exploit them.</p>
<p>So, if you are interested in diving deeper into the world of ECC, where to go from here?</p>
<p>First off, so far we have seen Weierstrass curves over prime fields, but you must know that there exist other kinds of curve and fields, in particular:</p>
<ul>
<li><strong>Koblitz curves over binary fields.</strong> Those are elliptic curves in the form $y^2 + xy = x^3 + ax^2 + 1$ (where $a$ is either 0 or 1) over finite fields containing $2^m$ elements (where $m$ is a prime). They allow particularly efficient point additions and scalar multiplications.
Examples of standardized Koblitz curves are <code>nistk163</code>, <code>nistk283</code> and <code>nistk571</code> (three curves defined over a field of 163, 283 and 571 bits).</li>
<li><strong>Binary curves.</strong> They are very similar to Koblitz curves and are in the form $x^2 + xy = x^3 + x^2 + b$ (where $b$ is an integer often generated from a random seed). As the name suggests, binary curves are restricted to binary fields too. Examples of standardized curves are <code>nistb163</code>, <code>nistb283</code> and <code>nistb571</code>.
It must be said that there are growing concerns that both Koblitz and Binary curves may not be as safe as prime curves.</li>
<li><strong>Edwards curves</strong>, in the form $x^2 + y^2 = 1 + d x^2 y^2$ (where $d$ is either 0 or 1). These are particularly interesting not only because point addition and scalar multiplication are fast, but also because the formula for point addition is always the same, in any case ($P \ne Q$, $P = Q$, $P = -Q$, …). This feature leverages the possibility of side-channel attacks, where you measure the time used for scalar multiplication and try to guess the scalar coefficient based on the time it took to compute.
Edwards curves are relatively new (they were presented in 2007) and no authority such as Certicom or NIST have yet standardized any of them.</li>
<li><strong>Curve25519</strong> and <strong>Ed25519</strong> are two particular elliptic curves designed for ECDH and a variant of ECDSA respectively. Like Edwards curves, these two curves are fast and help preventing side-channel attacks. And like Edwards curves, these two curves have not been standardized yet and we can’t find them in any popular software (except OpenSSH, that supports Ed25519 key pairs since 2014).</li>
</ul>
<p>If you are interested in the implementation details of ECC, then I suggest you read the sources of <strong>OpenSSL</strong> and <strong>GnuTLS</strong>.</p>
<p>Finally, if you are interested in the mathematical details, rather than the security and efficiency of the algorithms, you must know that:</p>
<ul>
<li>Elliptic curves are <strong>algebraic varieties with genus one</strong>.</li>
<li>Points at infinity are studied in <strong>projective geometry</strong> and can be represented using <strong>homogeneous coordinates</strong> (although most of the features of projective geometry are not needed for elliptic curve cryptography).</li>
</ul>
<p>And don’t forget to study <strong>finite fields</strong> and <strong>field theory</strong>.</p>
<p>These are the keywords that you should look up if you’re interested in the topics.</p>
<p>Now the series is officially concluded. Thank you for all your friendly comments, tweets and mails. Many have asked me if I’m going to write other series on other closely related topics. The answer is: maybe. I accept suggestions, but I can’t promise anything.</p>
<p>Thanks for reading and see you next time!</p>andreacorbelliniMon, 08 Jun 2015 13:28:00 +0000tag:andrea.corbellini.name,2015-06-08:/2015/06/08/elliptic-curve-cryptography-breaking-security-and-a-comparison-with-rsa/cryptographydhdsaeccecdhecdheecdsarsasecurityElliptic Curve Cryptography: ECDH and ECDSAhttps://andrea.corbellini.name/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/<p><strong>This post is the third in the series <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/">ECC: a gentle introduction</a>.</strong></p>
<p>In the previous posts, we have seen <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#elliptic-curves">what an elliptic curve is</a> and we have defined a <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#group-law">group law</a> in order to do some math with the points of elliptic curves. Then we have <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/">restricted elliptic curves to finite fields of integers modulo a prime</a>. With this restriction, we have seen that the points of elliptic curves generate <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/#scalar-multiplication">cyclic subgroups</a> and we have introduced the terms <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/#base-point">base point</a>, <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/#subgroup-order">order</a> and <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/#cofactor">cofactor</a>.</p>
<p>Finally, we have seen that <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/#scalar-multiplication">scalar multiplication in finite fields</a> is an “easy” problem, while the <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/#discrete-logarithm">discrete logarithm problem</a> seems to be “hard”. Now we’ll see how all of this applies to cryptography.</p>
<h1 id="domain-parameters">Domain parameters</h1>
<p>Our elliptic curve algorithms will work in a cyclic subgroup of an elliptic curve over a finite field. Therefore, our algorithms will need the following parameters:</p>
<ul>
<li>The <strong>prime $p$</strong> that specifies the size of the finite field.</li>
<li>The <strong>coefficients $a$ and $b$</strong> of the elliptic curve equation.</li>
<li>The <strong>base point $G$</strong> that generates our subgroup.</li>
<li>The <strong>order $n$</strong> of the subgroup.</li>
<li>The <strong>cofactor $h$</strong> of the subgroup.</li>
</ul>
<p>In conclusion, the <strong>domain parameters</strong> for our algorithms are the <strong>sextuple $(p, a, b, G, n, h)$</strong>.</p>
<h2 id="random-curves">Random curves</h2>
<p>When I said that the discrete logarithm problem was “hard”, I wasn’t entirely right. There are <strong>some classes of elliptic curves that are particularly weak</strong> and allow the use of special purpose algorithms to solve the discrete logarithm problem efficiently. For example, all the curves that have $p = hn$ (that is, the order of the finite field is equal to the order of the elliptic curve) are vulnerable to <a href="http://interact.sagemath.org/edu/2010/414/projects/novotney.pdf">Smart’s attack</a>, which can be used to solve discrete logarithms in polynomial time on a classical computer.</p>
<p>Now, suppose that I give you the domain parameters of a curve. There’s the possibility that I’ve discovered a new class of weak curves that nobody knows, and probably I have built a “fast” algorithm for computing discrete logarithms on the curve I gave you. How can I convince you of the contrary, i.e. that I’m not aware of any vulnerability? <strong>How can I assure you that the curve is “safe” (in the sense that it can’t be used for special purpose attacks by me)?</strong></p>
<p>In an attempt to solve this kind of problem, sometimes we have an additional domain parameter: the <strong>seed $S$</strong>. This is a random number used to generate the coefficients $a$ and $b$, or the base point $G$, or both. These parameters are generated by computing the hash of the seed $S$. Hashes, as we know, are “easy” to compute, but “hard” to reverse.</p>
<figure>
<img src="https://andrea.corbellini.name/images/random-parameters-generation.png" alt="Random curve generation" width="500" height="74">
<figcaption>A simple sketch of how a random curve is generated from a seed: the hash of a random number is used to calculate different parameters of the curve.</figcaption>
</figure>
<figure>
<img src="https://andrea.corbellini.name/images/seed-inversion.png" alt="Building a seed from a hash" width="359" height="76">
<figcaption>If we wanted to cheat and try to construct a seed from the domain parameters, we would have to solve a "hard" problem: hash inversion.</figcaption>
</figure>
<p>A curve generated through a seed is said to be <strong>verifiably random</strong>. The principle of using hashes to generate parameters is known as “<a href="http://en.wikipedia.org/wiki/Nothing_up_my_sleeve_number">nothing up my sleeve</a>”, and is commonly used in cryptography.</p>
<p>This trick should give some sort of assurance that <strong>the curve has not been specially crafted to expose vulnerabilities known to the author</strong>. In fact, if I give you a curve together with a seed, it means I was not free to arbitrarily choose the parameters $a$ and $b$, and you should be relatively sure that the curve cannot be used for special purpose attacks by me. The reason why I say “relatively” will be explained in the next post.</p>
<p>A standardized algorithm for generating and checking random curves is described in ANSI X9.62 and is based on <a href="https://en.wikipedia.org/wiki/SHA-1">SHA-1</a>. If you are curious, you can read the algorithms for generating verifiable random curves on <a href="http://www.secg.org/sec1-v2.pdf">a specification by SECG</a> (look for “Verifiably Random Curves and Base Point Generators”).</p>
<p>I’ve created a <strong><a href="https://github.com/andreacorbellini/ecc/blob/master/scripts/verifyrandom.py">tiny Python script</a> that verifies all the random curves currently <a href="https://github.com/openssl/openssl/blob/81fc390/crypto/ec/ec_curve.c">shipped with OpenSSL</a></strong>. I strongly recommend you to check it out!</p>
<h1 id="elliptic-curve-cryptography">Elliptic Curve Cryptography</h1>
<p>It took us a long time, but finally here we are! Therefore, pure and simple:</p>
<ol>
<li>The <strong>private key</strong> is a random integer $d$ chosen from $\{1, \dots, n - 1\}$ (where $n$ is the order of the subgroup).</li>
<li>The <strong>public key</strong> is the point $H = dG$ (where $G$ is the base point of the subgroup).</li>
</ol>
<p>You see? If we know $d$ and $G$ (along with the other domain parameters), finding $H$ is “easy”. But if we know $H$ and $G$, <strong>finding the private key $d$ is “hard”, because it requires us to solve the discrete logarithm problem</strong>.</p>
<p>Now we are going to describe two public-key algorithms based on that: ECDH (Elliptic curve Diffie-Hellman), which is used for encryption, and ECDSA (Elliptic Curve Digital Signature Algorithm), used for digital signing.</p>
<h2 id="encryption-with-ecdh">Encryption with ECDH</h2>
<p>ECDH is a variant of the <a href="https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange">Diffie-Hellman algorithm</a> for elliptic curves. It is actually a <a href="https://en.wikipedia.org/wiki/Key-agreement_protocol">key-agreement protocol</a>, more than an encryption algorithm. This basically means that ECDH defines (to some extent) how keys should be generated and exchanged between parties. How to actually encrypt data using such keys is up to us.</p>
<p>The problem it solves is the following: two parties (the usual <a href="http://en.wikipedia.org/wiki/Alice_and_Bob">Alice and Bob</a>) want to exchange information securely, so that a third party (the <a href="http://en.wikipedia.org/wiki/Man-in-the-middle_attack">Man In the Middle</a>) may intercept them, but may not decode them. This is one of the principles behind TLS, just to give you an example.</p>
<p>Here’s how it works:</p>
<ol>
<li>
<p>First, <strong>Alice and Bob generate their own private and public keys</strong>. We have the private key $d_A$ and the public key $H_A = d_AG$ for Alice, and the keys $d_B$ and $H_B = d_BG$ for Bob. Note that both Alice and Bob are using the same domain parameters: the same base point $G$ on the same elliptic curve on the same finite field.</p>
</li>
<li>
<p><strong>Alice and Bob exchange their public keys $H_A$ and $H_B$ over an insecure channel</strong>. The Man In the Middle would intercept $H_A$ and $H_B$, but won’t be able to find out neither $d_A$ nor $d_B$ without solving the discrete logarithm problem.</p>
</li>
<li>
<p><strong>Alice calculates $S = d_A H_B$</strong> (using her own private key and Bob’s public key), <strong>and Bob calculates $S = d_B H_A$</strong> (using his own private key and Alice’s public key). Note that $S$ is the same for both Alice and Bob, in fact:
$$S = d_A H_B = d_A (d_B G) = d_B (d_A G) = d_B H_A$$</p>
</li>
</ol>
<p>The Man In the Middle, however, only knows $H_A$ and $H_B$ (together with the other domain parameters) and would not be able to find out the <strong>shared secret $S$</strong>. This is known as the Diffie-Hellman problem, which can be stated as follows:</p>
<blockquote>
<p>Given three points $P$, $aP$ and $bP$, what is the result of $abP$?</p>
</blockquote>
<p>Or, equivalently:</p>
<blockquote>
<p>Given three integers $k$, $k^x$ and $k^y$, what is the result of $k^{xy}$?</p>
</blockquote>
<p>(The latter form is used in the original Diffie-Hellman algorithm, based on modular arithmetic.)</p>
<figure>
<img src="https://andrea.corbellini.name/images/ecdh.png" alt="ECDH" width="468" height="196">
<figcaption>The Diffie-Hellman key exchange: Alice and Bob can "easily" calculate the shared secret, the Man in the Middle has to solve a "hard" problem.</figcaption>
</figure>
<p>The principle behind the Diffie-Hellman problem is also explained in a great <a href="https://www.youtube.com/watch?v=YEBfamv-_do#t=02m37s">YouTube video by Khan Academy</a>, which later explains the Diffie-Hellman algorithm applied to modular arithmetic (not to elliptic curves).</p>
<p>The Diffie-Hellman problem for elliptic curves is assumed to be a “hard” problem. It is believed to be as “hard” as the discrete logarithm problem, although no mathematical proofs are available. What we can tell for sure is that it can’t be “harder”, because solving the logarithm problem is a way of solving the Diffie-Hellman problem.</p>
<p><strong>Now that Alice and Bob have obtained the shared secret, they can exchange data with symmetric encryption.</strong></p>
<p>For example, they can use the $x$ coordinate of $S$ as the key to encrypt messages using secure ciphers like <a href="https://en.wikipedia.org/wiki/Advanced_Encryption_Standard">AES</a> or <a href="https://en.wikipedia.org/wiki/Triple_DES">3DES</a>. This is more or less what TLS does, the difference is that TLS concatenates the $x$ coordinate with other numbers relative to the connection and then computes a hash of the resulting byte string.</p>
<h3 id="playing-with-ecdh">Playing with ECDH</h3>
<p>I’ve created <strong><a href="https://github.com/andreacorbellini/ecc/blob/master/scripts/ecdhe.py">another Python script</a> for computing public/private keys and shared secrets over an elliptic curve</strong>.</p>
<p>Unlike all the examples we have seen till now, this script makes use of a standardized curve, rather than a simple curve on a small field. The curve I’ve chosen is <code>secp256k1</code>, from <a href="http://www.secg.org/">SECG</a> (the “Standards for Efficient Cryptography Group”, founded by <a href="https://www.certicom.com/">Certicom</a>). <a href="https://en.bitcoin.it/wiki/Secp256k1">This same curve is also used by Bitcoin</a> for digital signatures. Here are the domain parameters:</p>
<ul>
<li>$p$ = 0xffffffff ffffffff ffffffff ffffffff ffffffff ffffffff fffffffe fffffc2f</li>
<li>$a$ = 0</li>
<li>$b$ = 7</li>
<li>$x_G$ = 0x79be667e f9dcbbac 55a06295 ce870b07 029bfcdb 2dce28d9 59f2815b 16f81798</li>
<li>$y_G$ = 0x483ada77 26a3c465 5da4fbfc 0e1108a8 fd17b448 a6855419 9c47d08f fb10d4b8</li>
<li>$n$ = 0xffffffff ffffffff ffffffff fffffffe baaedce6 af48a03b bfd25e8c d0364141</li>
<li>$h$ = 1</li>
</ul>
<p>(These numbers were taken from <a href="https://github.com/openssl/openssl/blob/81fc390/crypto/ec/ec_curve.c#L766">OpenSSL source code</a>.)</p>
<p>Of course, you are free to modify the script to use other curves and domain parameters, just be sure to use prime fields and curves Weierstrass normal form, otherwise the script won’t work.</p>
<p>The script is really simple and includes some of the algorithms we have described so far: point addition, double and add, ECDH. I recommend you to read and run it. It will produce an output like this:</p>
<div class="highlight"><pre><span></span><code>Curve: secp256k1
Alice's private key: 0xe32868331fa8ef0138de0de85478346aec5e3912b6029ae71691c384237a3eeb
Alice's public key: (0x86b1aa5120f079594348c67647679e7ac4c365b2c01330db782b0ba611c1d677, 0x5f4376a23eed633657a90f385ba21068ed7e29859a7fab09e953cc5b3e89beba)
Bob's private key: 0xcef147652aa90162e1fff9cf07f2605ea05529ca215a04350a98ecc24aa34342
Bob's public key: (0x4034127647bb7fdab7f1526c7d10be8b28174e2bba35b06ffd8a26fc2c20134a, 0x9e773199edc1ea792b150270ea3317689286c9fe239dd5b9c5cfd9e81b4b632)
Shared secret: (0x3e2ffbc3aa8a2836c1689e55cd169ba638b58a3a18803fcf7de153525b28c3cd, 0x43ca148c92af58ebdb525542488a4fe6397809200fe8c61b41a105449507083)
</code></pre></div>
<h3 id="ephemeral-ecdh">Ephemeral ECDH</h3>
<p>Some of you may have heard of ECDHE instead of ECDH. The “E” in ECDHE stands for “Ephemeral” and refers to the fact that the <strong>keys exchanged are temporary</strong>, rather than static.</p>
<p>ECDHE is used, for example, in TLS, where both the client and the server generate their public-private key pair on the fly, when the connection is established. The keys are then signed with the TLS certificate (for authentication) and exchanged between the parties.</p>
<h2 id="signing-with-ecdsa">Signing with ECDSA</h2>
<p>The scenario is the following: <strong>Alice wants to sign a message with her private key</strong> ($d_A$), and <strong>Bob wants to validate the signature using Alice’s public key</strong> ($H_A$). Nobody but Alice should be able to produce valid signatures. Everyone should be able to check signatures.</p>
<p>Again, Alice and Bob are using the same domain parameters. The algorithm we are going to see is ECDSA, a variant of the <a href="https://en.wikipedia.org/wiki/Digital_Signature_Algorithm">Digital Signature Algorithm</a> applied to elliptic curves.</p>
<p>ECDSA works on the hash of the message, rather than on the message itself. The choice of the hash function is up to us, but it should be obvious that a <a href="http://en.wikipedia.org/wiki/Cryptographic_hash_function">cryptographically-secure hash function</a> should be chosen. <strong>The hash of the message ought to be truncated</strong> so that the bit length of the hash is the same as the bit length of $n$ (the order of the subgroup). <strong>The truncated hash is an integer and will be denoted as $z$.</strong></p>
<p>The algorithm performed by Alice to sign the message works as follows:</p>
<ol>
<li>Take a <strong>random integer $k$</strong> chosen from $\{1, \dots, n - 1\}$ (where $n$ is still the subgroup order).</li>
<li>Calculate the point <strong>$P = kG$</strong> (where $G$ is the base point of the subgroup).</li>
<li>Calculate the number <strong>$r = x_P \bmod{n}$</strong> (where $x_P$ is the $x$ coordinate of $P$).</li>
<li>If $r = 0$, then choose another $k$ and try again.</li>
<li>Calculate <strong>$s = k^{-1} (z + rd_A) \bmod{n}$</strong> (where $d_A$ is Alice’s private key and $k^{-1}$ is the multiplicative inverse of $k$ modulo $n$).</li>
<li>If $s = 0$, then choose another $k$ and try again.</li>
</ol>
<p>The pair <strong>$(r, s)$ is the signature</strong>.</p>
<figure>
<img src="https://andrea.corbellini.name/images/ecdsa.png" alt="ECDSA" width="514" height="255">
<figcaption>Alice signs the hash $z$ using her private key $d_A$ and a random $k$. Bob verifies that the message has been correctly signed using Alice's public key $H_A$.</figcaption>
</figure>
<p>In plain words, this algorithm first generates a secret ($k$). This secret is hidden in $r$ thanks to point multiplication (that, as we know, is “easy” one way, and “hard” the other way round). $r$ is then bound to the message hash by the equation $s = k^{-1} (z + rd_A) \bmod{n}$.</p>
<p>Note that in order to calculate $s$, we have computed the inverse of $k$ modulo $n$. We have <a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/#p-must-be-prime">already said in the previous post</a> that this is guaranteed to work only if $n$ is a prime number. <strong>If a subgroup has a non-prime order, ECDSA can’t be used.</strong> It’s not by chance that almost all standardized curves have a prime order, and those that have a non-prime order are unsuitable for ECDSA.</p>
<h3 id="verifying-signatures">Verifying signatures</h3>
<p>In order to verify the signature we’ll need Alice’s public key $H_A$, the (truncated) hash $z$ and, obviously, the signature $(r, s)$.</p>
<ol>
<li>Calculate the integer $u_1 = s^{-1} z \bmod{n}$.</li>
<li>Calculate the integer $u_2 = s^{-1} r \bmod{n}$.</li>
<li>Calculate the point $P = u_1 G + u_2 H_A$.</li>
</ol>
<p>The signature is valid only if $r = x_P \bmod{n}$.</p>
<h2 id="correctness-of-the-algorithm">Correctness of the algorithm</h2>
<p>The logic behind this algorithm may not seem obvious at a first sight, however if we put together all the equations we have written so far, things will be clearer.</p>
<p>Let’s start from $P = u_1 G + u_2 H_A$. We know, from the definition of public key, that $H_A = d_A G$ (where $d_A$ is the private key). We can write:
$$\begin{align*}
P & = u_1 G + u_2 H_A \\
& = u_1 G + u_2 d_A G \\
& = (u_1 + u_2 d_A) G
\end{align*}$$</p>
<p>Using the definitions of $u_1$ and $u_2$, we can write:
$$\begin{align*}
P & = (u_1 + u_2 d_A) G \\
& = (s^{-1} z + s^{-1} r d_A) G \\
& = s^{-1} (z + r d_A) G
\end{align*}$$</p>
<p>Here we have omitted “$\text{mod}\ n$” both for brevity, and because the cyclic subgroup generated by $G$ has order $n$, hence “$\text{mod}\ n$” is superfluous.</p>
<p>Previously, we defined $s = k^{-1} (z + rd_A) \bmod{n}$. Multiplying each side of the equation by $k$ and dividing by $s$, we get: $k = s^{-1} (z + rd_A) \bmod{n}$. Substituting this result in our equation for $P$, we get:
$$\begin{align*}
P & = s^{-1} (z + r d_A) G \\
& = k G
\end{align*}$$</p>
<p><strong>This is the same equation for $P$ we had at step 2 of the signature generation algorithm!</strong> When generating signatures and when verifying them, we are calculating the same point $P$, just with a different set of equations. This is why the algorithm works.</p>
<h3 id="playing-with-ecdsa">Playing with ECDSA</h3>
<p>Of course, I’ve created <strong><a href="https://github.com/andreacorbellini/ecc/blob/master/scripts/ecdsa.py">a Python script</a> for signature generation and verification</strong>. The code shares some parts with the ECDH script, in particular the domain parameters and the public/private key pair generation algorithm.</p>
<p>Here is the kind of output produced by the script:</p>
<div class="highlight"><pre><span></span><code>Curve: secp256k1
Private key: 0x9f4c9eb899bd86e0e83ecca659602a15b2edb648e2ae4ee4a256b17bb29a1a1e
Public key: (0xabd9791437093d377ca25ea974ddc099eafa3d97c7250d2ea32af6a1556f92a, 0x3fe60f6150b6d87ae8d64b78199b13f26977407c801f233288c97ddc4acca326)
Message: b'Hello!'
Signature: (0xddcb8b5abfe46902f2ac54ab9cd5cf205e359c03fdf66ead1130826f79d45478, 0x551a5b2cd8465db43254df998ba577cb28e1ee73c5530430395e4fba96610151)
Verification: signature matches
Message: b'Hi there!'
Verification: invalid signature
Message: b'Hello!'
Public key: (0xc40572bb38dec72b82b3efb1efc8552588b8774149a32e546fb703021cf3b78a, 0x8c6e5c5a9c1ea4cad778072fe955ed1c6a2a92f516f02cab57e0ba7d0765f8bb)
Verification: invalid signature
</code></pre></div>
<p>As you can see, the script first signs a message (the byte string “Hello!”), then verifies the signature. Afterwards, it tries to verify the same signature against another message (“Hi there!”) and verification fails. Lastly, it tries to verify the signature against the correct message, but using another random public key and verification fails again.</p>
<h2 id="ecdsa-k">The importance of <em>k</em></h2>
<p>When generating ECDSA signatures, it is important to keep the secret $k$ really secret. If we used the same $k$ for all signatures, or if our random number generator were somewhat predictable, <strong>an attacker would be able to find out the private key</strong>!</p>
<p><a href="http://www.bbc.com/news/technology-12116051">This is the kind of mistake made by Sony a few years ago.</a> Basically, the PlayStation 3 game console can run only games signed by Sony with ECDSA. This way, if I wanted to create a new game for PlayStation 3, I couldn’t distribute it to the public without a signature from Sony. The problem is: all the signatures made by Sony were generated using a static $k$.</p>
<p>(Apparently, Sony’s random number generator was inspired by either <a href="http://xkcd.com/221/">XKCD</a> or <a href="http://dilbert.com/strip/2001-10-25">Dilbert</a>.)</p>
<p>In this situation, we could easily recover Sony’s private key $d_S$ by buying just two signed games, extracting their hashes ($z_1$ and $z_2$) and their signatures ($(r_1, s_1)$ and $(r_2, s_2)$), together with the domain parameters. Here’s how:</p>
<ul>
<li>First off, note that $r_1 = r_2$ (because $r = x_P \bmod{n}$ and $P = kG$ is the same for both signatures).</li>
<li>Consider that $(s_1 - s_2) \bmod{n} = k^{-1} (z_1 - z_2) \bmod{n}$ (this result comes directly from the equation for $s$).</li>
<li>Now multiply each side of the equation by $k$: $k (s_1 - s_2) \bmod{n} = (z_1 - z_2) \bmod{n}$.</li>
<li>Divide by $(s_1 - s_2)$ to get $k = (z_1 - z_2)(s_1 - s_2)^{-1} \bmod{n}$.</li>
</ul>
<p>The last equation lets us calculate $k$ using only two hashes and their corresponding signatures. Now we can extract the private key using the equation for $s$:
$$s = k^{-1}(z + rd_S) \bmod{n}\ \ \Rightarrow\ \ d_S = r^{-1} (sk - z) \bmod{n}$$</p>
<p>Similar techniques may be employed if $k$ is not static but predictable in some way.</p>
<h1 id="have-a-great-weekend">Have a great weekend</h1>
<p>I really hope you enjoyed what I’ve written here. As usual, don’t hesitate to leave a comment or send me a poke if you need help with something.</p>
<p>Next week I’ll publish the fourth and last article of this series. It’ll be about techniques for solving discrete logarithms, some important problems of Elliptic Curve cryptography, and how ECC compares with RSA. Don’t miss it!</p>
<p><strong><a href="https://andrea.corbellini.name/2015/06/08/elliptic-curve-cryptography-breaking-security-and-a-comparison-with-rsa/">Read the next post of the series »</a></strong></p>andreacorbelliniSat, 30 May 2015 19:23:00 +0000tag:andrea.corbellini.name,2015-05-30:/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/cryptographydhdsaeccecdhecdheecdsasecuritytlsElliptic Curve Cryptography: finite fields and discrete logarithmshttps://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/<p><strong>This post is the second in the series <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/">ECC: a gentle introduction</a>.</strong></p>
<p>In the <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/">previous post</a>, we have seen how elliptic curves over the real numbers can be used to define a group. Specifically, we have defined a rule for <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#group-law">point addition</a>: given three aligned points, their sum is zero ($P + Q + R = 0$). We have derived a <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#geometric-addition">geometric method</a> and an <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#algebraic-addition">algebraic method</a> for computing point additions.</p>
<p>We then introduced <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#scalar-multiplication">scalar multiplication</a> ($nP = P + P + \cdots + P$) and we found out an “easy” algorithm for computing scalar multiplication: <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#double-and-add">double and add</a>.</p>
<p><strong>Now we will restrict our elliptic curves to finite fields</strong>, rather than the set of real numbers, and see how things change.</p>
<h1 id="the-field-of-integers-modulo-p">The field of integers modulo <em>p</em></h1>
<p>A finite field is, first of all, a set with a finite number of elements. An example of finite field is the set of integers modulo $p$, where $p$ is a prime number. It is generally denoted as $\mathbb{Z}/p$, $GF(p)$ or $\mathbb{F}_p$. We will use the latter notation.</p>
<p>In fields we have two binary operations: addition (+) and multiplication (·). Both are closed, associative and commutative. For both operations, there exist a unique identity element, and for every element there’s a unique inverse element. Finally, multiplication is distributive over the addition: $x \cdot (y + z) = x \cdot y + x \cdot z$.</p>
<p>The set of <strong>integers modulo $p$ consists of all the integers from 0 to $p - 1$</strong>. Addition and multiplication work as in <a href="http://en.wikipedia.org/wiki/Modular_arithmetic">modular arithmetic</a> (also known as “clock arithmetic”). Here are a few examples of operations in $\mathbb{F}_{23}$:</p>
<ul>
<li>Addition: $(18 + 9) \bmod{23} = 4$</li>
<li>Subtraction: $(7 - 14) \bmod{23} = 16$</li>
<li>Multiplication: $4 \cdot 7 \bmod{23} = 5$</li>
<li>
<p>Additive inverse: $-5 \bmod{23} = 18$</p>
<p>Indeed: $(5 + (-5)) \bmod{23} = (5 + 18) \bmod{23} = 0$</p>
</li>
<li>
<p>Multiplicative inverse: $9^{-1} \bmod{23} = 18$</p>
<p>Indeed: $9 \cdot 9^{-1} \bmod{23} = 9 \cdot 18 \bmod{23} = 1$</p>
</li>
</ul>
<p>If these equations don’t look familiar to you and you need a primer on modular arithmetic, check out <a href="https://www.khanacademy.org/computing/computer-science/cryptography/modarithmetic/a/what-is-modular-arithmetic">Khan Academy</a>.</p>
<p>As we already said, the integers modulo $p$ are a field, and therefore all the properties listed above hold. <span id="p-must-be-prime">Note that the requirement for $p$ to be prime is important!</span> The set of integers modulo 4 is not a field: 2 has no multiplicative inverse (i.e. the equation $2 \cdot x \bmod{4} = 1$ has no solutions).</p>
<h2 id="division-modulo-p">Division modulo <em>p</em></h2>
<p>We will soon define elliptic curves over $\mathbb{F}_p$, but before doing so we need a clear idea of what $x / y$ means in $\mathbb{F}_p$. Simply put: $x / y = x \cdot y^{-1}$, or, in plain words, $x$ over $y$ is equal to $x$ times the multiplicative inverse of $y$. This fact is not surprising, but gives us a basic method to perform division: <strong>find the multiplicative inverse of a number and then perform a single multiplication</strong>.</p>
<p>Computing the multiplicative inverse can be “easily” done with the <strong><a href="http://en.wikipedia.org/wiki/Extended_Euclidean_algorithm">extended Euclidean algorithm</a></strong>, which is $O(\log p)$ (or $O(k)$ if we consider the bit length) in the worst case.</p>
<p>We won’t enter the details of the extended Euclidean algorithm, as it is off-topic, however here’s a working Python implementation:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">extended_euclidean_algorithm</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Returns a three-tuple (gcd, x, y) such that</span>
<span class="sd"> a * x + b * y == gcd, where gcd is the greatest</span>
<span class="sd"> common divisor of a and b.</span>
<span class="sd"> This function implements the extended Euclidean</span>
<span class="sd"> algorithm and runs in O(log b) in the worst case.</span>
<span class="sd"> """</span>
<span class="n">s</span><span class="p">,</span> <span class="n">old_s</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">1</span>
<span class="n">t</span><span class="p">,</span> <span class="n">old_t</span> <span class="o">=</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span>
<span class="n">r</span><span class="p">,</span> <span class="n">old_r</span> <span class="o">=</span> <span class="n">b</span><span class="p">,</span> <span class="n">a</span>
<span class="k">while</span> <span class="n">r</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">quotient</span> <span class="o">=</span> <span class="n">old_r</span> <span class="o">//</span> <span class="n">r</span>
<span class="n">old_r</span><span class="p">,</span> <span class="n">r</span> <span class="o">=</span> <span class="n">r</span><span class="p">,</span> <span class="n">old_r</span> <span class="o">-</span> <span class="n">quotient</span> <span class="o">*</span> <span class="n">r</span>
<span class="n">old_s</span><span class="p">,</span> <span class="n">s</span> <span class="o">=</span> <span class="n">s</span><span class="p">,</span> <span class="n">old_s</span> <span class="o">-</span> <span class="n">quotient</span> <span class="o">*</span> <span class="n">s</span>
<span class="n">old_t</span><span class="p">,</span> <span class="n">t</span> <span class="o">=</span> <span class="n">t</span><span class="p">,</span> <span class="n">old_t</span> <span class="o">-</span> <span class="n">quotient</span> <span class="o">*</span> <span class="n">t</span>
<span class="k">return</span> <span class="n">old_r</span><span class="p">,</span> <span class="n">old_s</span><span class="p">,</span> <span class="n">old_t</span>
<span class="k">def</span> <span class="nf">inverse_of</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">p</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Returns the multiplicative inverse of</span>
<span class="sd"> n modulo p.</span>
<span class="sd"> This function returns an integer m such that</span>
<span class="sd"> (n * m) % p == 1.</span>
<span class="sd"> """</span>
<span class="n">gcd</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="n">extended_euclidean_algorithm</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span>
<span class="k">assert</span> <span class="p">(</span><span class="n">n</span> <span class="o">*</span> <span class="n">x</span> <span class="o">+</span> <span class="n">p</span> <span class="o">*</span> <span class="n">y</span><span class="p">)</span> <span class="o">%</span> <span class="n">p</span> <span class="o">==</span> <span class="n">gcd</span>
<span class="k">if</span> <span class="n">gcd</span> <span class="o">!=</span> <span class="mi">1</span><span class="p">:</span>
<span class="c1"># Either n is 0, or p is not a prime number.</span>
<span class="k">raise</span> <span class="ne">ValueError</span><span class="p">(</span>
<span class="s1">'</span><span class="si">{}</span><span class="s1"> has no multiplicative inverse '</span>
<span class="s1">'modulo </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">p</span><span class="p">))</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">return</span> <span class="n">x</span> <span class="o">%</span> <span class="n">p</span>
</code></pre></div>
<h1 id="elliptic-curves-in-mathbbf_p">Elliptic curves in $\mathbb{F}_p$</h1>
<p>Now we have all the necessary elements to restrict elliptic curves over $\mathbb{F}_p$. The set of points, that in the <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#elliptic-curves">previous post</a> was:
$$\begin{array}{rcl}
\left\{(x, y) \in \mathbb{R}^2 \right. & \left. | \right. & \left. y^2 = x^3 + ax + b, \right. \\
& & \left. 4a^3 + 27b^2 \ne 0\right\}\ \cup\ \left\{0\right\}
\end{array}$$
now becomes:
$$\begin{array}{rcl}
\left\{(x, y) \in (\mathbb{F}_p)^2 \right. & \left. | \right. & \left. y^2 \equiv x^3 + ax + b \pmod{p}, \right. \\
& & \left. 4a^3 + 27b^2 \not\equiv 0 \pmod{p}\right\}\ \cup\ \left\{0\right\}
\end{array}$$</p>
<p>where 0 is still the point at infinity, and $a$ and $b$ are two integers in $\mathbb{F}_p$.</p>
<figure>
<img src="https://andrea.corbellini.name/images/elliptic-curves-mod-p.png" alt="Elliptic curves in Fp" width="608" height="608">
<figcaption>The curve $y^2 \equiv x^3 - 7x + 10 \pmod{p}$ with $p = 19, 97, 127, 487$. Note that, for every $x$, there are at most two points. Also note the symmetry about $y = p / 2$.</figcaption>
</figure>
<figure>
<img src="https://andrea.corbellini.name/images/singular-mod-p.png" alt="Singular curve in Fp" width="300" height="300">
<figcaption>The curve $y^2 \equiv x^3 \pmod{29}$ is singular and has a triple point in $(0, 0)$. It is not a valid elliptic curve.</figcaption>
</figure>
<p>What previously was a continuous curve is now a set of disjoint points in the $xy$-plane. But we can prove that, even if we have restricted our domain, <strong>elliptic curves in $\mathbb{F}_p$ still form an abelian group</strong>.</p>
<h1 id="point-addition">Point addition</h1>
<p>Clearly, we need to change a bit our definition of addition in order to make it work in $\mathbb{F}_p$. With reals, we said that the sum of three aligned points was zero ($P + Q + R = 0$). We can keep this definition, but what does it mean for three points to be aligned in $\mathbb{F}_p$?</p>
<p>We can say that <strong>three points are aligned if there’s a line that connects all of them</strong>. Now, of course, lines in $\mathbb{F}_p$ are not the same as lines in $\mathbb{R}$. We can say, informally, that a line in $\mathbb{F}_p$ is the set of points $(x, y)$ that satisfy the equation $ax + by + c \equiv 0 \pmod{p}$ (this is the standard line equation, with the addition of “$(\text{mod}\ p)$”).</p>
<figure>
<img src="https://andrea.corbellini.name/images/point-addition-mod-p.png" alt="Point addition for elliptic curves in Z/p" width="523" height="528">
<figcaption>Point addition over the curve $y^2 \equiv x^3 - x + 3 \pmod{127}$, with $P = (16, 20)$ and $Q = (41, 120)$. Note how the line $y \equiv 4x + 83 \pmod{127}$ that connects the points "repeats" itself in the plane.</figcaption>
</figure>
<p>Given that we are in a group, point addition retains the properties we already know:</p>
<ul>
<li>$Q + 0 = 0 + Q = Q$ (from the definition of identity element).</li>
<li>Given a non-zero point $Q$, the inverse $-Q$ is the point having the same abscissa but opposite ordinate. Or, if you prefer, $-Q = (x_Q, -y_Q \bmod{p})$.
For example, if a curve in $\mathbb{F}_{29}$ has a point $Q = (2, 5)$, the inverse is $-Q = (2, -5 \bmod{29}) = (2, 24)$.</li>
<li>Also, $P + (-P) = 0$ (from the definition of inverse element).</li>
</ul>
<h1 id="algebraic-sum">Algebraic sum</h1>
<p><strong>The equations for calculating point additions are exactly the same as in the previous post</strong>, except for the fact that we need to add “$\text{mod}\ p$” at the end of every expression. Therefore, given $P = (x_P, y_P)$, $Q = (x_Q, y_Q)$ and $R = (x_R, y_R)$, we can calculate $P + Q = -R$ as follows:
$$\begin{align*}
x_R & = (m^2 - x_P - x_Q) \bmod{p} \\
y_R & = [y_P + m(x_R - x_P)] \bmod{p} \\
& = [y_Q + m(x_R - x_Q)] \bmod{p}
\end{align*}$$</p>
<p>If $P \ne Q$, the the slope $m$ assumes the form:
$$m = (y_P - y_Q)(x_P - x_Q)^{-1} \bmod{p}$$</p>
<p>Else, if $P = Q$, we have:
$$m = (3 x_P^2 + a)(2 y_P)^{-1} \bmod{p}$$</p>
<p>It’s not a coincidence that the equations have not changed: in fact, these equations work in every field, finite or infinite (with the exception of $\mathbb{F}_2$ and $\mathbb{F}_3$, which are special cased). Now I feel I have to provide a justification for this fact. The problem is: proofs for the group law generally involve complex mathematical concepts. However, I found a <a href="https://arxiv.org/pdf/1710.00214">proof from Stefan Friedl</a> that uses only elementary concepts. Read it if you are interested in why these equations work in (almost) every field.</p>
<p>Back to us — we won’t define a geometric method: in fact, there are a few problems with that. For example, in the previous post, we said that to compute $P + P$ we needed to take the tangent to the curve in $P$. But without continuity, the word “tangent” does not make any sense. We can workaround this and other problems, however a pure geometric method would just be too complicated and not practical at all.</p>
<p>Instead, you can play with the <strong><a href="https://andrea.corbellini.name/ecc/interactive/modk-add.html">interactive tool</a> I’ve written for computing point additions</strong>.</p>
<h1 id="the-order-of-an-elliptic-curve-group">The order of an elliptic curve group</h1>
<p>We said that an elliptic curve defined over a finite field has a finite number of points. An important question that we need to answer is: <strong>how many points are there exactly?</strong></p>
<p>Firstly, let’s say that the number of points in a group is called the <strong>order of the group</strong>.</p>
<p>Trying all the possible values for $x$ from 0 to $p - 1$ is not a feasible way to count the points, as it would require $O(p)$ steps, and this is “hard” if $p$ is a large prime.</p>
<p>Luckily, there’s a faster algorithm for computing the order: <a href="https://en.wikipedia.org/wiki/Schoof%27s_algorithm">Schoof’s algorithm</a>. I won’t enter the details of the algorithm — what matters is that it runs in polynomial time, and this is what we need.</p>
<h1 id="scalar-multiplication">Scalar multiplication and cyclic subgroups</h1>
<p>As with reals, multiplication can be defined as:
$$n P = \underbrace{P + P + \cdots + P}_{n\ \text{times}}$$</p>
<p>And, again, we can use the <a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/#double-and-add">double and add algorithm</a> to perform multiplication in $O(\log n)$ steps (or $O(k)$, where $k$ is the number of bits of $n$). I’ve written an <strong><a href="https://andrea.corbellini.name/ecc/interactive/modk-mul.html">interactive tool</a> for scalar multiplication</strong> too.</p>
<p>Multiplication over points for elliptic curves in $\mathbb{F}_p$ has an interesting property. Take the curve $y^2 \equiv x^3 + 2x + 3 \pmod{97}$ and the point $P = (3, 6)$. Now <a href="https://andrea.corbellini.name/ecc/interactive/modk-mul.html">calculate</a> all the multiples of $P$:</p>
<figure>
<img src="https://andrea.corbellini.name/images/cyclic-subgroup.png" alt="Cyclic subgroup" width="322" height="255">
<figcaption>The multiples of $P = (3, 6)$ are just five distinct points ($0$, $P$, $2P$, $3P$, $4P$) and they are repeating cyclically. It's easy to spot the similarity between scalar multiplication on elliptic curves and addition in modular arithmetic.</figcaption>
</figure>
<ul>
<li>$0P = 0$</li>
<li>$1P = (3, 6)$</li>
<li>$2P = (80, 10)$</li>
<li>$3P = (80, 87)$</li>
<li>$4P = (3, 91)$</li>
<li>$5P = 0$</li>
<li>$6P = (3, 6)$</li>
<li>$7P = (80, 10)$</li>
<li>$8P = (80, 87)$</li>
<li>$9P = (3, 91)$</li>
<li>…</li>
</ul>
<p>Here we can immediately spot two things: firstly, the multiples of $P$ are just five: the other points of the elliptic curve never appear. Secondly, they are <strong>repeating cyclically</strong>. We can write:</p>
<ul>
<li>$5kP = 0$</li>
<li>$(5k + 1)P = P$</li>
<li>$(5k + 2)P = 2P$</li>
<li>$(5k + 3)P = 3P$</li>
<li>$(5k + 4)P = 4P$</li>
</ul>
<p>for every integer $k$. Note that these five equations can be “compressed” into a single one, thanks to the modulo operator: $kP = (k \bmod{5})P$.</p>
<p>Not only that, but we can immediately verify that <strong>these five points are closed under addition</strong>. Which means: however I add $0$, $P$, $2P$, $3P$ or $4P$, the result is always one of these five points. Again, the other points of the elliptic curve never appear in the results.</p>
<p>The same holds for every point, not just for $P = (3, 6)$. In fact, if we take a generic $P$:
$$nP + mP = \underbrace{P + \cdots + P}_{n\ \text{times}} + \underbrace{P + \cdots + P}_{m\ \text{times}} = (n + m)P$$</p>
<p>Which means: <strong>if we add two multiples of $P$, we obtain a multiple of $P$</strong> (i.e. multiples of $P$ are closed under addition). This is enough to <a href="https://en.wikipedia.org/wiki/Subgroup#Basic_properties_of_subgroups">prove</a> that <strong>the set of the multiples of $P$ is a cyclic subgroup</strong> of the group formed by the elliptic curve.</p>
<p>A “subgroup” is a group which is a subset of another group. A “cyclic subgroup” is a subgroup which elements are repeating cyclically, like we have shown in the previous example. <span id="base-point">The point $P$ is called <strong>generator</strong> or <strong>base point</strong> of the cyclic subgroup</span>.</p>
<p>Cyclic subgroups are the foundations of ECC and other cryptosystems. We will see why in the next post.</p>
<h2 id="subgroup-order">Subgroup order</h2>
<p>We can ask ourselves <strong>what the order of a subgroup generated by a point $P$ is</strong> (or, equivalently, what the order of $P$ is). To answer this question we can’t use Schoof’s algorithm, because that algorithm only works on whole elliptic curves, not on subgroups. Before approaching the problem, we need a few more bits:</p>
<ul>
<li>So far, we have the defined the order as the number of points of a group. This definition is still valid, but within a cyclic subgroup we can give a new, equivalent definition: <strong>the order of $P$ is the smallest positive integer $n$ such that $nP = 0$</strong>.
In fact, if you look at the previous example, our subgroup contained five points, and we had $5P = 0$.</li>
<li>The order of $P$ is linked to the order of the elliptic curve by <a href="https://en.wikipedia.org/wiki/Lagrange%27s_theorem_(group_theory)">Lagrange’s theorem</a>, which states that <strong>the order of a subgroup is a divisor of the order of the parent group</strong>.
In other words, if an elliptic curve contains $N$ points and one of its subgroups contains $n$ points, then $n$ is a divisor of $N$.</li>
</ul>
<p>These two information together give us a way to find out the order of a subgroup with base point $P$:</p>
<ol>
<li>Calculate the elliptic curve’s order $N$ using Schoof’s algorithm.</li>
<li>Find out all the divisors of $N$.</li>
<li>For every divisor $n$ of $N$, compute $nP$.</li>
<li>The smallest $n$ such that $nP = 0$ is the order of the subgroup.</li>
</ol>
<p>For example, the curve $y^2 = x^3 - x + 3$ over the field $\mathbb{F}_{37}$ has order $N = 42$. Its subgroups may have order $n = 1$, $2$, $3$, $6$, $7$, $14$, $21$ or $42$. If <a href="https://andrea.corbellini.name/ecc/interactive/modk-mul.html?a=-1&b=3&p=37&px=2&py=3">we try $P = (2, 3)$</a> we can see that $P \ne 0$, $2P \ne 0$, …, $7P = 0$, hence the order of $P$ is $n = 7$.</p>
<p>Note that <strong>it’s important to take the smallest divisor, not a random one</strong>. If we proceeded randomly, we could have taken $n = 14$, which is not the order of the subgroup, but one of its multiples.</p>
<p>Another example: the elliptic curve defined by the equation $y^2 = x^3 - x + 1$ over the field $\mathbb{F}_{29}$ has order $N = 37$, which is a prime. Its subgroups may only have order $n = 1$ or $37$. As you can easily guess, when $n = 1$, the subgroup contains only the point at infinity; when $n = N$, the subgroup contains all the points of the elliptic curve.</p>
<h2 id="finding-a-base-point">Finding a base point</h2>
<p>For our ECC algorithms, we want subgroups with a high order. So in general we will choose an elliptic curve, calculate its order ($N$), choose a high divisor as the subgroup order ($n$) and eventually find a suitable base point. That is: we won’t choose a base point and then calculate its order, but we’ll do the opposite: we will first choose an order that looks good enough and then we will hunt for a suitable base point. How do we do that?</p>
<p><span id="cofactor">Firstly, we need to introduce one more term. Lagrange’s theorem implies that the number <strong>$h = N / n$ is always an integer</strong> (because $n$ is a divisor of $N$). The number $h$ has a name: it’s the <strong>cofactor of the subgroup</strong>.</span></p>
<p>Now consider that for every point of an elliptic curve we have $NP = 0$. This happens because $N$ is a multiple of any candidate $n$. Using the definition of cofactor, we can write:
$$n(hP) = 0$$</p>
<p>Now suppose that $n$ is a prime number (for reason that will be explained in the next post, we prefer prime orders). This equation, written in this form, is telling us that the point $G = hP$ generates a subgroup of order $n$ (except when $G = hP = 0$, in which case the subgroup has order 1).</p>
<p>In the light of this, we can outline the following algorithm:</p>
<ol>
<li>Calculate the order $N$ of the elliptic curve.</li>
<li>Choose the order $n$ of the subgroup. For the algorithm to work, this number must be prime and must be a divisor of $N$.</li>
<li>Compute the cofactor $h = N / n$.</li>
<li>Choose a random point $P$ on the curve.</li>
<li>Compute $G = hP$.</li>
<li>If $G$ is 0, then go back to step 4. Otherwise we have found a generator of a subgroup with order $n$ and cofactor $h$.</li>
</ol>
<p>Note that this algorithm only works if $n$ is a prime. If $n$ wasn’t a prime, then the order of $G$ could be one of the divisors of $n$.</p>
<h1 id="discrete-logarithm">Discrete logarithm</h1>
<p>As we did when working with continuous elliptic curves, we are now going to discuss the question: <strong>if we know $P$ and $Q$, what is $k$ such that $Q = kP$?</strong></p>
<p>This problem, which is known as the <strong>discrete logarithm problem</strong> for elliptic curves, is believed to be a “hard” problem, in that there is no known polynomial time algorithm that can run on a classical computer. There are, however, no mathematical proofs for this belief.</p>
<p>This problem is also analogous to the discrete logarithm problem used with other cryptosystems such as the Digital Signature Algorithm (DSA), the Diffie-Hellman key exchange (D-H) and the ElGamal algorithm — it’s not a coincidence that they have the same name. The difference is that, with those algorithms, we use modulo exponentiation instead of scalar multiplication. Their discrete logarithm problem can be stated as follows: if we know $a$ and $b$, what’s $k$ such that $b = a^k \bmod{p}$?</p>
<p>Both these problems are “discrete” because they involve finite sets (more precisely, cyclic subgroups). And they are “logarithms” because they are analogous to ordinary logarithms.</p>
<p>What makes ECC interesting is that, as of today, the discrete logarithm problem for elliptic curves seems to be “harder” if compared to other similar problems used in cryptography. This implies that we need fewer bits for the integer $k$ in order to achieve the same level of security as with other cryptosystems, as we will see in details in the fourth and last post of this series.</p>
<h1 id="more-next-week">More next week!</h1>
<p>Enough for today! I really hope you enjoyed this post. Leave a comment if you didn’t.</p>
<p>Next week’s post will be the third in this series and will be about ECC algorithms: key pair generation, ECDH and ECDSA. That will be one of the most interesting parts of this series. Don’t miss it!</p>
<p><strong><a href="https://andrea.corbellini.name/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/">Read the next post of the series »</a></strong></p>andreacorbelliniSat, 23 May 2015 14:08:00 +0000tag:andrea.corbellini.name,2015-05-23:/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/cryptographyeccmathsecurityElliptic Curve Cryptography: a gentle introductionhttps://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/<p>Those of you who know what public-key cryptography is may have already heard of <strong>ECC</strong>, <strong>ECDH</strong> or <strong>ECDSA</strong>. The first is an acronym for Elliptic Curve Cryptography, the others are names for algorithms based on it.</p>
<p>Today, we can find elliptic curves cryptosystems in <a href="https://tools.ietf.org/html/rfc4492">TLS</a>, <a href="https://tools.ietf.org/html/rfc6637">PGP</a> and <a href="https://tools.ietf.org/html/rfc5656">SSH</a>, which are just three of the main technologies on which the modern web and IT world are based. Not to mention <a href="https://en.bitcoin.it/wiki/Secp256k1">Bitcoin</a> and other cryptocurrencies.</p>
<p>Before ECC become popular, almost all public-key algorithms were based on RSA, DSA, and DH, alternative cryptosystems based on modular arithmetic. RSA and friends are still very important today, and often are used alongside ECC. However, while the magic behind RSA and friends can be easily explained, is widely understood, and <a href="http://code.activestate.com/recipes/578838-rsa-a-simple-and-easy-to-read-implementation/">rough implementations can be written quite easily</a>, the foundations of ECC are still a mystery to most.</p>
<p>With a series of blog posts I’m going to give you a gentle introduction to the world of elliptic curve cryptography. My aim is not to provide a complete and detailed guide to ECC (the web is full of information on the subject), but to provide <strong>a simple overview of what ECC is and why it is considered secure</strong>, without losing time on long mathematical proofs or boring implementation details. I will also give <strong>helpful examples together with visual interactive tools and scripts to play with</strong>.</p>
<p>Specifically, here are the topics I’ll touch:</p>
<ol>
<li><strong><a href="https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/">Elliptic curves over real numbers and the group law</a></strong> (covered in this blog post)</li>
<li><strong><a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/">Elliptic curves over finite fields and the discrete logarithm problem</a></strong></li>
<li><strong><a href="https://andrea.corbellini.name/2015/05/30/elliptic-curve-cryptography-ecdh-and-ecdsa/">Key pair generation and two ECC algorithms: ECDH and ECDSA</a></strong></li>
<li><strong><a href="https://andrea.corbellini.name/2015/06/08/elliptic-curve-cryptography-breaking-security-and-a-comparison-with-rsa/">Algorithms for breaking ECC security, and a comparison with RSA</a></strong></li>
</ol>
<p>In order to understand what’s written here, you’ll need to know some basic stuff of set theory, geometry and modular arithmetic, and have familiarity with symmetric and asymmetric cryptography. Lastly, you need to have a clear idea of what an “easy” problem is, what a “hard” problem is, and their roles in cryptography.</p>
<p>Ready? Let’s start!</p>
<h1 id="elliptic-curves">Elliptic Curves</h1>
<p>First of all: what is an elliptic curve? Wolfram MathWorld gives an excellent and complete <a href="http://mathworld.wolfram.com/EllipticCurve.html">definition</a>. But for our aims, an elliptic curve will simply be <strong>the set of points described by the equation</strong>:
$$y^2 = x^3 + ax + b$$</p>
<p>where $4a^3 + 27b^2 \ne 0$ (this is required to exclude <a href="https://en.wikipedia.org/wiki/Singularity_(mathematics)">singular curves</a>). The equation above is what is called <em>Weierstrass normal form</em> for elliptic curves.</p>
<figure>
<img src="https://andrea.corbellini.name/images/curves.png" alt="Different shapes for different elliptic curves" width="440" height="450">
<figcaption>Different shapes for different elliptic curves ($b = 1$, $a$ varying from 2 to -3).</figcaption>
</figure>
<figure>
<img src="https://andrea.corbellini.name/images/singularities.png" alt="Types of singularities" width="300" height="220">
<figcaption>Types of singularities: on the left, a curve with a cusp ($y^2 = x^3$). On the right, a curve with a self-intersection ($y^2 = x^3 - 3x + 2$). None of them is a valid elliptic curve.</figcaption>
</figure>
<p>Depending on the value of $a$ and $b$, elliptic curves may assume different shapes on the plane. As it can be easily seen and verified, elliptic curves are symmetric about the $x$-axis.</p>
<p>For our aims, <strong>we will also need a <a href="https://en.wikipedia.org/wiki/Point_at_infinity">point at infinity</a></strong> (also known as ideal point) to be part of our curve. From now on, we will denote our point at infinity with the symbol 0 (zero).</p>
<p>If we want to explicitly take into account the point at infinity, we can refine our definition of elliptic curve as follows:
$$\left\{ (x, y) \in \mathbb{R}^2\ |\ y^2 = x^3 + ax + b,\ 4 a^3 + 27 b^2 \ne 0 \right\}\ \cup\ \left\{ 0 \right\}$$</p>
<h1 id="groups">Groups</h1>
<p>A group in mathematics is a set for which we have defined a binary operation that we call “addition” and indicate with the symbol +. In order for the set $\mathbb{G}$ to be a group, addition must defined so that it respects the following four properties:</p>
<ol>
<li><strong>closure:</strong> if $a$ and $b$ are members of $\mathbb{G}$, then $a + b$ is a member of $\mathbb{G}$;</li>
<li><strong>associativity:</strong> $(a + b) + c = a + (b + c)$;</li>
<li>there exists an <strong>identity element</strong> 0 such that $a + 0 = 0 + a = a$;</li>
<li>every element has an <strong>inverse</strong>, that is: for every $a$ there exists $b$ such that $a + b = 0$.</li>
</ol>
<p>If we add a fifth requirement:</p>
<ol start="5">
<li><strong>commutativity:</strong> $a + b = b + a$,</li>
</ol>
<p>then the group is called <em>abelian group</em>.</p>
<p>With the usual notion of addition, the set of integer numbers $\mathbb{Z}$ is a group (moreover, it’s an abelian group). The set of natural numbers $\mathbb{N}$ however is not a group, as the fourth property can’t be satisfied.</p>
<p>Groups are nice because, if we can demonstrate that those four properties hold, we get some other properties for free. For example: <strong>the identity element is unique</strong>; also the <strong>inverses are unique</strong>, that is: for every $a$ there exists only one $b$ such that $a + b = 0$ (and we can write $b$ as $-a$). Either directly or indirectly, these and other facts about groups will be very important for us later.</p>
<h1 id="group-law">The group law for elliptic curves</h1>
<p>We can define a group over elliptic curves. Specifically:</p>
<ul>
<li>the elements of the group are the points of an elliptic curve;</li>
<li>the <strong>identity element</strong> is the point at infinity 0;</li>
<li>the <strong>inverse</strong> of a point $P$ is the one symmetric about the $x$-axis;</li>
<li><strong>addition</strong> is given by the following rule: <strong>given three aligned, non-zero points $P$, $Q$ and $R$, their sum is $P + Q + R = 0$</strong>.</li>
</ul>
<figure>
<img src="https://andrea.corbellini.name/images/three-aligned-points.png" alt="Three aligned points" width="300" height="300">
<figcaption>The sum of three aligned point is 0.</figcaption>
</figure>
<p>Note that with the last rule, we only require three aligned points, and three points are aligned without respect to order. This means that, if $P$, $Q$ and $R$ are aligned, then $P + (Q + R) = Q + (P + R) = R + (P + Q) = \cdots = 0$. This way, we have intuitively proved that <strong>our + operator is both associative and commutative: we are in an abelian group</strong>.</p>
<p>So far, so great. But how do we actually compute the sum of two arbitrary points?</p>
<h1 id="geometric-addition">Geometric addition</h1>
<p>Thanks to the fact that we are in an abelian group, we can write $P + Q + R = 0$ as $P + Q = -R$. This equation, in this form, lets us derive a geometric method to compute the sum between two points $P$ and $Q$: <strong>if we draw a line passing through $P$ and $Q$, this line will intersect a third point on the curve, $R$</strong> (this is implied by the fact that $P$, $Q$ and $R$ are aligned). <strong>If we take the inverse of this point, $-R$, we have found the result of $P + Q$</strong>.</p>
<figure>
<img src="https://andrea.corbellini.name/images/point-addition.png" alt="Point addition" width="287" height="300">
<figcaption>Draw the line through $P$ and $Q$. The line intersects a third point $R$. The point symmetric to it, $-R$, is the result of $P + Q$.</figcaption>
</figure>
<p>This geometric method works but needs some refinement. Particularly, we need to answer a few questions:</p>
<ul>
<li><strong>What if $P = 0$ or $Q = 0$?</strong> Certainly, we can’t draw any line (0 is not on the $xy$-plane). But given that we have defined 0 as the identity element, $P + 0 = P$ and $0 + Q = Q$, for any $P$ and for any $Q$.</li>
<li><strong>What if $P = -Q$?</strong> In this case, the line going through the two points is vertical, and does not intersect any third point. But if $P$ is the inverse of $Q$, then we have $P + Q = P + (-P) = 0$ from the definition of inverse.</li>
<li><strong>What if $P = Q$?</strong> In this case, there are infinitely many lines passing through the point. Here things start getting a bit more complicated. But consider a point $Q’ \ne P$. What happens if we make $Q’$ approach $P$, getting closer and closer to it?
<br>
<figure>
<img src="https://andrea.corbellini.name/images/animation-point-doubling.gif" width="300" height="300" alt="The result of P + Q as Q is approaching P">
<figcaption>As the two points become closer together, the line passing through them becomes tangent to the curve.</figcaption>
</figure></li>
</ul>
<p>As $Q’$ tends towards $P$, the line passing through $P$ and $Q’$ becomes tangent to the curve. In the light of this we can say that $P + P = -R$, where $R$ is the point of intersection between the curve and the line tangent to the curve in $P$.
* <strong>What if $P \ne Q$, but there is no third point $R$?</strong> We are in a case very similar to the previous one. In fact, we are in the case where the line passing through $P$ and $Q$ is tangent to the curve.
<br>
<figure>
<img src="https://andrea.corbellini.name/images/animation-tangent-line.gif" alt="The result of P + Q as Q is approaching P" width="300" height="300">
<figcaption>If our line intersects just two points, then it means that it's tangent to the curve. It's easy to see how the result of the sum becomes symmetric to one of the two points.</figcaption>
</figure></p>
<p>Let’s assume that $P$ is the tangency point. In the previous case, we would have written $P + P = -Q$. That equation now becomes $P + Q = -P$. If, on the other hand, $Q$ were the tangency point, the correct equation would have been $P + Q = -Q$.</p>
<p>The geometric method is now complete and covers all cases. With a pencil and a ruler we are able to perform addition involving every point of any elliptic curve. If you want to try, <strong>take a look at the <a href="https://andrea.corbellini.name/ecc/interactive/reals-add.html">HTML5/JavaScript visual tool</a> I’ve built for computing sums on elliptic curves!</strong></p>
<h1 id="algebraic-addition">Algebraic addition</h1>
<p>If we want a computer to perform point addition, we need to turn the geometric method into an algebraic method. Transforming the rules described above into a set of equations may seem straightforward, but actually it can be really tedious because it requires solving cubic equations. For this reason, here I will report only the results.</p>
<p>First, let’s get get rid of the most annoying corner cases. We already know that $P + (-P) = 0$, and we also know that $P + 0 = 0 + P = P$. So, in our equations, we will avoid these two cases and we will only consider <strong>two non-zero, non-symmetric points $P = (x_P, y_P)$ and $Q = (x_Q, y_Q)$</strong>.</p>
<p><strong>If $P$ and $Q$ are distinct</strong> ($x_P \ne x_Q$), the line through them has <strong>slope</strong>:
$$m = \frac{y_P - y_Q}{x_P - x_Q}$$</p>
<p>The <strong>intersection</strong> of this line with the elliptic curve is a third point $R = (x_R, y_R)$:
$$\begin{align*}
x_R & = m^2 - x_P - x_Q \\
y_R & = y_P + m(x_R - x_P)
\end{align*}$$</p>
<p>or, equivalently:
$$y_R = y_Q + m(x_R - x_Q)$$</p>
<p>Hence $(x_P, y_P) + (x_Q, y_Q) = (x_R, -y_R)$ (pay attention at the signs and remember that $P + Q = -R$).</p>
<p>If we wanted to check whether this result is right, we would have had to check whether $R$ belongs to the curve and whether $P$, $Q$ and $R$ are aligned. Checking whether the points are aligned is trivial, checking that $R$ belongs to the curve is not, as we would need to solve a cubic equation, which is not fun at all.</p>
<p>Instead, let’s play with an example: according to our <a href="https://andrea.corbellini.name/ecc/interactive/reals-add.html">visual tool</a>, given $P = (1, 2)$ and $Q = (3, 4)$ over the curve $y^2 = x^3 - 7x + 10$, their sum is $P + Q = -R = (-3, 2)$. Let’s see if our equations agree:
$$\begin{align*}
m & = \frac{y_P - y_Q}{x_P - x_Q} = \frac{2 - 4}{1 - 3} = 1 \\
x_R & = m^2 - x_P - x_Q = 1^2 - 1 - 3 = -3 \\
y_R & = y_P + m(x_R - x_P) = 2 + 1 \cdot (-3 - 1) = -2 \\
& = y_Q + m(x_R - x_Q) = 4 + 1 \cdot (-3 - 3) = -2
\end{align*}$$</p>
<p>Yes, this is correct!</p>
<p>Note that these equations work even if <strong>one of $P$ or $Q$ is a tangency point</strong>. Let’s try with $P = (-1, 4)$ and $Q = (1, 2)$.
$$\begin{align*}
m & = \frac{y_P - y_Q}{x_P - x_Q} = \frac{4 - 2}{-1 - 1} = -1 \\
x_R & = m^2 - x_P - x_Q = (-1)^2 - (-1) - 1 = 1 \\
y_R & = y_P + m(x_R - x_P) = 4 + -1 \cdot (1 - (-1)) = 2
\end{align*}$$</p>
<p>We get the result $P + Q = (1, -2)$, which is the same result given by the <a href="https://andrea.corbellini.name/ecc/interactive/reals-add.html?px=-1&py=4&qx=1&qy=2">visual tool</a>.</p>
<p><strong>The case $P = Q$ needs to be treated a bit differently</strong>: the equations for $x_R$ and $y_R$ are the same, but given that $x_P = x_Q$, we must use a different equation for the <strong>slope</strong>:
$$m = \frac{3 x_P^2 + a}{2 y_P}$$</p>
<p>Note that, as we would expect, this expression for $m$ is the first derivative of:
$$y_P = \pm \sqrt{x_P^3 + ax_P + b}$$</p>
<p>To prove the validity of this result it is enough to check that $R$ belongs to the curve and that the line passing through $P$ and $R$ has only two intersections with the curve. But again, we don’t prove this fact, and instead try with an example: $P = Q = (1, 2)$.
$$\begin{align*}
m & = \frac{3x_P^2 + a}{2 y_P} = \frac{3 \cdot 1^2 - 7}{2 \cdot 2} = -1 \\
x_R & = m^2 - x_P - x_Q = (-1)^2 - 1 - 1 = -1 \\
y_R & = y_P + m(x_R - x_P) = 2 + (-1) \cdot (-1 - 1) = 4
\end{align*}$$</p>
<p>Which gives us $P + P = -R = (-1, -4)$. <a href="https://andrea.corbellini.name/ecc/interactive/reals-add.html?px=1&py=2&qx=1&qy=2">Correct</a>!</p>
<p>Although the procedure to derive them can be really tedious, our equations are pretty compact. This is thanks to Weierstrass normal form: without it, these equations could have been really long and complicated!</p>
<h1 id="scalar-multiplication">Scalar multiplication</h1>
<p>Other than addition, we can define another operation: <strong>scalar multiplication</strong>, that is:
$$nP = \underbrace{P + P + \cdots + P}_{n\ \text{times}}$$</p>
<p>where $n$ is a natural number. I’ve written a <strong><a href="https://andrea.corbellini.name/ecc/interactive/reals-mul.html">visual tool</a> for scalar multiplication</strong> too, if you want to play with that.</p>
<p>Written in that form, it may seem that computing $nP$ requires $n$ additions. If $n$ has $k$ binary digits, then our algorithm would be $O(2^k)$, which is not really good. But there exist faster algorithms.</p>
<p>One of them is the <span id="double-and-add"><strong>double and add</strong></span> algorithm. Its principle of operation can be better explained with an example. Take $n = 151$. Its binary representation is $10010111_2$. This binary representation can be turned into a sum of powers of two:
$$\begin{align*}
151 & = 1 \cdot 2^7 + 0 \cdot 2^6 + 0 \cdot 2^5 + 1 \cdot 2^4 + 0 \cdot 2^3 + 1 \cdot 2^2 + 1 \cdot 2^1 + 1 \cdot 2^0 \\
& = 2^7 + 2^4 + 2^2 + 2^1 + 2^0
\end{align*}$$</p>
<p>(We have taken each binary digit of $n$ and multiplied it by a power of two.)</p>
<p>In view of this, we can write:
$$151 \cdot P = 2^7 P + 2^4 P + 2^2 P + 2^1 P + 2^0 P$$</p>
<p>What the double and add algorithm tells us to do is:</p>
<ul>
<li>Take $P$.</li>
<li><em>Double</em> it, so that we get $2P$.</li>
<li><em>Add</em> $2P$ to $P$ (in order to get the result of $2^1P + 2^0P$).</li>
<li><em>Double</em> $2P$, so that we get $2^2P$.</li>
<li><em>Add</em> it to our result (so that we get $2^2P + 2^1P + 2^0P$).</li>
<li><em>Double</em> $2^2P$ to get $2^3P$.</li>
<li>Don’t perform any addition involving $2^3P$.</li>
<li><em>Double</em> $2^3P$ to get $2^4P$.</li>
<li><em>Add</em> it to our result (so that we get $2^4P + 2^2P + 2^1P + 2^0P$).</li>
<li>…</li>
</ul>
<p>In the end, we can compute $151 \cdot P$ performing just seven doublings and four additions.</p>
<p>If this is not clear enough, here’s a Python script that implements the algorithm:</p>
<div class="highlight"><pre><span></span><code><span class="k">def</span> <span class="nf">bits</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Generates the binary digits of n, starting</span>
<span class="sd"> from the least significant bit.</span>
<span class="sd"> bits(151) -> 1, 1, 1, 0, 1, 0, 0, 1</span>
<span class="sd"> """</span>
<span class="k">while</span> <span class="n">n</span><span class="p">:</span>
<span class="k">yield</span> <span class="n">n</span> <span class="o">&</span> <span class="mi">1</span>
<span class="n">n</span> <span class="o">>>=</span> <span class="mi">1</span>
<span class="k">def</span> <span class="nf">double_and_add</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="w"> </span><span class="sd">"""</span>
<span class="sd"> Returns the result of n * x, computed using</span>
<span class="sd"> the double and add algorithm.</span>
<span class="sd"> """</span>
<span class="n">result</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">addend</span> <span class="o">=</span> <span class="n">x</span>
<span class="k">for</span> <span class="n">bit</span> <span class="ow">in</span> <span class="n">bits</span><span class="p">(</span><span class="n">n</span><span class="p">):</span>
<span class="k">if</span> <span class="n">bit</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
<span class="n">result</span> <span class="o">+=</span> <span class="n">addend</span>
<span class="n">addend</span> <span class="o">*=</span> <span class="mi">2</span>
<span class="k">return</span> <span class="n">result</span>
</code></pre></div>
<p>If doubling and adding are both $O(1)$ operations, then <strong>this algorithm is $O(\log n)$</strong> (or $O(k)$ if we consider the bit length), which is pretty good. Surely much better than the initial $O(n)$ algorithm!</p>
<h1 id="logarithm">Logarithm</h1>
<p>Given $n$ and $P$, we now have at least one polynomial time algorithm for computing $Q = nP$. But what about the other way round? <strong>What if we know $Q$ and $P$ and need to find out $n$</strong>? This problem is known as the <strong>logarithm problem</strong>. We call it “logarithm” instead of “division” for conformity with other cryptosystems (where instead of multiplication we have exponentiation).</p>
<p>I don’t know of any “easy” algorithm for the logarithm problem, however <a href="https://andrea.corbellini.name/ecc/interactive/reals-mul.html?a=-3&b=1&px=0&py=1">playing with multiplication</a> it’s easy to see some patterns. For example, take the curve $y^2 = x^3 - 3x + 1$ and the point $P = (0, 1)$. We can immediately verify that, if $n$ is odd, $nP$ is on the curve on the left semiplane; if $n$ is even, $nP$ is on the curve on the right semiplane. If we experimented more, we could probably find more patterns that eventually could lead us to write an algorithm for computing the logarithm on that curve efficiently.</p>
<p>But there’s a variant of the logarithm problem: the <em>discrete</em> logarithm problem. As we will see in the next post, if we reduce the domain of our elliptic curves, <strong>scalar multiplication remains “easy”, while the discrete logarithm becomes a “hard” problem</strong>. This duality is the key brick of elliptic curve cryptography.</p>
<h1 id="see-you-next-week">See you next week</h1>
<p>That’s all for today, I hope you enjoyed this post! Next week we will discover <strong>finite fields</strong> and the <strong><em>discrete</em> logarithm problem</strong>, along with examples and tools to play with. If this stuff sounds interesting to you, then stay tuned!</p>
<p><strong><a href="https://andrea.corbellini.name/2015/05/23/elliptic-curve-cryptography-finite-fields-and-discrete-logarithms/">Read the next post of the series »</a></strong></p>andreacorbelliniSun, 17 May 2015 11:24:00 +0000tag:andrea.corbellini.name,2015-05-17:/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/cryptographybitcoindhdsaeccmathpgprsasecuritysshtlswebLet's Encrypt: the road towards a better web?https://andrea.corbellini.name/2015/04/12/lets-encrypt-the-road-towards-a-better-web/<p>I’ve always dreamed of a encrypted web, where HTTPS is the standard and plain HTTP is no more. A web where eavesdropping or manipulating information is not possible, or at least much harder than today.</p>
<p>I remember that I got excited when I first heard of <strong><a href="http://www.cacert.org/">CAcert</a>: “a community-driven Certificate Authority that issues certificates to the public at large for free”</strong>. Unfortunately, CAcert’s root certificate never made it into the major web browsers and operating systems. Whatever the reasons, the result is that visiting a HTTPS website with a certificate released by CAcert produces nothing but a <a href="https://cacert.org/">scary warning with a call to leave the site</a>, making CAcert unsuitable for most.</p>
<p><a href="https://www.startssl.com/">StarCom</a>, on the other hand, has made it into the major browsers. But despite its certificates are released for free, it has never become much widespread. Also, StarCom <a href="https://news.ycombinator.com/item?id=7557764">has</a> <a href="https://www.techdirt.com/articles/20140409/11442426859/shameful-security-startcom-charges-people-to-revoke-ssl-certs-vulnerable-to-heartbleed.shtml">been</a> <a href="https://twitter.com/startssl/status/453631038883758080">heavily</a> <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=994033">criticized</a> for how the Heartbleed vulnerability was handled, and AFAIK this has led many customers away.</p>
<h1 id="lets-encrypt">Let’s Encrypt</h1>
<p>Recently, I learned about <strong><a href="https://letsencrypt.org/">Let’s Encrypt</a>: a “free, automated, and open” Certificate Authority</strong> arriving in mid-2015. There are many important facts that make Let’s Encrypt different and better from all the other Certificate Authorities out there. I’ll let you discover all of them. Probably, the most important fact is that Let’s Encrypt has <strong><a href="https://letsencrypt.org/sponsors/">important sponsors</a>, including Mozilla</strong>. And this is what matters today, because it gives Let’s Encrypt a chance to be included in at least one major browser.</p>
<figure>
<a href="https://letsencrypt.org/"><img src="https://andrea.corbellini.name/images/letsencrypt-logo-horizontal.png" alt="Let's Encrypt" width="519" height="124"></a>
<figcaption>Let's Encrypt logo.</figcaption>
</figure>
<p>Another interesting fact about Let’s Encrypt is that its <strong>certificates are released in <a href="https://letsencrypt.org/howitworks/technology/">a way that is both secure and automated</a> at the same time</strong>. This gives the opportunity for other (potential) Certificate Authorities to adopt the same automated system.</p>
<p>If Let’s Encrypt wins, then everyone will have an easy way to obtain a free HTTPS certificate for their website. The next big step would be making Let’s Encrypt increase in adoption and the final step would be deprecating plain HTTP. There are however a few open questions:</p>
<ul>
<li>What will be the answer from Google, Apple, Microsoft and other major browser/operating systems makers?</li>
<li>What will be the reaction of Verisign and Comodo? (That together hold <a href="http://w3techs.com/technologies/overview/ssl_certificate/all">more than 50%</a> of all the certificates currently used on the web.)</li>
<li>Will they declare war to Let’s Encrypt or will they consolidate their efforts on customer services and Extended Validation?</li>
<li>Will the technology behind Let’s Encrypt allow the creation of a new model for certificate management? Will we see web servers and providers with built-in support for it?</li>
</ul>
<p>I do not have an answer to these questions, time will tell. However I really hope my dream to become a reality soon. If you, like me, want Let’s Encrypt to be a success, then please <strong>share and discuss</strong> about it. Perhaps, one day, we will find ourselves teaching juniors that HTTPS has not always been the standard… :)</p>andreacorbelliniSun, 12 Apr 2015 16:07:00 +0000tag:andrea.corbellini.name,2015-04-12:/2015/04/12/lets-encrypt-the-road-towards-a-better-web/information-technologysecuritytlsweblet's encryptRunning Ubuntu Snappy inside Dockerhttps://andrea.corbellini.name/2015/03/25/running-ubuntu-snappy-inside-docker/<p>Many of you may have already heard of <a href="https://developer.ubuntu.com/en/snappy/">Ubuntu Core</a>. For those who haven’t, it’s a minimal Ubuntu version, running only a few essential services and ships with a new package manager (snappy) that provides <em>transactional</em> updates. Ubuntu Core provides a lightweight base operating system which is fast to deploy and easy to maintain up to date. It also uses a nice <a href="https://wiki.ubuntu.com/SecurityTeam/Specifications/SnappyConfinement">security model</a>.</p>
<p>All these characteristics make it particularly appealing for the cloud. And, in fact, people are starting considering it for building their (micro)services architectures. Some weeks ago, a user on Ask Ubuntu asked: <a href="http://askubuntu.com/questions/566736/can-i-run-snappy-ubuntu-core-as-a-guest-inside-docker/577248">Can I run Snappy Ubuntu Core as a guest inside Docker?</a> The problem is that Ubuntu Core does not ship with an official Docker image that we can pull, so we are forced to set it up manually. Here’s how.</p>
<h1 id="creating-the-docker-image">Creating the Docker image</h1>
<h2 id="step-1-get-the-latest-ubuntu-core">Step 1: get the latest Ubuntu Core</h2>
<p>As of writing, the latest Ubuntu Core image is alpha 3 and can be downloaded with:</p>
<div class="highlight"><pre><span></span><code><span class="gp">$ </span>wget<span class="w"> </span>http://cdimage.ubuntu.com/ubuntu-core/releases/alpha-3/ubuntu-core-WEBDM-alpha-03_amd64-generic.img.xz
</code></pre></div>
<p>(If you browse to <a href="http://cdimage.ubuntu.com/ubuntu-core/releases/alpha-3/">cdimage.ubuntu.com</a>, you can also find the signed hashsums.)</p>
<p>The downloaded image is XZ-compressed and we need to extract it:</p>
<div class="highlight"><pre><span></span><code><span class="gp">$ </span>unxz<span class="w"> </span>ubuntu-core-WEBDM-alpha-03_amd64-generic.img.xz
</code></pre></div>
<h2 id="step-2-connect-the-image-using-qemu-nbd">Step 2: connect the image using qemu-nbd</h2>
<p>The file we have just downloaded and extracted is a filesystem dump. The previous version of the image (Alpha 2) was a QCOW2 image (the format used by QEMU). In order to access its contents, we have a few options. Here I’ll show one that works with both filesystem dumps and QCOW2 images. The trick consists in using <code>qemu-nbd</code> (a tool from the <a href="https://apps.ubuntu.com/cat/applications/qemu-utils/">qemu-utils</a> package):</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>qemu-nbd<span class="w"> </span>-rc<span class="w"> </span>/dev/nbd0<span class="w"> </span>ubuntu-core-WEBDM-alpha-03_amd64-generic.img
</code></pre></div>
<p>This command will create a virtual device named <code>/dev/nbd0</code>, with virtual partitions named <code>/dev/nbd0p1</code>, <code>/dev/nbd0p2</code>, … Use <code>fdisk -l /dev/nbd0</code> to get an idea of what partitions are inside the QCOW2 image.</p>
<h2 id="step-3-mount-the-filesystem">Step 3: mount the filesystem</h2>
<p>The partition we are interested in is <code>/dev/nbd0p3</code>, so we need to mount it:</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>mkdir<span class="w"> </span>nbd0p3
<span class="gp"># </span>mount<span class="w"> </span>-r<span class="w"> </span>/dev/nbd0p3<span class="w"> </span>nbd0p3
</code></pre></div>
<h2 id="step-4-create-a-base-docker-image">Step 4: create a base Docker image</h2>
<p>As suggested on the <a href="https://docs.docker.com/articles/baseimages/">Docker documentation</a>, creating a base Docker image from a directory is pretty straightforward:</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>tar<span class="w"> </span>-C<span class="w"> </span>nbd0p3<span class="w"> </span>-c<span class="w"> </span>.<span class="w"> </span><span class="p">|</span><span class="w"> </span>docker<span class="w"> </span>import<span class="w"> </span>-<span class="w"> </span>ubuntu-core<span class="w"> </span>alpha-3
</code></pre></div>
<p>Our newly created image will now appear when running <code>docker images</code>:</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>docker<span class="w"> </span>images
<span class="go">REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE</span>
<span class="go">ubuntu-core alpha-3 f6df3c0e2d74 5 seconds ago 543.5 MB</span>
</code></pre></div>
<p>Let’s verify if we did a good job:</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>docker<span class="w"> </span>run<span class="w"> </span>ubuntu-core:alpha-3<span class="w"> </span>snappy
<span class="go">Usage:snappy [-h] [-v]</span>
<span class="go"> {info,versions,search,update-versions,update,rollback,install,uninstall,tags,config,build,booted,chroot,framework,fake-version,nap}</span>
<span class="go"> ...</span>
</code></pre></div>
<p>Yes! We have successfully added Ubuntu Core to the available Docker images and we have run our first snappy container!</p>
<h1 id="installing-and-running-software">Installing and running software</h1>
<p>Without wasting too many words, here’s how to install and run the <code>xkcd-webserver</code> snappy package inside docker:</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>docker<span class="w"> </span>run<span class="w"> </span>-p<span class="w"> </span><span class="m">8000</span>:80<span class="w"> </span>ubuntu-core:alpha-3<span class="w"> </span>/bin/sh<span class="w"> </span>-c<span class="w"> </span><span class="s1">'snappy install xkcd-webserver && cd /apps/xkcd-webserver/0.3.1 && ./bin/xkcd-webserver'</span>
<span class="go">WARN: AppArmor not available when processing AppArmor hook</span>
<span class="go">Failed to get D-Bus connection: Operation not permitted</span>
<span class="go">Failed to get D-Bus connection: Operation not permitted</span>
<span class="go">** (process:13): WARNING **: user.vala:637: Can not connect to logind</span>
<span class="go">xkcd-webserver 21 kB [======================================] OK</span>
<span class="go">WARNING: failed to connect to dbus: org.freedesktop.DBus.Error.FileNotFound: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory</span>
<span class="go">Part Tag Installed Available Fingerprint Active</span>
<span class="go">xkcd-webserver edge 0.3.1 - 3a9152b8bff494 *</span>
</code></pre></div>
<p>Now, if you visit http://localhost:8000/ you should see a random XKCD comic.</p>
<p>If you have payed attention, you may have noticed a few warnings about AppArmor, DBus and logind. The reason why you are seeing these warnings is pretty simple: we did not start neither AppArmor nor DBus nor logind. Now, generally speaking, we could run init inside Docker and fix these and other warnings. However that’s not what Docker is meant for. So if you want to run AppArmor or similar stuff <em>from inside</em> Docker or LXC, then probably you should consider virtualization.</p>
<h1 id="dockerfile">Dockerfile</h1>
<p>Once you have created the base Docker image, you can start creating some <code>Dockerfile</code>s, if you need to. Here’s an example:</p>
<div class="highlight"><pre><span></span><code><span class="k">FROM</span><span class="w"> </span><span class="s">ubuntu-core:alpha-3</span>
<span class="k">RUN</span><span class="w"> </span>snappy<span class="w"> </span>install<span class="w"> </span>xkcd-webserver
<span class="k">EXPOSE</span><span class="w"> </span><span class="s">8000:80</span>
<span class="k">CMD</span><span class="w"> </span><span class="nb">cd</span><span class="w"> </span>/apps/xkcd-webserver/0.3.1<span class="w"> </span><span class="o">&&</span><span class="w"> </span>./bin/xkcd-webserver
</code></pre></div>
<p>This <code>Dockerfile</code> does the same job as the previous command: it installs and runs <code>xkcd-webserver</code> on port 8000. In order to use it, first build it:</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>docker<span class="w"> </span>build<span class="w"> </span>-t<span class="w"> </span>xkcd-webserver<span class="w"> </span>.
</code></pre></div>
<p>Check that it has been correctly installed:</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>docker<span class="w"> </span>images
<span class="go">REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE</span>
<span class="go">xkcd-webserver latest 260e0116e9e3 3 minutes ago 543.5 MB</span>
<span class="go">ubuntu-core alpha-3 f6df3c0e2d74 About an hour ago 543.5 MB</span>
</code></pre></div>
<p>Then run it:</p>
<div class="highlight"><pre><span></span><code><span class="gp"># </span>docker<span class="w"> </span>run<span class="w"> </span>xkcd-webserver
</code></pre></div>
<p>Again, you should see a random XKCD comic on <a href="http://localhost:8000/">http://localhost:8000/</a>.</p>
<h1 id="conclusion">Conclusion</h1>
<p>That’s all folks! I hope you enjoyed this tiny guide, and if you need help, please ask a question on Ask Ubuntu with the <a href="http://askubuntu.com/questions/tagged/ubuntu-core">ubuntu-core tag</a>, which I’m subscribed to.</p>andreacorbelliniWed, 25 Mar 2015 20:46:00 +0000tag:andrea.corbellini.name,2015-03-25:/2015/03/25/running-ubuntu-snappy-inside-docker/clouddockersnappyubuntuubuntu corexkcdAre LXC and Docker secure?https://andrea.corbellini.name/2015/02/20/are-lxc-and-docker-secure/<p>Since its initial release in 2008, LXC has become widespread among servers. Today, it is becoming the preferred deployment strategy in many contexts, also thanks to Docker and, more recently, LXD.</p>
<p>LXC and Docker are used not only to achieve modular architecture design, but also as a way to run untrusted code in an isolated environment.</p>
<p>We can agree that the LXC and Docker ecosystems are great and work well, but there’s an important question that I believe everyone should ask, but too few people are asking: <strong>are LXC and Docker secure?</strong></p>
<figure>
<img src="https://andrea.corbellini.name/images/broken-chain.jpg" alt="Broken Chain">
<figcaption>A system is as safe as its weakest component.</figcaption>
</figure>
<p>In order to answer this question, I won’t go deep into the details of what LXC and Docker are. The web is full of information on <a href="http://en.wikipedia.org/wiki/Cgroups#NAMESPACE-ISOLATION">namespaces</a> and <a href="http://en.wikipedia.org/wiki/Cgroups">cgroups</a>. Rather, I’d like to show what LXC and Docker can do, what they cannot do, and what their default configuration allows them to do. My hope is to provide a quick checklist for those who want to go with LXC/Docker, but are unsure on what they need to pay attention to.</p>
<h1 id="what-lxc-and-docker-can-do">What LXC and Docker can do</h1>
<p>As we all know, LXC confines processes mainly thanks to two Linux kernel features: namespaces and cgroups. These provide ways to control and limit access to resource such as memory or filesystem. So, for example, you can limit the bandwidth used by processes inside a container, you can limit the priority of the CPU scheduler, and so on.</p>
<p>As it is well known, processes inside a LXC guest cannot:</p>
<ul>
<li>directly interact with the host processes, or with other LXC containers;</li>
<li>access the root filesystem, unless configured otherwise;</li>
<li>access special devices (block devices, network interfaces, …), unless configured otherwise;</li>
<li>mount arbitrary filesystems;</li>
<li>execute special <code>ioctl</code>s, special syscalls or special interrupts, that would affect the behavior host.</li>
</ul>
<p>And at the same time, processes inside an LXC guest can find an environment that is perfectly suitable to run a working operating system: I can run init, I can read from <code>/proc</code>, I can access the internet.</p>
<p>This is most of what LXC can do, and it’s also what you get by default. Docker (when used with the LXC backend) is a wrapper around LXC that provides utilities for easy deployment and management of the containers, so <strong>everything that applies to LXC, applies to Docker too</strong>.</p>
<p>If this sounds great, then beware that there are the things you should know…</p>
<h1 id="you-need-a-security-context">You need a security context</h1>
<p>LXC is somewhat incomplete. What I mean is that some parts of special filesystems like procfs or sysfs are not faked. For example, as of now, I can successfully change the value of host’s <code>/proc/sys/kernel/panic</code> or <code>/sys/class/thermal/cooling_device0/cur_state</code>.</p>
<p>The reason why LXC is “incomplete” doesn’t really matter (it’s actually the kernel to be incomplete, but anyhow…). What matters is that certain nasty actions can be forbade, not by LXC itself, but by an AppArmor/SELinux profile that blocks read and write access certain <code>/proc</code> and <code>/sys</code> components. The AppArmor rules were shipped in Ubuntu since 12.10 (Quantal), and have been included upstream since early 2014, together with the SELinux rules.</p>
<p>Therefore, <strong>a security context like AppArmor or SELinux is required to run LXC safely</strong>. Without it, the root user inside a guest can take control of the host.</p>
<p>Check that AppArmor or SELinux are running and are configured properly. If you want to go with Grsecurity, then remember to configure it manually.</p>
<h1 id="limit-resource-consumption">Limit resource consumption</h1>
<p>LXC offers ways to limit resource usage, but no special restrictions are put in place by default. <strong>You have to configure them by yourself.</strong></p>
<p>With the default configuration, I can run fork-bombs, request huge memory maps, keep all CPUs busy, doing high loads of I/O. All of this without special privileges. Remember this when running untrusted code.</p>
<figure>
<img src="https://andrea.corbellini.name/images/memory-usage.png" alt="Uncontrolled memory consumption">
</figure>
<p>To limit resource consumption in LXC, open the configuration file for your container and set the <code>lxc.cgroup.<system></code> values you need.</p>
<p>For example, if you want to limit the container memory usage to 512 MiB, set <code>lxc.cgroup.memory.limit_in_bytes = 512M</code>. Note that the container with that option, once it exceeds the 512 MiB cap, will start using the swap without limits. If this is not what you want, then set <code>lxc.cgroup.memory.memsw.max_usage_in_bytes = 512M</code>. Note that to use both options you may need to add <code>cgroup_enable=memory</code> and <code>swapaccount=1</code> to the kernel command line.</p>
<p>To have an overview of all possible options, check out <a href="https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Resource_Management_Guide/ch-Subsystems_and_Tunable_Parameters.html">Red Hat’s documentation</a> or the <a href="https://www.kernel.org/doc/Documentation/cgroups/">Kernel documentation</a>.</p>
<p>With Docker, the story is similar: just use <code>--lxc-conf</code> from the command line to set LXC’s options.</p>
<h1 id="limit-disk-usage">Limit disk usage</h1>
<p>Something that LXC cannot do is limiting mass storage usage. Luckily, <strong><a href="https://www.stgraber.org/2013/12/27/lxc-1-0-container-storage/">LXC integrates nicely with LVM</a></strong> (and brtfs, and zfs, and overlayfs), and you can use that for easily limiting disk usage. You can, for example, create a logical volume for each of your guests, and give that volume a limited size, so that space usage inside a guest cannot grow indefinitely.</p>
<p>The same <a href="http://developerblog.redhat.com/2014/09/30/overview-storage-scalability-docker/">holds for Docker</a>.</p>
<h1 id="pay-attention-at-devrandom">Pay attention at <code>/dev/random</code></h1>
<p><strong>Processes inside LXC guests</strong>, by default, can read from <code>/dev/random</code> and <strong>can consume the entropy of the host</strong>. This may cause troubles if you need big amounts of randomness (to generate keys or whatever).</p>
<p>If this is something that you don’t want, then configure LXC so that it <a href="https://wiki.archlinux.org/index.php/Linux_Containers#Cgroups_device_configuration">denies access to the character devices</a> <code>1:8</code> (random) and <code>1:9</code> (urandom). Denying access to the path <code>/dev/random</code> is not enough, as <code>mknod</code> is allowed inside guests.</p>
<p>Note however that doing so may break many applications inside the LXC guest that need randomness. Maybe consider using a different machine for processes that require randomness for security purposes.</p>
<h1 id="use-unprivileged-containers">Use unprivileged containers</h1>
<p><strong>Containers can be <a href="https://www.stgraber.org/2014/01/17/lxc-1-0-unprivileged-containers/">run from an unprivileged user</a></strong>. This means UID 0 of the guest can’t match UID 0 of the host, and many potential security holes can’t simply be exploited. Unfortunately, <a href="https://github.com/docker/docker/issues/2918">Docker has not support for unprivileged containers</a> yet.</p>
<p>However, if Docker is not a requirement and you can do well with LXC, start experimenting with unprivileged containers and consider using them in production.</p>
<p>Programs like Apache will complain that it’s unable to change its ulimit (because setting the ulimit is a privilege of the real root user). If you need to run programs that require special privileges, either configure them so that they do not complain, or consider using <a href="http://linux.die.net/man/7/capabilities">capabilities</a> (but do not abuse them, and be cautious, or you risk introducing more problems than the ones your are trying to solve!)</p>
<h1 id="conclusion">Conclusion</h1>
<p>LXC, Docker and the entire ecosystem around them can be considered quite mature and stable. They’re surely production ready, and, if the right configuration is put in place, it can be pretty difficult to cause troubles to the host.</p>
<p>However, whether they can be considered secure or not is up to you: <strong>what are you using containers for? Who are you giving access to? What privileges are you giving, what actions are you restricting?</strong></p>
<p>Always remember what LXC and Docker do by default, and what they do not do, especially when you use them to run untrusted code. Those that I have listed may only be a few of the problems that LXC, Docker and friends may expose. Remember to carefully review your configuration before opening the doors to others.</p>
<h1 id="further-reading">Further reading</h1>
<p>If you liked this article, you’ll find these ones interesting too:</p>
<ul>
<li><a href="http://blog.docker.com/2013/08/containers-docker-how-secure-are-they/">Containers & Docker: how secure are they?</a>, from the Docker blog.</li>
<li>Stéphane Graber’s <a href="https://www.stgraber.org/2014/01/01/lxc-1-0-security-features/">Security features</a> from his <a href="https://www.stgraber.org/2013/12/20/lxc-1-0-blog-post-series/">LXC 1.0: Blog post series</a>.</li>
</ul>andreacorbelliniFri, 20 Feb 2015 16:36:00 +0000tag:andrea.corbellini.name,2015-02-20:/2015/02/20/are-lxc-and-docker-secure/clouddockerlxcsecurityPrime numbers and universe factorieshttps://andrea.corbellini.name/2015/02/15/prime-numbers-and-universe-factories/<p>I’m a XKCD fan, and I look it up regularly. There’s a comic that I particularly enjoyed: <a href="http://xkcd.com/10/">Pi Equals</a>.</p>
<figure>
<a href="http://xkcd.com/10/"><img src="http://imgs.xkcd.com/comics/pi.jpg" width="469" height="247" alt="Pi Equals"></a>
<figcaption>The comic <a href="http://xkcd.com/10/" title="Pi Equals">Pi Equals</a>, from XKCD.com (CC-BY-NC 2.5).</figcaption>
</figure>
<p>Well, it appears that Randall was right in that there’s a help message hidden somewhere. And I just found it in a prime number:</p>
<div class="highlight"><pre><span></span><code>245178888024581899558766786108789912235672909204719666025638877624752119760547413887830514281649480308707369249
</code></pre></div>
<p>That number corresponds to the ASCII encoding of this message:</p>
<div class="highlight"><pre><span></span><code>help!! i'm trapped in a universe factory!!!!!!
</code></pre></div>
<p>Apparently, universe factory workers speak English and write ASCII. Nice coincidence, huh?</p>
<h1 id="the-discovery">The discovery</h1>
<p>Yesterday I was playing with the two <a href="https://en.wikipedia.org/wiki/Illegal_prime">illegal primes</a> listed on Wikipedia. I was already aware of them, but I had never decoded them till yesterday. While doing so I wondered: how many prime numbers can be directly mapped to an executable file? Also, how many prime numbers can be directly mapped to plain English texts? Perhaps, while digging prime numbers, could we find something like the Iliad or a fully working operating system?</p>
<p>Well, while asking myself those highly philosophical questions, Randall’s comic quickly came to my mind, and I decided to start looking for help requests hidden in primes. You can’t imagine how many of them I found!</p>
<p>At first I tried looking for all prime numbers corresponding to strings starting with <code>HELP! I'M TRAPPED IN A UNIVERSE FACTORY!</code>, with an arbitrary suffix. I found many of them, but I wasn’t satisfied with the result: I wanted something that was purely English/ASCII, without any garbage. Therefore I tried appending hashtags like <code>#help</code> or <code>#universe</code>, but could not find any interesting combination that was also a prime number (apparently, use of Twitter is forbidden inside universe factories).</p>
<p>So I decided to change approach: I looked for all primes corresponding to <code>HELP</code>, followed by a variable number of exclamation marks, followed by <code>I'M TRAPPED IN A UNIVERSE FACTORY</code>, followed by other exclamation marks. I could not find anything.</p>
<p>But then I tried with a lower case string, and… I found lots of such primes!</p>
<div class="highlight"><pre><span></span><code>help i'm trapped in a universe factory!!!!!!!
help! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!! i'm trapped in a universe factory!!!!!!
help!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!! i'm trapped in a universe factory!!!!
help!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!! i'm trapped in a universe factory!
help!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!
help!!!!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
help!!!!!!!!!!!!!!!!!!!!!!!!!!! i'm trapped in a universe factory!!!!!!!
...
</code></pre></div>
<p>I picked the one I liked most and verified its primality with <a href="http://www.wolframalpha.com/input/?i=is+245178888024581899558766786108789912235672909204719666025638877624752119760547413887830514281649480308707369249+prime%3F">Wolfram|Alpha</a> and <a href="http://www.numberempire.com/primenumbers.php">numberempire.com</a>.</p>
<p>I’m not 100% sure that all the others are primes, as I used <a href="https://en.wikipedia.org/wiki/Fermat_primality_test">Fermat primality test</a>. However I’m impressed by what I found. Now I can’t stop wondering how much literature, physics or technology could be hidden in prime numbers, in plain English and UTF-8 encoded. :D</p>
<p>(Obviously, I’m perfectly conscious on what’s happening here, but I though this was a nice fact to share. It could also be a nice number to print on a shirt.)</p>
<p><strong>Dear universe factory worker, I’m going to rescue you, sooner or later. Just tell me how.</strong></p>andreacorbelliniSun, 15 Feb 2015 16:54:00 +0000tag:andrea.corbellini.name,2015-02-15:/2015/02/15/prime-numbers-and-universe-factories/funfunmathNew blog, againhttps://andrea.corbellini.name/2015/02/15/new-blog-again/<p>This must be the third blog I start from scratch. But this time, I’m taking a serious commitment: I’m going to write here regularly.</p>
<p>Wish me luck!</p>andreacorbelliniSun, 15 Feb 2015 12:23:00 +0000tag:andrea.corbellini.name,2015-02-15:/2015/02/15/new-blog-again/miscmiscblog