Guix and Org mode tutorialConférence francophone d'informatique en Parallélisme, Architecture et Système, Bordeaux, 2025
1. Foreword
Reproducibility of a research study in computer science has always been a complex matter. One of the biggest challenges is to recreate the same software environment. The latter is often built manually or using modules [1], especially on high-performance computing (HPC) platforms. The main issue with this approach is that modules and building instructions are likely to vary from one machine to another depending on system configuration and available software. Some package managers such as Spack [2] allow users to define their own package variants and leave them the choice of the configuration, the dependencies or the compiler. However, as the package manager still depends on compilers and other components provided by the underlying system, the reproducibility of software environments remains threatened. From this point of view, the container solutions such us Singularity [3] or Docker [4] are more robust but they do not make updating or management of multiple environment variants very easy. For example, if we want to use a different version of one or more packages, we either have to modify the container interactively, which would make it even less reproducible, or re-build it, which can take lot of time if performed regularly. In the aim to cope with these limitations and take total control over our software environments so as to be able to adapt and reproduce them easily, we propose to explore the usage of Guix [5], [6].
Another challenge to the reproducibility of an experimental scientific study is
the ability to rerun the associated computational experiments. We often need to
repeat experiments more than once, using different software versions and
algorithmic options, possibly months or even years apart. The importance of
maintaining a complete and well-structured documentation—allowing both ourselves
and others to recreate the experimental software environment, rerun all the
experiments, and post-process the results—is therefore critical. While a
detailed README
file and thoroughly commented source code address this need to
some extent, we propose going further by adopting literate programming
[7], a paradigm that combines source code with a natural language
narrative. Among the many literate programming tools available, we will present
Org mode [8], [9] for the Emacs editor. With Org mode, we can
write a document that combines formatted text, images, and figures with blocks
of source code. These code blocks can be extracted as files [10],
for example to be compiled, or evaluated on the fly [11]. The results
of the evaluation—such as a figure or the return value of a computation—can then
be included directly in the document. We can export this document in various
formats [12], and share it with the community, whether as a
scientific publication, a research report, or a simple web page.
Finally, we will discuss the long-term availability of the elements of a scientific study and the associated source code, including external dependencies. Even when hosted in version-controlled repositories on a source code forge such as Inria Forge or BitBucket, they can suddenly become unavailable—for example, due to a repository migration or the permanent shutdown of the platform. We will therefore introduce the Software Heritage [13] and the Zenodo [14] projects, which are dedicated to the long-term archival of source code, experimental datasets and other artifacts.
2. Goals
In this tutorial, once we have become familiar with Guix and Org mode, we will
start from an existing experimental study. The software environment for this
study can be built manually or using a Docker container. Its documentation
relies on a README
file and comments in the associated source code.
Participants will learn the basics of Guix, discover literate programming, and
explore the core features of Org mode. We will then use Guix and Org mode to
improve the reproducibility of this scientific study—both in terms of the
experimental environment and the experiments themselves. By the end of the
session, participants should have constructed a self-contained Git repository
that reimplements the study, this time managing the software environment with
Guix and applying the literate programming paradigm using Org mode. The ultimate
goal is to be able to reproduce the entire study: rerunning all experiments,
post-processing the results, and generating the associated scientific article as
well as a document explaining the process. At the end of the tutorial, if time
permits, we will discuss the long-term archival of an experimental scientific
study and go further in our use of Guix and Org mode.
3. Target audience
This tutorial is intended for scientists and engineers looking for a complete software framework that enables them to:
- create reproducible software environments, whether for high-performance computing (HPC) experiments, to achieve reproducible research results, or to share development environments,
- maintain complete and structured documentation that makes it possible to reproduce a set of computational experiments along with the associated software environment,
- ensure the long-term availability of the components of an experimental scientific study.
A basic knowledge of GNU/Linux is required: processes, command line, and software installation.
4. Pre-requisites
The hands-on sessions can take place:
- on participants’ laptops, provided they have installed GNU Guix version 1.4.0
in advance by following the instructions1; approximately 20 GiB of
free space on the root partition (
/
) will be needed for the software environment of the experimental study; - on a remote machine that will be made accessible to participants via SSH; in that case, participants will need to have an SSH client and know how to use it.
5. Workspace
For the needs of this session, we have created a special project group, namely Guix and Org mode tutorial – ComPAS 2025, on the GitLab platform with the following structure:
Guix and Org mode tutorial -- ComPAS 2025 ├── Software │ ├── guix-compas │ └── minisolver ├── Studies │ ├── Reference study │ └── Study using Guix and Org mode ├── Tutorial ├── Résumé └── Slides
The Software
subgroup provides software resources for this tutorial. The
minisolver
repository features the source code of minisolver
[15], a simple application for solving dense linear systems
arising from aeroacoustic simulations. We address the guix-compas
repository
later in Section 6.5.1.
The Studies
subgroup contains two versions of the same experimental resarch
study of the minisolver
application :
- Reference study, which does not rely on Guix and Org mode for reproducibility,
- Study using Guix and Org mode, which does rely on Guix and Org mode.
The Tutorial
repository contains the Org sources of the present document and
Slides
the Org sources of the introductory presentation. We do not report
further on these repositories.
5.1. Reference study
This repository contains an experimental study of minisolver
with the
following structure:
Reference study ├── .gitignore ├── .gitlab-ci.yml ├── Dockerfile ├── LICENSE ├── README.md ├── benchmarks.csv ├── cylinder.png ├── plot.py ├── references.bib ├── results.csv └── study.tex
In this case, we do not rely on Guix and literate programming using Org mode
to ensure reproducibility of the research study. To build a software environment
clode enough to the original for redoing the experiments, we can use either the
combination of native system package manager and a manual build, as detailed in
README.md
, or the accompanying pre-built Docker container defined in
Dockerfile
.
To redo the experiments on the minisolver
application defined in
benchmarks.csv
, we must settle for the instructions and explanations in the
README.md
file. results.csv
keeps the results of the latest experimental
campaign.
The plot.py
Python script allows for drawing figures based on the experimental
results. Finally, study.tex
and references.bib
represent the LaTeX source of
the study manuscript and the referenced bibliography, respectively.
5.2. Study using Guix and Org mode
This repository contains the same research study as the Reference study repository. However, here we do rely on Guix and literate programming using Org mode to ensure reproducibility of the study. See the structure of the repository below:
Study using Guix and Org mode ├── .gitignore ├── .gitlab-ci.yml ├── AUTHORS ├── LICENSE ├── README.md ├── cylinder.png ├── references.bib ├── results.csv └── study.org
The first thing we can observe is the disappearance of benchmarks.csv
and
plot.py
. Moreover, the study manuscript is now in the Org format (see
study.org
). The Org syntax allow us to combine formatted text with blocks of
source code. Therefore, the manuscript now incorporates both benchmarks.csv
and plot.py
. Before running experiments, one thus has to extract (or tangle in
the Org terminology) [10] the contents of benchmarks.csv
into the
corresponding *.csv
file. To draw the figures, one can execute the elements of
plot.py
directly from within the study.org
file [11]. The Org
file can also be easily exported [12] to various output formats
such as PDF (through LaTeX), HTML, ODF, plain text, etc. Note that exporting the
Org document automatically produces the figures. We further detail the Org
format later in Section 6.4.
The hands-on session will be based on this repository. The main
branch
contains the complete configuration we should have built by the end of the
session. The level0
branch represents the starting point for the participants
to be completed during the session. For the participants joining us later or
wanting to skip one or more phases, there are the other levelX
branches
corresponding to different levels of completion of the hands-on session.
6. Hands-on session
In the first place, we will put the experimental studies aside and familiarize ourselves with Guix.
6.1. Installing Guix
If plan to use Guix on the PlaFRIM cluster, connect over secure shell (SSH) with:
ssh compas-gen<ID>@formation.plafrim.fr
… where <ID>
is your temporary identifier, then
ssh plafrim
From here, you can jump directly to Section 6.2.
Here, we assume that we are running a third-party Linux distribution such as Debian, Fedora or Manjaro. We can install the Guix package manager on top of that distribution without interferring with our primary package manager. To do so, we use the official installation shell script that needs to be run with superuser privileges.
cd /tmp
wget https://git.savannah.gnu.org/cgit/guix.git/plain/etc/guix-install.sh
chmod +x guix-install.sh
sudo ./guix-install.sh
Then, we just need to follow on-screen instructions.
6.2. Running Guix for the first time
If you're using the PlaFRIM cluster, skip the remainder of this section and jump to Section 6.3.
After the installation, we proceed with a short sequence of commands to ensure a
smooth user experience with Guix onward. At the beginning, we install our first
package using Guix, i.e. glibc-locales
to allow the system to switch locales.
guix install glibc-locales
Then, to be able to acquire new versions of installed packages, we will need to pull the latest version of Guix first. The following command can take a while to execute, especially when run for the first time.
guix pull
Once the process finishes, we need to follow the hint the command gives us and
add the following lines to our .bash_profile
or .bashrc
to always get access
to the most recent Guix built by guix pull
.
GUIX_PROFILE="$HOME/.config/guix/current" . "$GUIX_PROFILE/etc/profile"
We also have to tell to our shell to use this new Guix.
hash guix
Finally, we can update our installed packages.
guix upgrade
To get information on the generation (revision in Guix terminology) of Guix being used, we can use:
guix describe
6.3. Familiarization with Guix
Let us enter our first Guix environment containing two packages, bash
and
cowsay
, using the guix shell
command and launch a shell inside of that
environment. Here, we use the --container
or -C
switch to span the new
environment within an isolated container. By default, we don't have access to
host filesystem (except for the current working directory), to host network or
environment variables. See guix shell --help
for more details.
guix shell --pure bash cowsay -- bash
To test the cowsay
package within the Guix shell, try to type cowsay "Hello
world"
, for example.
On some systems, including the PlaFRIM platform we rely on for this tutorial,
using --container
fails with an error along these lines:
$ guix shell --container coreutils guix shell: error: clone: 2114060305: Invalid argument
This indicates that the system lacks support for Linux's unprivileged user
namespaces. The guix shell
commands in the rest of the tutorial thus resort
to --pure
, which is weaker, but still gives good control over the environment.
See guix shell --help
for more details.
We can simply type exit
to get back to our original shell. Also, we do not
have to run an interactive shell inside of the environment. We can directly
execute a given command like for example:
guix shell --pure cowsay -- cowsay "Hello world"
Note that we did not include bash
this time, we did not need it. The above
command line should give us the following output:
______________ < Hello world > -------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
6.3.1. Manifests
The guix shell
command seems very convenient. However, let us imagine that we
do not need only two but 26 packages in our environment. The command line would
become quiet long, right? The good news is that we can instead put our list of
packages into a file, referred to as manifest, then use the --manifest
or -m
option to pass the manifest to our guix shell
command line.
Manifest files [16] use the Scheme language [17]
syntax which can be intimidating in the beginning. Fortunately, guix shell
has
got the --export-manifest
option allowing one to automatically generate the
manifest file corresponding to the environment specified on the command line.
Let us thus create the manifest corresponding to our latest single-package
environment and save it to a Scheme file named cowsay.scm
.
guix shell --export-manifest cowsay > cowsay.scm
Our manifest should look like this. Not so scary in the end, is it?
(specifications->manifest
(list "cowsay"))
Finally, we can enter the target environment using the manifest file and retry
a cowsay
command.
guix shell --pure -m cowsay.scm -- cowsay "Hello from the manifest"
6.3.2. Channels
Note that, by default, guix shell
considers the latest versions of the
specified packages available in our current revision of Guix. Well, maybe we are
fine with that at this point but what happens if we want to enter the exact same
environment a couple of weeks, months or years later? Maybe the packages will
not be even available anymore, or at least, not in the same version.
Software packages in Guix are provided through dedicated git repositories called
channels [18]. The official Guix channel guix
, automatically
set up in our Guix installation, currently provides more than 20,000 packages
[19]. However, many other channels are available, e.g. for
scientific HPC software and so on. We will discuss the usage of multiple
channels later. For the moment, let us focus on the default guix
channel.
To ensure the same revision of Guix providing the same packages in the same versions, we can accompany our manifest with a channel file, also written in Scheme. In the latter we can specify the channel or channels to use together with the desired revision number, i.e. commit.
We can obtain the currently used commit of the guix
channel (and other
channels, if any) by typing guix describe
. The output should look like
follows, modulo the language, date and time ;-)
.
Pokolenie 1 17. december 2024 02:27:16 (súčasné) guix c3290ce zdroj repozitára: https://git.savannah.gnu.org/git/guix.git vetva: master úprava: c3290cee6add60b7e56f5f919d9498d78542790a
Using the -f
option of guix describe
, we can generate the corresponding
channel file, e.g. channels.scm
, as follows.
guix describe -f channels > channels.scm
The contents of channels.scm
then should look like this.
(list (channel (name 'guix) (url "https://codeberg.org/guix/guix.git") (branch "master") (commit "c3290cee6add60b7e56f5f919d9498d78542790a") (introduction (make-channel-introduction "9edb3f66fd807b096b48283debdcddccfea34bad" (openpgp-fingerprint "BBB0 2DDF 2CEA F6A8 0D1D E643 A2A0 6DF2 A33A 54FA")))))
Finally, to execute the guix shell
command using our channel file, we can use
the guix time-machine
command with the --channels
, or -C
, option.
guix time-machine -C channels.scm -- shell --pure \ -m cowsay.scm -- cowsay "Great, a channel and a manifest file"
Let us do some Org now!
6.4. Familiarization with Org
In the aim to improve the reproductibility of numerical experiments, we propose to write the source code of scripts and various configuration files allowing us to design and automatize numerical experiments in respect of the paradigm known as literate programming [7]. The idea of this approach is to associate source code with an explanation of its purpose written in a natural language.
There are numerous software tools designed for literate programming. We rely on
Org mode for the Emacs text editor [8], [9] which defines the
Org markup language allowing to combine formatted text, images and figures with
traditional source code. Files containing documents written in Org mode should
end with the .org
extension.
Extracting a compilable or interpretable source code from an Org document is called tangling [10]. It is also possible to evaluate a particular source code block directly from the Emacs editor [11] while editing. For example, this can be particularly useful for the visualization of experimental results.
Eventually, an Org document can be exported to various output formats [12] such as LaTeX or Beamer, HTML and so on.
Listing 1 shows an example of Org syntax. In this exceprt, there
is some formatted text followed by a Python code block. The line starting with
#+PROPERTY
specifies that all of the source code blocks in that particular Org
file should be tangled into a Python script file named rss.py
. Figure
1 shows an HTML output corresponding to Listing
1.
#+PROPERTY: header-args :tangle rss.py ... ... Memory usage statistics of a particular process are stored in ~/proc/<pid>/statm~ where ~<pid>~ is the process identifier (PID). In this file, the field =VmRSS= holds the amount of real memory used by the process at instant $t$. See the associated function below. #+BEGIN_SRC python def rss(pid): with open("/proc/%d/statm" % pid, "r") as f: line = f.readline().split(); VmRSS = int(line[1]) return VmRSS #+END_SRC ...
Figure 1: HTML output corresponding to the Org document excerpt in Listing 1.
Before going further, let us extend our previous Guix environment, i.e.
cowsay.scm
, with the emacs
and the emacs-org
packages. We will need the
bash
package too.
guix shell --export-manifest bash emacs emacs-org cowsay > cowsay.scm
We can now compose an Org file by ourselves. Let us enter the new environment,
open Emacs inside and create a file named hello.org
.
guix time-machine -C channels.scm -- shell --pure -m cowsay.scm -- \
emacs --no-init-file hello.org
Then, we will try to describe the below short shell script in that Org file
following the aforementionned example. Note that tangling of hello.org
should
produce a shell script file named hello.sh
.
MESSAGE="Hello world" if test "$1" != ""; then MESSAGE="$1" fi cowsay "$MESSAGE"
If all went well, we should have something similar to this in our hello.org
file.
#+PROPERTY: header-args :tangle hello.sh This file describes a simple shell script ~hello.sh~. It begins by defining a default greeting message. #+BEGIN_SRC shell MESSAGE="Hello world!" #+END_SRC However, if the user provides a custom message, we prefer to show this one instead. #+BEGIN_SRC shell if test "$1" != ""; then MESSAGE="$1" fi #+END_SRC Finally, let the cow say our message! #+BEGIN_SRC shell cowsay "$MESSAGE" #+END_SRC
To save our modifications to hello.org
, we can use C-x C-s
. In order for
the #+PROPERTY
setting to take effect, we need to refresh the current buffer
by selecting Org > Refresh/Reload > Refresh setup current buffer
from the
Emacs application menu.
Then, to tangle the shell script from our Org file, we can use C-c C-v C-t
.
Finally, to execute the resulting hello.sh
script in our Guix environment, we
close Emacs using C-x C-c
and fire the command
guix time-machine -C channels.scm -- shell --pure -m cowsay.scm -- \ bash hello.sh "I can do Guix and Org now"
which should give us this fancy output.
____________________________ < I can do Guix and Org now > ---------------------------- \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
6.5. Building a reproducible study
We are now ready for the core part of the hands-on session. We are going to improve the reproducibility of the research study from the Reference study thanks to Guix and the literate programming approach through Org mode.
Before we begin, we need to clone the Study using Guix and Org mode repository we are going to work with. It already contains all of the files required for it to work but some of them need to be completed.
git clone https://gitlab.com/guix-org-tutorial-compas-2025/studies/guix-org-study.git
On PlaFRIM, the git
executable is not available straight away. It is necessary
to load the corresponding module using module load tools/git
.
If you want to keep a copy of the repository with all you'll have done during the session, feel free to fork it first!
We navigate to the root of the local clone and
cd guix-org-study
depending on where in the hands-on session we want to join, we checkout the right branch to start from:
git checkout level0
: the channel definition (the very beginning),git checkout level1
: the manifest definition,git checkout level2
: the literate description of a source code in Org,git checkout level3
: the reproduction of the study in a Guix environment.
6.5.1. Channels
The study relies on the minisolver
which in turn depends on the hmat-oss
library [20]. The official guix
channel does not provide these
packages yet. We have thus prepared a custom channel in the guix-compas
repository (see Section 5) to make these packages available in
Guix.
In study.org
, localize the 'Channels' section and complete the Scheme code
block within so as to match the following list of channels while respecting the
syntax seen in Section 6.3.2. At the end, we can tangle the
corresponding channel file channels.scm
from within Emacs using C-c C-v C-t
.
guix
, the official Guix channel (already present in the code block)- link:
https://codeberg.org/guix/guix.git
- commit:
6c9e010283424498ca094e163f29c1156df13752
- link:
guix-compas
, the channel of this tutorial- link:
https://gitlab.com/guix-org-tutorial-compas-2025/software/guix-compas.git
- commit:
8a349c190a4d7c5eb7453943dd91e758a3bfc1e2
- link:
6.5.2. Manifests
The only thing we need to run numerical experiments on the minisolver
application is the minisolver
application itself. It conveniently allows
us to run multiple experiments with different parameters in a batch.
At first, we will need to compose the guix shell
command allowing us to enter
the correct environment. Let us check whether we have access to a working
minisolver
package through one of the channels we specified in our
channels.scm
file in Section 6.5.1. For this, we can run a
small experiment.
guix time-machine -C channels.scm -- shell --pure minisolver -- \
minisolver --size 4000 --solver hmat
If everything worked well, our standard output should end with something like
[minisolver] relative error = 9.4328e-04 [minisolver] test-specific cleaning ... done [minisolver] computation completed [minisolver] cleaning ... done [minisolver] solver finalization ... done [minisolver] total computation time = 0.138832 s
However, in addition to the minisolver
package, we will need some other
packages to help us perform basic operations,
bash coreutils which git
manipulate and plot experimental results,
python python-pandas python-matplotlib
read and export Org mode documents,
emacs emacs-org
produce the final manuscript of the study using LaTeX.
texlive-scheme-basic texlive-collection-fontsrecommended texlive-type1cm texlive-underscore texlive-dvipng texlive-babel-english texlive-latexmk texlive-wrapfig texlive-ulem texlive-capt-of texlive-listings texlive-fancyvrb texlive-upquote texlive-lineno texlive-biblatex texlive-biber texlive-xcolor
The list of Texlive packages may seem a bit long. Indeed, we could have simply
used the global texlive
package, but the latter weights almost 4 GiB and
provides a lot of LaTeX packages we do not need.
We can observe that the list of requested packages differs from the one in the
README.md
file in the Reference study repository of the same study. This is
because we do not need to include the packages required to build
minisolver
, e.g. hmat-oss
. Guix will build the minisolver
package
in the appropriate environment automatically.
Complete the guix shell
command line below so as to include minisolver
and
all the other necessary packages and test it.
guix time-machine -C channels.scm -- shell --pure <list-of-packages> minisolver -- minisolver --size 4000 --solver hmat
We would not want to type this command line more than once or twice, would we?
We can obtain the corresponding manifest thanks to the --export-manifest
option of the guix shell
command.
guix time-machine -C channels.scm -- shell --export-manifest <list-of-packages> minisolver
Finally, we need to complete the Scheme source code blocks in the 'Manifest'
section of study.org
with portions of the output of the above command and
tangle our manifest file using C-c C-v C-t
.
To enter the resulting experimental environment and spawn a shell in it, we can now use
guix time-machine -C channels.scm -- shell --pure -m manifest.scm -- bash --norc
6.5.3. Org
Let us focus on the figures in the section 'Experimental study' of both the
Reference study and the Study using Guix and Org mode. In the former, we use the Python script
plot.py
to produce the two figures (see Section 5.1). In the latter,
plot.py
is a part of the Org document study.org
(see Section 5.2)
providing a literate description of the script in the section 'Post-processing
results' of the appendix. However, this literate description is incomplete. The
'Preamble' and the 'Data compression' subsections of 'Post-processing results'
describe approximately two thirds of plot.py
.
Complete the 'Multi-threaded execution' subsection based on the contents of the
corresponding part of the original plot.py
script from the Reference study. We
can consider comments as surrounding formatted text and split the source code
into Python source code blocks. Do not hesitate to consult the 'Preamble' and
the 'Data compression' subsections to inspire you.
Note that starting a line with <s
and hitting Tab
allows one to quickly
generate a pair of source code block delimiters, i.e. #+begin_src
and
#+end_src
. Do not forget to specify the target programming language following
the opening delimiter, e.g. #+begin_src python
in this case.
Once it is done, let us return to the section 'Experimental study' featuring the
two figures. It contains two 'Addendum' subsections, one per figure. Each
addendum subsection describes a unique code block assembling the required code
blocks from 'Post-processing results' to plot the corresponding figure. The
evaluation of the code blocks in the addendums takes place during the export of
study.org
into the target format, i.e. PDF in this case. However, it is
possible to execute these code blocks from within Emacs as well. We will address
this approach in Section 6.6.3.
For now, let us proceed to the most important challenge of the day – reproducing the entire Study using Guix and Org mode.
6.5.4. Reproducing the study
Before going further, we have to tangle the list of experiments to run, i.e.
benchmarks.csv
, from study.org
.
emacs --batch --no-init-file --load publish.el --eval '(execute-block-in-file "get_benchmarks" "study.org")'
As we have already tangled channels.scm
and manifest.scm
, we can now look
into README.md
of the Reference study and try to identify the minisolver
command for running the experiments. Execute the command from within the root of
the Study using Guix and Org mode repository and in the right Guix environment, i.e. using
channels.scm
and manifest.scm
. We have seen the associated guix
command
lines at the end of Section 6.5.2. Note that after the execution
of benchmarks we should get the results in another *.csv
file, i.e.
results.csv
.
Finally, to post-process the results (see Section 6.5.3) and publish the final manuscript of the study, we can use this command.
emacs --batch --no-init-file --load publish.el --eval '(org-publish "study")'
6.5.5. Reproducing guidelines
As we can see, the resulting manuscript study.pdf
does not contain the
appendix nor the addendum subsections of the section 'Experimental study' (see
Section 6.5.3). Instead, the conclusion of the manuscript refers the
reader to the archive of the study's repository.
Nevertheless, using the command
emacs --batch --no-init-file --load publish.el --eval '(org-publish "companion")'
we can produce the complete manuscript. While the first version would be suitable for publishing in a conference or a journal, the second version provides everything the reader would need to understand how the experiments were done and how to reproduce the study. Especially, the 'Quick-start guide' section summarizes all the steps.
We can see the complete manuscript as a companion document we can publish together with the study in HAL, arXiv or similar. The archived repository of the study could then include a pointer to the PDF version of this companion document.
Anyways, before sharing the companion document, let us complete its 'Quick-start
guide' section which is currently missing the minisolver
command-line for
running experiments we used in Section 6.5.3.
Eventually, if we re-run
emacs --batch --no-init-file --load publish.el --eval '(org-publish "companion")'
we obtain a complete companion document.
____________ < Well done > ------------ \ ^__^ \ (oo)\_______ (__)\ )\/\ ||----w | || ||
6.6. Going further
6.6.1. Long-term availability
Up to this point of the tutorial, we have addressed the reproducibility of software environments and the reproducibility of numerical experiments. Another challenge is the long-term availability of the elements of a scientific study. These elements include, but are not limited to, the source code of the packages and their dependencies composing the experimental software environment, the results of numerical experiments, associated scientific publications as well as the literate description of these components.
For the storage of source code, it is a common practice to rely on remote repositories under version control of the software tools such as git or subversion. For the storage of datasets and other type of artifacts often accompanying an experimental study, there are many file-hosting and sharing services. Nevertheless, the hosting platforms, commercial or institutional, do not guarantee the long-term availability of a particular version of a particular file or the long-term availability of the service itself.
This is where the Software Heritage [13] and the Zenodo [14] projects come into play. On the one hand, Software Heritage focuses on the archival of source code. It is able to archive multiple types of version-controlled repositories and features a seamless integration with Guix. On the other hand, Zenodo presents itself as a platform for storing not only source code, but also experimental datasets and other types of artifacts. Both of these platforms then allow for referencing the archives through unique identifiers. See the GitLab page of the Study using Guix and Org mode repository for an example.
6.6.2. Modular environment
The illustrative experimental studies we feature in this tutorial include
numerical experiments on the minisolver
application. One of the
dependencies of the latter is the OpenBLAS library [21] providing an
open-source implementation of the linear algebra BLAS [22] and LAPACK
[23] routines. However, the latter have multiple implementations.
Among the frequently used ones are the vendor-specific implementations such as
the Math Kernel Library (MKL) [24] from Intel(R). Let us imagine that we
want to perform the numerical experiments of our study, but replace OpenBLAS by
MKL in our software stack.
If we were building our software environment manually, we would have to
recompile a significant part of the software stack. If we were using a container
solution, such as Docker, we would have to rewrite the recipe of the container.
In the meantime, the guix shell
command in Guix provides the --with-input
option allowing us to replace a software package in the graph of dependencies of
the target environment by another one and rebuild the dependent packages
automatically. Let us see the usage of this option on a concrete example.
The default guix
channel does not provide MKL directly as it is not a free and
open-source software. We thus have to include the
=guix-science-nonfree
channel in the list of channels of the Study using Guix and Org mode (see Sections
6.5.1 and 6.3.2). Add the following item to the
Scheme source code block corresponding to channels.scm
in study.org
and
tangle the file using C-c C-v C-t
from within Emacs.
(channel (name 'guix-science-nonfree) (url "https://codeberg.org/guix-science/guix-science-nonfree.git") (introduction (make-channel-introduction "58661b110325fd5d9b40e6f0177cc486a615817e" (openpgp-fingerprint "CA4F 8CF4 37D7 478F DA05 5FD4 4213 7701 1A37 8446"))))
At first, we are going to put the manifest aside and launch minisolver
in an
environment where we replace OpenBLAS by MKL.
guix time-machine -C channels.scm -- \ shell --pure --with-input=openblas=intel-oneapi-mkl minisolver -- \ minisolver --size 4000 --solver hmat
The output should contain this line.
[minisolver] blas = mkl
The --with-input
option also works together with the --export-manifest
option of the guix shell
command (see Section 6.5.2). For
example, the above guix shell
command line with --export-manifest
gives us
the following manifest.
(use-modules (guix transformations)) (define transform1 (options->transformation '((with-input . "openblas=intel-oneapi-mkl")))) (packages->manifest (list (transform1 (specification->package "minisolver"))))
Now, try to update the Scheme code blocks corresponding to the manifest of the
Study using Guix and Org mode (see Section 6.5.2) in study.org
so as to
replace OpenBLAS by MKL in the entire environment.
6.6.3. In-place code block evaluation
In Section 6.5.3, we have done some literate programming in Org mode.
The goal was to complete the literate description of the plot.py
script from
the Reference study in the 'Post-processing' section of the appendix in
study.org
of the Study using Guix and Org mode. The 'Experimental study' section of the
latter features two figures. It contains two 'Addendum' subsections, one per
figure. Each addendum subsection describes a unique code block assembling the
required code blocks from 'Post-processing results' to plot the corresponding
figure. The evaluation of the code blocks in the addendums takes place during
the export of study.org
into the target format, i.e. PDF in this case.
However, it is possible to execute these code blocks from within Emacs as well.
If it is not the case, let us enter the environment of the Study using Guix and Org mode
guix time-machine -C channels.scm -- shell --pure -m manifest.scm -- bash --norc
and open study.org
in Emacs.
emacs -nw --load publish.el study.org
Now, we navigate to the first 'Addendum' subsection in the 'Experimental study' section and place the cursor on the corresponding source code block.
Note that you can use C-x s
to look for
and quickly locate the target source code block.
With the cursor on the source code block, use C-c C-v C-e
to execute it from
within the Emacs editor. The purpose of this code block is to produce the figure
in the 'Data compression' subsection of 'Experimental study'. When the execution
finishes, locate the line beginning with:
Below this line comes the result of the execution of the block plot_parallel
.
The result is a link to a file named parallel.pdf
corresponding to the
resulting figure. Finally, display the figure in Emacs by clicking on the file
link.
7. Pointers
In addition to the bibliography at the end of the document, the following pointers may be of interest for those who would like to learn further on how to use Guix and Org mode for Emacs:
- https://hpc.guix.info/ (Guix-HPC, reproducible software deployment for high-performance computing: channels, packages, events, …)
- https://cours-mf.gitlabpages.inria.fr/is328/tuto-chameleon.html (tutorial on how to use Guix or Singularity images produced by Guix on HPC platforms such as PlaFRIM)
- https://felsoci.sk/blog/posts.html (blog of Marek Felšöci with posts on Guix and Org mode usage - for work and for home)
8. References
Footnotes
Binary installation (GNU Guix Reference guide), https://guix.gnu.org/manual/en/html_node/Binary-Installation.html