Some bits about the Debian Installer by the development team

Joey Hess

Holger Levsen

Christian Perrier

Frans Pop

This article is free; you may redistribute it and/or modify it under the terms of the GNU General Public License.

Abstract

The release of Debian sarge has put in evidence one of the biggest improvment brought by this new version of the Universal Operating System: its installation process, namely the debian-installer.

This article features details about some specific parts of debian-installer project.

First, we memorize the old installer, the boot-floppies. Then the design and development model of the new debian-installer is presented briefly, but not as the main scope of the talk. Next the debian-installer internationalization framework and the maintenance of the Installation Guide are explained in more detail. Finally, some thoughts are developed about the future of debian-installer and the areas where more contribution is welcomed.


Table of Contents

1. Debian Installer: past and present
2. Debian Installer internationalization and localization issues
2.1. Some history
2.2. Generalities
2.3. Organization
2.4. Technical aspects
2.5. The future
3. Installation Guide
3.1. Translations
3.2. Technical aspects
3.3. How can you help?
4. The future of Debian Installer: how to help
4.1. Translate debian-installer
4.2. Make it install an UTF-8 system
4.3. Get involved in supporting your architecture in debian-installer
4.4. Work on debian-installer updates for sarge
4.5. Work on supporting sarge installs using the etch installer
4.6. Help us develop a graphical installer
4.7. Help improve automated installs
4.8. Add encrypted filesystem support to partman
4.9. Help improve hardware detection
4.10. Help support non-free drivers/firmware
4.11. Help support secure apt
4.12. Help support installation by the blind
4.13. Help improve base-config, X setup, etc
4.14. Add support for installs via PPPOE
4.15. Make it easier to build custom CDs and otherwise customize the installer
4.16. Find a way to remove a single question from the standard install process
4.17. Invent the next great thing in debian-installer
4.18. Add moon-buggy to debian-installer
5. Conclusion

1. Debian Installer: past and present

A long time ago, installing Debian meant using the boot-floppies, an installer so old that even its name harks back to days of yore when a floppy disk was the only reasonable way to install linux.

It's important to realize what a good installer the boot-floppies were for their time. It's possible to find comments circa 1995 praising how well Debian's installer worked compared to some of the alternitives back then. Although you may need to read them three times to believe it! The boot-floppies worked, it only took six floppies, and you didn't even need to use a command line to install Debian.

So the boot-floppies began as a best of breed linux installer, then were retrofitted to kinda support CD installs and a few ports, then after a few years began to show their age as hardware detection, streamlined graphical installers, and fancy partitioners became standards in linux installation.

By the late 90's they were in maintenance mode and were only pulled out and laboriously dusted off and brought up-to-date with newer kernels, architectures, and other changes in the several months before new releases of Debian. In a time when everyone seemed to be busy mass-installing server farms of hundreds of machines using cutting-edge new hardware, the out of date boot floppies began to hold Debian back.

There's a lesson to be learned in this, and it is that when it comes to installation, Debian cannot afford to rest on its laurels.

Now that sarge is released, Debian finally has an installation system that we can be proud of. Kudos to everyone who helped make the new Debian Installer the awesome installer it is today; you've wildly exceeded our expectations, when the debian-installer project began, of what a great installer we would have for sarge.

But things should not stop here, because debian-installer is not just an installer for sarge; it's a base we can use for making the Debian install process continue to improve for etch and beyond. And the installer is intentionally not a self-contained installer that does its own thing in its own corner using its own technology. Instead, from its very core (debootstrap) up to its finishing touches (tasksel) it's a part of Debian that is designed to be improved as Debian as a whole is improved, and to itself promote improvements in Debian.

The development of debian-installer is team work which takes place on the debian-installer development mailing list () and the freenode IRC channel #debian-boot, where all commit message are published through a bot. Minutes and logs of scheduled IRC meetings, examples of preseeding configurations, FAQs and more are collected in a wiki at http://wiki.debian.net/DebianInstaller. Additionally we hold physical work camps like the one preceeding debconf5.

To join the debian-installer team check out the source code from it's subversion repository (you'll find more detailed instructions on that on the debian-installer develoment page on http://www.debian.org/devel/debian-installer and read on in /installer/doc/devel - and subscribe to the mailing-list and simply say "Hi I'm foo and want to do bar."

Especially now that Sarge is released we are looking forward to welcome new debian-installer developers with new ideas! (And we already have lots of ideas as you will find out in the end of this paper.) And even though we expect lots of people reading or listening to this to have somewhat deeper knowledge about debian-installer here's a quick overview for the rest:

The design of debian-installer is modular: from the various initrd kernel images with their kernel modules to the micro debs, called udebs, which are stripped down debian packages (don't need to conform to policy, less contents, no docs, not installable on a standard debian system) build from the same source as the "real" ones.

Due to the use of debconf (the tool, not the conference) it is possible to have various frontends as well as the already famous preseeding feature. anna's not nearly apt but still manages it's job very well, which is loading udebs and placing them with the proper order into main-menu which in non-expert mode is hidden from the user.

The debian-installer source code is managed in a subversion repository on alioth, where all new development takes place in the trunk. During the emergence of sarge severel release candiate (RC) branches were used, which are now abandoned, though the sarge branch is still there and will be used for further point releases of sarge. In principle the debian-installer in etch will be able to install sarge, in practice this will depend how kernel updates will be managed.

2. Debian Installer internationalization and localization issues

2.1. Some history

2.1.1. Why localize?

A recurrent debate often comes about the need to localize the installation process of a Linux distribution. Indeed, a commonly spread idea is that Unix/Linux systems administrators have to get some fluency in the English language to do their work, so they should not really worry about using their language to install a Linux system.

This very narrow point of view is fortunately quite outdated in the Debian community and the need for l10n of the installer has been well established for years. However, the main arguments are still valid and revisiting them may be of some interest.

First of all, not every system administrator prefers using English. Most of them, in real life, pick their own language when it is available.

Moreover, Debian is more and more newbie friendly. As explained elsewhere in this paper, one of the design goals of the installer was making it easy to use for every user, including newcomers to the Linux environment. Here, offering a completely localized installation process becomes a key point.

Finally, more and more Custom Debian Distributions are based on Debian sarge and thus depend on Debian Installer for their installation process. Some of these distributions are targeted at home users or for use in educational environments, where a complete l10n is also a key argument of choice.

2.1.2. Boot floppies internationalization

The former installer of the Debian distribution, namely boot-floppies, was a single package which already got a good internationalization[1] and thus has only one single translation file for localization[2]. It was already using gettext to handle translatable material. However, its l10n was completely unsynchronised with the l10n of other material involved in the installation process. This lead up with quite partial installation procedures which required a quite large English knowledge even for languages for which the boot-floppies was fully translated.

The boot-floppies development did not include string freeze procedures, ie some periods of the development time where no change was allowed in English messages. The French translators remember a lot of headaches catching up with last minute changes...

The number of supported languages was already quite high, but was limited by the number of technically skilled translators, able to deal themselves with several tasks not directly related to translations, such as commits, resynchronizations, etc.

2.1.3. Early design choices of debian-installer for i18n

The very modular nature of debian-installer has made l10n easier because it became really easy for translators to focus on the most important issues.

The key point has been putting all translatable strings into debconf-style templates (sometimes by using specific templates introduced with cdebconf, the C-reimplementation of debconf). The advent, during early sarge development, of the use of gettext for debconf templates i18n, made the process very simple for translators.

2.1.4. The advent of the debian-installer i18n team

During the early stages of Debian Installer, translators at first were the translators of boot-floppies, or were coming from established translation teams. The work was then pretty informal with translators spontaneously joining the debian-installer development mailing list () and getting access to the CVS repository to commit translations themselves.

There were already a few dozen debian-installer packages at that time and following the status of translations rapidly became nearly impossible for translators.

The first debian-installer beta versions also raised the problem of completing the translation work for the releases with hardly any infrastructure to track the completion status, and showed a need for a gateway between developers and the emerging team of translators. The need for some documentation for i18n and l10n work for the installer was also a good motivation to get people specifically involved in i18n coordination.

The first signs of i18n coordination appeared in mid 2003 with the first beta releases of debian-installer. Petter Reinholdtsen, because of the Skolelinux needs in that area, began acting as a coordinator and wrote the first version of a specific documentation of debian-installer l10n.

The beta 4 release finally made the i18n team a reality, providing debian-installer developers with a reliable way to include i18n and l10n in release planning. At the moment of this writing, the debian-installer i18n team is made of Christian Perrier, Dennis Stampfer and Frans Pop (more specialized in the installation manual i18n).

2.2. Generalities

2.2.1. I18n with po-debconf

Each debian-installer package, just like other regular packages, includes a translation template in the debian/po directory in the package's source tree. In that directory, one file per language groups together the translations of all the template's strings for that package.

The final package build process assembles these templates and their translations without any manual intervention from the package maintainer or another debian-installer developer.

That very simple process is completely identical to processes for regular Debian packages using debconf template translation, which makes the work simpler for both translators and developers.

2.2.2. The concept of levels

The design choices of debian-installer make it strongly interact with regular Debian packages. Indeed, as soon as possible in the installation process, debian-installer relies on packages to control the installation process. Even in the first stage (the installation phase before the only reboot), some of the user interaction is provided by stripped udeb versions of regular Debian packages.

After the reboot, the second stage of debian-installer is controlled by the base-config package, which may still be seen as a debian-installer package (it is maintained by the debian-installer team). However, this package heavily relies on other packages such as shadow, tasksel, pppconfig and a few others to achieve some tasks. This means that a full l10n of the installation process, as seen from the user, has to include l10n of these packages.

In order to give translators priorities and make their work progressive, the i18n team has established levels of translation which group together all involved parts, sorted by order of importance with regards to l10n.

The technical part of this section will describe details about these levels.

2.3. Organization

2.3.1. The i18n team

The role of the i18n coordination team is very broad. Indeed, it may be summarized as "be the pivotal point between debian-installer developers and debian-installer translators". Even if translators have always been considered to be full members of the debian-installer team, the need for this gathering point was obvious.

The role of the i18n coordination team may be summarized in the following tasks:

  • reference point for translators;

  • guarantees consistency in English writing style;

  • search for and validate new translators;

  • maintain helper tools;

  • follow developement and maintain localechooser;

  • expertise in more specific topics such as encoding, left-to-right text handling, special packages to support arabic, oriental languages is also needed;

  • manages the testing of the release candidates with regards to i18n/l10n;

  • assists translators with the Debian BTS and package maintainers by handling l10n commits for them;

  • improve the framework to make the translators life easy;

  • maintain contact with translators, derived distributions and organised free software l10n teams (Arabeyes, Indlinux...);

  • maintain the installation manual i18n.

2.3.2. The translators

The translators and the translation teams are obviously the people who do the real job. The basic principle in debian-installer i18n (and indeed in Debian as a whole) is leaving translators and translation teams with great freedom to choose their work method.

This results in various situations, from languages where a single individual is the only person responsible for the whole translation, QA work, and interaction with the i18n team; up to languages where a very structured team exists.

The only requirement made by the i18n team is getting the name and email address of a reliable contact who will be further called the language coordinator for this language. Having a backup coordinator is preferred, but not required. And, finally, if the translation is made with the support of a translation team (internal or external to the Debian project), the name and address of this team is needed (possibly with some write access to the team's mailing list).

Though the i18n team assists translators in their tasks and sometimes commits files in their name, autonomy is highly desired, with the language coordinator committing him/herself translations produced by the team and reporting bug reports for packages that are not part of the debian-installer team maintenance area.

2.3.3. The developers

In the i18n process, the role of debian-installer developers is to ensure that all possibly translatable material is properly marked as translatable and use strings from the debconf i18n system for their packages. The cdebconf package documentation gives good examples of the use of debconf strings even outside the context of debconf.

debian-installer developers also have to coordinate their work with the i18n team when their work is likely to introduce new translatable strings. Such strings need peer review for the English messages style consistency checks. They have to follow the calls for string freezes during the releases preparation and must work very closely to the i18n team when exceptions are needed to the string freeze policy.

Developers of packages not maintained by the debian-installer team, but needed by the installer (mostly source packages that produce udeb packages) are requested to get in touch with the debian-installer i18n team when their package includes translatable material. Following debian-installer string freezes is here also very valuable.

2.4. Technical aspects

2.4.1. The levels

2.4.1.1. Level 1

The so-called level 1 of translation in Debian includes all udeb files which are part of the first stage of the installation process and are maintained by the debian-installer team. This consists of the whole debian-installer/ source tree on the debian-installer development server.

A few packages which technically belong in level 1 because some of their translatable material is used during the first stage of the installation process, have been moved to level 2 to make it clear that the level 1 is the core of debian-installer translations.

At the moment of writing of this paper, level 1 consists of 65 packages with l10n messages for a total of 1270 strings.

To make the translators work easier to maintain, all strings in level 1 files are gathered together in a single master file in packages/po in the debian-installer development trees (sarge branch and development branch).

This method of work does not allow split work but makes the maintenance process far more easier as well as improves Quality Assurance (because it guarantees that the same string is translated the same way among packages).

The synchronization script (l10n-sync) main job is to spread out translations from the master file to the PO files in each package. The internals of this script are detailed in the debian-installer i18n documentation. This script also combines the PO templates files (POT files) from all packages into a single POT file, which may be used by translators starting a new translation.

Some languages may be defined as prospective languages, which means that the translations for these languages are not spread out to individual packages. The reason for this is the impact translations have on the initial RAM disk size and the memory requirements of debian-installer. Languages are removed from the prospective list only after the debian-installer i18n team has received the agreement of the debian-installer development coordinator.

Nine prospective languages were waiting for the "gates opening". These translations were started after the language freeze of debian-installer which occurred during debian-installer RC2 development with the decision not to add more languages up to the sarge release.

Including the prospective languages, the total number of languages supported in debian-installer is 50, including English. The number of languages supported in the sarge branch is 41.

2.4.1.2. Level 2

debian-installer translation level 2 includes all packages prompting users during a default (high priority) base system installation.

This includes two packages maintained by the debian-installer team itself (base-config and tasksel) as well as regular packages maintained by Debian developers outside the debian-installer team.

Packages producing udebs used during the first stage of the installation process have been added to this level. These include iso-codes and console-data.

Finally, two regular packages are also included in level 2 because part of their debconf templates are used during the second stage of the installation process: shadow and exim4.

2.4.1.3. Level 3

debian-installer translation level 3 includes all packages prompting users during any kind of base system installation.

It currently includes only regular Debian packages, some of which are quite widely used in some types of installs (aptitude, pppconfig). Some others that are no longer prompting users but used to in the past, are kept for historical reasons or in case they come back (popularity-contest, pcmcia-cs..) and a few which currently only have their menu entry to base-config that needs translation (xdebconfigurator, l10n-config).

2.4.1.4. Level 4

debian-installer translation level 4 includes all packages that display messages during a base system installation.

The purpose of including these packages is to try to reach the state where users only see display in their language while installing the system. A side-effect is pushing for the translation of two of the most important packages in Debian: dpkg and apt.

Shadow has been added to this list because it very briefly displays a few messages. The side effect here is also getting some very widely used programs such as login, passwd and all other such utilities translated into many languages.

2.4.2. Typical process to integrate a new translation

Several steps are required for new translators to integrate the debian-installer translation team and start working on translations in their language. The typical sequence ("the New Translator Process") is the following:

  • First contact: the translator may come by him/herself, or be directed by a third party to the debian-installer development team. The i18n team also happens to be proactive towards translators already working in other FLOSS projects and try convincing them to work on debian-installer translations;

  • Locale and language code identification: from information given by the candidate translator, the i18n team determines the language code that needs to be used. If a locale already exists in Debian for that language, the work can go on. Otherwise, the coordinator and the translator work on writing one (which is a very tedious task) and propose it to the locales package maintainers;

  • Local language name: the translator sends the coordinators the name of the language as a UTF-8 string. The i18n coordinator updates localechooser accordingly;

  • Activate early support: the i18n coordinator requests the translator about needed information for this: display specifics, input systems...

  • The translator begins working on level1, translating a few strings and sends the new PO file to the i18n coordinator who checks it and commits it (not forgetting to add the given language to the prospective languages list);

  • The new translator registers for an account on Alioth and the i18n coordinator ass him/her to the debian-installer commiters list. Subscription to the mailing list is also highly encouraged (and actually presented as mandatory). Subscription to the mailing list is also encouraged but less enforced;

  • Finally, the i18n coordinator assists the new translator for SVN commits/checkouts so that (s)he can work autonomously.

This whole process has been proven to be a good way to integrate new translators and teams progressively. It also gives the i18n team a good method to really test the translators motivation and avoid losing people in the wild.

2.4.3. The framework

Stricto sensu, the debian-installer i18n framework only consists in the tools used to manage translation of level 1 packages (all packages developed by the debian-installer team) and the tools use to monitor the translation status. This paper will not describe any special method involved in handling translations for level 2-4 packages because there is indeed nothing really special regular bug reporting and handling of PO files by their maintainers.

2.4.3.1. Handling of level 1 PO files

All PO files coming from all debian-installerpackages are combined together in a big PO file named the master file. This has the advantage of avoiding translators to dig in each and every debian-installer package to update its translation. Moreover, as several packages share common strings (or similar strings), having these only translated once helps in keeping the general consistency among translations.

Translators only work on the master file and all mergin/spreading of translations back and forth this master file to all individual packages is handled by special script named l10n-sync. This script is maintained by the debian-installer i18n coordination team and is run from one of the Debian servers available to developers.

l10n-sync tasks:

  • merge together all POT files in a single POT file

  • update the master files from this POT file

  • update all packages PO files for each language from the master file of that language

All this is achieved with a simple shell script, by using the various gettext utilities. A minor drawback of this is being dependent on the version of the gettext utilities used to run l10n-sync. For instance, woody's gettext utilities have different behaviour and different options than sarge's gettext utilities. This sometimes resulted in gratuitous commits by the script and some headaches for the debian-installer developers.

l10n-sync also has the ability to update a branch of the installer development repository from the PO files in another branch. For instance, the current sarge branch is updated from the PO files in the trunk. This is again aimed to simplfy the work of translators who only have to deal with one file.

2.4.3.2. Other levels handling

For levels 2 to 4, the translation files are dealt with individually. The translators work on each of them, complete them and then send them as bug reports against the relevant packages.To make this work easier, the translation status pages collect the files from either the Debian archive (when no public repository is accessible for the given package) or from the maintainer(s) CVS or SVN repositories. That allows translators to easily grab the files to translate without digging into packages and/or repositories.

To make this work easier, the translation status pages aintenance scripts collect the files from either the Debian archive (when no public repository is accessible for the given package) or from the maintainer(s) CVS or SVN repositories. That allows translators to easily grab the files to translate without digging into packages and/or repositories.

The debian-installer i18n coordinators have usually got a commit access on the relevant repositories and follow these packages development through the Package Tracking System, which allows them to act as gateways between translators and maintainers and save some precious time for maintainers.

2.4.3.3. The translation status pages

The debian-installer i18n team maintains status pages for all levels translations. The main status page is, at the time of this writing, located at http://people.debian.org/%7Eseppy/d-i/translation-status.html. It gives news about the translations (usually modified strings), statistics about each level translation and access to all releavant files (PO files and POT files) for all levels. The status page also features graphics of the translation ratio progress and gives access to other useful resources.

The staus page is built by scripts run from the i18n coordinators account on people.debian.org. These scripts spider through all relevant repositories or in the Debian archive (when no repository is available for the package). They currently cannot easily work with distributed revision control systems. Two packages in the various debian-installer l10n levels use such systems (apt and dpkg). For these packages, one of the i18n coordinators maintains a special arch branch for l10n.

2.4.4. Quality Assurance work

2.4.4.1. Error checking, proofreading

Peer review is one of the corner stones of translation work. Because translation is a very tedious work, translators are very prone to errors, spelling mistakes and typos. For that reason, all teams that can afford the required manpower should invest a strong effort in peer review. The only drawback is that it needs at least two translators, which is currently not yet achieved for all languages.

Except for automated spellchecking, no special infrastructure exists for peer review of translations. Several teams have setup their own policyand sometimes even did document it. The Dutch team peer review system is for instance very well documented: http://wiki.debian.net/?l10nCoordination.

2.4.4.2. Spellchecking

During the very late steps of sarge development, Davide Viti has setup a framework for automated spellchecking of debian-installer translations. This framework makes use of the aspell utility as well as aspell dictionaries for many language among the languages supported in debian-installer.

A common word list is built for each debian-installer i18n level: this word list contains all words not recognized by aspell (most of the time, jargon words, words kept in English, acronyms, etc.). Each language team can also build its own word list with all words from this language which are not recognized by aspell even if they are correct (here again , most often jargon words, acronyms, country names, etc.).

The spellchecking system also checks for incorrect variable substitution in translations. Translators should not translate variable names in debconf translations, but mistakes or typos often happen in that matter, which results in "strange" displays to users. Variable substitution checking has been one of the most immediate benefits from the spellchecking framework for debian-installer.

2.5. The future

2.5.1. Web-based translation tools

Web-based translation tools currently receive an increased popularity in FLOSS development. They allow barely everyone to participate in the translation effort because the access to it is made very easy, even to people not running Free Software environments.

Web-based translation tools cannot completely replace specialized software such as KBabel, Poedit, Gtranslator or Emacs PO mode, especially on heavy translation work. However, they may be very convenient for translation maintenance as well as giving more opportunities to get new (wo)manpower to l10n team.

Two main systems have emerged recently:

  • Rosetta, developed for the Ubuntu project, is used quite widely in Ubuntu for l10n work. Being mostly developed by contributors in (or near) the Debian community, it may very likely be adapted for Debian specific needs. Its main drawback is being based on a technlogy which does not meet the Debian Free Software Guidelines (at least at the time of this writing).

  • Pootle, developed by contributors of the translate.org project, has got a very wide recognition by several major l10n projects in FLOSS. The developers have setup a server at http://pootle.wordforge.org but they also made the Pootle Toolkit available to teams who want to setup their own Pootle server. Setting up a Pootle server for Debian could be a way to go, which could help in debian-installer l10n as well as all other l10n work in the project.

2.5.2. Language packs

Currently, integrating a new language in debian-installer needs to remove it from the prospective list so that the PO files for it are built into all packages from debian-installer level 1. Then, each one of these packages has to be rebuilt and uploaded for the language translations to be available in debian-installer.

The drawback here is to bloat debian-installer for each additional language. Then the size of the initial RAM disk image as well as the memory requirements of debian-installer may become out of control. This is the major reason for which 9 languages have been kept out of sarge debian-installer since September 2004 while the debian-installer team was doing its best to keep debian-installer in a releasable state so that it is no more a blocker for the whole release of sarge.

3. Installation Guide

The purpose of the Debian Installation Guide is to provide users with a definitive guide to installing Debian using any of the installation methods for all of the architectures supported by Debian. Although for some architectures the basics of the installation process are fairly well covered, the manual unfortunately falls well short of this goal.

A different version of the Installation Guide is published for each architecture:

  • in the debian-installer-manual package
  • on the 1st CD of the full binary CD sets
  • on the release pages of the Debian website (Sarge version at http://www.debian.org/releases/sarge/installmanual)
  • on the d-i project pages on Alioth (Etch version; working version)

The following formats are available: HTML, text, PDF and Postscript. The Postscript format is not actually published anywhere, but can be built.

The current Installation Guide for Sarge is for a large part a copy of the Woody's manual for the boot-floppies installation system. A lot of work has been done in 2003 and 2004 by Miroslav Kuře to document most of the components of debian-installer.

The chapters needing most work are those concerning preparation of the installation and booting the installer. These chapters still contain a lot of outdated text based on Woody. They should be reorganized to provide:

  • a solid general introduction to debian-installer, including system requirements

  • a description of the available installation methods (consisting of boot method and source used to retrieve debian-installer components (udebs) and base system packages

  • a description of the basic options available for the installation (e.g. installing using a 2.6 kernel; how to configure the network statically instead of using DHCP)

  • a description for each installation method of how to prepare for the installation, including how to obtain the necessary image/media and how to boot the installer

The main challenge is to keep the text concise while still covering all architectures and especially the differences between them.

3.1. Translations

Translation of the manual is very active. There are currently nine complete translations that are published on the official website. Only five of those are included on the installation CDs because the others were completed after the RC3 release of debian-installer. These nine languages are:

  • Simplified Chinese
  • Czech
  • French
  • German
  • Japanese
  • Portuguese
  • Portuguese (Brazilian)
  • Russian
  • Spanish

In addition to those nine translations, there are two that are almost complete (Korean and Traditional Chinese), four that are in progress (Greek, Italian, Catalan and Dutch) and another four that are currently not maintained (Tagalog, Basque, Finnish and Danish).

The translations that are not yet complete, are published on Alioth. For the translations that are in progress, we try to keep the untranslated parts up-to-date in English so that the manual is still usable.

3.2. Technical aspects

The Installation Guide is written in XML. The text is divided into about 160 individual XML files, the division being mostly by either subject, d-i component or architecture.

There is only one source for all architectures; the source even contains some variation between the Sarge version of the installer and the version currently in development for Etch. The manual makes heavy use of conditions to determine which text is actually included in which version of the manual. This is regulated at build time through parameters set in the build scripts.

3.2.1. Building

The build system has evolved very heavily over the past two years. There is one central script that builds the manual for a specific architecture, for a specific language and for one or more formats. This script is generally called by different wrapper scripts that build the whole set of versions of the manual:

  • for the debian-installer-manual package (per architecture), also used for the installation CDs; this script is called from the Makefile that builds a release of debian-installer
  • for the website; run on request by Frank Lichtenheld
  • "daily" builds (in practice 4 times a week); only launched by an update in the repository for a language

The last script also includes functions to update and commit POT and PO files (see below), mail build logs to translators, calculate translation statistics, create an index page, upload and install the built files on Alioth.

The build system uses the following packages:

docbook, docbook-xml, docbook-xsl, xsltproc

The basic toolset to process the XML files and generate the HTML version of the manual.

openjade, jadetex, docbook-dsssl

Used to generate first a temporary TeX file and then a temporary DVI file from which the PDF format is generated.

gs-common

Used to generate the Postscript format from the temporary DVI file.

w3m

First a single file HTML document is generated after which w3m is used to generate the text file.

3.2.2. Translation tools

Originally translations were done by translating the raw XML files. As many translators are not very confortable using XML, we introduced the option of using PO files for translation last year. This method is currently used by seven translations, of which two were converted from XML.

Three toolsets were considered to generate templates from the English XML files and XML files from the translated PO files:

  • po4a
  • poxml (from the KDE project)
  • po-debiandoc

po-debiandoc does not handle XML and therefore was not really an option. Both of the others were tested; both had issues. po4a had no problems converting the XML files, but would often divide text that logically belongs together into separate strings, mainly if a paragraph contained tags (like an example or footnote), which makes translation harder. poxml would keep paragraphs nicely together (including embedded tags), but choked on the fact that a lot of the individual XML files are not “well-formed” XML. In the end we chose poxml and built a set of scripts that would integrate multiple XML files into a single XML file per chapter, which is always well-formed. In practice we feel having the larger templates helps translation rather than hinders it.

For the old XML based translations we had already introduced a rudimentary translation statistics overview showing number of files translated, not translated and, more recently, outdated. There also are scripts available to translators showing which files need updating and the changes in the English text since the last update of the translation.

For the translations using PO files, these are of course no longer needed. Instead we make use of the same framework as used for the translation statistics of the levels introduced earlier. Translations using PO files are also included in the spellchecking framework developed by Davide Viti, although this has not yet been formally introduced because some minor improvements are needed to make the spellchecks really usable.

The PO file framework has its own set of scripts that "merge" the individual XML files into the XML file per chapter, update the POT files after changes in the English XML files, update the PO files of translations and generate translated XML files from the PO files. These scripts are currently only used automated for the daily unofficial build; for the official builds the generated XML files are saved to the Sarge branch of the repository. There is one special script that can convert an existing XML-based translation to PO files.

3.3. How can you help?

There are four areas where help with the manual is very welcome: general review and editing; architecture specific review and editing; build system improvement; translation. The last one is fairly obvious, but the others deserve some elaboration. If you would like to help, please mail us at debian-boot@lists.debian.org.

We are still making changes for the Sarge version of the manual and about once every two weeks the latest revision of both the English version and updated translations is published on the Debian website. This also means that the version available on installation CDs and in the debian-installer-manual package are somewhat outdated.

For reference, here are our own plans for the near future:

  • reorganization of entities used:

    1. to make branding of the manual easier for derived distributions

    2. to make some entities translatable

  • restucturing of chapters 2 and 3 (although that has been in the planning for quite some time already)

  • create a style guide for the different formats and improve the readability and consistency on the manual

3.3.1. General review and editing

Review is something everybody can help with: take the manual, read through it, perform installation tests and let us know what parts can be improved. General remarks are not really useful at the moment as we are already aware that some restructuring is needed, especially in the chapters 2 to 5. However, if you see things that are just plain wrong, or if there are things in the chapters “Welcome to Debian”, “Using the Debian Installer” or “Booting Into Your New Debian System” that you think should be improved, suggestions are very welcome (especially if a proposed new text is attached).

If you want to contribute on a higher level, you should invest some time to get familiar with the installer: try different installation methods and options, preferably on some different architectures[3]. You will also need to get familiar with the English XML files, but that should not be a big problem.

3.3.2. Architecture specific review and editing

A lot of sections in the manual are architecture specific. For some architectures these have not been checked at all and so still document a Woody boot-floppies installation more than a Sarge debian-installer installation. For these architectures a warning has been included at the start of the Installation Guide.

What we face here is basically the same issue already mentioned earlier in this article and also partly at the basis of the Vancouver proposal: we really need people involved in our ports to care about other aspects of porting than getting the kernel to run and packages built.

Help for all architectures is welcome, but the following are officially in bad shape (some more than others): arm, hppa, mips(el), powerpc, s390, sparc. The most urgent of those obviously is powerpc as that is likely to have the largest newbie user base.

Working on this will probably mean you will have to get at least somewhat familiar with the XML files to find out which parts are architecture specific and how architecture specific changes can safely be made.

3.3.3. Build system improvement

The build system currently is fairly complicated. Part of this is probably inevetable, but some parts could probably be improved by using different tools to generate e.g. the PDF and TXT formats. What would be very welcome is some help from packagers and experienced users of available tools to help choose the best toolset or improve the way we use the current tools.

Currently we can not build a PDF or PS version of the manual for Greek and for oriental languages. Also, the font used for Russian is not very nice. Nikolai Prokoschenko has done a lot of preliminary work using db2latex to solve this. However help is needed to either finish his work and package it for Debian or to find another solution.

There also seems to be interest to use the tools used for the manual for other Debian documents, although this seems to depend on who you talk to. It would be very nice to get some more interaction with other groups within Debian working on documentation and documentation toolsets and maybe work towards a more common infrastructure.

4. The future of Debian Installer: how to help

Here are some ways you can get involved and help debian-installer development.

4.1. Translate debian-installer

We are still looking for more new languages, as well as more translators to keep the translations we have current.

4.2. Make it install an UTF-8 system

With all these languages, it's time Debian became UTF-8 by default. If you agree, then research the pitfalls, convince everyone this is the right choice, and make it happen.

4.3. Get involved in supporting your architecture in debian-installer

debian-installer has between zero and two porters for most architectures. This sometimes results in debian-installer developers who have little experience with an architecture making important changes for it. For example, a few months ago kernel updates for debian-installer hppa had to be done by a developer who knew very little about the architecture. This was not ideal.

Continued involvement of porters to the less popular architectures is essential if debian-installer is to continue to support those architectures let alone improve its support. It's quite possible that future debian-installer beta releases for etch may not be available for all ports, if we don't have active developers and testers for those ports.

We wouldn't mind seeing debian-installer work as an installer for the Hurd or a BSD either.

4.4. Work on debian-installer updates for sarge

debian-installer's modular design makes it pretty easy to update individual udebs in a sarge point release to fix important bugs. Many such bugs are already known and fixed in unstable; some fixes need backporting to sarge, while some fixes are backported to our sarge branch and need only testing and uploads.

Some examples of things it would be nice to fix include some nasty bugs in translations (especially debconf substitution variable typos), updates to the mirror list to reflect changes since rc3, kernel security updates, and updates to the discover1-data hardware detection database to make sarge autodetect some more hardware.

4.5. Work on supporting sarge installs using the etch installer

It should be possible to install sarge using a debian-installer with a newer kernel than 2.6.8 and 2.4.27, so that as new hardware comes out sarge can still be installed on it. Etch's debian-installer is planned to be backwards compatible and capable of installing sarge, but to make it work usefully some packages in sarge (such as kernel and hardware detection packages) will need to be updated in a backports-like repository. And someone would need to do the work to set up such a repository, put together a version of debian-installer that can use this repository, and make CD images based on it.

4.6. Help us develop a graphical installer

Work on a graphical version of debian-installer has already begun, and indeed it's now mostly usable. It still needs a lot of work, so people proficient in UI polishing, graphics, and the technical side of making it work well on a variety of graphics hardware are needed in the debian-installer team.

4.7. Help improve automated installs

Automated installs work for sarge, but the documentation is poor, it can be hard to set up, and the need to preseed parameters to set up a network interface at the kernel boot line is unwieldy and problematic. Also work needs to be done on making the partitioner more flexible and friendly for automated installs, enabling automated LVM and RAID setup, and so on.

4.8. Add encrypted filesystem support to partman

Partman was designed with support for encrypted filesystems in mind, so this is just a small matter of code. Your laptop will thank you.

4.9. Help improve hardware detection

We hope to be switching debian-installer to use hotplug for hardware detection, at least for 2.6, based on work done by Ubuntu, and we need people to help work on that. Also one of sarge's weak points is that SATA often doesn't work very well, or that other HDD modules are loaded in the wrong order to work properly. We also have issues with network interfaces sometimes changing names between the installer and the installed system.

4.10. Help support non-free drivers/firmware

Since Debian is removing non-free firmware from main for etch, we need to make sure that debian-installer can still make use of that firmware, from non-free, if the user needs it. We have some interesting ideas involving initramfs and catting non-free images to free debian-installer images, but all this needs developers to make happen or to come up with better ideas.

4.11. Help support secure apt

debian-installer will need changes to support secure apt and signed Packages files. To secure the whole installation, debian-installer needs to check signatures of the udebs it downloads, too.

4.12. Help support installation by the blind

Sarge's installer only supports blind users installing on i386 with 2.4, using floppies. We need someone who is committed to making this better and making debian-installer fully accessible to these users.

4.13. Help improve base-config, X setup, etc

Because the installation doesn't finish where debian-installer proper leaves off. This part of the install is noticeably lacking polish compared to the rest of debian-installer, and we need to find ways to improve that. One possibility is to move all of this part into the first stage of the installer, so that the installer boots directly into a fully installer desktop system.

4.14. Add support for installs via PPPOE

PPPOE is not well integrated with the installer, it is not supported in the first stage install and has to be set up by hand in the second stage. To get this improved we need a developer who is familiar with PPPOE.

4.15. Make it easier to build custom CDs and otherwise customize the installer

Many of our users need custom versions of the installer, and we need improved documentation and/or better tools for doing this.

A particular problem area is CD generation, which is very hard to get working. That could be improved by making some easier CD creation tool (like pickax) support debian-installer, or improving debian-cd, or some other way.

Another problem area is building versions of the installer with a custom kernel or third party modules. This needs to be made easier, so that a regular user can hope to do it, and it needs to be well documented.

4.16. Find a way to remove a single question from the standard install process

Every question the installer asks a user is one more place for the installation process to go wrong or for a newbie to give up. During your next install, pick a question and figure out ways to do away with it.

Some candidates:

  • Why should the installer ask the user to choose a Debian mirror when it can pick a good one?

  • Could the installer get enough sanity checks added so it could sometimes safely install grub without asking the user for confirmation?

  • Could the installer automatically resize existing Windows partitions to make room for Debian, without the user having to work out how to do it by hand?

  • Could the installer figure out that a machine is a laptop and install the right set of packages for it?

4.17. Invent the next great thing in debian-installer

The debian-installer developers love to take a good idea and run with it. For example, last spring some basic work was contributed to let debian-installer detect other OSes and other linux distros also installed on the machine. Now we detect a dozen or so OSes and all major distros; this feature is a key part of debian-installer and is used in the bootloader setup, partitioner, and elsewhere. It wouldn't have happened without that initial idea and the beginning of work on what has become our os-prober component. Next time you install sarge, see if you cannot think of novel way that debian-installer could improve the installation process -- and then work on coding up enough of it so there's a plausible premise of a nice feature -- and we will help you take it the rest of the way.

4.18. Add moon-buggy to debian-installer

This one has just been on our TODO list for too long to be ignored..

5. Conclusion

Please don't assume that debian-installer development is over, or that it can be put off until near the release of etch. We lost some development momentum while the installer was frozen for the sarge release. Now we're trying to ramp things back up, involve as many developers with the installer as we can, and make sarge's old installer, in a year's time, look as quaint as the boot-floppies do today.

So, check out the source, read the docs and join the team!



[1] Internationalization, often abbreviated to i18n, is the action of getting software ready to work with different languages in different countries

[2] Localization, often abbreviated to l10n, is the action of translating software and documentation to different languages

[3] Use hercules to install Debian on an emulated S/390 mainframe!