Planet Skolelinux

April 23, 2014

Petter Reinholdtsen

Install hardware dependent packages using tasksel (Isenkram 0.7)

It would be nice if it was easier in Debian to get all the hardware related packages relevant for the computer installed automatically. So I implemented one, using my Isenkram package. To use it, install the tasksel and isenkram packages and run tasksel as user root. You should be presented with a new option, "Hardware specific packages (autodetected by isenkram)". When you select it, tasksel will install the packages isenkram claim is fit for the current hardware, hot pluggable or not.

The implementation is in two files, one is the tasksel menu entry description, and the other is the script used to extract the list of packages to install. The first part is in /usr/share/tasksel/descs/isenkram.desc and look like this:

Task: isenkram
Section: hardware
Description: Hardware specific packages (autodetected by isenkram)
 Based on the detected hardware various hardware specific packages are
 proposed.
Test-new-install: mark show
Relevance: 8
Packages: for-current-hardware

The second part is in /usr/lib/tasksel/packages/for-current-hardware and look like this:

#!/bin/sh
#
(
    isenkram-lookup
    isenkram-autoinstall-firmware -l
) | sort -u

All in all, a very short and simple implementation making it trivial to install the hardware dependent package we all may want to have installed on our machines. I've not been able to find a way to get tasksel to tell you exactly which packages it plan to install before doing the installation. So if you are curious or careful, check the output from the isenkram-* command line tools first.

The information about which packages are handling which hardware is fetched either from the isenkram package itself in /usr/share/isenkram/, from git.debian.org or from the APT package database (using the Modaliases header). The APT package database parsing have caused a nasty resource leak in the isenkram daemon (bugs #719837 and #730704). The cause is in the python-apt code (bug #745487), but using a workaround I was able to get rid of the file descriptor leak and reduce the memory leak from ~30 MiB per hardware detection down to around 2 MiB per hardware detection. It should make the desktop daemon a lot more useful. The fix is in version 0.7 uploaded to unstable today.

I believe the current way of mapping hardware to packages in Isenkram is is a good draft, but in the future I expect isenkram to use the AppStream data source for this. A proposal for getting proper AppStream support into Debian is floating around as DEP-11, and GSoC project will take place this summer to improve the situation. I look forward to seeing the result, and welcome patches for isenkram to start using the information when it is ready.

If you want your package to map to some specific hardware, either add a "Xb-Modaliases" header to your control file like I did in the pymissile package or submit a bug report with the details to the isenkram package. See also all my blog posts tagged isenkram for details on the notation. I expect the information will be migrated to AppStream eventually, but for the moment I got no better place to store it.

April 23, 2014 12:50 PM

April 15, 2014

Petter Reinholdtsen

FreedomBox milestone - all packages now in Debian Sid

The Freedombox project is working on providing the software and hardware to make it easy for non-technical people to host their data and communication at home, and being able to communicate with their friends and family encrypted and away from prying eyes. It is still going strong, and today a major mile stone was reached.

Today, the last of the packages currently used by the project to created the system images were accepted into Debian Unstable. It was the freedombox-setup package, which is used to configure the images during build and on the first boot. Now all one need to get going is the build code from the freedom-maker git repository and packages from Debian. And once the freedombox-setup package enter testing, we can build everything directly from Debian. :)

Some key packages used by Freedombox are freedombox-setup, plinth, pagekite, tor, privoxy, owncloud and dnsmasq. There are plans to integrate more packages into the setup. User documentation is maintained on the Debian wiki. Please check out the manual and help us improve it.

To test for yourself and create boot images with the FreedomBox setup, run this on a Debian machine using a user with sudo rights to become root:

sudo apt-get install git vmdebootstrap mercurial python-docutils \
  mktorrent extlinux virtualbox qemu-user-static binfmt-support \
  u-boot-tools
git clone http://anonscm.debian.org/git/freedombox/freedom-maker.git \
  freedom-maker
make -C freedom-maker dreamplug-image raspberry-image virtualbox-image

Root access is needed to run debootstrap and mount loopback devices. See the README in the freedom-maker git repo for more details on the build. If you do not want all three images, trim the make line. Note that the virtualbox-image target is not really virtualbox specific. It create a x86 image usable in kvm, qemu, vmware and any other x86 virtual machine environment. You might need the version of vmdebootstrap in Jessie to get the build working, as it include fixes for a race condition with kpartx.

If you instead want to install using a Debian CD and the preseed method, boot a Debian Wheezy ISO and use this boot argument to load the preseed values:

url=http://www.reinholdtsen.name/freedombox/preseed-jessie.dat

I have not tested it myself the last few weeks, so I do not know if it still work.

If you wonder how to help, one task you could look at is using systemd as the boot system. It will become the default for Linux in Jessie, so we need to make sure it is usable on the Freedombox. I did a simple test a few weeks ago, and noticed dnsmasq failed to start during boot when using systemd. I suspect there are other problems too. :) To detect problems, there is a test suite included, which can be run from the plinth web interface.

Give it a go and let us know how it goes on the mailing list, and help us get the new release published. :) Please join us on IRC (#freedombox on irc.debian.org) and the mailing list if you want to help make this vision come true.

April 15, 2014 08:10 PM

April 11, 2014

Petter Reinholdtsen

Språkkoder for POSIX locale i Norge

For 12 år siden, skrev jeg et lite notat om bruk av språkkoder i Norge. Jeg ble nettopp minnet på dette da jeg fikk spørsmål om notatet fortsatt var aktuelt, og tenkte det var greit å repetere hva som fortsatt gjelder. Det jeg skrev da er fortsatt like aktuelt.

Når en velger språk i programmer på unix, så velger en blant mange språkkoder. For språk i Norge anbefales følgende språkkoder (anbefalt locale i parantes):

nb (nb_NO)
Bokmål i Norge
nn (nn_NO)
Nynorsk i Norge
se (se_NO)
Nordsamisk i Norge

Alle programmer som bruker andre koder bør endres.

Språkkoden bør brukes når .po-filer navngis og installeres. Dette er ikke det samme som locale-koden. For Norsk Bokmål, så bør filene være navngitt nb.po, mens locale (LANG) bør være nb_NO.

Hvis vi ikke får standardisert de kodene i alle programmene med norske oversettelser, så er det umulig å gi LANG-variablen ett innhold som fungerer for alle programmer.

Språkkodene er de offisielle kodene fra ISO 639, og bruken av dem i forbindelse med POSIX localer er standardisert i RFC 3066 og ISO 15897. Denne anbefalingen er i tråd med de angitte standardene.

Følgende koder er eller har vært i bruk som locale-verdier for "norske" språk. Disse bør unngås, og erstattes når de oppdages:

norwegian-> nb_NO
bokmål -> nb_NO
bokmal -> nb_NO
nynorsk -> nn_NO
no -> nb_NO
no_NO -> nb_NO
no_NY -> nn_NO
sme_NO -> se_NO

Merk at når det gjelder de samiske språkene, at se_NO i praksis henviser til nordsamisk i Norge, mens f.eks. smj_NO henviser til lulesamisk. Dette notatet er dog ikke ment å gi råd rundt samiske språkkoder, der gjør Divvun-prosjektet en bedre jobb.

Referanser:

April 11, 2014 07:30 PM

April 09, 2014

Petter Reinholdtsen

S3QL, a locally mounted cloud file system - nice free software

For a while now, I have been looking for a sensible offsite backup solution for use at home. My requirements are simple, it must be cheap and locally encrypted (in other words, I keep the encryption keys, the storage provider do not have access to my private files). One idea me and my friends had many years ago, before the cloud storage providers showed up, was to use Google mail as storage, writing a Linux block device storing blocks as emails in the mail service provided by Google, and thus get heaps of free space. On top of this one can add encryption, RAID and volume management to have lots of (fairly slow, I admit that) cheap and encrypted storage. But I never found time to implement such system. But the last few weeks I have looked at a system called S3QL, a locally mounted network backed file system with the features I need.

S3QL is a fuse file system with a local cache and cloud storage, handling several different storage providers, any with Amazon S3, Google Drive or OpenStack API. There are heaps of such storage providers. S3QL can also use a local directory as storage, which combined with sshfs allow for file storage on any ssh server. S3QL include support for encryption, compression, de-duplication, snapshots and immutable file systems, allowing me to mount the remote storage as a local mount point, look at and use the files as if they were local, while the content is stored in the cloud as well. This allow me to have a backup that should survive fire. The file system can not be shared between several machines at the same time, as only one can mount it at the time, but any machine with the encryption key and access to the storage service can mount it if it is unmounted.

It is simple to use. I'm using it on Debian Wheezy, where the package is included already. So to get started, run apt-get install s3ql. Next, pick a storage provider. I ended up picking Greenqloud, after reading their nice recipe on how to use S3QL with their Amazon S3 service, because I trust the laws in Iceland more than those in USA when it come to keeping my personal data safe and private, and thus would rather spend money on a company in Iceland. Another nice recipe is available from the article S3QL Filesystem for HPC Storage by Jeff Layton in the HPC section of Admin magazine. When the provider is picked, figure out how to get the API key needed to connect to the storage API. With Greencloud, the key did not show up until I had added payment details to my account.

Armed with the API access details, it is time to create the file system. First, create a new bucket in the cloud. This bucket is the file system storage area. I picked a bucket name reflecting the machine that was going to store data there, but any name will do. I'll refer to it as bucket-name below. In addition, one need the API login and password, and a locally created password. Store it all in ~root/.s3ql/authinfo2 like this:

[s3c]
storage-url: s3c://s.greenqloud.com:443/bucket-name
backend-login: API-login
backend-password: API-password
fs-passphrase: local-password

I create my local passphrase using pwget 50 or similar, but any sensible way to create a fairly random password should do it. Armed with these details, it is now time to run mkfs, entering the API details and password to create it:

# mkdir -m 700 /var/lib/s3ql-cache
# mkfs.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
  --ssl s3c://s.greenqloud.com:443/bucket-name
Enter backend login: 
Enter backend password: 
Before using S3QL, make sure to read the user's guide, especially
the 'Important Rules to Avoid Loosing Data' section.
Enter encryption password: 
Confirm encryption password: 
Generating random encryption key...
Creating metadata tables...
Dumping metadata...
..objects..
..blocks..
..inodes..
..inode_blocks..
..symlink_targets..
..names..
..contents..
..ext_attributes..
Compressing and uploading metadata...
Wrote 0.00 MB of compressed metadata.
# 

The next step is mounting the file system to make the storage available.

# mount.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
  --ssl --allow-root s3c://s.greenqloud.com:443/bucket-name /s3ql
Using 4 upload threads.
Downloading and decompressing metadata...
Reading metadata...
..objects..
..blocks..
..inodes..
..inode_blocks..
..symlink_targets..
..names..
..contents..
..ext_attributes..
Mounting filesystem...
# df -h /s3ql
Filesystem                              Size  Used Avail Use% Mounted on
s3c://s.greenqloud.com:443/bucket-name  1.0T     0  1.0T   0% /s3ql
#

The file system is now ready for use. I use rsync to store my backups in it, and as the metadata used by rsync is downloaded at mount time, no network traffic (and storage cost) is triggered by running rsync. To unmount, one should not use the normal umount command, as this will not flush the cache to the cloud storage, but instead running the umount.s3ql command like this:

# umount.s3ql /s3ql
# 

There is a fsck command available to check the file system and correct any problems detected. This can be used if the local server crashes while the file system is mounted, to reset the "already mounted" flag. This is what it look like when processing a working file system:

# fsck.s3ql --force --ssl s3c://s.greenqloud.com:443/bucket-name
Using cached metadata.
File system seems clean, checking anyway.
Checking DB integrity...
Creating temporary extra indices...
Checking lost+found...
Checking cached objects...
Checking names (refcounts)...
Checking contents (names)...
Checking contents (inodes)...
Checking contents (parent inodes)...
Checking objects (reference counts)...
Checking objects (backend)...
..processed 5000 objects so far..
..processed 10000 objects so far..
..processed 15000 objects so far..
Checking objects (sizes)...
Checking blocks (referenced objects)...
Checking blocks (refcounts)...
Checking inode-block mapping (blocks)...
Checking inode-block mapping (inodes)...
Checking inodes (refcounts)...
Checking inodes (sizes)...
Checking extended attributes (names)...
Checking extended attributes (inodes)...
Checking symlinks (inodes)...
Checking directory reachability...
Checking unix conventions...
Checking referential integrity...
Dropping temporary indices...
Backing up old metadata...
Dumping metadata...
..objects..
..blocks..
..inodes..
..inode_blocks..
..symlink_targets..
..names..
..contents..
..ext_attributes..
Compressing and uploading metadata...
Wrote 0.89 MB of compressed metadata.
# 

Thanks to the cache, working on files that fit in the cache is very quick, about the same speed as local file access. Uploading large amount of data is to me limited by the bandwidth out of and into my house. Uploading 685 MiB with a 100 MiB cache gave me 305 kiB/s, which is very close to my upload speed, and downloading the same Debian installation ISO gave me 610 kiB/s, close to my download speed. Both were measured using dd. So for me, the bottleneck is my network, not the file system code. I do not know what a good cache size would be, but suspect that the cache should e larger than your working set.

I mentioned that only one machine can mount the file system at the time. If another machine try, it is told that the file system is busy:

# mount.s3ql --cachedir /var/lib/s3ql-cache --authfile /root/.s3ql/authinfo2 \
  --ssl --allow-root s3c://s.greenqloud.com:443/bucket-name /s3ql
Using 8 upload threads.
Backend reports that fs is still mounted elsewhere, aborting.
#

The file content is uploaded when the cache is full, while the metadata is uploaded once every 24 hour by default. To ensure the file system content is flushed to the cloud, one can either umount the file system, or ask S3QL to flush the cache and metadata using s3qlctrl:

# s3qlctrl upload-meta /s3ql
# s3qlctrl flushcache /s3ql
# 

If you are curious about how much space your data uses in the cloud, and how much compression and deduplication cut down on the storage usage, you can use s3qlstat on the mounted file system to get a report:

# s3qlstat /s3ql
Directory entries:    9141
Inodes:               9143
Data blocks:          8851
Total data size:      22049.38 MB
After de-duplication: 21955.46 MB (99.57% of total)
After compression:    21877.28 MB (99.22% of total, 99.64% of de-duplicated)
Database size:        2.39 MB (uncompressed)
(some values do not take into account not-yet-uploaded dirty blocks in cache)
#

I mentioned earlier that there are several possible suppliers of storage. I did not try to locate them all, but am aware of at least Greenqloud, Google Drive, Amazon S3 web serivces, Rackspace and Crowncloud. The latter even accept payment in Bitcoin. Pick one that suit your need. Some of them provide several GiB of free storage, but the prize models are quite different and you will have to figure out what suits you best.

While researching this blog post, I had a look at research papers and posters discussing the S3QL file system. There are several, which told me that the file system is getting a critical check by the science community and increased my confidence in using it. One nice poster is titled "An Innovative Parallel Cloud Storage System using OpenStack’s SwiftObject Store and Transformative Parallel I/O Approach" by Hsing-Bung Chen, Benjamin McClelland, David Sherrill, Alfred Torrez, Parks Fields and Pamela Smith. Please have a look.

Given my problems with different file systems earlier, I decided to check out the mounted S3QL file system to see if it would be usable as a home directory (in other word, that it provided POSIX semantics when it come to locking and umask handling etc). Running my test code to check file system semantics, I was happy to discover that no error was found. So the file system can be used for home directories, if one chooses to do so.

If you do not want a locally file system, and want something that work without the Linux fuse file system, I would like to mention the Tarsnap service, which also provide locally encrypted backup using a command line client. It have a nicer access control system, where one can split out read and write access, allowing some systems to write to the backup and others to only read from it.

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

April 09, 2014 09:30 AM

April 08, 2014

Petter Reinholdtsen

EU-domstolen bekreftet i dag at datalagringsdirektivet er ulovlig

I dag kom endelig avgjørelsen fra EU-domstolen om datalagringsdirektivet, som ikke overraskende ble dømt ulovlig og i strid med borgernes grunnleggende rettigheter. Hvis du lurer på hva datalagringsdirektivet er for noe, så er det en flott dokumentar tilgjengelig hos NRK som jeg tidligere har anbefalt alle å se.

Her er et liten knippe nyhetsoppslag om saken, og jeg regner med at det kommer flere ut over dagen. Flere kan finnes via mylder.

Jeg synes det er veldig fint at nok en stemme slår fast at totalitær overvåkning av befolkningen er uakseptabelt, men det er fortsatt like viktig å beskytte privatsfæren som før, da de teknologiske mulighetene fortsatt finnes og utnyttes, og jeg tror innsats i prosjekter som Freedombox og Dugnadsnett er viktigere enn noen gang.

Update 2014-04-08 12:10: Kronerullingen for å stoppe datalagringsdirektivet i Norge gjøres hos foreningen Digitalt Personvern, som har samlet inn 843 215,- så langt men trenger nok mye mer hvis ikke Høyre og Arbeiderpartiet bytter mening i saken. Det var kun partinene Høyre og Arbeiderpartiet som stemte for Datalagringsdirektivet, og en av dem må bytte mening for at det skal bli flertall mot i Stortinget. Se mer om saken Holder de ord.

April 08, 2014 09:30 AM