Portage tree using SquashFS and OverlayFS

Category: Gentoo Tags: Portage, SquashFS, OverlayFS

Howto put your portage main repository in squashfs image and still be able to update/sync using overlayfs.


Preface

Gentoo normally stores the portage repository tree under /usr/portage. It consists of roughly 200k files and needs arround 950MB of diskspace.
Thats alot if you run from a flash card or usb stick. Especially the number files exhausts a large portion of inodes on smaller partitions, so you might need to adjust them when creating the filesystem.

Putting the portage tree into a squashfs image results in only one file and arround 100MB of diskspace in standard configuration. Thats much better, but squashfs images are not writeable .... what to do when updating/syncing your portage tree.

There are so called overlay filesystems that come in handy, and believe it or not an overlayfs is integrated in linux kernel since 3.18 ... so you would need no extra kernel patches to implement.

So how does this work. We build an squashfs image from the portage tree. Mount the squashfs and put an overlayfs on top of it.
Now the tree is writeable and portage updates/syncs are possible.
Later we only need to update the squashfs-image when the tree has changed.


Preparations - Kernel

First you need to make sure squashfs and overlayfs are enabled in your kernel.

File systems  --->
 [*] Miscellaneous filesystems  --->
   <*>   SquashFS 4.0 - Squashed file system support
...
 <*> Overlay filesystem support

Also enable additional compression algorithms if you use em and 4k block device if your run from flash card or usb.

To mount an image you need a loop device too.

[*] Block devices  --->
  <*>   Loopback device support

We will use /dev/shm for write operations. But /dev/shm should already be enabled and mounted as it is standard for gentoo. So there should be no need for further adjustments.

You can of course build the requirements as modules rather then statically in the kernel.


Preparations - Filesystem

Emerge also stores distfiles within your portage tree. They are rather large and already compressed, so there is no need to put them into the image. They would also exhaust RAM considering all changes of overlayfs are backed by /dev/shm.

I store those files under /var/lib/portage/distfiles.

mkdir -p /var/lib/portage/distfiles
chown portage:portage /var/lib/portage/distfiles

Tell portage to use this directory for storing distfiles. Edit /etc/portage/make.conf and add:

DISTDIR="/var/lib/portage/distfiles"

Remove old distfiles from current portage tree.

rm /usr/portage/distfiles/*

Emerge squashfs-tools for creating squashfs filesystems.

emerge -av squashfs-tools

Enable desired compression algorithms via use flags.

Create a squashfs image from your current portage tree.

mksquashfs /usr/portage /usr/portage.sqfs

Now you can remove your portage tree. If you are unsure, you can also keep it and delete it later.

rm -r /usr/portage/*

Finally we need a directory to mount the squashfs-image. It must on the same filesystem as the overlay mount point.

mkdir /usr/portage-ro


The real thing

Ok everything is setup. Now put it to use. I wrote an init skript (based on some script made for aufs, can't remember where I found it).

#!/sbin/openrc-run
# Copyright 1999-2016 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
# $Id$
# /etc/init.d/squash_portage

SQFS_CUR="/usr/portage.sqfs"
OVERLAY_DIR="/dev/shm/.portage-overlay"
RO_DIR="/usr/portage-ro"
PORT_DIR="/usr/portage"
UPPER_DIR="$OVERLAY_DIR/upper"
WORK_DIR="$OVERLAY_DIR/work"
SQFS_NEW="$OVERLAY_DIR/portage-new.sqfs"

depend() {
    need localmount
}

start() {
    ebegin "Mounting read-only portage tree squashfs image"
    [ -d $RO_DIR ] || mkdir -p $RO_DIR
    mount -rt squashfs -o loop,nodev,noexec $SQFS_CUR $RO_DIR
    retval=$?
    eend $?
    [ $retval -ne 0 ] && return $retval

     ebegin "Mounting read-write portage tree with overlay"
     [ -d $UPPER_DIR ] || mkdir -p $UPPER_DIR
     [ -d $WORK_DIR ] || mkdir -p $WORK_DIR

    mount -t overlay -o lowerdir=$RO_DIR,upperdir=$UPPER_DIR,workdir=$WORK_DIR overlay $PORT_DIR
    eend $?
}

stop() {
    if [ ! -z `ls $UPPER_DIR | head -n1` ]; then
        ebegin "Update portage tree squashfs"
#        mksquashfs $PORT_DIR $SQFS_NEW -comp xz -b 1M -Xdict-size 100% 2>/dev/null
        mksquashfs $PORT_DIR $SQFS_NEW 2>/dev/null
        eend $?
    fi
    ebegin "Unmounting portage tree"
    umount -t overlay $PORT_DIR
    umount -t squashfs $RO_DIR
    if [ -f $SQFS_NEW ]; then
        mv $SQFS_NEW $SQFS_CUR
    fi
    rm -rf $OVERLAY_DIR
    eend 0
}

So what happens here. All files/directories are defined as variables and should match the ones in preparations section above.

start() gets called when the service is started, stop() gets called when the service is stopped (what a magic).

At first the squashfs image of portage is mounted at our readonly portage directory.

mount -rt squashfs -o loop,nodev,noexec $SQFS_CUR $RO_DIR

Then directories for handling the overlayfs are created. They need to be created everytime the system starts up, cause /dev/shm is ram based.

[ -d $UPPER_DIR ] || mkdir -p $UPPER_DIR
[ -d $WORK_DIR ] || mkdir -p $WORK_DIR

Finally the overlayfs is mounted at the actual portage directory.

mount -t overlay -o lowerdir=$RO_DIR,upperdir=$UPPER_DIR,workdir=$WORK_DIR overlay $PORT_DIR

Overlayfs needs 3 directories. A "lowerdir" which can be readonly (our squashfs mountpoint), an "upperdir" where write operations are stored (a dir in /dev/shm) and a "workdir" (also in /dev/shm).

That's it about having a writeable portage tree on top of a squashfs image.

But all changes are made in /dev/shm so they remain in RAM. They must be persisted if we don't want to sync the same changes everytime the system is started. Have a look at the stop() function.

First there is an detection of changes in our "upperdir".

if [ ! -z `ls $UPPER_DIR | head -n1` ]; then

If there are any changes a new squashfs image is created from the current portage tree.

mksquashfs $PORT_DIR $SQFS_NEW 2>/dev/null

This is how changes are persisted.

Overlayfs and squashfs image are unmounted.

umount -t overlay $PORT_DIR
umount -t squashfs $RO_DIR

And if there is a new squashfs image the current image is overwritten.
At last the working directory in /dev/shm is deleted to free any ram that was used by overlayfs.

Save the script under /etc/init.d/squash_portage and make it excutable and start it.

chmod a+x /etc/init.d/squash_portage
service squash_portage start

If everything went fine update your portage tree and restart the squash_portage service.

emerge --sync
service squash_portage restart

If there were any changes the image should have been regenerated.

If you not did earlier you can new remove your old portage tree. Also add the squash_portage service to the default runlevel.

service squash_portage stop
rm -r /usr/portage/*
rc-update add squash_portage default
service squash_portage start


Fine tuning

In the current setup squashfs is using gzip as compression algorithm. Gzip has an average compression ratio. I achieved best compression with xz.

mksquashfs $PORT_DIR $SQFS_NEW -comp xz -b 1M -Xdict-size 100%

But it takes much more time.

Also the changes are only persisted if the service is stopped. Right now this only happens when the computer is shutdown/rebooted.

Depending on your computers uptime you might want to persist changes in between.
You could schedule a job to restart the squash_portage with cron.
You also can persist it every time the portage tree is updated. To do so place a skript in the postsync hook folder.

#!/bin/bash
# /etc/portage/postsync.d/remount_squash_portage.sh

/etc/init.d/squash_portage restart


Conclusion

For my setup this solution proofed very useful. I don't have to care about file numbers and it also needs less space. I hope you find it useful too.

Btw. I store the kernelsources as squashfs-image too. 52k files/650MB vs 1 file/95MB.But without overlayfs. I build the kernel outside of the sources folder.


08/16/2016


Update

Replaced /sbin/runscript by /sbin/openrc-run.