jmtd → log → Linux Mount Namespaces
I've been refreshing myself on the low-level guts of Linux container technology. Here's some notes on mount namespaces.
In the below examples, I will use more than one root shell
simultaneously. To disambiguate them, the examples will feature
a numbered shell prompt: 1#
for the first shell, and 2#
for
the second.
Preliminaries
Namespaces are normally associated with processes and are
removed when the last associated process terminates. To make
them persistent, you have to bind-mount the corresponding
virtual file from an associated processes's entry in /proc
,
to another path1.
The receiving path needs to have its "propogation" property set to "private". Most likely your system's existing mounts are mostly "public". You can check the propogation setting for mounts with
1# findmnt -o+PROPAGATION
We'll create a new directory to hold mount namespaces we create, and set its Propagation to private, via a bind-mount of itself to itself.
1# mkdir /root/mntns
1# mount --bind --make-private /root/mntns /root/mntns
The namespace itself needs to be bind-mounted over a file rather than a directory, so we'll create one.
1# touch /root/mntns/1
Creating and persisting a new mount namespace
1# unshare --mount=/root/mntns/1
We are now 'inside' the new namespace in a new shell process. We'll change the shell prompt to make this clearer
PS1='inside# '
We can make a filesystem change, such as mounting a tmpfs
inside# mount -t tmpfs /mnt /mnt
inside# touch /mnt/hi-there
And observe it is not visible outside that namespace
2# findmnt /mnt
2# stat /mnt/hi-there
stat: cannot statx '/mnt/hi-there': No such file or directory
Back to the namespace shell, we can find an integer identifier for
the namespace via the shell processes /proc
entry:
inside# readlink /proc/$$/ns/mnt
It will be something like mnt:[4026533646]
.
From another shell, we can list namespaces and see that it
exists:
2# lsns -t mnt
NS TYPE NPROCS PID USER COMMAND
…
4026533646 mnt 1 52525 root -bash
If we exit the shell that unshare
created,
inside# exit
running lsns
again should2 still list the namespace,
albeit with the NPROCS
column now reading 0.
2# lsns -t mnt
We can see that a virtual filesystem of type nsfs
is mounted at
the path we selected when we ran unshare
:
2# grep /root/mntns/1 /proc/mounts
nsfs /root/mntns/1 nsfs rw 0 0
Entering the namespace from another process
This is relatively easy:
1# nsenter --mount=/root/mntns/1
1# stat /mnt/hi-there
File: /mnt/hi-there
…
More to come in future blog posts!
References
These were particularly useful in figuring this out:
- This feels really weird to me. At least at first. I suppose it fits with the "everything is a file" philosophy.↩
-
I've found
lsns
in util-linux 2.38.1 (from 2022-08-04) doesn't list mount namespaces with no associated processes; but 2.41 (from 2025-03-18) does. The fix landed in 2022-11-08. For extra fun, I notice that a namespace can be held persistent with a file descriptor which is unlinked from the filesystem↩
Comments