# cat /dev/two

To content | To menu | To search

Tag - selinux

Entries feed - Comments feed


SSSD and Kerberos Replay Cache

Recently I was contacted about a problem where sssd was "randomly" failing to continue functioning on a rebuilt server. Unreliably reproducible bugs are of course the most annoying to troubleshoot, fortunately this one was not actually random. A quick peek at the debug and audit logs showed sssd getting hung up trying to work with a file named ${hostname}-[0-9]+_0 in /var/tmp which was labeled "sshd_tmp_t". Ah, this old problem. It seems to come up pretty frequently when using a particular kerberos server.

The machine in question was setup to resolve users and groups via sssd which was configured to use kerberos for authentication ( auth_provider = krb5 ) and ldap for identification/authorization ( id_provider = ldap ), the latter of which was binding to its server via GSSAPI ( ldap_sasl_mech = GSSAPI ). Like any good kerberos client, sssd keeps a kerberos replay cache to protect against certain types of attacks, or at least it does when the non-default option of "krb5_validate" is set to "true" (you are setting it to "true", right?). As one would expect, sshd maintains a replay cache too.

When creating the kerberos replay cache, unless overridden, the kerberos libraries decide on the default file name by the identifier of the first viable service key in the selected keytab. That means, if your hostname is "erwin.example.com" and your system keytab happens to look like:


then your default replace cache file will be:


However, if your keytab happens to look like:


then your default replace cache file will be something like:


At this point you might reasonably ask, who cares what the replace cache is named? That is a very reasonably question. In this particular case, SELinux cares, a lot. One of SELinux's primary enforcement models is centered around the label on a file, and one of the methods for ensuring a file is correctly labeled upon creation, is its full path. Absent path based recommendations (subject to policy approval), newly created files are labeled based on the parent process's label and the label of the directory where the file is getting created. (That is simplified a little bit, but it is sufficient for this explanation).

If there were no path based rules in the standard RHEL/Fedora SELinux policy, when sssd wrote the replay cache into /var/tmp, the file would be labeled user_tmp_t. When sshd wrote the replay cache into /var/tmp, the file would be labeled sshd_tmp_t. As one would expect, sssd is not permitted to work with files labeled sshd_tmp_t, those are exclusive to sshd. So, if the file sssd thought was supposed to be the host's kerberos replay cache happened to be labeled sshd_tmp_t, sssd would be unable to manipulate the replay cache and would consequently fail secure (aka, stop functioning). The "random" failure experienced was sssd stopping after any user had authenticated to sshd (thereby causing sshd to write out a replay cache with the "wrong" label), effectively denying sssd access to the replay cache.

To address this, a path based rule is included in SELinux that says

/var/tmp/host_0 --  system_u:object_r:krb5_host_rcache_t:s0

By default, /var/tmp is a world writable space, any file by just about any name and created by any user can exist there. That means any SELinux path based rules for /var/tmp need to be quite specific, one cannot simply say that any file in that directory should be labeled krb5_host_rcache_t. Since the normal keytab layout results in a file name of "host_0", this is nearly always sufficient. However, as we have seen, the solution completely relies on the order inside the keytab being such that the file generated by the kerberos libraries is named "host_0".

That covers why the problem was occurring, on to how to fix it. There are a number of possible resolutions, here are two of them:

  1. Reorder the keytab so the "host/" entry is first
  2. Configure sssd to use a different rcache directory

Reorder the keytab so the "host/" entry is first

This is relatively simple for a single host. ktutil from the krb5-workstation package has the ability to manipulate keytabs. Read the original keytab in twice, delete the extraneous entries to reverse the order, and write the corrected version out to a new file.

$ ktutil
ktutil:  rkt /tmp/bad.keytab
ktutil:  list
slot KVNO Principal
---- ---- ---------------------------------
   1    2                erwin$@EXAMPLE.COM
   2    2 host/erwinexample.com@EXAMPLE.COM

ktutil:  rkt /tmp/bad.keytab
ktutil:  list
slot KVNO Principal
---- ---- ---------------------------------
   1    2                erwin$@EXAMPLE.COM
   2    2 host/erwinexample.com@EXAMPLE.COM
   3    2                erwin$@EXAMPLE.COM
   4    2 host/erwinexample.com@EXAMPLE.COM

ktutil:  delent 1
ktutil:  list
slot KVNO Principal
---- ---- ---------------------------------
   1    2 host/erwinexample.com@EXAMPLE.COM
   2    2                erwin$@EXAMPLE.COM
   3    2 host/erwinexample.com@EXAMPLE.COM

ktutil:  delent 3
ktutil:  list
slot KVNO Principal
---- ---- ---------------------------------
   1    2 host/erwinexample.com@EXAMPLE.COM
   2    2                erwin$@EXAMPLE.COM

ktutil:  wkt /tmp/fixed.keytab

Replace the host keytab with the re-ordered host keytab, restart the associated daemons, and all is well. The kerberos libraries will now generate a replay cache name of "host_0" and the default SELinux path rules will cover everything.

Configure sssd to use a different rcache directory

While rearranging a keytab is the traditional solution, depending on your environment, it may not scale well. This is especially true if your kerberos realm happens to be backed by Active Directory. (Note to self, get around to writing a post on decent ways to join RHEL into an AD domain without samba, it comes up frequently enough)

Fortunately, as of RHEL6.2 (scroll to BZ#732974) sssd provides work around. The bug is unfortunately private, but it points to a patch in sssd >= 1.7 which was also backported to sssd 1.5 and sssd 1.6.

The patch adds an option, "krb5_rcache_dir", which can be used to specify the directory for storage of replay caches and then sets the new default to be "%{_localstatedir}/cache/krb5rcache" (aka "/var/cache/krb5rcache/").

Additionally, as of selinux-policy-3.7.19-105.el6, a supplemental path labeling rule was added:

/var/cache/krb5rcache(/.*)? system_u:object_r:krb5_host_rcache_t:s0

That means, if you are using sssd >= 1.7 or the patched 1.5/1.6 along with selinux-policy >= 3.7.19-105, this problem is already solved for you. The updated sssd writes out its replay cache into "/var/cache/krb5rcache/", and any file in that folder is labeled "krb5_host_rcache_t". It no longer matters what the kerberos libraries happen to name the file.


More "An SELinux constraint violation"

The example used in addressing an SELinux constraint violation touched on giving mount_t the attributes 'mcsreadall' and 'mcswriteall' by way of incorporating calls to the interfaces mcs_file_read_all() and mcs_file_write_all() with the argument 'mount_t'. In part four of that example, one of the most important parts of changing SELinux policy, determining if the modification is acceptable, was glossed over. That resulted in a number of questions, so I will try to address it here.

The interfaces mcs_file_read_all() and mcs_file_write_all() allow the type provided as an argument to bypass the constraints which make categories restrictive. This means, if a type, let's say user_t, was granted mcs_file_read_all(), a user confined by user_t would be able to read files labeled with any category (provided they otherwise had access), and not just those categories to which they had been granted access.

It would be the functional equivalent of granting every instance of user_t SystemLow-SystemHigh (c0.c1024). That would obviously be less than desirable. If you had taken the time to label sensitive files into categories like Finance and Mind_Control_Research, the intent was clearly to protect those files from being read by users not explicitly granted access to those categories.

So, clearly we don't want to be assigning the attributes 'mcsreadall' and 'mcswriteall' to any random type, the grant needs to be carefully considered. In this case, that means, is it appropriate to grant anything labeled 'mount_t' the attributes 'mcsreadall' and 'mcswriteall'?

Ok, well, what is allowed to run as 'mount_t'? Let's start with what can transition to the type. It turns out, a few things:

sesearch -C -T | grep "process mount_t" | sort | uniq
   type_transition automount_t fusermount_exec_t : process mount_t; 
   type_transition automount_t mount_exec_t : process mount_t; 
   type_transition bootloader_t fusermount_exec_t : process mount_t; 
   type_transition bootloader_t mount_exec_t : process mount_t; 
   type_transition crond_t mount_exec_t : process mount_t; 
   type_transition devicekit_disk_t fusermount_exec_t : process mount_t; 
   type_transition devicekit_disk_t mount_exec_t : process mount_t; 
   type_transition hald_t fusermount_exec_t : process mount_t; 
   type_transition hald_t mount_exec_t : process mount_t; 
   type_transition hotplug_t fusermount_exec_t : process mount_t; 
   type_transition hotplug_t mount_exec_t : process mount_t; 
   type_transition initrc_t mount_exec_t : process mount_t; 
   type_transition insmod_t fusermount_exec_t : process mount_t; 
   type_transition insmod_t mount_exec_t : process mount_t; 
   type_transition livecd_t fusermount_exec_t : process mount_t; 
   type_transition livecd_t mount_exec_t : process mount_t; 
   type_transition local_login_t fusermount_exec_t : process mount_t; 
   type_transition local_login_t mount_exec_t : process mount_t; 
   type_transition puppet_t fusermount_exec_t : process mount_t; 
   type_transition puppet_t mount_exec_t : process mount_t; 
   type_transition remote_login_t fusermount_exec_t : process mount_t; 
   type_transition remote_login_t mount_exec_t : process mount_t; 
   type_transition rgmanager_t fusermount_exec_t : process mount_t; 
   type_transition rgmanager_t mount_exec_t : process mount_t; 
   type_transition ricci_modcluster_t fusermount_exec_t : process mount_t; 
   type_transition ricci_modcluster_t mount_exec_t : process mount_t; 
   type_transition ricci_modstorage_t fusermount_exec_t : process mount_t; 
   type_transition ricci_modstorage_t mount_exec_t : process mount_t; 
   type_transition rlogind_t fusermount_exec_t : process mount_t; 
   type_transition rlogind_t mount_exec_t : process mount_t; 
   type_transition rshd_t fusermount_exec_t : process mount_t; 
   type_transition rshd_t mount_exec_t : process mount_t; 
   type_transition sosreport_t fusermount_exec_t : process mount_t; 
   type_transition sosreport_t mount_exec_t : process mount_t; 
   type_transition sshd_t fusermount_exec_t : process mount_t; 
   type_transition sshd_t mount_exec_t : process mount_t; 
   type_transition staff_t fusermount_exec_t : process mount_t; 
   type_transition sysadm_t fusermount_exec_t : process mount_t; 
   type_transition sysadm_t mount_exec_t : process mount_t; 
   type_transition system_cronjob_t mount_exec_t : process mount_t; 
   type_transition udev_t fusermount_exec_t : process mount_t; 
   type_transition udev_t mount_exec_t : process mount_t; 
   type_transition user_t fusermount_exec_t : process mount_t; 
   type_transition xdm_t fusermount_exec_t : process mount_t; 
   type_transition xdm_t mount_exec_t : process mount_t; 
   type_transition xend_t fusermount_exec_t : process mount_t; 
   type_transition xend_t mount_exec_t : process mount_t; 

Each one of those lines indicates the second column, for instance, 'user_t', can transition to 'mount_t', by executing a binary labeled by the third column. In the case of 'user_t', it can transition to 'mount_t' by calling any binary labeled 'fusermount_exec_t'.

Alright then, there are quite a few source types, so let's first focus on the transition points. There seem to be a lot fewer of those.

$ sesearch -C -T | grep "process mount_t" | awk '{ print $3 }' | sort -u

Two, that is manageable. Unfortunately it just tells us that the binaries in question are going to be labeled 'fusermount_exec_t' and 'mount_exec_t', not which binaries. Of course based on the name, you can probably guess which binaries are going to be labeled with those two types, but let's check anyway.

$ egrep -h ':(fuser)?mount_exec_t' /etc/selinux/targeted/contexts/files/*
/bin/mount.*    --  system_u:object_r:mount_exec_t:s0
/bin/umount.*   --  system_u:object_r:mount_exec_t:s0
/sbin/mount.*   --  system_u:object_r:mount_exec_t:s0
/sbin/umount.*  --  system_u:object_r:mount_exec_t:s0
/bin/fusermount --  system_u:object_r:fusermount_exec_t:s0
/usr/bin/fusermount --  system_u:object_r:fusermount_exec_t:s0

Raise your hand if you didn't see that coming. Right, moving on. We are still trying to get an idea of how much we might regret granting 'mount_t' the ability to bypass all category security, and therefor trying to identify what paths exist from a malicious attacker to the type 'mount_t'.

Right now we know that a bad guy running as any of the following could transition to 'mount_t' by way of calling binaries labeled 'mount_exec_t' and 'fusermount_exec_t':


The notable exceptions here are that 'staff_t' and 'user_t' can only use 'fusermount_exec_t', not 'mount_exec_t', and 'crond_t' which can only use 'mount_exec_t'.

Of course beyond just calling something labeled with these exec types, there would have to be flaw in the labeled binary that the malicious individual could exploit. Just to show how this would work, let's make an example.

As root, fake a buggy file labeled 'fusermount_exec_t'.

$ cat << EOF > /scratch/buggymount
id -Z
cat /scratch/tmp/file
$ chmod 755 /scratch/buggymount
$ chcon -u system_u -r object_r -t fusermount_exec_t /scratch/buggymount

Then create a classified file

$ echo "classified" > /scratch/tmp/file
$ chcat +c2 /scratch/tmp/file
$ ls -lZ /scratch/tmp/file | cut -d\  -f 4-
staff_u:object_r:user_tmp_t:s0:c2 /scratch/tmp/file

Trying to read it via an unprivileged shell generates an AVC that is a constraint violation, as does trying to use our fake buggy mount binary.

$ runcon -l s0 /bin/zsh
$ id -Z
$ cat /scratch/tmp/file
cat: /scratch/tmp/file: Permission denied
$ /scratch/buggymount
cat: /scratch/tmp/file: Permission denied

However, if the patch from earlier has been applied:

$ runcon -l s0 /bin/zsh
$ id -Z
$ cat /scratch/tmp/file
cat: /scratch/tmp/file: Permission denied
$ /scratch/buggymount

Do you care? Well, it depends on how paranoid you are (or what your business requirements dictate). As covered earlier the automount problem which originally exposed this constraint violation, can be fixed with a much smaller permission grant. There is likely no need to give mount_t an MCS override for anything more than 'search' on directories. However, there does not currently exist a way to do that without creating a different attribute, splitting up the constraint definitions, and adding an interface. Only then could we give 'mount_t' just the permissions it actually requires.

Such a change would be pretty invasive, so, as many things in computer security do, it boils down to risk management. In a perfect world, permissions should only be as wide as they absolutely need to be. Of course in a perfect world, code wouldn't have bugs, so there would be no way to trick mount into doing something malicious on your behalf, and we would have unlimited resources to develop precisely defined minimal permissions.

Realistically, it is a trade off. Granting 'mount_t' the attribute 'mcsreadall' is a lot better solution than turning off selinux, and the hole it opens is not large. The number of binaries labeled 'mount_t' are similarly small, and most have been around for many years and are heavily audited. That does not mean a bug does not exist or won't be introduced, but it does not exactly have the same security footprint of a web browser. Of course, /bin/fusermount is generally installed suid, which is always a good reason to step up the paranoia.

That is great and all, but what did the experts decide was best in this case? They granted the attributes 'mcsreadall' and 'mcswriteall' to 'mount_t' by way of the interfaces 'mcs_file_read_all()' and 'mcs_file_write_all()'.


An SELinux constraint violation (p 5 of 5)

This is part 5 of a 5 part post on identifying and fixing an· SELinux constraint violation. If you have not already read the previous parts, you may want to start at the beginning.

Giving mount_t the mcsreadall attribute.

Before we get to far down the road, let's build a quick module to test this attribute actually solves the problem.

# create the TE file assigning the attribute
$ cat << EOF > testmcsautofsnfs4.te
module testmcsautofsnfs4 1.0;
require {
    type mount_t;
    attribute mcsreadall;
typeattribute mount_t mcsreadall;

# generate a binary policy module
$ checkmodule -M -m testmcsautofsnfs4.te -o testmcsautofsnfs4.mod

# create a policy module package
$ semodule_package -o testmcsautofsnfs4.pp -m testmcsautofsnfs4.mod

# install the policy module
$ semodule -i testmcsautofsnfs4.pp

After the module is loaded, autofs mounts to the troublesome filesystem work once again, excellent.

Now remove this module to get back to a vanilla policy. We should be using interfaces rather than directly granting attributes.

$ semodule -r testmcsautofsnfs4

The source is already unpacked, and since what we are searching for is relatively unique, it won't take long to find via grep.

$ grep -R mcsreadall modules/
    modules/kernel/mcs.if:      attribute mcsreadall;
    modules/kernel/mcs.if:  typeattribute $1 mcsreadall;
    modules/kernel/mcs.te:attribute mcsreadall;

Sure enough, the interface comes right up.

## <summary>
##      This domain is allowed to read files and directories
##      regardless of their MCS category set.
## </summary>
## <param name="domain">
##      <summary>
##      Domain target for user exemption.
##      </summary>
## </param>
## <rolecap/>
                attribute mcsreadall;

        typeattribute $1 mcsreadall;

If the policy source was not handy, this information can also be quickly pulled from the selinux policy documentation that is part of the "selinux-policy-doc" rpm. That package contains "/usr/share/selinux/devel/policyhelp" which is a quick way to pull up the documentation in your web browser. Use the left hand panel to navigate to kernel -> mcs and mcs_file_read_all is sitting right there.

Ok, so we have the interface, how do we use it? First, update the .te file:

$ cat << 'EOF' > testmcsautofsnfs4.te
policy_module(testmcsautofsnfs4, 1.0)

        type mount_t;


Unfortunately checkmodule does not support the macros necessary to use interfaces, so we have to call make directly.

$ make NAME=targeted -f /usr/share/selinux/devel/Makefile

This will generate a .pp, just as checkmodule + semodule_package had. Now to load it:

$ semodule -i testmcsautofsnfs4.pp

Test the mount again, everything remains working, excellent.

Pushing the change out to multiple systems

A binary policy package file is not terribly easy to centrally manage, what if you need the modification on a lot of systems while waiting for upstream to patch the base policy? Simple, make the module into an rpm to add to your site's internal repository.

The selinux packaging draft covers most of the details, here is a spec file template:

%define modulename examplesiteautofsnfs4
%define selinux_variants mls strict targeted
%define selinux_policyver %(sed -e 's,.*selinux-policy-\\([^/]*\\)/.*,\\1,' /usr/share/selinux/devel/policyhelp)

Summary: Example-Site selinux policy tweak, allow autofs to mount NFS with secontext
Name: examplesiteconf-selinux-autofsnfs4
Version: 1.0
Release: 1%{?dist}
License: GPLv2
Group: System Environment/Base
URL: http://example.org
BuildArch:      noarch
Source0:        %{modulename}.if
Source1:        %{modulename}.te
Source2:        %{modulename}.fc
Source3:        README
BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root
BuildRequires:  checkpolicy, selinux-policy-devel, hardlink, selinux-policy-doc
%if "%{selinux_policyver}" != ""
Requires:       selinux-policy >= %{selinux_policyver}

Provides Example-Site selinux configuration which allows mount_t
to read all mcs ranges. This is needed so that autofs can
successfully mount nfs4 with mount point category labeling.
For more information, see Red Hat case 00000000.

%{__mkdir} -p SELinux
cp -p %{SOURCE0} %{SOURCE1} %{SOURCE2} SELinux
cp %{SOURCE3} .

cd SELinux
for selinuxvariant in %{selinux_variants}
    make NAME=${selinuxvariant} -f /usr/share/selinux/devel/Makefile
    mv %{modulename}.pp %{modulename}.pp.${selinuxvariant}
    make NAME=${selinuxvariant} -f /usr/share/selinux/devel/Makefile clean
cd -

# Install SELinux policy modules
cd SELinux
for selinuxvariant in %{selinux_variants}
    install -d %{buildroot}%{_datadir}/selinux/${selinuxvariant}
    install -p -m 644 %{modulename}.pp.${selinuxvariant} \
cd -
rm -rf %{buildroot}/SELinux

# Hardlink identical policy module packages together
/usr/sbin/hardlink -cv %{buildroot}%{_datadir}/selinux

%{__rm} -rf %{buildroot}

# Install SELinux policy modules
for selinuxvariant in %{selinux_variants}
  /usr/sbin/semodule -s ${selinuxvariant} -i \
    %{_datadir}/selinux/${selinuxvariant}/%{modulename}.pp &> /dev/null || :

# Clean up after package removal
if [ $1 -eq 0 ]; then
  # Remove SELinux policy modules
  for selinuxvariant in %{selinux_variants}
    /usr/sbin/semodule -s ${selinuxvariant} -r %{modulename} &> /dev/null || :

%defattr(-, root, root, 0755)

* Tue Aug 07 2012 Example-Site  1.0-1%{?dist}
- initial build

- page 1 of 5