Software not working? Disable SELinux.

So, this is a break from my normal philosophical theme to talk about a real experience I had this week.

Basically I was trying out some software, without saying what it was or who makes it, I thought it might be helpful for the software development I do at work. For those who don't know I work on SELinux libraries and the policy toolchain at work.

That said, I downloaded the trial version of the software and got to work. The first time I ran it I got a very obscure error, it had to do with how the app was being run (after installing it according to the instructions included). After figuring out the “correct” way to run it I started it up only to get another (more obscure) error. I emailed the support address found on their page, including strace outputs and a description of what happened with both issues.

I got an email from who I originally thought was a low level support staffer who was instructed to say this, but is actually a cofounder of the company. The email read:

Hi Joshua,

I think you may have selinux enabled (FC5 has this by default).

If /etc/selinux/config has the line:

SELINUX=enforcing

then you need to change it to:

SELINUX=disabled

(or permissive should work, although I’ve not tried it).

Unfortunately, you need to reboot for the changes to take effect.

Hope this works for you (if not, please don’t hesitate to ask).

I was a little shocked. I had heard of people being told this by vendors but it never happened to me, until now. My reply, maybe a little terse, said that I couldn’t really disable SELinux and use the software to work on SELinux libraries and programs. I got a reply from the other founder of the company, complete with source code snippets and descriptions on what should be happening and speculations that SELinux was somehow causing the problems. At least he said he was willing to learn about SELinux since I claimed that, while my development machine was in permissive mode, it could not be causing the problem.

The first bug was pretty obvious to me from the code, I pointed it out in a reply and included a test program to reproduce the problem, on SELinux or not. That didn’t solve the more serious problem though, the one I had been unable to work around.

The second bug was a bit more concerning. The source code snippet showed the same variable being used for an open() call, and if open() failed it was printed out. The error was printed out fine but strace showed the open call had garbage in it. While the cofounder I was talking to suspected SELinux was somehow mangling the open call I assured him that SELinux does not ever mangle syscalls and only returns permission errors when something is denied. I was now on a mission to find out the real cause.

During this discussion I mentioned some details on how SELinux works, and in particular how it is implemented in Fedora Core 5 (the distro I had the issues on). On Fedora Core 5 SELinux uses the targeted policy by default. The targeted policy confines only part of the system, specifically high risk network facing daemons that generally run with lots of unnecessary access (think Apache, sendmail, cups, etc). The vast majority of the time a user will not be affected by SELinux running.

Briefly, one minor exception is that Fedora Core 5 made unconfined_t (the unconfined domain that users and other unconfined apps run in) unable to execute the stack, anonymous memory or load libraries with text relocations by default, which is very good. Ulrich Drepper has more information about those permissions. Some 3rd party software is affected by these permissions, for one reason or another. This is not hard to fix without disabling SELinux altogether but some vendors choose to put you at risk rather than do 5 minutes of research.
Back to the issue at hand, the software was not only running in unconfined_t but the system itself was running in permissive (unfortunately this development VM has to be in permissive because I often reload policies that invalidate all the contexts on the system) and therefore it was very unlikely that SELinux was interrupting the software.

The subsequent emails were pretty productive, including a link to a new build of the tool that fixed the first problem. After that I tracked down the second problem, there was some memory corruption based on how I was building my binaries (apparently their software didn’t like static binaries). This has been reported and I’m sure they’ll attend to it, but the point is that it wasn’t SELinux’ fault at all. If this happened to someone who didn’t know better they would have decreased their security for no reason at all.
Now, the point of this article is to talk to vendors and users who are told by their vendors that SELinux is causing problems with their software. SELinux has become the popular thing to pick on, mainly because, it seems, vendors don’t understand it and it had a reputation of breaking things back in the FC2 days (when the old strict policy was used and caused many many problems).

The fact is that SELinux is a disruptive technology. There will always be people who fight against disruptive technologies, people who like things the way they are and see no need to change. I personally believe this disruptive technology is necessary for the future of computing but that is another topic. The most unfortunate part about the naysayers is that by telling people to disable SELinux they aren’t just affecting their app, they are reducing the overall security of the system.

Think about it, a vendor telling you to disable SELinux is, in many ways, like telling you to run chmod -R o+rw / on your system decades ago, since those pesky permissions are clearly the cause of problems in their software. I can make no claims about whether the quality of software has increased or decreased over time but I can say disabling your security system to run an app is rarely the right thing to do.

So how do you know if it really is SELinux’ fault? The main indication is when a denial happens. You can check your audit log or messages file for an AVC (access vector cache) message detailing what the denied access was. Most of the time it isn’t difficult to tell whether a denial is related to the application at hand. If the problem can’t be explained by lack of permissions to do something (the something for SELinux is much more granular and comprehensive than Linux by itself) it almost certainly isn’t SELinux’ fault. If it is a permission problem new SELinux advances will allow you to insert required allow rules into your policy if an application truly needs the access without too much effort (see Dan Walsh’s blog for this sort of help). If it is SELinux’ fault there is a pretty comprehensive list of resolutions at the Fedora Core 5 SELinux FAQ.

I believe it goes without saying that if you have a 3rd party application trying to write to /etc/shadow or /dev/mem you should not add that to your policy and should instead contact the vendor about their badly broken software. Likewise, if the application is doing other nasty things like executing anonymous memory mappings (ofcourse JIT compilers have to do this, but a standard application generally shouldn’t) you should urge the vendor to fix their software.

I suspect as SELinux gets more uptake vendors will be forced to actually look into SELinux and do what is necessary to make their software work with SELinux. This, ofcourse, can include fixing their software or if the access is truly necessary writing policy that is included with the application. This gives the added benefit of letting you, the end user, actually see what the software is doing (if you care) and forces the vendor to evaluate their own software to see what it does. Often times software is developed over a long period of time with many teams of programmers so the current developers may not even know what huge parts of the software do.

However, the vendors aren’t going to do this until it makes sense from a financial point of view. I suggest making that happen as soon as possible by telling vendors simply “No, I will not disable SELinux”. If they want your business they’ll start looking at fixing the problems, which is a benefit to everyone including the vendor. A quick web search for “disable selinux” shows many companies (and even opensource projects) which list disabling SELinux as the “solution” to their malfunctioning software. That list includes: VMWare, Novell, Brother printers, Oracle, Sun, @Mail, Positive Software, Zend (PHP), Subversion, … the list goes on.

So, if you happen to have the misfortune of dealing with a vendor who chooses to put your infrastructure, business or personal machine at risk rather than fixing their software just send them the link for this article, it may help you, them and everyone else that uses their products.