Kate's Comment

Thoughts on British ICT, energy & environment, cloud computing and security from Memset's MD

Security Aspects Of Open Source Software

Nick and I have built a market-leading, multi-award-winning, multi-million dollar hosting/cloud IaaS company using entirely open source software and an “automate everything” philosophy. We have recently attained a cross-government CESG accreditation for our service under the G-Cloud project, incorporating the open source hypervisor Xen, even though Xen itself was not certified. Here are my views on why open source is actually more secure and reliable than alternatives.

Why we use open source

We turned to open source for a number of reasons:

Price

You don’t need to pay for proprietary software anymore, simply download the open source software and install it, and don’t pay a penny. Furthermore, you usually get unrestricted access to the source code enabling you to modify it to suit your requirements.

Flexibility

Once you have the software installed you are free to host your applications wherever you like. This means you no longer need to put all your information in one basket, say with Google, so instead you’re able to separate the software from the host and own your own data. A good example of how to achieve that would be Zimbra; an open source, Web based software as a service (SaaS) suite of office applications that can be hosted by any managed hosting provider.

Efficient systems integration

By using open source software and adapting it to suit our needs, with fairly minimal development effort, we’ve been able to build on those foundations to automate a large number of our processes such as account billing, administration, provisioning, maintenance and monitoring activities so that they require very little staff input. Our preferred core tools (all open source of course) are:

  • Python (programming language)
  • MySQL or SQlite (data bases)
  • Django (application framework)
  • Ngnix and Apache (Web servers)

A key part of our approach is “one database to rule them All”. Thus, our configuration management, billing, and everything-else database (aka. “The database of Doom”) is something we have built ourselves, using the above tools. As with most development approaches we tend to start with an object model, which we give to Django to turn into a database structure and hang a bunch of appropriate code off it. The in-house built stuff takes care of the following:

  • firewall rules management
  • asset management
  • IP address management
  • Domain Name Servers (DNS)
  • vLANs and switch fabric management
  • network connectivity/bandwidth regulation, shaping and accounting
  • automated provisioning of infrastructure as a service (IaaS)
  • customer accounts and details
  • billing / invoicing

That might seem like quite a lot but actually, thanks to the extremely powerful and highly-efficient open source tools we employ, we have built all of that and maintain it with a very small development team (single digits). When you consider that we are the cheapest cloud IaaS provider on the UK market (including cheaper than Amazon) thanks to our approach of massive automate, and that most companies competing at scale have massive development teams, this should hopefully impress!

However, that is just the core, and in many ways it is little more than a rather large and complex, database-driven object-oriented spiderweb of applications. The clever bit is that it also stitches together a whole bunch of open source applications, much like the conductor of an orchestra. Below are some examples of the individual open source tools and applications, which comprise of our orchestra:

  • Xen Hypervisor to chop up a host server into multiple Miniserver VM virtual servers (more on this below)
  • Logical Volume Manager (LVM) – does the hard disk bit of Miniserver VMs
  • OpenStack SWIFT to run our Memstore cloud storage service
  • Linux Virtual Server (LVS) and Heartbeat to run our Performance Patrol clustering and load balancing service
  • IPtables – runs our firewalls
  • Nagios which we use for monitoring all servers and services in our estate

Because all of those bits of software are open source, and therefore built with the intention of being transparent and easily accessible, it has been very simple for us to integrate them. Most of them are actually driven by simple textual configuration files, which we auto-generate and automatically propagate, to the appropriate places. Because of this elegant simplicity it also means that our systems are very, very reliable. On the one hand, there is not much to go wrong, and on the other when something does go wrong it is easy for us to poke around to figure out what the problem is without having to get in touch with some third party software vendor’s hopeless support desk!

Cheap enhancements

Following on from the above, another of the beauties of open source is that unlike their proprietary counterparts it is possible to modify or enhance them yourselves relatively inexpensively. We have done exactly this with many of the solutions we use. For example, we improved the virtualisation technologies we employ to enhance its fair scheduling (ie. making sure that one VM cannot monopolise the host’s resources. Another good example is with OpenStack SWIFT, the object-storage system we use for Memstore. Normally it does not come with a content delivery network element nor the ability to upload files via FTP or SFTP. We added those pieces of functionality, giving us a competitive edge over other SWIFT users.

In some cases we have chosen to open source that code ourselves, which we have done with the FTP server add-on to Memstore. We have actually taken over as the lead developers on a project that had been ailing for lack of support and others are now using that software. They can be anywhere in the world, for example a Russian ISP is one of the organisations that has been helping us to further develop that package with detailed testing and improvements suggestions as well as some of their own patches. Through being open ourselves we are tapping into a wider communal resource that we would not otherwise have; in effect some of the minnows are able to group together to compete in terms of software development with the major players.

Mobility

Because all of our systems are Web-based it is really easy for people to work from home or on the road – one of the many advantages of a SaaS model but without the usual lock-in associated with proprietary providers – some popular enterprise resource management software vendors spring to mind! A good example of such a package we use is Trac – and integrated Wiki, ticketing & project management system and software repository (it contains staff job lists/work flows as well as all company documentation). Further, because we can see all the inner workings of the systems we use we can be completely confident about their security. For instance, even I with my rusty old coder skills can comprehend Trac’s security measures as well as the defences we put around it as part of us self-hosting it (how it does access control, how we ensure it is only accessible via HTTPS, where the data is, who has physical access to the machines it is on, etc) which gives me much more confidence than an opaque, proprietary SaaS service elsewhere.

Reliable, free personal operating systems. We have now migrated most of our staff to ‘nix-based systems (mostly Linux, but some of us use MacOS), and all they need is just a browser and an email client. Firefox & Thunderbird are certainly enterprise-quality these days and indeed the collective subjective viewpoint is that using, say, Ubuntu (an open source desktop/laptop operating system) with the likes of Firefox, Thunderbird and LibreOffice is actually significantly more reliable than Microsoft Windows platforms. This opinion comes from a point of many years experience among our systems administrators – they love Linux because it is more reliable and more secure (see below) than MS Windows as a personal operating system.

Transparency (no lock-in & easy analytics)

Open source is not solely about publishing your code and getting a community of developers and users to collaborate to build, maintain and improve it. It is also something of a design mind set; when you are building something that needs to be easily understandable and maintained by many you have to be transparent and easy to comprehend from the ground up.

This means that most open source applications store their information in very accessible ways, most commonly in databases like MySQL in fact, and usually with very clearly defined and accessible structures. This means that it is very easy to export and import data between systems. It also means that you can do data mining and analytics very easily. We run the company in a very scientific way, exploiting very large data sets (we collect about five million data points per day) to inform our management decision-making. We do this without any expensive software nor any special expertise, but I can instantly access statistics from any of our systems using tools that our development team cooked up sometimes in a matter of minutes, which simply run a query on the appropriate database.

This open design approach is in stark contrast to make proprietary solutions where the vendor’s strategy is to make it very hard to dig into the data so that a) the users get locked in to that solution and b) they can monopolise the addition of helpful services or data analytics tools and make more money on that part too.

Increased Security

It is the opinion of everyone at Memset (all experts in their fields and many fully “bi-lingual” in terms of Linux and Windows operating systems) that, in general, open source applications are more secure than their commercial equivalents. This may seem a bold claim, but let us examine it. The source code for open source software is just that; open and public. Open source applications stand naked in the wind and anyone in the world may scrutinise them and attempt to hack them with this significant advantage. The result of this is that if there are any exploits they are rapidly discovered by the open source community of developers (those who specialise in finding exploits and publishing their findings are called “white-hat hackers”, as opposed to “black-hat hackers” – the ones who do it in secret for personal gain) with an interest in a particular project who then release a patch.

By contrast, proprietary (closed source) software is not open to this wide scrutiny. Instead the usual way exploits are discovered is by a hacker somewhere in the world who busily takes advantage of the exploit or bug to their personal gains – and your personal loss. Inevitably such news of such exploits gradually leaks out and the corporations behind the software patch the hole. However, it can often be many days or even weeks before the likes of Microsoft become aware of the hackers making merry with a flaw in their code, and it is not uncommon for it to again take many weeks to get the patch rolled out. In the intervening time there is often significant damage done.

For evidence of what I’m talking about you need only look at the security advisories sites to see that there are, in general, many, many more serious security exploits for closed source products than open source products. Further, such sites also demonstrate that it is more common for exploits in open source software to be first discovered by white-hat hackers who do not exploit the bugs.

To take two more simplistic examples; if you put a Linux server online with no firewall and default settings then it is extremely unlikely to get “rooted” (root is the Linux equivalent of the Windows administrator account – the super user account which can do anything) – it is secure by default. In contrast, as a hosting company we rapidly learned that you cannot deploy a default-install Windows machine without a firewall since it immediately gets hacked, usually by automated viruses. What do I mean by immediately? Well our record was 17 seconds from the completion of boot up!! This is why, if you look at the Miniserver VM virtual servers on our Web site, we provide a free basic firewall with Windows machines.

The Basis of Miniserver Virtual Machines

All of our Miniserver virtual machines use Xen, which is an open source virtual machine monitor or hypervisor originally developed by Dr. Ian Pratt at the University of Cambridge’s computing department. Since then it has had contributions from many major companies including IBM, Microsoft and Intel. Originally the main goal of the design and development was being able to run up to a hundred full-featured OS instances on a single computer or server. Xen provides secure isolation, resource control, quality of service guarantees and also protects each individual account on the system. The advantages of having this technology are clear and for applications such as web hosting where server load and higher amounts of processor or memory power are not needed the benefits and cost savings are huge.

Why are Miniserver VMs better?

One reason that Xen is more effective than commercial programmes like Virtuozzo is that operating systems must be explicitly modified to run on Xen, this enables Xen to achieve high-performance virtualization and also prevent any sharing or memory, processes or having any individual account on the server disrupting any others. Xen also allocates each account on the system its own sub kernel making it at an operational OS level a dedicated machine. This means should one account fail or crash the others would continue unaffected. Virtuozzo on the other hand relies on the services of a single kernel, all of the VPSs on a given server must run basically the same operating system. Another major issue for Virtuozzo is reliance on a single kernel. Should the underlying OS kernel fail, all VPSs running on the server would be brought down as a result.

Xen uses a technique called paravirtualization to achieve high performance (typical performance penalties are around 2%), at the other end of the spectrum, emulation solutions entail performance penalties of around 20%.

After extensive testing we opted for using Xen even though we had to do a reasonable amount of bespoke work ourselves to port Linux operating systems to the server. Many other hosting companies have now jumped on the virtual server bandwagon and most of these have decided to use Virtuozzo. That is because it is very easy to set up, administer and also allows them to put up to sixty accounts on one server. We only put 5-10 Virtual Machines accounts on each physical server and as a result performance and uptime is excellent.

Another important consideration is that because Xen is open source unlike Virtuozzo we can keep the costs of our Miniservers lower and we do not need to put a large amount of accounts on a server to cover our underlying cost.

Open Source Advantage & The Future

Being open source, Xen also allows us to offer customers the ability to manage and change their kernels and also as Xen supports different operating systems on the same server we can offer customers the choice of either Debian or Fedora as underlying Linux OS.

Our firewalls use open source Linux IPtables, which we automatically configure from our master database (open source MySQL, of course) with our own scripts. We also use the new open source Open Stack software for our cloud storage solution, Memstore.

We also use an open source implementation of the vLAN’ing and bridging software (for creating a virtual switch on the host servers), all using standardized, open interfaces and protocols.

By using Linux and open source software for our core infrastructure also means we are able to use commodity servers for everything from our firewall-routers to our virtual machine hosts. By doubling up on everything (which is made economically very feasible when using commodity hardware and not paying license fees) we are able to achieve huge levels of resilience for very little outlay.

Security and brand concerns

Moving into the government hosting space, we’ve been able to dispel some of the myths that open source is insecure and unsuitable for high-security requirements. In fact, Memset firmly believes that open source is more secure than closed source software. Recent penetration tests carried out by Encription who are CREST & Tiger certified, found there were no vulnerabilities or warnings of any kind. The tests included attempts to launch attacks on virtual machines (VMs) sharing the same host server and the failure to cause any impairment to the performance of security of the attacked VMs demonstrates the integrity of Memset’s Xen-based hypervisor layer.

Gaining cross-government CESG accreditation for its service, incorporating the open source hypervisor, even though Xen itself was not certified, proves that any virtualisation software could be used to put government VMs at IL2 on the same machine as other customers (in our public cloud), regardless of the software’s security certification.

I believe people are hung up on virtualisation and make a big issue out of VMware versus Xen. But it’s not a big issue. Yes, VMware is accredited to multi-tenant IL3 VMs along side untrusted ones, but in reality you need to have a separate infrastructure stack on IL3 hosting anyway as it can’t be connected to the public internet. It’s possible to use Xen in that setting; we are just going to run a secure IL3 community cloud for government. If some public sector customers still have an issue we just provision a private cloud within that estate.

With the Government ICT strategy, released in March 2012, which said OS solutions should be considered alongside proprietary frameworks during digital procurement. By 2015, the Government hopes to procure 50% of ICT through cloud-based solutions, this simply highlights the need to get up to speed with a platform that’s relatively new to the public sector.

Is Open Source Software Really Enterprise-Ready?

Despite the fact that there are several enterprises using open source to run mission-critical functions, there are some CIO’s that still prefer proprietary software for their enterprise requirements. Their major concern is about the software being supported in the future with open source projects, and being reliant on an unpaid community of volunteers. There are several flaws in this perspective:

First off, you have this problem with commercial software; what if the supplier fails, or in the case of one like Microsoft what happens when they change version and stop supporting yours.

Second, while some open source packages are indeed more of a labour of love than something commercially motivated, as cited in my example with Memstore’s FTP/SFTP server, there are an increasing number of commercial entities that recognise the value in pooling their efforts in a collaborative manner and have bet the farm, so to speak, on open source solutions. You might saw that we’re an odd, small British company, but I would remind you that OpenStack (the leading open source cloud IaaS solution) is actually RackSpace’s code – the multi-billion dollar number one manage hosting company. They saw their market share being eroded by Amazon Web services and their answer was to fight back by open sourcing their code base. It was a very clever move; they have in-effect united us Davids in the war on the Goliath that is Amazon Web Services.

Third, those CIOs simply to not understand developers!! Personally I would also trust the projects that are more of a labor of love as well, provided I could be confident that if push came to shove we could take it on and become the leads ourselves. This is because, unlike C*O execs, developers (open source oriented ones I should add, I don’t necessarily include Microsoft-oriented devs in this) are in general extremely bright men and women who are not overly motivated by money but are more interested in having interesting and challenging problems to solve. They also take satisfaction from collaboration, a bit like scientists, and are highly motivated by the kudos that contribution to open source projects bring. In fact, I would go as far as to say that the open source development community is the single example of functional, healthy communism.

Finally, even if a project does stop being supported, because of the aforementioned transparency and because (if you’ve taken my advice) the solutions are self-hosted (ie. you’re getting the software from someone other than the organisation providing the hosting) you are in total control of your own data and can easily migrate to someone else.

We do not use Google Docs, for example, mainly because I don’t want all my company information to be stored on servers with a company I don’t whole and utterly trust. Not only am I not confident that they can keep our data safe and/or will not misuse it, but there is no guarantee that they might close ranks and start making it hard to export stuff. Unlikely, yes, but personally I’d much rather put my faith in a community of enlightened, liberal, intelligent people who are just trying to make things work a little better than a purely capitalistic entity.

5 comments