Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

Introduction
No way! That’s impossible.” Well, actually it’s not. Using Open Source technology, it’s actually possible to create a competitive IT infrastructure at very low costs. Not only does Open Source software enable you to create more customized solutions to better fit your needs, but it also means that you can spend your budget on hardware – not software.

Last month I was asked by a company to figure out how to ‘modernize‘ their IT infrastructure with a minimal (almost non-existing) budget. After plenty of thinking and research, I came to realize that the only way to do this was to use some kind of thin-client solution. The solution that I found to fit my needs the best was Linux Terminal Server Project (LTSP). LTSP utilizes network boot (we will use PXE) to boot the clients directly from the server. Therefore, we can use obsolete clients without hard drives (to reduce the noise) as thin clients. The only thing we need on all the clients is a fast network adapter with PXE-support.

Some of the widest adoption of LTSP has been within K12 the education field. Since many educational institutions are working with very tight budgets, LTSP has a strong advantage. It is a way to save costs without having to compromise too much on usability. Edubuntu is a Linux distribution that targets educational institution. What makes Edubuntu very interesting is that it comes out-of-box with LTSP support, which enables system administrators with limited knowledge of Linux to get a thin-client setup running with little effort. Since Edubuntu is closely related to the regular Ubuntu, it’s not very hard to get Ubuntu up and running as an LTSP server.

Infrastructure

This layout is quite typical for a LTSP setup. You might also want to add a couple of network printers.

The Hardware
With a very limited budget, I realized that a thin-client solution would be the most realistic approach. As the name implies, the clients are thin and most of the load will be on the server. Therefore we will spend most of the money on a solid server. After a good amount of research to find a cheap, yet powerful and expandable server, I found the HP ProLiant ML115. The server comes with a 64 bit 2.2GHz Dual Core AMD Opteron CPU, which will serve our needs well. However, it only comes with 512Mb of RAM, which is insufficient for the number of users we intend to have. Therefore we’ll need to purchase some additional RAM. The RAM consumption estimates varies across different LTSP projects, ranging from 256Mb + 32Mb per client to 1024Mb + 64Mb per client. However, since I’d rather be on the safe side, I’d recommend that you purchase another 2Gb of RAM (total 2.5Gb) and put in the server (1024Mb + 150Mb per client).

The next thing we need to add is a Gigabit switch to reduce the possibility of having the network as a possible bottleneck (note that I’m not sure if a 10/100 Mbit network would actually create a bottleneck in this setup, but I rather be safe than sorry). Since Gigabit is cheap today, going Gigabit all the way seems like a reasonable move. Therefore, I’ve budgeted for 10 Gigabit network adapters (with PXE support) and a 16 ports Gigabit switch (the HP server comes with Gigabit network adapter).

Now we need somewhere to store the users’ data with high security and performance. Since we’re on a limited budget, we will use a software RAID solution rather than a high-end hardware RAID solution. A RAID-5 setup on three SATA disks using Linux’s software RAID is probably the cheapest and most reliable for the price. This will allow us to increase performance while we also gain protection against loss of any one drive without loss of any data. Moreover, we also use a separate OS disk to reduce the I/O load.

Because everything is both stored and running on the server, it’s crucial that we protect the equipment from power failures and spikes (the server by itself is a single point of failure). Therefore I’ve added an UPS to the budget (1100 VA) that will not only protect our hardware but also reduce server downtime.

The last thing we need to add is the clients. It’s quite likely that you have a many retired PCs (preferably P2>). If not, you’re likely to find a many of these computers for sale (or for free) in your local classified adds section. If Craigslist is available where you live, this is a great source to find this kind of hardware. Even though many of you will get these computers for free, I’ve budgeted $50 per client.

Budget

Here’s the complete budget for the project. As you can see, the total is just below $3,000

The Software
I’ve divided this section into two parts: Server and Desktop. Although all software will be running on the server, the users will only see the software on the Desktop side, which is why I have separated them.

Server
As already touched upon, the server will be running Linux. To be more precise, I’ve decided to use Ubuntu Server 6.06.1 LTS Server (64-bit). The reason why I didn’t chose to use the most recent version (7.04) is to minimize the administrative effort. Since 6.06 is the Long Time Support (LTS) version, the Ubuntu team will supply this version with security patches and updates for a longer period than the 7.04 release (6.06 LTS Server will be supported until 2011).

Installing Ubuntu is very straight forward. The only thing to consider in this setup is to make sure we install the system on the hard drive that came with the server and not the RAID-5 array.

Configuring the RAID 5 can be done though a couple of different ways. You can either do this during the installation (with Ubuntu’s graphical utility), or wait and set it up afterward in the console (see the Software RAID HOWTO for details). After setting up the RAID, go ahead and mount it to /raid or something similar.

Now it’s time to set up LTSP, which is the foundation of our cost-saving solution. There are a couple of different ways to do this, but the one I found to be most useful was to follow a guide from Novell (strangely enough) available here. You might also want to take a look at Ubunut’s Thin-client documentation.

Before you start the installation, go ahead and symlink /opt, which is where LTSP will be installed, to within you RAID array (ln -s /opt /raid/opt). This will install all the packages on the RAID array instead of the system disk. Finally, what you will want to do is to add a test-user (or a real user – it’s your call). This is done by simply using the user-management tool in the Ubuntu. Note that you probably want to have the home-dir of the users on the RAID array. To achieve this, you can either symlink the entire /home to /raid/home or just set the home-dir to /raid/home/user in the user-creation process.

Desktop
The LTSP setup comes with the most common software used in a corporate environment. This includes:
* FireFox – A great web browser
* Open Office – A Microsoft Office replacement
* Evolution – A Microsoft Exchange replacement
* The Gimp – An Adobe Photoshop replacement (arguably less powerful)
Plus a long line of other applications such as PDF-viewer etc.

You might also want to install is Wine. Wine is a Windows emulator which will run (legally) without any Windows license. Although it does not run all Windows software, many applications work very well.

If you have needs outside of the applications listed above, there’s more available. Any software that runs in a Linux environment (pretty much) will run on these thin clients. Although I haven’t tested it, you should be able to run a fully emulated Windows environment using a virtualization software such as VMWare Workstation or CPU/RAM intense softwares such as MatLab or CAD software.

Screenshot of LTSP in Action Running Open Office in Swedish

Here’s a screenshot of LTSP in action (from the client-side) running Open Office and the Gimp in Swedish

The Clients
The last part is to get the clients to actually boot over the network. If you decide to use a different network card than the one specified in the budget above, make sure it supports PXE booting. Many budget NICs don’t support this feature. There are also other ways to boot but I’m not going to cover that in this article (such as floppy, CD etc.).

The Pros and The Cons
Although this approach is a very good way to create an updated desktop environment while at the same time minimizing the administrator’s job, it does come with some drawbacks. Unfortunately many companies today are stuck with custom software that only runs on Windows. Although Wine offers a great emulation software, you might be forced to purchase a license of VMWare Workstation (and Windows) to run some specialized applications. If you’re lucky, your custom software was written in Java, and it will actually run on Linux as well.

Another thing to consider is the transition from the old environment (quite likely from Windows) to this new environment. Although the transition is likely to be smooth for a crew of young people, members of the older generation (40+) are likely to require some training before being able to use the system fully. Both Open Office and Firefox work very similar to their Windows-counterparts, but Evolution is slightly different.

Another pro is the lack of viruses on Linux. Since we’ve left Windows behind, the likeliness of being infected by a virus is almost zero.

In summary, utilizing a solution such as this might or might not suite your needs. If you do have the flexibility to use Open Source software in your organization and are able to run your customized software either emulated or use a web-based interface, this is a great way to reduce costs from the IT budget and spend them in a way that benefits the company better.

Update: I’ve now actually deployed this solution at a company. For information about how it went, go ahead and read “Deploying the sub-$3,000 IT-infrastructure.”

Author: Tags: , , , ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

At least in my personal opinion, one of the strongest trends seen at the LinuxWorld expo in San Francisco over the last years has been virtualization. This year many exhibitors had taken the next step and were actually using VMware products on their exhibit computers to simulate a number of servers in a network. For instance, Hyperic demoed their systems management software with a set of virtual servers.

Whenever virtualization comes up, the idea of grid computing isn’t far away as enterprises wish to maximize server utilization by turning their data centers into grids that each deliver the ‘services’ of CPU, memory, ports and so on. But in this brave new world of virtual machines and grid processing there is an element missing. If you’re moving your computing over to a grid computing model, why is there no corresponding grid storage model?

The commercial open source startup Cleversafe has that corresponding model. By employing a mathematical algorithm known as an Information Dispersal Algorithm, found in the cryptographic field of research, Cleversafe separates data into slices that can be distributed to different servers, even across the world. But it’s much more than just slicing and dicing: the algorithm adds redundancy and security as it goes about its task. When the algorithm is done, each individual slice is useless in isolation, and yet not all slices are needed to reconstruct the original data. In other words, your data is safer both in terms of security and in terms of reliability.

Cleversafe is not the first entity to come up with such a scheme. The idea of an Information Dispersal Algorithm is known from Adi Shamir’s paper ‘How to Share a Secret’ and other publications. When we met up with Cleversafe’s Chairman and CTO Chris Gladwin at LinuxWorld, he mentioned that the Information Dispersal Algorithm had been used in many applications before – even to store launch codes for nuclear weapons securely.

The scheme is different from a simple parity scheme in that you can configure how many redundant pieces you want. With parity as found in common RAID setups, you can lose any one storage unit in the set. With an Information Dispersal Algorithm, you can make your system resistant to failure or corruption of any one, two or indeed any number of units in the set. If there’s a strike in your data center in Texas, and your German data center is on fire, your data will still be fully accessible through the remaining servers provided you began with a sufficient number of servers. And as opposed to the brute force solution of multiple mirrors of the data, the dispersal algorithm has a much smaller overhead.

Google is a well known proponent of the brute force solution: the Google File System implementation suggests that the best method to keep your data continuously available is to keep three copies of it at all times. Cleversafe is a smarter system. If you have 16 slice servers (known as pillars in Cleversafe terminology) with a redundancy of 4 slices (known as the threshold) you can lose up to four servers simultaneously and still retain your data. At the same time the total overhead in storage space is only 4/12 – 33% of the space. The advantage as compared to Google’s three copies method is clear: with three copies you only protect yourself against the failure of any two servers and yet you pay a much greater price with a total of 200% storage and bandwidth overhead. And that’s not all. While you’re storing two additional copies of your data, you have effectively tripled the risk of that data being stolen. When a careless system administrator forgets the backup tapes in his car over night and the car gets stolen, all those credit card numbers or what have you will be out in the wild, even that only one out of three locations was compromised. In our Cleversafe example, 12 separate servers would have to simultaneously be compromised – quite unlikely by comparison.

Cleversafe is not alone and there are other actors on the software market such as the PASIS system. PASIS’ home page describes functionality very similar to Cleversafe’s: “PASIS is a survivable storage system. Survivable storage systems can guarantee the confidentiality, integrity, and availability of stored data even when some storage nodes fail or are compromised by an intruder.” None the less, Cleversafe appears to be a step ahead of its competitors at this time and is poised to be the first to deliver grid storage to a wider market.

While the Cleversafe software is developed as open source through the Cleversafe Open Source Community at cleversafe.org, there is a commercial company behind Cleversafe: Cleversafe, Inc. Cleversafe, Inc. plans to generate revenue by offering a storage grid for rent based on the Cleversafe technology. “The market for a more secure, more cost effective storage solution is enormous,” says Jon Zakin, CEO of Cleversafe in a press release issued in May.

The Cleversafe project is available as Open Source under the GPL 2.0 License. The version online is apparently an early alpha version and is not ready for production use. According to Mr. Gladwin, there will most likely be a new version within a month, and sometime in the beginning of the next year Cleversafe may be ready for production use. In the meantime, you can download the current alpha version of the software at the Cleversafe Open Source Website. You can read more about the algorithm at Cleversafe.org’s wiki, and there’s also a flash video describing the idea available.

Author: Tags: , , , ,

© 2006-2009 WireLoad, LLC.
Logo photo by William Picard. Theme based on BlueMod © 2005 - 2009 FrederikM.de, based on blueblog_DE by Oliver Wunder.
Sitemap