Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

If you’re a regular here at Playing With Wire, you’ve probably already read our articles about Cacti. While Cacti does do a great job on visualizing load on your servers, it does not provide (by default) alerts when a server goes down.

When we launched YippieMove we quickly realized that we needed a reliable 3rd party that could ping our servers from several locations across the globe to ensure that we were not experiencing any problems with the access to our site. As we are quite tech-savvy here at WireLoad, we had a hard time justifying paying more than a few bucks per months for a service like this, since the service is so easy to write (we actually did write our own uptime-monitor with alerts a few years back using Curl, Crontab and some other tools, but would rather outsource this service).

So the search began. We required a few thing for this service:

  • Several servers across the globe that ping our servers.
  • Cheap. Preferably free (we don’t mind some ads).
  • Decent statistics showing response-times etc.
  • Reliable alert system by e-mail (luckily most US Cell providers allow you to send email to your phone, using [email protected].)
  • Must allow monitoring of both SSL and non-SSL servers.
  • A minimum of 4 monitors (we needed to monitor playingwithwire.com, wireload.net, yippiemove.com [with and without SSL]), but it would also be great if we could monitor our mail-server.
  • The more frequent the pings the better.
  • No back-links required.

One of the most impressive sites we found was Pingdom, a small Swedish firm that is trusted by companies such as IBM, Loopt and Twitter (wow, they must spend more bandwidth on alerts than pings with Twitter for sure). What we really liked about Pingdom was the general look and feel of their site. It feels fresh, responsive and reliable. The pricing is definitely within reason: they charge $9.95 for their Basic plan, which includes 5 checks and 20 SMS.

The next site we stumbled upon was SiteUptime. The site has a decent look and feel (but does not come close to Pingdom). After examining their pricing, we realized that we needed their Advanced plan, since none of their lower plans allowed SSL monitoring. The price for this plan is $10 per month. While their site and visualization does not come close to Pingdom, they do give you 10 monitors, as apposed to 5 monitors with Pingdom, with their Advanced plan.

Another site we found was Pingability. The general look and feel of the site is OK, but the service offered was not great. The free plan requires a back-link (which we think is unacceptable for a professional site). At the same time the premium service, for $9.95, only offers one monitor.

Next up for review is Wormly. Priced at $9 per month, their Bronze-plan seems to be a reasonable alternative. The plan includes 5 monitors and they ping your server 5 times every 5 minutes, which is good enough. Unfortunately there’s a big ‘but’ — no SSL monitoring (at least as far as we can tell). That’s a deal-breaker. To Wormly’s defense though, they do offer something that sets them apart from the competition, namely the ‘Server Health Monitor.’ This service is something similar to Cacti (it definitely looks RRDTool-based), that visualizes server-load. However, they will probably have a hard time selling this service to security-concerned organizations, as they require a monitoring-client to be installed on the server (it’s hard to get this data otherwise).

Basicstate is the final service we will cover in this article. A lot can be said about Basicstate’s web design (it’s _really_ bad). However, they do offer a very competitive service. They ping every 15 minutes and allows you monitor as many sites as you want (including SSL). While it might not be a very pleasing site to browse, they do offer sufficient statistics (with graphs) on their site. In addition to that, they also send you daily reports about all your monitored sites (with time data for dns, connect, request, ttfb, ttlb). The only drawback we discovered with Basicstate is that you cannot monitor the same domain-name with SSL and non-SSL (sub-domains is fine though). This may or may not be an issue for you.

The verdict? We settled for Basicstate. Later on, as we grow, we might consider switching to Pingdom. We’re happy with Basicstate for now. Although we did experiencing some false alerts, the guy who runs the site (I assume), Spenser, did a great job on providing an in-depth explanation to the alerts by email. So if you’re on a tight budget, Basicstate is our recommendation. If you have more money to spend, go for Pingdom.

Author: Tags: ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.
Oct
12.

Wow! Has it been one year already? Yes! Today marks exactly one year since we first published a short introduction to the blog (Starting Up). Since then we’ve had more than 90,000 unique visitors. In this celebration post, I’d like to re-cap the year, and bring you a list of the three most successful articles we’ve written over this past year.

1. Why Gentoo Shouldn’t be on Your Server (~40,000 views)
Slashdot
Our by far most popular article during the course of the year was an article called “Why Gentoo Shouldn’t be on Your Server.” Not only did this article catch a great deal of attention on blogs and forums around the world, but it even made it to Slashdot. The reason why this article became so widely read was because it discussed something that apparently is a very sacred Open Source-topic: Gentoo Linux. This Linux-distribution is considered by many as the most elite and advanced distribution because of its endless abilities to customize for you own needs. Our article took a look at how that kind of flexibility fares in the server world, and how it worked out for us after a year of use. We wrote that while we liked the distribution, it didn’t seem like the best idea to run it on a production server. Gentoo could be a great distribution for the lab-machine where you would want to stay updated with the most recent versions of everything, but for our production servers we would rather have something more stable that requires less frequent updates (only security updates), such as FreeBSD or Ubuntu (LTS).

2. Building a modern IT infrastructure for a small company (10 clients) with a sub-$3,000 budget (~18,000 views)

Interestingly enough, this article made it to the top without being mentioned on any of the big blogs: instead the majority of our visitors found the article through Stumbleupon. The article is the first in a series of two articles about how to create a modern IT infrastructure from scratch with an extremely small budget. To achieve this, we heavily rely on Open Source software for all parts of the organization. Without going into details, we utilize a software suit called LTSP to turn a set of cheap old computers into modern thin clients.

In the first article, we talk about the entire concept of using LTSP and how everything fits together. In the second article (Deploying the sub-$3,000 IT-infrastructure), we actually deploy this concept in a real company. Not only do we discuss how we ended up setting everything up, but we also provide detailed information on the exact hardware used, how it was configured, and what the actual cost was.

3. Bye Bye Binders, I Won’t Miss You at All (~10,000 views)
Lifehacker
This article was a bit different from the articles we usually write, but it turned out to draw quite a lot of attention and it made it all the way to Lifehacker. In this article we wrote about how your could turn those ugly binders in your bookshelf into something more useful and pretty — a set of PDF-files. If you do have a scanner with Automatic Document Feeder (ADF), this guide helps you organize all of those documents into conveniently accessible PDF-files.

So what did we learn over this past year?

  • Digg is overrated. If none of the few power-diggers ‘diggs’ your post, you probably won’t receive any noticeable traffic.
  • Stumbleupon is very sporadic and unpredictable. Even though no major blog wrote about our LTSP article, it’s still the second most visited article on our blog.
  • Blogging takes time. Although we really enjoy writing most of the articles here, it does take a whole lot of time.
  • You need quite a bit of traffic to make any money of a blog. Even though we’ve had over 100,000 visits this past year, the revenue we’ve made in advertisement doesn’t even cover our hosting costs.
  • It takes time to establish a user-base. Nowadays we receive more traffic in a couple of days than we did in a month in the beginning.

We hope that you’ve enjoyed the first year of Playing With Wire, and that you will enjoy another year with our technology, internet and startup articles.

Author: Tags: ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

At least in my personal opinion, one of the strongest trends seen at the LinuxWorld expo in San Francisco over the last years has been virtualization. This year many exhibitors had taken the next step and were actually using VMware products on their exhibit computers to simulate a number of servers in a network. For instance, Hyperic demoed their systems management software with a set of virtual servers.

Whenever virtualization comes up, the idea of grid computing isn’t far away as enterprises wish to maximize server utilization by turning their data centers into grids that each deliver the ‘services’ of CPU, memory, ports and so on. But in this brave new world of virtual machines and grid processing there is an element missing. If you’re moving your computing over to a grid computing model, why is there no corresponding grid storage model?

The commercial open source startup Cleversafe has that corresponding model. By employing a mathematical algorithm known as an Information Dispersal Algorithm, found in the cryptographic field of research, Cleversafe separates data into slices that can be distributed to different servers, even across the world. But it’s much more than just slicing and dicing: the algorithm adds redundancy and security as it goes about its task. When the algorithm is done, each individual slice is useless in isolation, and yet not all slices are needed to reconstruct the original data. In other words, your data is safer both in terms of security and in terms of reliability.

Cleversafe is not the first entity to come up with such a scheme. The idea of an Information Dispersal Algorithm is known from Adi Shamir’s paper ‘How to Share a Secret’ and other publications. When we met up with Cleversafe’s Chairman and CTO Chris Gladwin at LinuxWorld, he mentioned that the Information Dispersal Algorithm had been used in many applications before – even to store launch codes for nuclear weapons securely.

The scheme is different from a simple parity scheme in that you can configure how many redundant pieces you want. With parity as found in common RAID setups, you can lose any one storage unit in the set. With an Information Dispersal Algorithm, you can make your system resistant to failure or corruption of any one, two or indeed any number of units in the set. If there’s a strike in your data center in Texas, and your German data center is on fire, your data will still be fully accessible through the remaining servers provided you began with a sufficient number of servers. And as opposed to the brute force solution of multiple mirrors of the data, the dispersal algorithm has a much smaller overhead.

Google is a well known proponent of the brute force solution: the Google File System implementation suggests that the best method to keep your data continuously available is to keep three copies of it at all times. Cleversafe is a smarter system. If you have 16 slice servers (known as pillars in Cleversafe terminology) with a redundancy of 4 slices (known as the threshold) you can lose up to four servers simultaneously and still retain your data. At the same time the total overhead in storage space is only 4/12 – 33% of the space. The advantage as compared to Google’s three copies method is clear: with three copies you only protect yourself against the failure of any two servers and yet you pay a much greater price with a total of 200% storage and bandwidth overhead. And that’s not all. While you’re storing two additional copies of your data, you have effectively tripled the risk of that data being stolen. When a careless system administrator forgets the backup tapes in his car over night and the car gets stolen, all those credit card numbers or what have you will be out in the wild, even that only one out of three locations was compromised. In our Cleversafe example, 12 separate servers would have to simultaneously be compromised – quite unlikely by comparison.

Cleversafe is not alone and there are other actors on the software market such as the PASIS system. PASIS’ home page describes functionality very similar to Cleversafe’s: “PASIS is a survivable storage system. Survivable storage systems can guarantee the confidentiality, integrity, and availability of stored data even when some storage nodes fail or are compromised by an intruder.” None the less, Cleversafe appears to be a step ahead of its competitors at this time and is poised to be the first to deliver grid storage to a wider market.

While the Cleversafe software is developed as open source through the Cleversafe Open Source Community at cleversafe.org, there is a commercial company behind Cleversafe: Cleversafe, Inc. Cleversafe, Inc. plans to generate revenue by offering a storage grid for rent based on the Cleversafe technology. “The market for a more secure, more cost effective storage solution is enormous,” says Jon Zakin, CEO of Cleversafe in a press release issued in May.

The Cleversafe project is available as Open Source under the GPL 2.0 License. The version online is apparently an early alpha version and is not ready for production use. According to Mr. Gladwin, there will most likely be a new version within a month, and sometime in the beginning of the next year Cleversafe may be ready for production use. In the meantime, you can download the current alpha version of the software at the Cleversafe Open Source Website. You can read more about the algorithm at Cleversafe.org’s wiki, and there’s also a flash video describing the idea available.

Author: Tags: , , , ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

Yahoo! recently released a new Firefox extension called YSlow. This article describes how to get started and what we did at Playing With Wire to get our front page to load in almost half the time.

YSlow is a handy little tool for analyzing the performance of your websites. It will give you vital statistics and grade your site on 13 performance points with helpful hints for what you may be able to do to improve the loading speed of your pages.

Installation

YSlow is actually a plugin to a plugin in Firefox. To use it, you need Firefox and the Firebug web developer plugin. Both are easily installable using links from YSlow’s homepage however.

Once you’ve restarted Firefox with the new plugins installed, all you have to do is to activate Firebug for a particular site and you’re ready to go. Normally, this means surfing to the site and then revealing Firebug from its icon in the Firefox status bar. Just click the Firebug icon and it will reveal it’s main view. In it, it will most likely say that Firebug is disabled. Just click ‘Enable Firebug for this web site’ and you’re ready to start dissecting it’s performance with Yahoo!’s YSlow.

This is what Firebug looks like.
Firebug revealed.

Taking your site apart

Once you have Firebug enabled, switch to the ‘Performance’ tab and you’ll get a grade on your website’s loading performance. The grade breaks down into several subcomponents where each one corresponds to a point in Yahoo!’s Thirteen Simple Rules for Speeding Up Your Web site. The grading is fairly arbitrary and should be taken with a grain of salt. For example, if you have 35 downloads for your page and just four of them don’t have an Expires header, YSlow will give you a harsh F in that category.

YSlow gives Playing With Wire an F for Expires headers.
Hello, is this Google? YSlow is giving me an F in Expires headers. Could you reconfigure your ad servers for me?

None the less, the sub-points of the grade-sheet are great hints for what you can do with your site. While you’ll probably be forced to ignore the ‘grades’ if you have externally sourced ad units like we do, you can still work your way through the list and fix everything that you do have control over. This is what we did with Playing With Wire and astonishingly enough we reduced the download size to about half of what it used to be. Below are the best tricks we learnt or revisited after using YSlow.

Eliminate HTTP requests

This is a well known method that we had already worked into the design of Playing With Wire. The idea is to have as few CSS, image and Javascript files as possible. Most browsers will only download two files at a time and there’s always some overhead associated with the download of a new file. If you can combine files you reduce this overhead.

YSlow’s Components page let us know exactly what we were bringing in through links and we could eliminate an external Javascript we were no longer using.

Add Expires headers

Expires headers are important to let web browsers know that once they’ve cached an image, CSS include or Javascript file, they can keep using it for a while. Without these headers most browsers will keep downloading the same files over and over out of fear that they may change frequently. Again, YSlow’s Components tab reveals relevant information: the Expires column lets you know what Expiry date your web server is broadcasting for each downloaded file. If you notice files with a value in the Expires column, it may be time to go into your web server configuration file. In Apache, a section like this one might just do the trick:

<virtualhost ...>
...

<directory ...>
  ...
  ExpiresByType text/css "access plus 1 week"
  ExpiresByType text/javascript "access plus 1 week"
  ExpiresByType image/gif "access plus 1 week"
  ExpiresByType image/jpg "access plus 1 week"
  ExpiresByType image/jpeg "access plus 1 week"
  ExpiresByType image/png "access plus 1 week"
</directory>

  ExpiresActive On
</virtualhost>

Enable compression

Most modern clients support streaming compression. This is a feature that lets the web server compress data before sending it to the client. This reduces the download time of the page at the expense of some CPU time on the server. While most graphics can’t be compressed much, this turns out to work out great for HTML, CSS and Javascript. All of these files can often be reduced to as little as a third of their original size. The best part is that the web server won’t try to compress the data unless it already knows the client can handle it.

How to set up compression depends on your application and web server software. If you’re using Apache, you can have the server do it for you for normal files. For dynamic content such as that generated by PHP it depends. In WordPress there’s a switch in the Options tab. If you’re using wp-cache like Playing With Wire is, it may take some more work to get things up and running, but it’s well worth the effort.

Conclusion

Between these tricks and a couple more, Playing With Wire came out about half as heavy for the main HTML, CSS and Javascript, and as a result felt much more responsive. All in all YSlow was a helpful utility, especially thanks to its ‘Components’ tab which made it easy to see what parts of the page were being cached and compressed properly, and which ones were not. The grading system wasn’t very helpful, not all of the grade hints were applicable, and none of the hints were unknown in the field. Still, the list of suggestions was a useful as a kind of laundry list of things to do for the site, combined with data specific to your site. At the end of the day, I’ve found a new partner for optimizing websites with Yahoo!’s YSlow.

Author: Tags: , , ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

My recent post about Wikipedia’s Wikia linking brought on some emotional responses. Since there seems to be some misunderstandings about what I’m arguing, I’ll in this post lay out what is hopefully a more succinct description of why Wikipedia’s actions are both unfortunate, and ironically enough promotes spam rather than combats it on a large scale.

First of all, please be mindful that nowhere am I making the argument that ‘spam is good’, nor that Wikipedia should be a platform for spam. You will notice that Playing With Wire is not linked to by Wikipedia, and that we have no direct self interest in the use or lack of ‘no-follow’ tags on Wikipedia. What I do have an interest in is the health of the internet as a whole, and I believe that there is a risk that Wikipedia is causing harm to this health with its recent actions.

At the crux of the matter is the way modern search engines separates spam from useful content. Google and other search engines separate valid content from spam by inspecting the way the world wide web is interlinked. A site is considered ‘trusted’ if it has many inbound links from other trusted sites. The theory is that since humans make most links, sites that are useful for actual people get plenty of human links over their life span, while spam sites only get links from other spam sites. If you score sites based on the quality of their incoming links, you will then over time see some sites rise above the general noise. As far as Google is concerned these are the ‘non spam’ sites – other trusted websites have confirmed their validity.

You will notice that there is something circular about this system – a catch 22 if you will. To know the trusted sites on the internet, you have to already know what sites can be trusted so that they may vote. To solve this apparent paradox, Google will seed the system so that every site has some kind of base trust. From there on Google starts to count: outgoing links ‘give’ trust to other sites, and incoming likes conversely ‘receive’ trust from other sites. A mathematical formula balances the total amount of ‘trust’ so that eventually a stable structure crystalizes.

You may think of this trusted structure as the sea with little trusted islands rising out of it. Google gives you good search results because most of the time it can find you an island rather than having to dive into the sea floor mud of spam and noise that is the general internet.

It is this structure and balance that makes Wikipedia’s choice of anti-spam technique so unfortunate. Since a lot of trusted sites have given their vote for Wikipedia, they have essentially lowered themselves a little bit into the sea in the process. Normally, this would be fine because when a trusted sites lowers itself in this way, it will cause other islands to rise. These other islands in turn give away some of their buoyancy to yet other islands, and so forth. In the greater scheme of things the mud stays on the bottom and the islands stay on top.

Wikipedia has over time built a very strong position within this system. Wikipedia is one of the most trusted sites on the web as far as Google is concerned. It still amazes me how often Wikipedia comes up right on top in search queries. Wikipedia is essentially one of very few mountains in our sea analogy. But by not voting on other valid sites, Wikipedia is pushing every other island back into the mud by its own sheer weight. Google can no longer give us as many valid search results because the islands are closer to the mud as compared to Wikipedia. These sites gave away their trust in the greater balance to a site that doesn’t give anything back.

In a system where you measure importance as the relative difference between the average and the peaks, having an enormous peak will reduce the effectivity of the system. What’s worse, Wikipedia is setting a very distressing example. Imagine for a moment that every site on the internet decided to do what Wikipedia is doing now, and only use no-follow tags for external links. This is in the individual interest of every site, as they no longer give away their votes. But in doing so, the whole system is ruined as every site would be reduced to the level of the mud.

Ironically, Wikipedia is promoting spam on the internet in the process of trying to rid itself of it.

Author: Tags:

© 2006-2009 WireLoad, LLC.
Logo photo by William Picard. Theme based on BlueMod © 2005 - 2009 FrederikM.de, based on blueblog_DE by Oliver Wunder.
Sitemap