Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

Our email transfer service YippieMove is essentially software as a service. The customer pays us to run some custom software on fast machines with a lot of bandwidth. We initially picked VMware virtualization technology for our back-end deployment because we desired to isolate individual runs, to simplify maintenance and to make scaling dead easy. VMware was ultimately proven to be the wrong choice for these requirements.

Ever since the launch over a year ago we used VMware Server 1 for instantiating the YippieMove back-end software. For that year performance was not a huge concern because there were many other things we were prioritizing on for YippieMove ’09. Then, towards the end of development we began doing performance work. We switched from a data storage model best described as “a huge pile of files” to a much cleaner sqlite3 design. The reason for this was technical: the email mover process opened so many files at the same time that we’d hit various limits on simultaneously open file descriptors. While running sqlite over NFS posed its own set of challenges, they were not as insurmountable as juggling hundreds of thousands of files in a single folder.

The new sqlite3 system worked great in testing – and then promptly bogged down on the production virtual machines.

CPU usage on one of our core servers running VMWare

Tough CPU week on a server running VMWare

We had heard before that I/O performance and disk performance are the weaknesses of virtualization but we thought we could work around that by putting the job databases on an NFS export from a non virtualized server. Instead the slowness we saw blew our minds. The core servers spent a constant 70% of CPU time with system tasks and despite an uninterrupted 100% CPU usage we could not transfer more than 400KBit/s worth of IMAP traffic per physical machine. This was off by a magnitude from our expected throughput.

Obviously something was wrong. We doubled the amount of memory per server, we quadrupled sqlite’s internal buffers, we turned off sqlite auto-vacuuming, we turned off synchronization, we added more database indexes. These things helped but not enough. We twiddled endlessly with NFS block sizes but that gave nothing. We were confused. Certainly we had expected a performance difference between running our software in a VM compared to running on the metal, but that it could be as much as 10X was a wake-up call.

At this point we realized that no amount of tweaking was likely to get  our new sqlite3 version out of its performance hole. The raw performance just wasn’t there. We suspected at least part of the problem was that we were running FreeBSD guests in VMware. We checked that we were using the right network card driver (yes we were). We checked the OS version – 7.1, yep that one was supposedly the best you could get for VMware. We tuned various sysctl values according to guides we found online. Nothing helped.

We had the ability to switch to a more VM friendly client OS such as Ubuntu and hope it would improve performance. But what if that wouldn’t resolve the situation? That’s when FreeBSD jails came up.

Jails are a sort of lightweight virtualization technique available on the FreeBSD platform. They are like a chroot environment on steroids where not only the file system is isolated out but individual processes are confined to a virtual environment – like a virtual machine without the machine part. The host and the jails use the same hardware but the operating system puts a clever disguise on the hardware resources to make the jail seem like its own isolated system.

Since nobody could think of an argument against using jails we gave them a shot. Jails feature all the things we wanted to get out of VMware virtualization:

  • Ease of management: you can pack up a whole jail and duplicate it easily
  • Isolation: you can reboot a jail if you have to without affecting the rest of the machine
  • Simple scaling: it’s easy to give a new instance an IP and get it going

At the same time jails don’t come with half the memory overhead. And theoretically IO performance should be a lot better since there was no emulated harddrive.

And sure enough, system CPU usage dropped by half. That CPU time was immediately put to good use by our software. And so even that we still ran at 100% CPU usage overall throughput was much higher – up to 2.5MBit/s. Sure there was still space for us to get closer to the theoretical maximum performance but now we were in the right ballpark at least.

More expensive versions of VMware offer process migration and better resource pooling, something we’ll be keen to look into when we grow. It’s very likely our VMware setup had some problems, and perhaps they could have been resolved by using fancier VMware software or porting our software to run in Ubuntu (which would be fairly easy). But why cross the river for water? For our needs today the answer was right in front of us in FreeBSD: jails offer a much more lightweight virtualization solution and in this particular case it was a smash hit performance win.

Author: Tags: , , , , ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

As you guys have noticed by now we have done a little refresh of Playing With Wire. At the same time we choose to upgrade to WordPress 2.7 from WordPress 2.2.3.

Unfortunately early versions of WordPress did not specify UTF-8 encoding for the tables created in the database. After the upgrade, UTF-8 was in WordPress but our tables were still in Latin 1 and we got quite a collection of funny characters in some of our postings. Examples include “’” instead of a quotation mark, or  in the middle of some whitespace.

After searching for a while we found the solution at bawdo2001’s blog:

mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset DBNAME > DBNAME.sql
sed -e 's/latin1/utf8/g' -i ./DBNAME.sql
mysql -p --default-character-set=utf8 DBNAME < DBNAME.sql

In other words, just dump the database in latin1, swap out latin1 for utf8 in the output SQL and then reimport in utf8. Just make sure you get a good backup of your database in a separate file before you start reimporting.

Author: Tags: , , , ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.

Today we are very excited to announce YippieMove ’09, our largest update ever to our user friendly online email transfer solution. YippieMove ’09 breaks down email barriers by unlocking transfers from anywhere to anywhere; you can now take your email from almost any IMAP account and shuttle it over to any other account. All this YippieMove does faster while shining with gorgeous new graphs and visuals.

The idea of YippieMove is to unlock email and let the user make the switch to another email. When the original YippieMove was released we did just that – as long as you wanted to switch to Gmail. That’s all changed. In YippieMove ’09 any of our pre-configured email providers can now be the destination of your email transfer. You set up a new Zimbra mail account? No problem, we’ll get your old email in there. HyperOffice? Sure, if that’s what you want. Just like usual you can enter your own providers too if you’re a little handy.

Speed is up in the new version: through better caching and smart point optimizations in the mover we cut many transfer times in half. Most of you will hardly notice since the previous version of YippieMove routinely chewed through even huge jobs in just a few hours. But for the few of you who carry your whole life memoirs and then some in your inbox, the new version will really race to the finish. To reflect our confidence in the new speedy transfer engine we bumped up all the limits. Transfer twice as many emails and twice as many bytes with this new version: 20,000 emails and 20 GB respectively.

The new status page is the coolest new feature. You now get running updates on your transfer job with much more detail than before. What folders have been transferred and which ones are still in queue, what sizes your folders are, how many emails you have. It’s all in there. And since there’s so much data we have distilled it into line charts and bar charts, giving you an easy overview.

And it’s still all online. There is no bulky Windows-only resource hogging program to download. Nothing to install. Everything happens in our servers, and with our internet connections. Just fill in your details and you’re good to go.

We are really happy with the new version. It’s available today at www.yippiemove.com.


Author: Tags: ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.
Apr
03.
Comments Off
Comments
Category: Uncategorized

In preparation for our YippieMove ’09 unveiling next week (you can get a pre-announcement sneak peek now) WireLoad is today releasing version 0.2 of OFC WireLoad Edition.

When we began work on the new Status page of YippieMove ’09 we searched high and low for good charting software, both server based and dynamic. OFC 2 came out on top. OFC is an excellent Flash charts program written primarily by John Glazebrook. It supports several different chart types including line graphs, bar graphs and pie charts. It dynamically reads its data using JSON.

To meet WireLoad’s specific design goals for YippieMove’s status page a number of modifications were made. We needed a particular look and feel, we wanted the fastest possible load times and there were a couple of glitches when using our particular data sets that needed fixing. Since many of these changes were very specific to our use case we opted to just branch the software and not disturb the ordinary development of OFC. This branch is what we are releasing today as OFC WireLoad Edition 0.2. We hope it will benefit the OFC community and perhaps interested parties will be able to find pieces and parts they can use elsewhere.

OFC 2 Hyperion was used as the base. An overview of the changes can be found below.

Visual Changes

  • Support for a gradient background.
  • Chart encompassing border.
  • Look of axises changed.
  • Pie chart drop shadow.
  • “Fuzzy” grid lines sharpened up.
  • New ‘spinner’ progress indicator.

OFC WireLoad Edition Graph

Functional Changes

  • Fast loading progress indicator which starts showing before the whole flash file has downloaded and remains until the graph data has been loaded.
  • New on the side legend for pie charts.
  • New build script for building without the Windows specific Flash Develop.

Size Reduction

  • Each chart type can be enabled or disabled at build time, which enables a site specific light-weight build. Many individual functions such as image saving can similarly be disabled.
  • Embedded fonts are no longer required for 0-90 degree rotated X axis labels or rotated Y axis labels.
  • Reduction of some redundant code.

The final version used on YippieMove’s status page is about 50KiB, down from 200KiB in the original.

If you want to set OFC WireLoad Edition 0.2 up for a test, be aware that when using IE7, SWFObject did not always properly detect the running Flash version in our testing. So you may see unexpected degradation to your non Flash content. Updating to the latest version of Flash seems to resolve the issue, regardless of your installed version – it’s the reinstalling itself that fixes the problem. Word on the net is that there is an installation corruption issue happening to some IE7 users.

Downloads and a complete change log can be found on WireLoad’s open source page.

Author: Tags: , , , ,
Introducing YippieMove '09. Easy email transfers. Now open for all destinations.
Jan
28.
Comments Off
Comments
Category: Technology

We recently had to decide on a configuration format for one of our internal utilities. In this post I’ll talk a little about why we picked YAML as the format and the reasoning behind it.

WireLoad has a couple of servers, each running a number of different services. For a long time we had almost one backup script per service, all hand hacked in bash to fit the requirements of the application. This wasn’t great because it meant we repeated a lot of work. To make the situation a little more manageable we developed BackupWire, a simple backup utility in a single file with a minimal number of dependencies.

The design goals of BackupWire were,

  • Minimal footprint: BackupWire shouldn’t be much heavier than the bash scripts we already had. Why? Because if it was huge and difficult to deploy we might end up writing little bash scripts instead!
  • Minimal dependencies: BackupWire should not have many dependencies. This is for the same reason as in the previous point. BackupWire needs to be easy to install.
  • Readable configuration: One of the problems with bash scripts is that once they’re a little complicated it gets hard to see what’s happening. BackupWire’s real purpose is to alleviate that headache by distilling most backup jobs down to a few lines of configuration.

In order to make everything dead simple the configuration was stored in the BackupWire script itself. This would make it easier to relocate the script, and it would decrease the chance that a config file was not found due to things such as the cron environment being sparse. However, this design decision made it hard to update the script with new versions. In addition, the configuration format became a little cumbersome because it was just a set of Python class instantiations. Hence this feature went against the third design goal of BackupWire. The latest version now uses a configuration file instead.

Thinking that the world really doesn’t need another arbitrary configuration syntax, I wanted to pick a standardized configuration format. So I read up on Wikipedia’s entry on configuration files and found the top three contenders: Lua, XML and YAML.

Lua, being a programming language these days, looked like it would add too many dependencies to BackupWire. BackupWire is written in Python, which we already have on all servers, but Lua we don’t use for anything else so it would be a new requirement which would have to be installed on each server. Also, it just struck me as a little excessive to have a full blown second programming language as a configuration format unless the application was really complex.

The other problem with Lua was that googling Lua config tutorial didn’t really give that many good results, making me think that perhaps the focus of the language has shifted from configuration to something else over time.

XML was immediately off the table, perhaps obviously to some of our readers. Most importantly XML is not a very readable language with it’s abundance of symbols and markup. But also, it’s not very easy to write for the same reason. The people behind Django’s documentation put it best when they said, “Making humans edit XML is sadistic!”

YAML is readable and easy to write both. There is also a light weight Python module called PyYAML to read the format. Using YAML the new BackupWire configuration files are definitely to the point and concise without being complicated to edit. Here is an example of the new configuration format we developed in YAML syntax:

name:       "Sample Backup"         
to:         "/backup/"                   
frequency:  "daily"                 

tasks:
 - run: 
   command: 'df -h'
   log_output: True
 - archive:
   name: "etc.tbz"
   contents: ["/etc/", "/opt/etc/"]
 - archive:
   name: "tmp.tbz"
   contents: [ "/tmp/" ]
# Dump a database using a run task with 
# %(targetFolder)s to locate the destination.
 - run:
   command: 'mysqldump --quick --extended-insert 
     --compact --single-transaction 
     -u backup --databases sample 
     | bzip2 >%(targetFolder)s/mysql-sample.sql.bz2'
---

Not too bad as far as readability goes and all standardized YAML to spare the world from yet one more syntax.

Author: Tags:

© 2006-2009 WireLoad, LLC.
Logo photo by William Picard. Theme based on BlueMod © 2005 - 2009 FrederikM.de, based on blueblog_DE by Oliver Wunder.
Sitemap