Microsoft has announced a victory in the MinuteSort test. They claim to have tripled the amount of data sorted by the previous record holder, a Yahoo team. MinuteSort is a test to see how much data can be sorted in just a mere 60 seconds. As more data moves into the cloud, this ability to sort data quickly becomes a bigger and bigger issue.
According to Microsoft's post on TechNet, "In raw numbers, the team's system sorted 1401 gigabytes in just 60 seconds - using 1033 disks across 250 machines." This hardware compared to what Yahoo ran is roughly "one-sixth of the hardware resources" and managed to sort around 3 times as much data. You can see that the Microsoft solution is much more efficient.
Additionally, it's interesting to note that Microsoft Research didn't use Hadoop as one might expect. Instead, the researchers at Microsoft created a new system called "Flat Datacenter Storage." The "flat" portion is the important part of the system. Microsoft explains:
[Microsoft Research's Jeremy] Elson compares FDS to an organizational chart. In a hierarchical company, employees report to a superior, then to another superior, and so on. In a "flat" organization, they basically report to everyone, and vice versa.
Along with "full bisection bandwidth networks," this FDS has proved to be an effective way to sort data. Hadoop currently has the industry as a commercial standard, so FDS may take some of this thunder. Microsoft has hinted that it will likely be deployed on some of its projects. The numbers have not been confirmed by an external entity yet, so, for now, this is just Microsoft praising itself. I suggest taking it with a grain of salt until someone else confirms the numbers.
Further Reading: Read and find more Super Computing news at our Super Computing news index page.