Clusters

Clusters exist for mainly two reasons:

The industry has created a number of solutions for managing the HA problem. Load balancing, however, is still a challenging problem, especially when it is not only the CPU load, but also I/O load, or a combination of these, that need to be distributed over several computers. Also, load balancing somehow includes the HA problem - the nodes of the clusters fail from time to time, and the system has to cope with that, or uptime goals cannot be reached.

Examples for individual cluster software

Here is a list of cluster components that were developed in the past:

Of course, in a real cluster application, a number of such components have to be combined to get the desired functionality.

Technology

Choosing the right technology is difficult. The usual "developer reflexes" are not always the right one, and innovative approaches get a chance to prove themselves.

The stability of the running programs cannot be emphasized enough. In normal environments a developer is already happy when the application does not crash weekly. In a cluster system, even a crash rate (per running instance) of once per month may be way too high. Imagine you have 100 nodes - this rate would mean 3 cluster outages per day.

Also, correctness is of superior importance. Given that a cluster computation is very costly (and often also lengthy), it is not desirable to find at the end of the computation that the results are incorrect.

Other criterions for choosing the right technology are the runtime speed, and the "time-to-market", i.e. how long the development cycle takes.

Gerd Stolpmann prefers:

Software architecture

The architecture of a cluster system is extremely important in order to reach the promised performance goals. Generally, Amdahl's law limits the possible speed-ups in a cluster system, and a careful analysis must be done to ensure that really all parts of the system are parallelizable.

The success often depends on the data architecture. If the data can be organized in a cluster-friendly way it can be accessed and processed simultaneously in a natural way, and the best speed-ups can be achieved. Some examples:

Components/Products

There is expertise for:

 
Dipl.-Inform. Gerd Stolpmann
[de] Deutsch / Lokale Kunden
[en] English / Int. Customers