the art ::.

» the two guys ::.

the craft ::.

we brake mostly for unix based, open source solutions
services ::.
[ cluster design and auditing ]
[ power efficiency and cooling ]
[ network design ]
[ automated maintenance and monitoring ]
[ server consolidation through Xen and KVM virtual machines ]
[ high availability clustering for critical services ]
[ cluster file systems and high performance storage ]
[ documentation ]
[ project cost ]



cluster design and auditing ::.
One of our main strengths is to spec out and design compute clusters with efficiency in mind. This translates to cost efficiency, smaller weight and cluster density footprints, cooling, networking, and much more. We also will audit and report on existing infrastructures for areas of waste and inefficiency which can be leveraged to get more performance out of your systems in the future, as well as pave the way for smooth and carefully laid out expansion operations. We will work with vendors to make everything come together when time for planning, coordination and installation arrives. Details below.

The result of our operations for your project end up paying for themselves many times over, for a net decrease in total systems cost.
power efficiency and cooling ::.
One of the biggest problems modern organizations face these days is not only how to deal with the massive cost of supplying power for the machines themselves that make up clusters, but also how to deal with how to provide cooling for the systems infratsructure. We have worked on many different scenarios such that we can provide plans for a much reduced power footprint based on DC power infrastructures, CPU throttling based on load, as well as careful power and machine room temperature monitoring and alerting through SNMP and open source monitoring. We know well how to make use of good airflow design and efficient rack organization structures to further optimize cooling. Remote monitoring and administration of power centers and Power Distribution Units (PDUs) can be had for more flexibility.
network design ::.
We can design your networking infrastructure based on your needs and desires, anything from small 'unmanaged' switches all the way up to high density Large Datacenter Switches supporting 100Gb/s interfaces, custom switching and routing configurations, Infiniband and MyriNet configurations. This includes network traffic monitoring, custom VLAN configurations, Layer 3 top talkers and NetFlow data (if the hardware supports it), and much more. We lean towards Cisco gear because of their stability and breadth of supported features, but will, of course, work with whatever networking gear you prefer.
automated maintenance and monitoring ::.
We provide many automation solutions for various aspects of new and existing clusters. Some worth mentioning are automatic installation mechanisms for node installs, including custom kickstart and PXE solutions (including other 'netboot' implementations), as well as other prepackaged open source cluster install packages such as ROCKS. Beyond installation solutions, we also offer myriad automation and monitoring solutions for every circumstance. Common solutions include ganglia cluster monitor and nagios node and server monitoring and alerting. We also configure custom cacti graphing suite installations which can be used for network monitoring, disk and file server I/O, machine room temperature monitoring, any SNMP enabled device, as well as thousands of custom script solutions to track almost anything that changes over time. Another extremely powerful tool for automating installation and systems configurations is the cfengine package, which can be used extensively in nearly all environmengts. Use of these, and other, monitoring solutions help us discover inefficiencies in the system to improve cluster and server performance, save money on infrastructure and streamline your operation.
server consolidation through Xen and KVM virtual machines ::.
We can build VM server solutions using the big players in UNIX/Linux based VM hypervisors, including Xen and KVM/libvirt. Among other things, VMs of this sort are good for monitoring platforms. We can, of course, deploy VMs in any way you require.
high availability clustering for critical services ::.
We offer High Availability (HA) solutions through Heartbeat and Pacemaker available on several UNIX platforms. Be it redundant Web Server, VM Server or MySQL server clusters, we can make the environment more failure resistant, ensuring the maximum amount of uptime. We also offer services for DRBD live block device replication between servers, elimination the need for expensive SAN backends for shared storage, making storage fully redundant in real time.
cluster file systems and high performance storage ::.
We recognize the troubles inherent in delivering data to large computing environments in a fast and efficient way. Depending on the nature of your environment, we can provide several solutions based on what we think your system would most benefit from:

NFS:
Tried and true, the UNIX Network File System (NFS) can be leveraged for file serving in certain cluster environments. While not as scalable as many of its file system brethren, we can carefully tune and monitor NFS to give you much more performance than a default install will provide. If I/O to your servers and cluster nodes is not that great, this solution can be implemented with the least administrative overhead. It should be noted that NFS was not designed with clusters in mind.

GPFS:
The General Parallel File System (GPFS), written by IBM, was designed specifically for high performance cluster access. It leverages a technology based on striping data across many data server nodes in parallel, thus improving data transfer bandwidth exponentially. It can be leveraged via two implementations: parallel server delivery or SAN based delivery. It also supports in-band data replication and therefore has high value for disaster recovery situations. Depending on the size of your equipment budget, the storage can be nearly limitless and maximum bandwidth is often constrained only by your networking infrastructure. We have seen implementations that broke 20Gb/s data delivery to large cluster without breaking a sweat. It should be noted that this is a commercial solution and is not open source. Other features include snapshots, data pools, quotas, and it is POSIX compliant. It can support massive namespace, a single filesystem can be petabytes in size.

ZFS:
Developed by Sun Microsystems, ZFS is a high performance file system that is meant for local storage generally, and takes the place of hardware and software based RAID solutions. It is currently available for Solaris, OpenSolaris and FreeBSD only. It is installed on top of a JBOD ('just a bunch of disks') configuration, such that all you need is a disk enclosure or other medium to house an array of independent disks, and ZFS will form them into arrays that resemble mirroring (RAID 1), striping (RAID 0), or several types of striping + parity (RAID 5/6, etc). One can configure several subsets and pools within the filesystem that are configured differently. Some bonuses of ZFS are live file system inflation, pool migration, cache space for high speed disk access, snapshots, and much more. File system creation is instantaneous, even for filesystems over 10TB is size. Can be leveraged for cheaper RAID-like disk pooling with commodity hardware, and has blazing fast performance that can be used in a multitude of applications, such as database access, general purpose file serving, backups, staging areas, and much more. Supports iSCSI exporting. Note that this is not considered a 'cluster' file system, and is still considered 'local' storage for one machine.
documentation ::.
We meticulously document all systems we implement in whatever way is easiest to manage for you; we find that wikis work well for most projects but can document in most format types as you like. This is so we are not permanently tied in as the only point of contact for your systems operation - many of the systems and tools we implement can be managed by local technical staff after implementation. We do not re-write manuals for our tools that are documented elsewhere, but we do carefully document the specifics of what we do in a way that is public to your organization for ease of use and understanding.
project cost ::.
Because the nature of each project is so different, we cannot give a fixed set of criteria for the cost of our services to you. But we will draw up a project prospectus and cost analysis based on your needs after first contact with your organization.

Contact us for more details!