(Note: I am really disappointed that after selling 400 million copies of Windows 7 Microsoft no longer supports Windows XP but has begun to downgrade its functionality in bits and pieces making it no longer suitable for this project. Since Doing the project under Windows 7 is even more difficult than doing it under Linux and Linux offer many more ultimate advantages than Windows I'm making the transition to Linux and as I learn will update this project accordingly. Also I am considering making this an open source collaboration under Linux so let me know if you would like to collaborate.)
Since the values in the array exceeded the limits of a Microsoft long integer variable and I had to use single or double precision variables (in the search for duplicate values by comparing each value in the array with an index ranging from the minimum to the maximum array value) the amount of available memory and time was often exceeded.
An array size of 265 elements that contains integer values that increase exponentially in value each cycle with potentially no limit in the number of cycles or the values in the array the time a memory required quickly exceeded the capability of my single personal computer.
An array and its set of values and a compare index example:
1 2622695870048
- -
- -
5 6645178834471
6 63491409181716
7 83861832294247
8 2622695870048
9 45029704436816
10 18117222372627
11 6659485477833
12 50600552165484
13 37292816675177
14 36784339702878
15 40751538724887
16 41576195825074
17 37319353740877
18 15405643023342
19 77102983094426
- -
- -
265 41576195825074
...with a compare index from 0 to 28239790275938 for this particular array.
I considered rewriting the program in ASP, opening an Amazon or Google or Microsoft cloud account but decide to tinker with a small Beowulf cluster first after reading several articles about them.
Since it was possible to duplicate the array and copy it to another computer it and it was also possible to split the index in half and count duplicates in the lower range of values on one computer and the upper range on another computer
28239790275938 / 2 = 14119895137969
lower range processed on one computer 0 to 14119895137969 - 1
upper range processed on another computer: 14119895137969 to 28239790275938
then adding more and more computers could potentially provide a solution to the limits of a single computer.
If you have a problem that can be divided in a similar manner and run one half on one computer and the other half on another you can build a supercomputer by simply dividing the problem and adding more computers .
Of course you might want to use an Ethernet link to transfer data instead of using floppies and you might want to avoid computers that run at speeds less than 1 GHz but the idea does not change.
Because my application does not demand too much server memory or processing time to count the duplicates I can usually run several instance of the program on each computer in the background without too much slow down or memory loss for other programs. With dedicated servers I can usually run a lot more instances of the program.
The system I have up and running can now handle an array size of 265 values requiring an undivided index from 0 to 1,280,000,000,000 (1.28 trillion) for a single computer or that can be divided into 38 segments to reduce index value size to 33,684,210 (33 million), 38 instances can run on 7 computers without issues.
Although it might take overnight or longer to complete this task and be slow by industrial standards it shows that with enough computers the possibilities are virtually unlimited..
Remove these ads by
Signing UpStep 1What components did I use?
After a power supply failed on the computer I was using to run seven hard drives I decided to limit hard drives per computer to four per computer. To do this I needed to crank up some much older computers and add them to the network. In addition I decided to add a new but inexpensive system to take some of the hard drive load
To keep costs down I recruited an old case and bought a new Intel D525 motherboard with an embedded processor and a total of four logic units and added 4GB laptop memory. Total cost for this repair and addition including the new power supplies was around $250.
Now the network has six computers and seventeen hard drives. Since I only need one hard drive per computer I will not need to buy hard drives for some time.All that is really needed to boot the D525 and handle the data and programming is a flash drive.
I started with a standard Ethernet configuration consisting of network peers using four PC's that were already running under some version of Microsoft Windows and were connected using a four port wireless Westell Ultraline Series 3 Model 9200EM router that came with my Verizon FIOS installation. This particular router allowed one of the cables to be made from one hundred feet of four copper conductor telephone line which ran past CRT monitors, fans and florescent lights and up and down walls without any detectable interference, data loss of slow down.
The oldest PC in the network is an Intel D865GLC board that lost its on-board audio after lightning hit my amplifier. I still use it on occasion for word processing, email, and exploring the web while my other computers are busy. I set it up to monitor security cameras right after I lost the audio instead of throwing it away. I also decided to use it explore oil immersion cooling which completely eliminated system noise but made replacing components a pain. Although I could upgrade it to 2GB I decided not remove it from the oil bath but to leave it at 1.5GB. It runs Windows XP32
The next oldest system has an Asus A8V-X motherboard I purchased to replace a far better Foxconn board that also got zapped through the audio port but unlike the other board did not survive to tell the story. I bought the A8V-X to keep the AMD X2 4400 CPU I paid $450 new when they first came out. The A8V-X is far inferior to the Foxconn but it allowed me to keep the X2 processor in service. ...to think I could have 2 Intel i7's now for the same price no longer erks me but it did for awhile. Memory is maxed out at 4GB of 200MHz (DDR400) PC3200. It also runs Windows XP32.
The next is a BioStar G31D-M7 and a dual core Intel P6 CPU running at 2.7MHz with 4GB of Adata DDR2-800 (PC-2 6400). I will have to replace this system soon as it has a tendency to lock up and turn off when several programs run at the same time and may exceed CPU or memory capacity or just not be designed to handle it on the board. It runs Windows XP64 which may actually be the problem.
The forth computer is an Acer A0751h netbook which has an Atom Z520 CPU running at 1.33GHz and 2GB internal memory. It has a 260GB hard drive and a 11.6 screen but I got it at a discount because it came with Vista. Vista Home Basic presents a number of hardware and software bottlenecks and challenges but once persuaded to grant access and to run applications programs it is as fast as the desktops. I use if as the client and any processing I share over the whole network gets done in better than one forth the time since the servers only process things that can be run at the same time like counting each occurrence of a first name in an array that would otherwise exceed available memory or time or the maximum value of a long integer variable type (2,147,483,647).
If your setup is close to this one then you can use it to develop a Beowulf network. Although you can use older and slower computers running under 1.0 GHz its probably a good idea to adjust the workload accordingly. I tried using an old laptop that runs at 300 MHz and has a 5.5 gig hard drive but eventually had to limit it to only 5 locations or nodes for the whole computer.
All tasks that use this setup benefit from much shorter completion times without overtaxing each node by running in an interleaved mode that is controlled by the operating system. With this setup I can explore other ways of avoiding the requirement of excess memory and time while remaining in a novice friendly programming language like Microsoft Visual Basic 6.
Any improvements you can make to the software or hardware of your network to exceed the capacity of this setup will help but any configuration that is similar will be sufficient to get started. You can experiment with upgrading or adding older hardware and operating systems after your network is established. Although I have added older computers running Windows 98SE at around 300 MHz their slower speed causes a noticeable and perhaps unacceptable delay. Identical systems with highest affordable speed are the best way to go.
| « Previous Step | Download PDFView All Steps | Next Step » |
1
comment
|
Add Comment
|
![]() |
Add Comment
|


















































