Ever wonder how cloud services like iCloud, Amazon EC2, Rackspace, or Dropbox actually work? Each of these major corporate programs rely upon thousands of computer servers to host their services. Although companies try to hide it, each one relies upon thousands of servers that are always on, constantly drawing excessive amounts of energy and requiring numerous hardware replacements just to carry out their daily functions.
This project will show you how to build your own cloud server to gain more power and flexibility with your computer and teach you a bit about how computer networking (if you don't already know). I used a CuBox (~3 watts of power) with a LaCie HDD (~24 watts of power) for my own setup, but you can use whatever you'd like. All together, my cloud server uses less than half the power of an average light bulb! No more Google Drive, Dropbox, or anything else.
-You'll need to buy some new hardware (an ARM computer and probably an HDD)
-Although this is energy efficient it does use a tiny bit of energy that wouldn't otherwise be used
-Read the service agreement from your ISP before doing this! Passing lots of large files over WAN will slow down their network and could lead to the ISP throttling your bandwidth or worse (they don't like large upload usage), so read your service agreement and use your head before setting this up! You should have no problem with a business account or fiber internet. I am not responsible for any problems that you create.
-Free access to all the files on your server
-Similar to Dropbox, Rackspace, or Google Drive except you own it and can modify it to your needs (for example: giving it as much storage space as you want)!
-Works well as a web server (for low traffic sites) and the website can be configured and monitored remotely
-An encrypted proxy server for insecure, pubic networks
-VPN for access to your home network (or wherever your server is)
-grid computing, virtual server hosting, more?
-Having fun learning about some of the basics of servers and the internet!
-ARM computer (not actually *required* but it's what makes this project so low energy and awesome)
-at least one HDD or SDD for storage
-some patience and sticktoitiveness (I'll try my best to keep things minimally geeky but be prepared to google around, soak up some wiki articles, and jump into the unknown)
Step 1: Internet Background
I feel like I have very little chance of really being able to do this section justice but I'll try laying out some of the basics. Feel free to skip it if you already know this or are just too excited to stop and read all this mumbo-jumbo.
If you're reading this on the Instructables site right now, then you've opened up a port on your computer and home router to transmit internet traffic, sent a connection to a remote domain name that's attached to a WAN IP address, and connected through a port on a remote computer to a web server application that's hosting the instructables.com web page files. Look at the picture above for a diagram of this process.
A domain name is the URL address aka the name of the website (instructables.com in this case). This is bound to a WAN IP address (184.108.40.206, WAN = Wide Area Network) which is what computers use to state their location on the internet to other computers (so others can connect to you and you can record who you're connected to). When you pass your connection through the internet to a web server, you'll be bouncing your connection through many intermediary computers that relay your signal forward (there's no way that one cable could connect you directly to Melbourne, Australia if you are positioned in Tajikistan, for instance). Then, your home router translates this WAN IP to a LAN IP (192.168.0.1 --> 192.168.0.3, LAN = Local Area Network) using a technique called Network Address Translation (NAT). Try typing $ traceroute instructables.com into your command line to see what hops you're using for your connection right now.
For any of the traffic to pass through the computers, a port must be open. The default port is 80 for HTTP (normal internet) and 443 for HTTPS (secure internet, used when making online purchases etc.) but there are thousands of different ports that allow various applications to communicate over the internet with other computers. Samba, for example, uses 445, SSH uses 22, and FTP uses 20&21.
To recap, your web browser passes a signal through an open port over the LAN to the router. The router receives the signal on a port, uses NAT to alter the transmission, and pushes the connection through a port on the other side to send it over WAN. After bouncing back and forth across the world, your signal finally hits the Instructables server. The web server application on the Instructables server receives your signal through the port, transmits the web page info, and sends it all back.
It's really more complicated than this but this is probably enough to grok everything we'll do in this project :p
Here's some more info:
Google's 20 Things I learned
Step 2: Get an ARM Computer (Happy Shopping)
If you haven't heard, ARM computers have become seriously powerful and are much more affordable than a standard desktop or laptop computer. My CuBox was $130+$40 for shipping and it has half the processing power and the same amount of RAM as my $1100 MacBook. The ARM architecture, of course, is different from most PC's (which generally use i386 or x64) so software doesn't always run correctly and Linux is by far the most stable OS. Getting an ARM computer for your server is what makes this project so energy efficient and butt-kicking. If you're already using a server at home, the energy savings after a few years will probably more than pay for the ARM computer anyway.
There are lots of new ARM computers to choose from right now so you'll need to pin down exactly which computer will be best. Some have more graphical power, RAM, connections for cables, etc. so pick according to your needs. Here's some stuff to consider:
-eSata has better speeds than USB 2.0 and 3.0 (good for file transfers and streaming)
-HDMI output is nice if you'll use the computer as a home media center (watching on your TV)
-more CPU power + RAM means your server will be faster
-some computers don't come with cases
The Raspberry Pi is kind of the holy grail at $35/25 but is a little less powerful than a CuBox. The main drawback is that an order now may take months to come because it's so popular. There's also the Beagle Board, older Shiva + Guru Plugs, Rikomagic, CuBox, and lots more. I like the CuBox because it has good hardware, eSata, HDMI, an IR reciever for things like TV remotes, and comes in a pretty looking case.
Step 3: Install the Basic OS
Agonizing over which Linux distribution to install shouldn't be a very high priority. I'm using Arch Linux for my system because it's easy to set up and manage and comes without a GUI pre-installed. All of the programs and commands for this project will work on other flavors of Linux (just not the pacman commands, those are Arch only). The GUI won't really be necessary unless you'll want to use your ARM as a media center, web browser, image editing studio, etc. Normal computers are much better suited for these activities.
There are many different boot loaders for all the ARM computers so just use whatever you have to boot into the operating system.
Step 4: Download Necessary Programs
Hold on to your butts! In this step we'll download and configure all the programs in only a few commands. Log in as root and type these commands into your terminal:
$pacman -Syu #updates the download repositories
$pacman -Sy openntpd #openntpd keeps track of time on your system (very necessary)
$pacman -Sy python2 #install python2
$pacman -Sy rsync #install rsync
$pacman -Sy samba avahi #install samba and avahi
Now that that's done we'll need to configure Samba so we can share files with other computers safely:
$adduser MyUserName #creates a new user for guest access, use defaults and pick a password
$smbpasswd -a MyUserName #make a Samba password for the user account
$nano /etc/samba/smb.conf #opens the config file for Samba (vi is way better than nano if you can do it!)
Disabling printer support in Samba can save RAM if it's not needed. Add the following lines to the [global] section of your "/etc/samba/smb.conf":
load printers = no
printing = bsd
printcap name = /dev/null
disable spoolss = yes
Not using NetBIOS name resolution can save on resources too. NetBIOS is useful for finding names of other Samba shares on a local network and offering up the shares from the host computer (the ARM computer) to other computers. This isn't needed because we'll be putting a static IP address on the ARM computer and using it only as a server. Disable it with this:
disable netbios = yes
Then, edit "/etc/conf.d/samba" and remove "nmbd" from the "SAMBA_DAEMONS" so that it looks like this:
# Configuration for the Samba init script
# space separated list of daemons to launch
#SAMBA_DAEMONS=(smbd nmbd winbindd)
Start up Samba with this:
$/etc/init.d/samba restart #restarts the samba process with new configurations
Ok, almost done. Now we'll finally edit the IP address, hostname, and daemons:
$nano /etc/rc.conf #opens the main config file for processes that start at boot time
# HOSTNAME: Hostname of machine. Should also be put in /etc/hosts
# Static IP example
address=192.168.XX.XX #replace XX with your info
#interface=eth0 #comment out the section for DHCP by inserting "#" at the beginning of the line
DAEMONS=(!hwclock openntpd syslog-ng network netfs crond samba sshd) #set up these applications to run on boot
If you want a web server try something lightweight like nginx (pronounced engine x) or Cherokee which has a GUI configuration:
$pacman -Sy application_name
For more details go to the Arch Linux ARM site!
Step 5: Client Computer Downloads
For the client computers that you'll be using to connect to the server, you'll want to have SSH, Samba, and sshuttle. Windows systems don't come with SSH so you should try getting a copy of PuTTY. I think sshuttle only works on *nix computers (not windows) but I never really use Windows stuff and haven't tried installing it there. Samba is pretty straightforward and comes pre-installed on OS X and Windows.
Step 6: Nmap Network Scanner
Ok, time to learn how to use Nmap . This is absolutely essential for any networking project! It's like bringing running shoes for a marathon or a calculator for linear algebra.
Nmap will scan networks and deliver all kinds of information back depending on what you ask for. For this project, port scanning and IP scanning will be helpful so we'll use a couple basic commands.
This will scan for all of the computers working on a LAN:
$ nmap -sP 192.168.0.0/24 #uses ping scanning
Once you find a computer of interest just nmap its IP (for example 192.168.0.53):
$ nmap 192.168.0.53
Now you'll get a list of open ports!
Many other scanning methods can be done, most of which require sudo, but the above example gets the job done fairly easily.
Step 7: Port Forwarding
Most residential computers operate behind a router with a firewall. This means that all internet traffic that travels to a computer on the home LAN network must be passed through a port on the router first. If, for example, you try scanning your router with nmap, you should see that port 80 is open. That is the default port HTTP which allows internet traffic to flow in and out of your LAN network. It would be common to have only port 80 and 443 open on a router but any port can be opened or closed to your liking!
We're going to set up an SSH server on the home network and have it function over WAN so we can connect back to it when we're out and about. The first thing to do is tweak the router so it forwards connections back to the ARM server. Open up a web browser and type in 192.168.0.1 (or whatever it is as long as the last digit is 1) to log into your server's configuration interface. There should be a page called Setup or something similar, click on that and go to the Port Forwarding section. Enter in the IP address of the ARM server and make sure that you set the same number for the start port and end port (that's to accept a connection and push it through the other side of the router using a specific port).
The default SSH port is 22, so if you want to use that just set the port forwarding to work over port 22 and that should be it. Unfortunately many public networks block transmissions over port 22 so when you try to connect back to your ARM server, you'll hit their firewall. Luckily the HTTP port is open on pretty much every network, so if the ARM SSH server is on port 80 you should be able to connect without trouble! Enter this command on your ARM computer to alter /etc/ssh/sshd_config:
$ vi /etc/ssh/sshd_config
#Port 22 <-- Change this
Port 80 <-- to this!
Restart the SSH server with this:
$ /etc/init.d/ssh restart
Now the SSH for your ARM computer will only accept connections using port 80. Make note of your WAN IP by either checking your router or googling "What's my ip?". Try connecting back to your ARM server using the WAN IP and the SSH port you chose to make sure everything is working.
Step 8: Proxy Connections
What is a proxy connection? Remember how your internet connection bounces around between different ISP servers on the internet to get to any destination? A proxy will basically allow you to select your first bounce before sending your internet connection out to the sea of the internet, making it appear as if your traffic is originating from the proxy server you chose.
Surfing the web on public networks can be insecure due to the variety of different ways people can hack your internet connection and see your information. Using sshuttle you can pass all of your internet traffic back to your home server while also encrypting your connection in an SSH tunnel. Since cracking this connection is very difficult you'll be secure from any dubious eavesdroppers.
The program sshuttle is wonderful because it allows you to proxy your internet connection to your ARM server without any difficult software installations or confusing extra stuff. A command like this will build an SSH encryption tunnel to your server and push all of your traffic through it:
$ cd ~/Applications/sshuttle #or use whatever folder you store sshuttle in
$ ./sshuttle -vr server_USERNAME@serverIP --dns 0/0
-v is for verbose to show what the application is doing
-r is for a remote connection
For connecting to a port other than 22 on the server (22 is the SSH default) try:
$ ./sshuttle -v --remote=server_USERNAME@serverIP:port_number --dns 0/0
Other options exist for tunneling only specific traffic. Check the man page and this useful github post for more useful examples.
Step 9: Custom VPN Service
VPN is an acronym for virtual private network. This forwards a remote network to your computer to make your computer seem like it's on that network. This is most useful if you're going to install sshuttle on your home network router because you'll be able to forward your home LAN to your computer (so you can try Samba, SSH, etc. as if you were at home).
Try a command like this to forward your home network from the ARM computer:
$ cd ~/Applications/sshuttle
$ ./sshuttle -NHvr server_USERNAME@serverIP
-N routes any subnets on your home LAN (you'll probably only have one LAN network but doesn't hurt)
-H scans for host names of computers on the forwarded network and sends them back to your computer
Step 10: Samba Fileserver Through SSH Tunnel Over WAN
Now, time for the real meat of this project! A distributed file system (like Samba or NFS) works nicer than many other solutions for cloud serving because it will mount the server's drive to the local filesystem (as opposed to only a web-interface or ftp/sftp browsing). The effect is similar to plugging in a flash drive. The files aren't actually stored on the local computer but you still have r/w access to everything. So editing documents, watching videos, listening to music, or whatever else can be done without needing to actually download anything and everything works quickly. All of the traffic is encrypted with an SSH tunnel too, which keeps curious eavesdroppers away from your info and speeds up the file transfers.
What we'll do first is ssh port forwarding to the server (the ARM computer) from the client computer (your computer). This will forward all connections on a specific port from the client computer back to the server. We can take advantage of this feature when connecting with samba! For samba, we'll actually tell it to mount the local computer on the port that is port forwarded so we connect back to our server through an ssh tunnel.
Here is the command to build the ssh port forwarding (use port 445 because this is the samba default port):
$ ssh -C -c blowfish -L8392:localhost:445 server_USERNAME@serverIP
-C is for data compression (speeds up connection)
-c blowfish is for the blowfish encryption algorithm which is secure but allows for faster connections than other algorithms
-L[bind_address:]port:host:hostport is for port forwarding
-p [server_port] not shown but can connect to SSH on a non-default port on the server (if you don't like using port 22)
Then make a directory on your client computer where you'll mount your server (for OS X I like doing /Users/my_username/mount_spot and dragging the mount_spot folder into the Finder sidebar):
$ mkdir /Users/username/mount_spot <--(make this at the desired location)
Keep the window with ssh up and open a new shell (command line window), mounting Samba with this (NOTE: 8392 is a random number, switch it with whatever as long as it's above 1000):
$ mount -t smbfs //server_USERNAME@localhost:8392/server_drive ~/mount_spot
Unmount the drive with this:
umount -t smbfs //server_USERNAME@localhost:8392/server_drive
Close SSH port forwarding by switching to the window with SSH and typing:
Step 11: Rsync for File Syncing
Ok, you're quickly amassing a variety of helpful tools which will allow you to network however you'd like. The final tool I'll show how to use is rsync. Samba works pretty well for many things but when it comes to transferring medium to large files, it moves like a snail. I've found that rsync over ssh is my favorite tool for moving these types of files. The program is just like syncing an ipod except that you dictate which file/folder to sync and where to sync it to. For example, I could be working on a project that's hosted on my server at home and quickly upload a folder for a new section that contains four videos, a spreadsheet, and documents. The files would get compressed and passed quickly to the home server with lightning speed.
Here's a command for syncing from client to server:
$ rsync -avze ssh /Users/userName/Documents/CowStudy/grazingStatistics email@example.com:~/Documents/CowStudy/grazingStatistics
That will update the information in the grazingStatistics folder on the server to include new items from the client computer
For going from server to host using port 80 try this:
$ rsync -avze "ssh -p 80" firstname.lastname@example.org:~/Documents/OlympicStatistics/Rhythmic_Gymnastics/Ribbon_Dancing ~/Documents/Fun_Statistics
This would update the Fun_Statistics folder on your local computer with the contents of the Ribbon_Dancing folder on the local computer. If an old version of Ribbon_Dancing was already in Fun_Statistics rsync would detect this and only move new files from Ribbon_Dancing to Fun_Statistics (this comes in handy for website updates, working on projects, updating media libraries etc.)
Check out the man page and articles for more rsync goodness!
For extremely large transfers, UFTP may work well but I haven't actually tested it much quite yet.
Step 12: Bonus: Ideas for Extra Uses
Woohoo! You have a cloud server, now what can you do with it and how can it be improved?
-Media files (videos, music, video game Roms, etc.) can be streamed to yourself so you can enjoy them without needing to actually download anything (try mapping files to video game emulators or a local copy of XBMC)
-If you like running websites, this system serves as a great web server and with SSH and Samba, it can be monitored and configured remotely. The light hardware may only be good for one or two low volume sites but many ARM computers could be used in tandem for load balancing for large sites (just think, a server blade made of ARM computers that runs on less power, is more tolerant to failure, and can even do grid computing! Baidu converted to arm servers a while back)
-For times when you don't want to carry a laptop around, a virtual machine running on a flash drive may be used for transmitting files to and from your server. This can also be used as a mini work station with all the normal applications you like to use. Mike Levin has made a great tutorial for this on his site: Using QEMU Ubuntu Tutorial (hit the NEXT POST button on the bottom to follow chronologically). This is missing some steps for the OS X section but I may make another 'bile to fill in the gaps and add some other tweaks. If you're feeling kinda lazy I made a TinyCore Linux virtual machine you can download from here: cloud-computer-package
-Home automation system for various tasks
-a Honeypot for defending against Hacking attempts on large networks
-install the applications FreePBX + Asterisk for a great voicemail system for the office or home (here's an Asterisk for Raspberry Pi site)
-More stuff (not all cloud computer related) for ARM
Post any ideas of your own! Nothing is totally out of the question but keep in mind the limited RAM/processing power.
Good luck and have fun!