Hack a $30 WiFi Pan-Tilt Camera - Video, Audio, and Motor Control With Python




Introduction: Hack a $30 WiFi Pan-Tilt Camera - Video, Audio, and Motor Control With Python

In this Instructable, you'll learn how to intercept the video, microphone, and controls of the $30 Kaicong SIP1602 wireless pan-tilt camera on Windows, Linux, or OSX! Everything is rolled neatly into python scripts; you can use the output data for things like voice transcription, computer vision, and automated directional control. If you're feeling truly adventurous, keep on reading and you'll learn my methods to discover and reverse engineer wireless cameras!

Installation time: ~30 minutes

You will Need:

  • A Kaicong SIP1602 WiFi Pan-Tilt Camera
    • NOTE: Apparently the popularity of this Instructable wiped out Amazon's stock. Others have reported that this $45 Tenvis camera is a good substitute.
  • A computer or network router with an ethernet port
  • A working 802.11B/G wireless network (wireless N isn't supported with this camera)
  • Basic knowledge of command prompt or terminal (change directories, run a file)

For anything other than just installing and running the camera code, intermediate-level experience in Python and OpenCV will also be very useful. Let's get to it!

If you like this hack, don't forget to follow us on Instructables, Facebook or Twitter, and check out our other projects on our website!

Step 1: Setting Up the Camera

On the box that contains the camera is Kaicong's motto: "Nothing Important Than Safty". And it shows - they really made the manual secure, because anyone that can't read Mandarin is going to have a pretty hard time understanding it! That said, installation is surprisingly simple.

First Steps:

  1. Remove the camera, wall charger, and ethernet cable from the box.
  2. Plug the wall charger into a nearby outlet, and connect the camera to your computer or router ethernet port via the cable.
  3. Turn your camera over and look on the bottom. You should see a host name listed, as well as the username and password for your camera (spoiler alert: it's "admin" and "123456")
  4. Type in your domain (mine was 385345.kaicong.info) into your browser. Type in your camera's login information at the prompt, and you'll be directed to a list of links to different browsing modes. Choose the best mode for your browser.
  5. Have fun clicking buttons for a bit! You'll notice that the Server Push Mode for Chrome and Firefox doesn't have a working microphone or speaker output, which is quite lame. The IE version also requires installing an ActiveX object, but all features work once it is installed.
  6. Also take note of the IP address in the webpage URL - we'll need this later.

Wireless Setup:

  1. There should be a small gear icon at the bottom of the column of control buttons on the page you were on at step 5. Click this, and you'll be taken to the settings page.
  2. Click on the Wireless Lan Settings link on the left side of the page.
  3. Click on the text input labeled "SSID" and enter your wireless network's name.
  4. Make sure the "Authetication" drop down button is set to your network's auth type (usually WPA2-PSK AES if your network has a password)
  5. Click on the "Share Key" text input and enter in your network password if you have one.
  6. Click the "Set" button. Your camera will reset and connect to the network.


By default, these cameras are viewable to anyone on the internet who guesses your <number>.kaicong.info address - which can be awesome for projects, but not so awesome for security and privacy. To solve this, you can either change your DDNS username and password, or simply set both of them to blank (thereby making it impossible to access your camera outside of your local network)

Step 2: Installing Python Controls

With the camera set up complete, we'll need to install a few libraries before we can run our scripts.

For Windows: Here are links to windows installation tutorials, or pages where you can find the windows installer.

For Ubuntu: setup can be done via this command: sudo apt-get install python python-opencv python-pyaudio python-pygame

For OSX: First install OpenCV and Homebrew - I had to additionally install eigen (brew install eigen) to prevent compiler errors.

Then run the following:

brew install python

brew install gcc

brew install homebrew/python/pygame

brew install portaudio

Then download the pyaudio wrapper for OSX and install that as well.

The Repo:

Now that we've got the dependencies out of the way, head over to the git repository where this project is hosted, download it, and extract the files. Open up a command window or terminal in the directory with the extracted files, and run each script with the following commands, replacing with the IP address of your camera:

python KaicongAudio.py
This script pulls audio from the mic and plays it on your speakers.

python KaicongVideo.py
This script displays video from the camera and displays it in an OpenCV window.

python KaicongMotor.py
This script opens up a black Pygame window. Click it with the mouse so it can capture your keyboard, then use the WASD keys to pan and tilt the camera!

At this point, we've successfully hooked up the camera and can intercept audio, video, and motor control from it via programming. But how did we do this? Read on to find out...

Step 3: How We Did It: Hacking Motion

We started out with a camera with a web page interface and wanted to control it programmatically, so what better way to figure out how it works than inspect the code?

We saved the webpage to disk and looked at monitor.htm. It was there that we found some interesting looking variables, such as PTZ_UP and PTZ_STOP, which appeared to be motion control constants. Keeping that in mind, we opened up the web inspection console (Ctrl+Shift+C in Chrome) and inspected the network traffic while clicking the camera motion buttons. We found several calls to a decoder_control.cgi page with a "command=" argument matching the constants we found earlier in the HTML - one whenever a click begins, and another whenever a click ends. So the controls are ON/OFF and via HTTP GET request? Let's find out!

We copied the url we saw:

into the browser and loaded the page, and sure enough the camera began moving! From then it was a matter of throwing the constants and a formattable URL string into Python to complete the controller. Done.

But what about video? A camera's not a camera without it, after all...

Step 4: How We Did It: Hacking Video

As it turns out, video hacking was actually pretty simple - we looked in the network requests and found a lot of requests to snapshot.cgi. Entering one of these into Chrome produced a still image every time the page was loaded. Neat!

But we wanted something a bit more efficient: the streamed video that the ActiveX object seemed to receive. The ActiveX object itself didn't seem too useful to disassemble (reversing assembly code is way overrated), so instead we opened up Wireshark. We filtered the capture down to the IP of our camera (Capture->Options->Capture Filter) and started the capture, before reloading the ActiveX control page in our browser. What we found were two GET requests for audiostream.cgi and livestream.cgi, presumably for the audio and video.

Putting aside the audio url for now, we turned to Google to see if anyone had decoded an IP camera video stream before. Under a search for "IP camera HTTP stream" we found a handy little python script to get everything running in OpenCV. All it took was replacing the script's URL with ours, and we were in business!

Next, it was time to intercept the audio.

Step 5: How We Did It: Hacking Audio

Getting video wasn't too hard. Hopefully audio would be just as easy, right? After a few hours of Google searching, it looked like no one else has ever managed to successfully pull out and decode the audio stream of an IP camera. We were on our own.

Going back to our audiostream.cgi url we found via Wireshark, we captured a few bytes of audio with Ubuntu:


Then hit Ctrl+C to cut off the stream. Raw audio in hand, we marched over to Audacity to attempt to play it via File->Import->Raw Data. Most attempts sounded like noise, however we found that using the VOX ADPCM encoding at 8kHz produced something recognizable!

There was still the matter of removing that weird pattern of clicks. I figured it had something to do with the packets, as with the video stream we had to remove some headers at the start and end. Maybe the same was true with audio?

We looked a bit more closely at each packet, and noticed that the data started with the same 0x55aa15a8... bytes, plus a value that looked to be counting upwards each packet, and a long stream of zeroes, for a total of 32 bytes. Presumably, Audacity was taking these packet headers as audio data and trying to decode them, which is what made the nasty clicking sounds.

A few experimental python scripts later, we removed the headers and passed it through the ADPCM decoder in Audacity - most of the clicks were removed! But there were a few left over, specifically during the noisier parts of the audio.

So we read into how ADPCM works - apparently it encodes audio via the difference between samples, and caches the previous audio state so that it can add the two and produce a new sample. After a few more python scripts, we managed to capture the packets directly and reset this state at the start of each packet. Clicks were completely removed, and nothing but camera audio remained. Success!

Step 6: The Future

It's awesome to have such a complex device completely controllable via python. We plan on using our camera for person detection and room occupancy tracking as well as spoken voice commands, but we can think of a few other uses for a camera like this one, such as:

  • Augment an RC car or plane to display first-person video while driving
  • Put it on an airsoft or NERF turret and track your victims
  • Set up a rockin' custom built home security system
  • Use CV object recognition to track where your pets go automatically when you aren't there
  • Make a remote telepresence robot
  • Rip off the camera part and attach anything that needs to be precisely positioned (lasers, robot arms...)
  • Create a remote time-lapse system with slow panning over time

We'll be hacking on these cameras with at least a few such projects in mind, so be sure to follow us on Facebook and Instructables if you're looking for inspiration!

Gadget Hacking and Accessories Contest

Third Prize in the
Gadget Hacking and Accessories Contest

Sensors Contest

Participated in the
Sensors Contest

Be the First to Share


    • Anything Goes Contest

      Anything Goes Contest



    4 years ago

    I had a quick look at my old Chinese IP camera, and found a way to simply control it without

    python (am afraid of snakes) but just from a script with 'wget':

    # keep moving up
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=0"

    # keep moving down
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=2"

    # keep moving left
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=4"

    # keep moving right
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=6"

    # one step up
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=0+&onestep=0"

    # one step down
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=2+&onestep=0"

    # one step left
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=4+&onestep=0"

    # one step right
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=6+&onestep=0"

    # scan around, then goto center
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=25

    # vscan
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=26"

    # vscan stop
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=27"

    # hscan
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=28"

    # hscan stop
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=29"

    # left up
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=90+&onestep=0"

    # right up
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=91+&onestep=0"

    # left down
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=92+&onestep=0"

    # right down
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=93+&onestep=0"

    # goto preset 1
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=31"

    2 3 4 5 6 7 8 9 10 11 12 13
    .... 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55

    # goto preset 14
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=57"

    # goto preset 15
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=59"

    # store preset 1
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=30"

    2 3 4 5 6 7 8 9 10 11 12 13 14
    .... 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56

    # store preset 15
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/decoder_control.cgi?command=58"

    # brightness 1
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/camera_control.cgi?param=1&value=16"

    .... 2 3 4 5 6 7
    32 48 64 80 96 112

    # brightness 8
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/camera_control.cgi?param=1&value=128"

    # brightness 9
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/camera_control.cgi?param=1&value=144"

    # contrast 1
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/camera_control.cgi?param=2&value=1"

    ... 2, 3, 4, 5, 6, 7, 8, 9

    # contrast 5
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/camera_control.cgi?param=2&value=5"

    # contrast 10
    wget -O /dev/zero --user USER --password PASSWORD "http://CAMERA_IP_ADDRESS:PORT/camera_control.cgi?param=2&value=10"

    # For the picture I wrote mcamip a long long time ago:


    # Audio recording to mp3

    wget --user=USER --password=PASSWORD http://CAMERAR_IP_ADDRESS:PORT/videostream.asf -O - 2>/dev/zero | ffmpeg -f asf -i - -acodec libmp3lame -y filename.mp3

    Acuially i used 'snort' and logged the communication from my browser,

    then looked in the logfile for cgi for teh commands,

    those command numbers are also listed if you look at 'view source' in the browser.

    I finally managed to get cmd to find the files, had to drag and drop to get the exact file path but now the commands time out because of a connection failure. everything works fine with the default software. i have tried changing the port and double checked the passwords are the same. any other ideas. the exact error is

    Opening url:
    Traceback (most recent call last):
    File KaicongVideo.py", line 47, in <module>
    File KaicongInput.py", line 44, in run
    KaicongInput.py", line 19, in connect
    self.stream = urllib2.urlopen(self.uri)
    File "C:\Python27\lib\urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
    File "C:\Python27\lib\urllib2.py", line 400, in open
    response = self._open(req, data)
    File "C:\Python27\lib\urllib2.py", line 418, in _open
    '_open', req)
    File "C:\Python27\lib\urllib2.py", line 378, in _call_chain
    result = func(*args)
    File "C:\Python27\lib\urllib2.py", line 1207, in http_open
    return self.do_open(httplib.HTTPConnection, req)
    File "C:\Python27\lib\urllib2.py", line 1177, in do_open
    raise URLError(err)
    urllib2.URLError: <urlopen error [Errno 10060] A connection attempt failed becau
    se the connected party did not properly respond after a period of time, or estab
    lished connection failed because connected host has failed to respond>

    any ideas what other direction i can approach this from would be great

    Mario Mey
    Mario Mey

    8 years ago on Introduction

    Hi, man. I'm trying to make it work with my Foscam FI8918W WiFi IPCam.

    I had to change some stuff. To get video, this is the new line in KaiscongVideo.py:

    URI = "http://%s:8080/videostream.cgi?user=%s&pwd=%s&streamid=3&audio=1&filename="

    Where 8080 is my port (not 81) and the file is videostream.cgi (I think I changed KaigcongInput.py and KaigcongOutput.py)

    Now, I having propblems with Audio and Motor.

    Audio: I don't find the audio file from the camera. Because there're videostream.cgi and videostrem.asf (supposedly, with audio), but there's no audiostream.cgi.

    Motor: This works in web brower and it moves the camera:


    But, changing the port, user and pass in KaicongMotor.py, I have this error:

    urllib2.HTTPError: HTTP Error 401: Unauthorized

    Any help would be very appreciated.


    Reply 8 years ago on Introduction

    Hey Mario, awesome job so far!

    For the motor stuff it sounds as though it's using a session of some kind... I'd try using that URL along with a valid field for loginpas (AKA your password) and see if that works. If no, you might have to use http://pycurl.sourceforge.net/ or some other python web framework with session support.

    As for audio... if it's mixed in with the video in an .asf file it'll be quite a bit harder to rip it out. If your web UI has a mute/audio button, try doing a packet capture with it off and another with it on, and compare the two. For us, doing this showed a series of same-length packets that were a different size than the ones for video. 

    Hope this helps!

    Mario Mey
    Mario Mey

    Reply 8 years ago on Introduction

    Hey, man, from Foscam, they send me a link where is the CGI/SDK for HD Camera. It is a good reference to learn from.


    This link was sent to me, because I asked them about sending audio to the cam... because it is possible and Foscam Android App CAN do it (it was the first time that I send audio to my camera and hear from it).

    Mario Mey
    Mario Mey

    Reply 8 years ago on Introduction

    Here I found the manual to make it work without built-in page (spanish):


    Well, there's no audio-only file. You are right about ripping from .asf... and, for now, it would be complicated to look into capturing packets...

    And motor... again, look into pycurl would be a lot of time I don't have now. But you told me good data.

    Thank you!


    8 years ago on Introduction

    Really cool stuff man, great way to use some stuff that might be just lying around.


    8 years ago on Step 6

    I'll be testing this with my Tenvis JPT3815. The Tenvis also has the capability to send audio to the speaker in the camera. I will be interested in exploring that if you guys haven't already.


    8 years ago on Introduction

    I don't suppose you can get the ir lights to shut down? The standard cmd's 94 & 95 should theoretically turn them on/off but this doesn't seem to work on the 1602, any different result when hacked? Also anyone tried the "Sip1601, 1603,1605,1606 multilingual network camera firmware"? http://kaicong.cc/forum.php?mod=viewthread&tid=40058 Again with the goal to rest control or the IR programmatically..

    Hades X_X
    Hades X_X

    8 years ago on Introduction

    Sorry to bother you, but how exactly should one go on about doing such: "To solve this, you can either change your DDNS username and password"


    8 years ago on Step 6

    When trying to execute the python command I get the error 'Please build and install the PortAudio Python bindings first.' I have installed the required libraries.


    8 years ago on Step 5

    What do you think the difficulty would be for creating a library for visual basic?


    8 years ago on Introduction

    FYI This model camera is "unavailable on AMAZON. Think it might work on the other models?....i was soooo ready to pull the trigger(amazon)


    Reply 8 years ago on Introduction

    Thanks for pointing that out, looks like our instructable got a bit too popular! Other people are saying this one should also work: http://www.amazon.com/gp/product/B006I0KL6Y/ref=as_li_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=B006I0KL6Y&linkCode=as2&tag=todmedblo-20&linkId=3VMGGJBNFWN5X2RF


    Reply 8 years ago on Introduction

    Yes,but usually if stock is present..you can watch the price rise! One more question before I try to pull the trigger again. When you say 'should work'....MANY MANY of these cameras are identical on the outside regardless of who is selling it or are the manufacturer, and OFTEN tyhe electronics are ALSO identical. see were I'm going with this...?


    8 years ago on Introduction

    Very interesting! I've got a tenvis cam which is basicaly the same thing. Gotta try this with my linux rig.


    Reply 8 years ago on Introduction

    Let us know! Apparently the Kaicong camera is now sold out, and we're trying to find a good substitue for people to use


    8 years ago on Introduction

    Does anyone know how this camera compares to the Foscam equivalent? I ask because it looks almost identical, and the UI looks extremely similar. I'm wondering if this camera and the Foscam are really the same under the hood.


    Reply 8 years ago on Introduction

    Can't say we've tried it ourselves, though there's a good chance it'd work - at most requiring only small changes to the code. Other people are also saying there's a similarly priced Tenvis camera on the market that'll work!