Introduction: Pull Website Data Easily on IOS

About: I'm a maker and student at Harvard studying electrical engineering.

There are lots of ways for an iOS developer to pull data from a website. However, many of these ways require complicated Objective-C code or require you to be fluent in HTML, PHP, or JavaScript. Let's explore a simpler method. This can be used for many purposes, but currently I'm using it to get closings and delays information here in snowy Central New York

Requirements:

-Basic Understanding of Objective-C

-Access to an Apple Mac with Xcode 4 or greater

Step 1: Theory

Here's how this works:

1. We get the HTML source code of the website, and put it in an NSString object.

-Why NSString? NSStrings have lots of useful class methods that allow us to search through the data.

2. Find all instances of a substring between two HTML snippets that surround the data we want

3. Convert this data to useable information

4. Because websites are constantly updated, it is a good idea to provide your users with a large list of sources to pull the data from. We will explicitly define what will be the leading and trailing "cap" of the data we want to extract, so if the website owner changes the way they display the information then this source may not work until you update the app.

Step 2: Getting the Website HTML Source

This part is simple. To encapsulate the HTML code in a string simply use the following lines of code:

NSURL *URL = [NSURL URLWithString:@"http://your-url-here"];
NSData *data = [NSData dataWithContentsOfURL:URL];

NSString *html = [NSString stringWithUTF8String:[data bytes]];

That's it in terms of getting the source code. But wait! We can't get ahead of ourselves! First you need to figure out what the caps around your data are before you can start using the data! Here's how to do it:

-Save your desired website as HTML source code. In Safari, simply do cmd-s and select page source, not web archive, from the dropdown.

-Open the file in your favorite text editor (you will need to right-click and do open with, as Safari will want to open the file as default). I'll be using the free and awesome CodePad as my text editor.

-Scroll down in the file to where the desired information is.

-Find the data you want to extract. For the data to be good for extraction, one of the "caps" around it needs to be a unique string of text that every other element in the list has (if you want to get data from a list/table). Don't worry too much about this, because chances are the combination of both caps will be unique, which in almost all cases works fine.

-'What are the caps?' you may be asking. Look at the first and second photos. The leading cap for my data is highlighted in the first photo. Every other element in this list of closings has this leading cap, and it is seen nowhere else in the page. The trailing cap in this case is the </h3> tag as seen in the second photo. This is by no means unique in the web page, but if the combination of the first and second cap is unique then you shouldn't get any unwanted data.

-The caps are used to tell Xcode where the data is. It will use this to pull out the information between these two caps, which is detailed in the following step.

-Once you have identified caps for your data then you can move on to the next step.

Step 3: Getting the Data!

We'll be using some awesome code I found on a StackOverflow answer to extract the data in the form of an NSMutableArray. The array will have a bunch of NSStrings, each one being some data that Xcode was able to find between your two caps. If you are not familiar with NSMutableArrays than do some reading, as they are extremely useful and used in ALMOST ALL programming languages, although they may be present in a different form.

Here is the code to extract the data in the form of a method:

-(NSMutableArray*)stringsBetweenString:(NSString*)start andString:(NSString*)end andText:(NSString*)text
{

NSMutableArray* strings = [NSMutableArray arrayWithCapacity:0];

NSRange startRange = [text rangeOfString:start];

for( ;; )

{

if (startRange.location != NSNotFound)

{

NSRange targetRange;

targetRange.location = startRange.location + startRange.length;

targetRange.length = [text length] - targetRange.location;

NSRange endRange = [text rangeOfString:end options:0 range:targetRange];

if (endRange.location != NSNotFound)

{

targetRange.length = endRange.location - targetRange.location;

[strings addObject:[text substringWithRange:targetRange]];

NSRange restOfString;

restOfString.location = endRange.location + endRange.length;

restOfString.length = [text length] - restOfString.location;

startRange = [text rangeOfString:start options:0 range:restOfString];

}

else

{

break;

}

}

else

{

break;

}

}

NSLog(@"%@",strings);

return strings;

}

I called it as such:

NSMutableArray *titles = [[NSMutableArray alloc] init];

titles = [self stringsBetweenString:@"<h3 class=\"name\">" andString:@"</h3>" andText:html];

NSMutableArray 'titles' is the array I use to 'catch' the returned data.

The argument 'start' is the leading cap of your data. The argument 'end' is, you guessed it, the trailing cap of your data. 'text' will be the string 'html' that you got from the website, which has the html data encapsulated in it. If everything goes well, you should now have an NSMutableArray of the data. The array prints out to the console for your convenience. In my example photo there was only one closing on the day that I built the app, so that's why only one entry is present. If there were multiple closings, then there would be multiple rows on the table and items in the array. But wait! There's more!

Quick note: Why the \ before the quotations in the start string? The compiler will throw errors if you do not do this, as it cannot differentiate between the quotations at the start and end of your string and the quotations actually in the string. If your cap has quotations, then add a \ before them to let Xcode know that these are part of the text.

Step 4: Getting Lots of Data

Check out the example photo above. I'm getting two pieces of data from the website, not just one - both the title and the description of the cancellation. To do something like this, you should call that method again, but this time catch the result in a new array. You can then combine the two together using the power of objects! In my example I have a custom NSObject of 'Closing' with two properties, one NSString called "title" and one NSString called "info". If you do not know how to make classes, then read up because these are incredibly crucial in Objective-C and other object-oriented languages. I use the following code to combine the two arrays and make one object:

for (int i = 0; i < [titles count]; i++) {
Closing *tmp = [[Closing alloc] init];

tmp.title = [titles objectAtIndex:i];

tmp.info = [descriptions objectAtIndex:i];

[self.appDelegate.closings addObject:tmp];

}

Descriptions is my array of descriptions. Using two arrays also allows you to check and see if the data is possibly corrupted, due to the webmaster changing the HTML format of displaying the information. You can compare to see if they are the same length. If not, this may indicate that you have a problem:

if ([titles count] != [descriptions count]) {

//Alert code goes here! (Possibly a UIAlertView ?)

}

Step 5: Other Tidbits

Again, it's extremely helpful to include more than one data source for your user in case the one source has been changed. It's also nice to include a refresh button, which can simply clear the existing arrays and re-add the data. In terms of displaying the data, I would look into UITableViews or UICollectionViews. They are perfectly designed to use arrays of data as their datasource, and they aren't too hard to set up. You can also make them look really nice, like in my example.

Why did I write this tutorial in Objective-C rather than Swift, Apple's new programming language? It's quite simple. You need to know Objective-C. Swift is still a new language, so most of the nice repositories on GitHub and other places are not written in Swift. Swift also only targets just iOS 8, which excludes a significant portion of iOS devices from downloading or running your app.

One more thing then I'm done. Please please please like and comment on this tutorial! It shows me that people are actually reading my stuff, giving me incentive to put out more!

Explore Science Contest

Participated in the
Explore Science Contest