Quicklinks
Update 12-20-08: The myspace class has been updated to reflect changes in the way MySpace outputs the HTML for friend listing pages. Download link is the same.
Presenting two MySpace classes:
- MySpace Mobile
- MySpace.com
programmed in PHP and released under the GPL. Download both classes here (myspace_classes.zip).
These classes provide basic profile and friend scraping functionality. My intention in programming them was to free personal & relationship data from MySpace for use in other applications. They were not created for spamming or other nefarious kinds of activities.
Each has their strengths and weaknesses. Used together, they can be very powerful. Here’s a brief breakdown:
MySpace Mobile
MySpace Mobile (located at m.myspace.com) is extremely useful for easily extracting user-entered profile information. Profile info is broken up into 6 categories:
- Interests and Personality
- Basic Info
- Background and Lifestyle
- Schools
- Companies
- Networking
“Basic Info” for instance, contains Gender, Occupation, City, Ethnicity, etc. It’s easy to extract this info from MySpace Mobile because it lays it out in easy-to-match HTML. Look at this DOM view of MySpace Mobile’s “Basic Info” section:

Talk about a breeze with preg_match…
For my purposes I found Interests & Personality, Basic Info and Background & Lifestyle to be the most interesting, and thus only included them as part of the profile parsing functionality. It wouldn’t be difficult to extend this to Schools, Companies and Networking.
Unfortunately MySpace Mobile is excrutiatingly slow compared to MySpace.com. In my tests, it takes about 0.5 seconds to grab a friend’s Basic Info. This might seem fast but that means grabbing 300 friends’ info would take ~150 seconds or 2 1/2 minutes. In fact it would normally take longer but I set up cURL to TIMEOUT after 2 seconds; in the event that a page takes longer than 2 seconds to load, the request will drop, and the class will try one more time to grab the page. This will sometimes result in pages dropped completely. The TIMEOUT value can be easily adjusted to your liking.
MySpace
The MySpace class is primarily useful for quickly grabbing all of a user’s friends’ rudimentary info: Profile ID, Main Image, and Name. In my tests, grabbing between 300 and 400 friends took about 2 seconds. Grabbing 70 friends took 1.5 seconds.
“Grabbing” friends means the class will return a 2D array of your friends’ info. In this example:
$test = new myspace();
$test->login("username", "password");
$my_friends = $test->grabFriendBasics($test->myID);
$my_friends will be an array with the following indices: ['name'], ['img_url'] and ['id']. They each will contain the same number of values, e.g. if you have 70 friends then name, img_url and id will all have 70 values each, in the order in which your friends are displayed on your friend listing pages.
Together
I have found the most interesting utility in using the MySpace class to grab all rudimentary friend info, and then using those friend IDs to grab all demographic info with MySpace Mobile. Here’s an example:
$rud = new myspace();
$rud->login("username", "password");
$my_friends = $rud->grabFriendBasics($rud->myID);
$msm = new myspace_m();
$msm->login("username", "password");
for($i = 0; $i < count($my_friends['id']); $i++) {
$details = $msm->getFriendDetails($my_friends['id'][$i], array("basic"));
// If first iteration, output the structure of the Basic info listing
if($i == 0) {
print_r($details['basic_def']);
}
print_r($details['basic']);
}
Of course, I don’t expect these classes to necessarily be used for what I’ve set them up for. If you have any feature requests, leave a comment below.
For a live demonstration of simple functionality, along with more code examples, check out the Demo.
Leave Comment
I’m here at Freebase’s Hackday ‘08 and things are just getting started. I have been interested in the Semantic Web for a couple years now, and though Freebase’s Metaweb technology didn’t originally provide an RDF interface, Freebase now has an RDF endpoint. The Metaweb Query Language (MQL) is much more intuitive to me but I think RDF has a much bigger payoff in the end as a W3C recommendation. I’m looking forward to attending talks on interfacing with Metaweb using RDF..
On another note, the theme of today’s gathering is “Open Government.” I remember hearing a call recently for developers to work on more beneficial, perhaps humanitarian applications. Applications that will make everyday tasks easier to accomplish, and the important things easier to find. Doctor’s appointments, accurate medical information, transparency of gov’t, etc…. talks are starting. Will update soon.
Post-Hackday Update
Instead of going into detail on the different sessions I attended, I should just get straight to the point: Freebase, I like the idea of your mission. I appreciate the open-ended nature of your system and the ease of using the MQL. But I can’t help but feel like you’re in a blind race to beat everyone to the structured-data punch. Where are the killer apps? Your GIX framework for mashing, say, points of architectural interest onto a map of Marin County, is not very new or interesting. I know examples like that are watered-down versions of more complex applications for the sake of demonstration, but really - I have yet to see anything out of Freebase or the use of Freebase that has really blown my mind.
This is by no means a final judgment. I hold out for the day I see a killer application based on Freebase technologies. But I think the Semantic Web revolution will depend on information that contains meaning and utility for our vocational and personal lives. That includes social networks, e-mails, text messages, data on our hard drives…
So I leave with a few questions for the Freebase community: will Freebase’s “bases” technology enable this kind of user-centric information warehousing? Will there be a point (or does it exist already) where I can submit to Freebase a FOAF file of all my friends on MySpace and query it with the MQL? Obviously there are privacy concerns, so is Freebase going to give the average user leverage for maintaining their own data?
All-in-all I had a good time at Freebase. I met some cool people who helped me gain a better understanding of the structure of MQL queries, and technologies available to do work on my own projects (thanks James Levy!). And, who doesn’t love a t-shirt with a rhino holding a flag? Thank you to everyone at Freebase for making the day light-hearted and fun.
Leave Comment
The idea is essentially based on what the folks over at SquareSpace have done - provide a WYSIWYG interface for designing/modifying blog templates. I’ve had this idea for a while now (though on a larger scale than just WP), but it was only until recently when I became more adept with JavaScript that I began to think seriously about how to implement it. I decided to narrow my focus to WP for a couple reasons:
- The WP community is large, and many people could benefit from this kind of tool.
- WP has a limited number of design-influencing template functions. This makes my job easier.
Implementation
There are a few different aspects to developing the back-end functionality for this kind of tool. Note that I’m writing off the top of my head, so this list is sure to change.
- Understand the structure of the selected WP theme: header, footer, the loop, content (page, post, comments, etc), sidebar. This would be accomplished by parsing the theme’s main files (index.php, page.php, theloop.php, etc), and determining where each template function resides in relation to surrounding HTML elements.
- Use the information gathered from parsing to create a “map”: a link between template functionality andĀ theĀ Document Object Model. This will allow us to determine which elements of the WP suite are actually implemented on the page, where they are, and the limitations of their placement on the page.
- Functionality to modify the theme’s template files based on front-end design changes. This includes activating and placing widgets, HTML arrangement, CSS mods (+ other things I’m not thinking of?)
The front-end component is a little more obvious I think. I’m not too certain of the best layout, but SquareSpace has the right idea. Semi-transparent overlaid windows with all the design-related functionality needed: “drag-and-drop the sidebar here”, “give the footer this background color”, “put a calendar right here”, etc. At this point I’m more concerned about interfacing and modifying template files.
In terms of actually programming this plugin, I’ve had a lot of success with the Yahoo User Interface JavaScript library, so I’d want to use that for the front-end. And of course an AJAX interface to the PHP back-end to make modifications to the template files.
Clearly still mulling this over. Comments welcome!
Leave Comment