Alex Barnett blog


Facebook export - WTF????

On hearing the news over at TechCrunch that I'm able to export my social network data out of Facebook as a .CSV file using the FriendCSV app, I did exactly that.

I have 74 contacts on Facebook but I managed to export 144 records. That's 70-odd people's "social data" including education, work experience, current location, hometown, affiliations, date of birth of people I don't know. Like Justin over here, I seem to have more "friends" than I bargained for. And I've now got their data.

This is not good. Facebook - wtf is going on????

Update: Dan Birdwhistell, developer of the FriendsCSV app, commented below soon after I posted to explain the issue I described above was NOT the fault of Facebook but an issue with the FriendsCSV data processing side of things:

"Hey Alex.  I just posted this on Techcrunch as well.  WTF is right, but this very unfortunate glitch has been fixed.  Here's what happened:  After valleywag, techmeme, digg, etc. all picked it up, the server got overwhelmed and we had around 25 dumps that were in queue.  FB times out after a few minutes, so to speed up with the dump, we added some threading to the libraries, which pushed the exports through in an instant, but also misplaced some of the data in what we now know to be at least four separate csv dumps.  When we were alerted to this, we removed the threading and all was right again; however, the error did occur and it was our fault.  We'll continue to test the app during the night just to make sure this doesn't happen again. " 

Deepest apologies that this happened."

Thanks for getting back to me so quickly Dan. Here's the thing though - you have shown how easily all this data can be extracted - that's a good thing, and well done for bringing our attention to that, but you have also shown how wrong it can go - that's the bad thing. It's small scale in this case but you've highlighted how this can go bad in a more general sense.

I've been very supportive of the "my data" efforts, however my support has been based on the presumption that I'm actually talking about "my data", not somebody elses. This real-world example has opened my eyes to the fact that opening up social networks / social graph will be a complex business and fraught with possible downsides.

At large scale, the ability to extract all my social graph data - as opposed to "my data" which I provide permission to an application - begs the following question: who's data is my social graph data? It is each individual's, or is it mine once Jo Smith has allowed me access to it?


Dan Birdwhistell said:

Alex.  Thanks for the thoughtful response here.  We were thinking about many of the same questions and issues here, and even though it was a big glitch that brought some of them to the forefront, Im interested to see how the discussion develops re: your question of the extent to which a user (or the developers building apps for that user) actually owns the "social graph" and data tied to it.  When we started building this little exporting graph, we knew that we would be pushing against the walls of what people felt was comfortable; however, it was just a guns-blazing version of what other apps have been doing for quite some time.

As you'll see in some of my other comments on the TC thread, what will be interesting, I believe, if what people actually DO with these data after they suck them out.  

For instance, why did you decide to do the data dump, and what use does it provide you?  I personally did it so that I could more easily sort by various items and then decide whom I needed to contact/invite for specific things.  It also allowed me to sort for multiple column variables.

# October 23, 2007 7:59 PM

Gandalfe said:

Figures. Hope you pinged the facebook team.  :o(

# October 25, 2007 2:10 PM

Matt Katz said:

I tried the friendscsv app and it does appear to be all better now.  

The more interesting question you raise is about the private information in my data.

I think you solve that by treating your social graph, bloglist, etc as a list of pointers.  

My APML, OPML, etc should contain URLS that _don't_ contain my authentication info.  That way I can publicly expose anything I subscribe to, but you can only dig into those details if you have authorization to do so.

In facebook, the things you should be able to expose responsibly are the facebook profile ID's and the limited data (picture, etc) that by default one can see without being a friend.  From there on out, people with the appropriate permissions should be able to spider through to whoever they have perms for.

In friendcsv, the author is not exposing the data, rather he is letting you retrieve the data people have exposed to you.  You losing the data is your irresponsible ass.  Do it and risk the shame and censure of your friends.

# October 26, 2007 10:32 AM