The launch of Google Buzz, the search gigants attempt to get into the microblogging and status update world dominated by Twitter, yesterday again shows that the age of distributed publishing is here. People are stating their opinions and ideas openly in many different venues on the net – from Twitter to Facebook to their own weblog. But they are not limited to one of these venues, instead they regularly switch from one to the other. Therefor it is relatively hard to keep track of these statements.
As stated in some other posts on this blog before this kind of distributed opinions are arguments could be very valuable if they can be gathered efficiently and be integrated into discussion making processes. I would like to take a closer look at the first of the two conditions, the efficient scraping of the arguments and opinions, and provide a basic tool to get started with this work.
As a short demonstration of what unified search can do I quickly build this crude Yahoo Pipe (more on that later) searching through Twitter, Blogposts and comments of blogposts. Feel free to use it and take a look at the internals. This demo search shows the results for the term „Google Buzz“. Of course these results could also be in RSS format.
No lets take a look at how to build such a unified search tool with a lot more functionality.
First of all there is a need to reduce the venues and areas in which the search for statements and opinions is conduced. Just searching the whole net using a common search engine would not solve the problem as the results are out of context and have little meta information. In other words: Search engine results are very fuzzy and hard to deal with. Before starting to engineer tools to find the statements in the different venues it has to be clear which kind of meta information is needed to further work with the information found. These three questions are example for possible meta information:
In which venue has the statement been posted? (Blog, Twitter, Facebook etc.)
Was is a reaction to another statement or does is stand on their own? (Blogposts, comment, @reply etc.)
Where was the statement made (geographically)? (Many tools like Twitter or Google Buzz do attach geodata to posts)
After the decision of where to search for what has been taken the solution can be build. Most social web contents are public and can be search through by simple reproducing the sites own search queries. For example if you load the URL: https://twitter.com/search?q=pep-net in you browser, the Twitter search results for the term „pep-net“ will be presented to you. By adding the right parameters to the URL the search can for example be specified to just results in a 50 kilometer radius around the city of Hamburg. On the basis of this approach it is fairly easy to build unified searches covering many social web sites at once.
Building the actual search can be done using a tool like Yahoo Pipes or by implementing it into an actual programming language like PHP or Python. Using existing tools like Pipes allow for very rapid development even for a non-programmer but the final result does lack in performance. Using a programing language allows for maximal flexibility and performance but the implementation work is much harder.
This article is of course only mend to be a quick introduction to the basic concepts of unified search and its possible use in participation processes. If you are interested in this area just post a comment here.