This is an idea I’ve been toying with the last couple of months after working a lot on integrating Twitter, gathering RSS feeds and implementing an activity stream for iknow.co.jp.

When talking about Twitter over a couple of drinks, my friends often roll up their eyeballs. Again, this idiot evangelizing this ‘look-at-me’ service. In a way they are right, Twitter has limited value for it’s users. But the thing that I found most interesting about Twitter is the potential of activity streams and the machine-to-human aspects of it. In my previous article on how to build a Twitter service, I concluded that Twitter has severe limits when it comes to general activity streaming.

But guess what, activity is still HOT and serious innovation here is the new gold.

Activity Streams?

After sites like Plaxo, more and more websites are restructuring certain areas of their sites to facilitate ‘acitivty streams’. LinkedIn just introduced one:

Only a few weeks before 37Signals made this update to my Backpack account:

These activity streams are still kind of vaguely defined some call it Newsroom, Latest Activity or Updates. Here in Tokyo we have many different names for it: ‘notifications’ in programming code, ‘Activity’ on one page and ‘My News/マイニュース’ on the other.

Nevertheless we can extract some simple facts from this almost natural occurring phenomenon:
  • Events are plotted along a timeline and serve as news flashes.
  • They include updates about friends or followee’s.
  • They notify you about system events (new message, you have listened to 1000 songs, congratulations!)
  • They serve as call to actions: read this message, checkout this blog entry etc. This serves as a social and content creation lubricant.

Lifestreaming?

This word is flooding today’s buzztalk. Applications like friendfeed.com and tumblr.com basically allow you to aggregate stuff from other services and republish that in one big ass activity stream.

However, there are big flaws in most of these services:
  • They are not updated real-time, they poll external services and passively retrieve new updates.
  • If they have filter/categorization capabilities at all, they provide that functionality through their site and not the actual API.
  • They focus on social events and omit the upcoming wave of recommendation engine notifications.
  • Actively syndicating events through eg XMPP is still hard to find.
  • Their paradigm is Software as a Service rather than Platform as a Service

an Open Activity Platform

I think there will be a need to fire off all these notifications into some sort of standardized activity broker. Such an Activity Broker should have these core responsibilities:
  • Providing categorization and call-to-action hooks for notifications.
  • Providing authentication for external services and easy access ways.
  • Sophisticated filtering mechanisms for users.
  • Allowing real-time pushing of these incoming notifications to third party services.

I’m not really aware of current standardization drafts that accommodate these – perhaps DataPortability.org will do the job. But standards like oAuth, JSON and XMPP play in very well with the implementation of such a platform.

Interestingly – while searching for a suitable home for such a platform – I discovered the domain openactivity.org has already been taken by a certain company from Redmond:

1
2
3
4
5
Domain Name:OPENACTIVITY.ORG
Registrant Name:Sean Lyndersay
Registrant Street1:One Microsoft Way
Registrant City:Redmond
Registrant Email:seanlynd@microsoft.com

Skinning the Platform

The other side of the coin is the tools that plug into this Open Activity Broker.User added value will lay in tools that provide display and control very well. Lifestreamers and the like should be mere frontends. Secondly, tremendous value can be added by plugging in recommendation engines and integrating with existing services.

I haven’t done a thorough comparison of current lifestream-like applications out there, but I do have some ideas for such an Activity Frontend. The core value will be sophisticated filtering and categorization on incoming notifications. A filter like that should be organic eg. voting down certain sources or notification types. Right now I’m following about 50 people on Twitter and I don’t have time to give any attention to this public timeline – I need to easily and seriously filter the noise!

During one of my takout-sushi lunches I made the above mockup. The important filtering/categorization controls are missing, because building those will require serious thinking about user interaction. Techniques like Comet) can make this web application real-time (pushing notifications on the page as they happen).

Also, integration with the Desktop world is an interesting prospective. As the platform should play well with current open standards like XMPP, so should the frontend play well with open UI libraries like Growl (or Snarl for windows and Ghosd for Linux):

a Project has been Born?

Not yet, although I can probably not resist writing some prototypes. However, for something like this to really work well – there needs to be some kind of community actively backing the non-profit part, the Open Activity Broker. This will require a lot of commitment making the whole package a fulltime gig.

Any takers? Or any tips that this is just reinventing the wheel? Please share your thoughts.

How to Build a Twitter Agent

on February 15, 2008

Note, while working on this project this ReadWriteWeb article was released, illustrating the future potential of the Jabber/XMPP protocol.

In this article we will build an actual useful Twitter Service that will allow us to track the Blogosphere. In the process we will get hands on programming experience with Ruby, DRb, Twitter and Jabber. This will sharpen our developer skill-set to get ready for the upcoming (Folk)Semantic Web. Also we evaluate the problems seen and opportunities ahead.

Background

Whether you want to call it Web 2.0, Web 3.0, the Semantic Web or the Web of Data – change is happening. The past years we’ve seen the tremendous power of Folkosonomies and now this Social Web is colliding with the emergence of the Semantic Web, resulting in the first semantic services. For us developers and creative entrepreneurs it’s important to get ready for this new wave of business opportunities. I find the whole notion of Intelligent Agents very interesting. For our little project however, we will create a Stupid Agent :]

Technologies like Jabber/XMPP and DRb will enable us to move from a reactive web to a proactive web. Right now this proactive realtime push of data is important for the more liquid content creation services. Micro/nano blogging platforms as Twitter and Tumblr are good examples of this. This is one of the reasons that they already have a Jabber service set up.

I’ve had used Jabber before to communicate with my geek friends. For this project, I had to set up a Jabber client and Jabber account. Call me stupid, but I actually had to spend 30 minutes figuring out how the hell I had to create an account and choose which server to use (turns out you can do that in the client). Now of course XMPP/Jabber is just a standard for enabling IM communication, but apart from Google’s GTalk there hasn’t been much widespread use by ordinary users. In my view, these uses of XMPP for machine to machine (to human) are much more interesting.

Case: The Observatory Bot

I’ve been programming for quite some time now. When learning a new language I really hate doing little examples that produce zero user/business value. That’s why I think the best way to learn new technologies is to solve real world problems right away. Of course tutorials are valuable to just get a general idea of what’s going on, but don’t waste too much time on them – implement straight away.

Our Twitter Service will also need to create some value for the user and must be production ready. However, don’t get your hopes up too much since these are experimental technologies with dependence on external services like Jabber and Twitter.

Remember my last little project? Wigitize.com is actually generating a lot of data. It’s tracking about 5000 6000 feeds every hour! Let’s do something with that data :]

The Observatory:
  • is a Twitter-only service
  • will allow you to ‘track’ the Blogosphere
  • will send you a direct message when something happens in the Blogosphere

Basically this is like Twitter’s IM functionality to track the Twittersphere. So in a sense The Observatory will be a proof of concept portal between the Twittersphere and the Blogosphere.

The Architecture

Right now Wigitize.com uses BackgrounDRb to perform background tasks and also to update all RSS feeds periodically. Everytime the feed aggregation process finds a new feed entry it will create a FeedEntry instance. The creation of these objects serve as events for the Observatory Bot. These events have to be pushed to the Observatory Bot in some way.

What can Twitter’s IM service do for us?

To play around with Twitter’s agent you need to set up a Jabber account and a Twitter account. For debugging I’ve found the MacOS tool JabberFox very helpful.

Basically, these commands are available:

However, I think there are a lot more hidden commands which can be used. After emailing with the Twitter developers they told me there is a command called “d”. This can be used to send direct messages (d username message). Very useful!

Coding the Bot

Our implementation choice for today will be Ruby. If you’ve programmed intensively in other languages before you’ve probably come to the conclusion that Ruby is quite different from most other. Ruby’s flexible object models allow for great extension of the language itself (eg 3.minutes.ago). It is therefore no surprise that interesting Semantic Web projects like ActiveRDF are choosing Ruby as their language.

To communicate with Twitter we can use this cool Ruby Twitter API. Unfortunately all of these API’s are HTTP/REST driven and really limit what we can do in terms of realtime response. Also, if you want to build a serious production ready service, constantly polling Twitter will kill both parties.

So we need to interface with their Jabber Service. Jabber is a friendly name for Instant Messaging (IM) using the open XMPP protocol. Luckily, there is a Ruby library called XMPP4R which does most of the XMPP work for us. This blog post provides some simple examples and this German wiki entry provides sample code how to use callbacks (very important for a bot).

I’ve wrapped all of this in a simple JabberBot class jabber_bot.rb that can be used like this:

1
2
3
4
5
6
7
8
  class MyJabberBot < JabberBot
    def on_message(from, body)
      say(from, "You said: #{body}")
    end
  end
  my_jabber_bot = MyJabberBot.new('observatory@jabber.org', 'password')
  my_jabber_bot.connect_and_authenticate
  my_jabber_bot.run

As you can see in the diagram, I’ve build a TwitterBot on top of this JabberBot. Unfortunatly it’s not possible to do all communication with Twitter through Jabber yet. For example: there are no events for when users start following other users or ways to retrieve information. This is why twitter_bot.rb is essentially a hybrid using both the Twitter API and Twitter’s Jabber service. Feel free to use all sourcecode provided here, I know it will be useful to some of you out there. This is how to use this TwitterBot:

1
2
3
4
5
6
7
8
9
10
11
12
  twitter_bot = TwitterBot.new('observatory', 'password', 'observatory@jabber.org', 'password')
  twitter_bot.track_phrases = ['observatory.topoints.com']
  twitter_bot.on_directed_tweet do |username, message|
    puts("directed tweet: #{username} says #{message}")
  end
  twitter_bot.on_tweet do |username, message|
    puts("something from #{username}: #{message}")
  end
  twitter_bot.on_track do |username, message, phrase|
    puts("track: #{username} says #{message} (keyword: #{phrase}")
  end
  twitter_bot.runn(:follow_all_followers => true)

Now that we have the basic building blocks to build our service, let’s build our core business logic (observatory_twitter_bot.rb):

This means that we will send a greeting when people start following us:

1
2
3
4
5
6
7
8
  on_follow do |username|
    logger.info("#{username} is following us, will follow #{username} too and send welcome message")
    follow(username)
    direct_message(username, "the Observatory is now ready to serve you, use '@observatory track [keyword]' to get blogosphere updates.")
  end
  on_unfollow do |username|
    logger.info("#{username} stopped following us")
  end

Note: in order to get the on_follow event, we have to poll the Twitter HTTP API . Since Twitter limits the rate to 70 requests per hour, I poll every two minutes to be on the safe side.

And that we will start tracking the Blogosphere for them when they say the magic word:

1
2
3
4
5
6
7
8
9
10
11
12
  on_directed_tweet do |username, message|
    logger.info("directed tweet: #{username} says #{message}")
    if (phrase = track_phrase(message))
      logger.info("tracking '#{phrase}' for user #{username}")
      begin
        direct_message(username, "Will send a direct message anytime something happens in the Blogosphere regarding '#{phrase}'")
        Tracker.for(username, phrase)
      rescue => e
        logger.error("tracking failure: #{e.to_s}")
      end
    end
  end

Now when a new FeedEntry is created, we need to make sure that these Twitter users get notified when their tracked phrase matches the FeedEntry. Since this might take up some time, I’ve created a background worker task for it:

As you might see, Distributed Ruby (DRb) makes it extremely easy to control our bot remotely. In the ObservatoryBot we say:

1
  DRb.start_service("druby://:8997", self)

And all bot functionality can be accessed by calling: observatory_bot = DRbObject.new(nil, ‘druby://:8997’)

Now that we have our autonomous agent it would be nice if we could easily start and stop it in a production environment. I found the Ruby Gem called Daemons extremely useful to wrap these things up.

First, set up a file that runs the never ending process (eg script/observatory_twitter_bot.rb):

1
2
3
4
5
6
7
8
require 'logger'
require File.dirname(__FILE__) + '/../config/boot'
require File.dirname(__FILE__) + '/../config/environment'
require File.dirname(__FILE__) + '/../lib/observatory_twitter_bot'

logger = Logger.new(File.join(RAILS_ROOT, 'log/observatory_twitter_bot.log'))
observatory_twitter_bot = ObservatoryTwitterBot.new(logger)
observatory_twitter_bot.runn

Next, wrap this up in a daemon script (eg script/observatory_twitter_bot):

1
2
3
4
5
6
7

#!/usr/bin/env ruby
require File.dirname(__FILE__) + '/../config/boot'
require 'rubygems'
require 'daemons'

Daemons.run('script/observatory_twitter_bot.rb')


The Demo

Right now if all communication lines with Twitter are working fine, the service is up and running. I’ve made a little bot homepage at observatory.topoints.com

@observatory track ‘Twitter’:

Problems and Opportunities

In the development of the Observatory I had one big obstacle: Twitter is often down and it can cripple your service and development time. I understand that Twitter is a small team under enormous pressure but there have been a lot of complaints about this.

Nevertheless Twitter and it’s developers really kick ass. I emailed Alex Payne and he was excited about what I’m doing (and also the Twitter things happening the iKnow! project in Japan). He responded fairly quickly and immediately whitelisted my Twitter account to up the rate limits.

While working on this project I realized that Twitter in it’s current state isn’t really suitable for system-to-human notifications. Twitter could expand their system to be a true notification framework, but I’m not sure if they will. If they don’t, there is a tremendous business opportunity here. Imagine an open API that mashes up with technologies like XMPP and Growl. A service like that could become THE notification-bus of the web! (Already Growl is pretty big in Mac land). A full blog post about this startup idea coming up!

Geek Food for Thought

What about RubyOnRails, Jabber, DRb, Daemons and BackgrounDRb? I think we are seeing a new framework here! In this interesting article by Danny Ayers he talks about a toolset for agents. On Java there is already “an Agent Framework” that I haven’t checked out yet. I can imagine that these frameworks ease doing development like this and facilitate better system autonomy. Of course it’s desirable to give such a framework a pragmatic paintjob by using Rails paradigms.

Above here I’ve illustrated Rails’ missing brother. I think it’s also good to take into account interesting technologies like Juggernaut and Comet. These are basically Javascript Push techniques to make a synchronous interaction on the asynchronous web possible.

When you combine all these micro- asynchronous communication lines you get one big synchronous connection line between machine agents and user agents.

Creating a Twitter Loop

on January 29, 2008

Note: hashtags.org has banned me for spam and I understand. Sorry guys! The experiment was stopped as of 11:00 JST

Last night I did a little art experiment: What happens when you use a service like TwitterFeed.com to feed your Twitter status updates back into itself.

For this experiment I:
  • created a user called loopyloop
  • used twitterfeed to update both the RSS and Atom feed
  • used the great service hashtags.org to keep track of the activity

I was hoping to see some interesting emergent behavior, but the end result was a simple update every half an hour with 5 updates.

I do think it is interesting and important to think about full-artificial organisms living and looping through our cyberspace.

Twitter is great, and this small 5 million $ company is growing like crazy. The core of their service is inherently simple: blogging/chatting with no longer than 150 characters. They opened up all of their API’s and Twitter is now flurishing with activity. Hell, we even integrated it in our language learning platform: http://iknow.co.jp/ (more to come) It’s only the first step however, these services could open up even more!

iKnow is a service that specializes in online learning and therefore the SNS part of the site is nevertheless important but still secondary. Right now in our service iKnow you can upload a picture together with your journal entry. It would be nice if we could provide more picture uploading and managing functionality, but we don’t want to build that. Flickr specializes in these things and would make a nice addition. Unfortunately, a service like Flickr doesn’t provide a real transparent API yet – people still need to register for a Flickr account.

Also, it would be nice if could provide status updates for all people on the website using Twitter – but people still need to go through the registration procedure at Twitter.com to be able to use it. I think the next success in web integration lies in opening up your API’s to an extend where it is completely transparent to use them. You don’t need to worry about registering at the third party service.

But what about the revenue? If people don’t come to your website anymore, you cannot get any advertising money! That maybe so since most sites still rely on people being exposed to banners/adwords on their website. These things will change however:

  • there will be in-content advertising, an example is the already emerging in-video ads
  • the freemium model will also be applied to API’s. A free API is for personal use, a premium API is for integration use in other web services. This premium API will not require you to let people pre-register at the very least.

The big picture really makes sense: online services will be more specialized. Right now you could imagine that photo management is done by something like flickr/picasaweb, status updates by twitter, music integration by last.fm etc. But things could fragment even more when there is an open integration market: facebook-style wall service, embeddable message/mail service, tagging service, rating service or image cropping service. An example of the latter is PicNik a service recently integrated into Flickr.

sidethought: will this kill branding?

Most of you might have heard about the latest trend: twitter.com. At first I thought the idea was really stupid and could only be used for egocentric people (like me):

  • twitter is a blog that allows you to write posts with a maximum of about 250 characters
  • these one-liners can be 'watched' by your friends or you can watch your friends

"I'm at the backery", "Britney spears is so cool!" are examples of these so-called 'tweets'. But these examples are bad. Like blogging, people can write really really useless things. Blogs are also most often abused by people who write about 'how depressed they are' or how shallow they are.

But some blogs, a small number, provide real high quality content. Some of them even get printed to books (The book I'm reading now by Seth Godin appears to be a printout of his blog).

If I'm right, blog posts are micro-content (and my semantic web obsessed colleague can correct me if I'm wrong). Quality blog posts are on average the size of magazine articles and they provide the same information. Blog posts have the extra advantage of being interconnected in the blogosphere.

Twitter on the other hand is nano-content, and it has obviously other uses. People are not quite clear yet about these uses. An example of high quality tweets are: "stranded in Korea because of typhoon" and "Harry Potter dies in the end". These provide quick communications to the people that are subscribed to your messages.

Another use would be to use twitter as a thought notebook, where you can write little ideas like "hey, what about yocto-content" or "KFC is booming here, buy KFC China stocks". An advantage that these tweets have is that they can be written quickly which allows you to have a very active messaging stream. This is one of the main abstract uses of twitter: displaying activity. My colleage had an interesting suggestion that just like corporate blogs, there might be a use for corporate twitter-like applications.

As a wannabe entrepreneur, I think there are 2 big opportunities here:

Twitter without the hype

There are many tools that make twittering more easy, like twittermail (mail2twitter). But there is one obvious thing always painfully stuck in our eyes: the big ugly twitter logo on our profiles. On your twitter page (like http://twitter.com/dominiek) you can only customize the layout for a bit, but it will always look like this.

Therefore, it would be great to have a service that allows you to simply have a list of latest-thoughts or latest-communiques. Also allowing geeks to put in their own markup code and to attach their own domains (like thoughts.dominiek.com). But more importantly to allow corporations to make use their own brand. So this service should be transparent and brandless (conforms with Seth's statement that branding is a dying industry, sorry Russ Meyer).

Yocto-content

If twitter is nano content, would something even smaller also work? Let's check wikipedia for a name:

So what would this yocto-content look like? Probably one word or a hyphened word. You can have a stream of simple keywords to 'tag your life' in a way. For example: work, container, work, work, namkee, heineken, back-to-work, vacation, china, work, holland, bureaucracy, fly, korean You could then visualize this stream (over time) like a tagcloud. Also you can compare it to the cloud of other people and detect similar lives or interests. Other uses are still to be explored.. ;]