Friday, February 28, 2014

Progress and Architecture of the Yammer Search Connector for SharePoint

In the last days I took some time to improve my Yammer Search Connector for SharePoint. I do this partly as preparation for my talk at Collaboration Days in Z├╝rich and partly out of curiosity to see if this can work.

So far everything worked well. The display templates for displaying the SharePoint search results are always more work than expected, especially if you have to write asynchronous JavaScript in SharePoint Designer.

I chose to adapt the Microblog Display Template that is normally used for displaying Newsfeed posts.

This is the result:


Not too bad, isn't it?

You see the similarity to the Newsfeed search results, but this time all data is from Yammer: content, author, creation date, like count, reply count and whether it is a "root" post or not. If it's a root the caption says "Xy yammered..." and for a reply it says "Xy answered...".

The user image is also from Yammer. It will even be displayed if you are not logged in to Yammer.

On the left you see some other data from Yammer being used as refiners: groups, network names and like count. This could easily be extended to topics or result type.

Architecture

Now for the architecture. Here is a diagram:


As mentioned this works quite well. (The Sync-apps use the YamrSync library.)

Here is my reasoning for using a cache in between Yammer and SharePoint:
  • a SharePoint full crawl would hammer the Yammer REST API - hitting API limits and possibly getting some attention because of causing excessive load
  • Yammer is in the cloud, not your local data center - you are using your Internet connection for downloading data, so it can take some time to get all data, especially big documents -> a crawl would take forever and continuous crawls are not supported for non-SharePoint content
The idea is to use a small console application loading data from Yammer over time. This of course has caveats:
  • we essentially duplicate what the SharePoint crawl does (although with more control over how we crawl)
  • we need a cache
  • the cache currently is MongoDB, which is robust and scalable but outside the Microsoft world - would this fit in?

So what do you think?

Is this a road that is worth to go further? Or is this prototype good for its initial purpose - demonstrating how you can integrate external systems into SharePoint Search - but not more? Maybe Yammer would pull the plug anyways.

What would be the minimum requirements to run this in a production environment? And is it possible to meet those requirements?

I'd love to hear your take on this in the comments!