Wednesday, November 20, 2013

A system for recommending similar reviews

If you have been a regular visitor to Solomon Says in the past, you may have noticed that the similar reviews section (for the uninitiated, on the right bar) was a little disappointing. The reviews shown as similar were all too often not similar at all. When they were, the similarity was either too general (all fantasy books are similar to all other fantasy books) or more rarely, accurate by chance. All in all, the "Similar reviews" section wasn't something you could trust.

No more! Today I have put out the first version of a recommendation system that shows genuinely similar reviews so you can discover more, and more importantly, better content. I have been surfing the site all evening, and the quality of recommendations on every page looks solid.



 Now for the details. Recommendation system theories describe two kinds of systems - one that gives suggestions based on the nature of the items involved, and one that gives suggestions looking at how users interact with different items. The older recommendation generator was meant to be the second kind of system. It made the simple assumption that if a visitor goes from one review to another, the reviews were linked in some way. The nature of the link was not known, except for its existence and the assumption that it was unidirectional. So we maintained mappings of all such source-destination pairs along with a count of the number of times this transition happened. Given any review, the top n most heavily visited reviews from that page were shown as similar reviews.

The assumptions of this system are justifiable, but as it turns out, only at a very large scale. If the majority of the traffic is new visitors with a high bounce rate (as ours is), what happens is that visitors keep surfing to random reviews to explore the site instead of systematically exploring it. This creates all kinds of source-destinations mappings, most of which do not mean anything. My assumption was that large amounts of traffic will weed out the anomalies and strengthen genuinely similar relationships (it has), but this hasn't worked well enough with my current visitor stats.

The new system falls in the first category of systems described above. The approach is simple. We tag each reviewed item with its characteristics (e.g. Books might be tagged with 'epic fantasy', 'light read', 'capitalism' etc.). These tags are shared between items sharing similar characteristics. This is not too difficult since the scale is not very large and the tagging is done while creating the review itself. Some amount of discipline in creating and assigning tags suffices to maintain good item-to-tag relationships. Since an item can have arbitrarily man tags, it allows me to describe them in a fine grained way ("Modern George R. R. Martin style fantasy" instead of just "Fantasy"). All the new system has to do for a given item is to find other items that have the maximum number of tags in common with it. These are shown in the 'Similar Reviews' section.

The tags themselves can be seen at the end of each review as "Related Topics" and can be clicked to see all the items they apply to. This is further means of surfing niche corners of the website's content


To be fair, this isn't rocket science backed by vast amounts of data such as what an Amazon or a GoodReads might run. But even a lukewarm recommendation generator is better than none, and the difference in the results shown to the user is extremely striking. Content that was hitherto invisible (because nobody ever saw or read those reviews) now appears in many places. This gives the users a powerful new avenue to explore and discover content which didn't exists before. For Solomon Says, it means (hopefully) increased user retention and engagement.

As an example, check out the recommendations for "A Wizard of Earthsea". As of this writing, two belong to the same series(which is good), and others belong to four different fantasy series', all of which have something in common with "A Wizard of Earthsea" (which is fantastic). Some of these recommended reviews haven't received much traffic on the site. The new system increases their visibility in front of the visitors.

Check out the new gizmo, dear reader, and drop me a line in the comments or here about what you think.

Thursday, May 16, 2013

YOU’VE BEEN……memcached!

Listen to this song. This is a great song.


That was in no way relevant to this post.

Further in pursuit of making SolomonSays faster, I have been looking into caching solutions for a while now. After going through a ton of blog posts, I decided to go with using memcached as a caching back-end. I started this yesterday, and owing to extreme ease of installation and use (and my own, personal awesomeness), SolomonSays today runs on memchached.
The expected benefits are:
  1. Fewer queries being run means snappier performance. This will matter more and more as the site gains visitors because Django doesn’t support database connection pooling out of the box.
  2. A direct consequence of #1 is that the load on our MYSQL data server reduces. This is pertinent because the site runs on EC2 micro instance (free tier) and computational resources are minimal.
On the LINUX production system, the process was simple as:
  1. yum install memcached
  2. memcached –d –m  128 (to run memcached as a daemon with 128MB of memory)
  3. Configure memcached as the caching backend for Django as described here.
After that it was just a matter of analyzing what needed to be cached in the application and  using the cache for this. Currently I cache popular review (for right panel of most screens), the data for building the top menu, and reviews by their ids.

The tricky bit was setting memcached up for my development environment which is Windows. As Zurmo.org mentions:
Memcache was designed with Linux in mind and not windows, so it has posed some installation issues because Windows users are not so familiar with having to compile code from source as memcache does not come with any installation software.
However, it all worked out in the end with the help of the link above and this.

Hopefully you are now experiencing a website which is faster than it was before.
Thoughts?  Still think it’s too slow? Feel free to drop me a line.

Wednesday, May 15, 2013

Solomon says : Need for Speed

Long time no see!!!

I have been away from SolomonSays for most of this year (the reasons for which will soon be discussed elsewhere). Over the last week or so, however, the mists have lifted and I have returned to the fun and games with a vengeance.

The speed of the website has been one of the biggest concerns for me over the last few months.  Speed tests at WebPageTest showed that the home page was taking ~11 seconds to load completely. Not cool at all! So this was the first order of business.


Two of the biggest sluggards on the site were:

1. The auto-completing search box – A JQuery UI autocomplete component which took the complete list of reviews as JSON input at page load. Basically _everything_  in the system was queried on each page request. To make matters worse, this was a blocking call (synchronous request) due to some other implementation issues.  So the page loading couldn’t progress till this part was complete.

Instead of tweaking my implementation of the search, I chose to replace it with Google site search. This gives a twofold benefit:
  1. All the performance overhead described above goes away.
  2. The search functionality becomes much more powerful. The older search worked if you typed the exact name of the item in it. The new component provides full blown Google search functionality.
2. The site uses a bunch of JavaScript components and loads a whole bunch of .js and .css files. Optiomization-101 says to combine them all into a single file .js and .css file. So this is what I finally did using the django-compress. I love the simplicity of usage – just put whatever you want to compress between {% compress js/css %}  and {% endcompress %}  tags and voila, you are good to go. Almost entirely non-intrusive.

Not everything worked as expected (of course):

  1. The scripts being loaded from external sources like Google, Addthis etc. are not compressed and have to be loaded as before.
  2. Some of the javascript components like TinyMCE (used for accepting user reviews) and carouFredSel (used for the scrolling image gallery in each review) didn’t like being compressed independent of the rest of their packages. So I was obliged keep them out of the great squeeze.
Even so, I am now serving 1 js file instead of 4 and 1 css file instead of 7.

Web page test now reports a complete load time of ~6 seconds. Hurray!!!


Saturday, December 8, 2012

Complete makeover for the home page

Eat your own dog food.

Some days ago, I posted a review for Steve Krug’s “Don’t make me think”, a brilliant and funny book on website usability and recommended that everyone designing things for the web heed it. Since I am one of them, and design has been a slow learning process for me, I paid special attention to what the book said.

Some of the advice really hit home:
  • Navigation should be obvious and intuitive.
  • What the website is about should be clear.
  • The home page is a whole other beast.
  • Minimize text.
  • Optimize workflows where there is a chance of user error.
This set me thinking about the Solomon Says homepage. Indeed, I tinkered with it even as I read the book. Together with some ideas I already had, I have released a severely updated version of the homepage today. Here’s what has changed:
  • A tagline right below the logo on left top corner stating exactly what Solomon Says is.
  • The “whys” and “hows” of Solomon Says are laid out first thing on the homepage. It breaks Steve’s “minimize text” rule, but will have to for now in favour of mission clarity.
  • A new display for new reviews, which scrolls through them instead of the previous static six-tiled display.
  • A blog feed is added to the main page to provide easy access to discussions and any ideas/upcoming features I am mulling over.
  • No need to explicitly type your email id when requesting reviews or providing feedback. Just log in via the super handy Janrain login widget – we’ll handle the rest.
  • Tools for following Solomon Says on social media have been moved to top right corner for easy access and to reduce clutter in the right panel.

With these changes, hopefully we have a more intuitive and informative home page which will allow reader to “get it” and use the site more effectively.

I have, however, not followed the biggest, most important bit of advice in the book. No user testing was conducted to see if the readers react as expected to the changes. This is where you, my loyal readers, come in. Give the new version a spin, and let me know how you like it.

After all, giving you what you need is what this is all about.

Cheers,
Kislay

Tuesday, December 4, 2012

Always keep a backup. Of everything.

Let me tell you a story of failures and backups and pain.

So last night I finished a bunch of changes to Solomon Says.  After the regular load of testing (that lasts 15 minutes and includes opening a bunch of pages on Firefox and Chrome), I uploaded the changes and tried to bring the server back up. Everything exploded in my face at about the same time. The only reason we are still in business is that I had backups. In decreasing order of importance, the following backups saved the day:
  1. Database
  2. Code/Configurations
  3. Images
So pretty much everything :)

At this point, a note on the deployment process is in order. Here’s how it goes:
  1. Stop python fcgi process and nginx service.
  2. Delete the production code.
  3. Run the DB migration script.
  4. Upload the entire code from my laptop to the production location.
  5. Start python and nginx
#2, #3, and #4 didn’t go too well.

#2 – My dev. environment is Windows, but production is on LINUX.  So there’s a bunch of stuff related to path handling (‘/’ vs ‘\’ etc.) that I change just for development. This is automatically handled in production by using a different configuration file. Alas, I ran the delete for #2 from one level higher in the directory structure. Boom goes the config. And on bringing the server up, I get a load of ‘access permission denied’ errors. I spent a half hour analyzing the arcane debug messages, then give up and restore the entire code base file by file and change by change.

#3 – I missed selecting a couple of ‘where’ conditions when running the migration script. Result – 2 of the main table got randomly changed. Considering how crappy the day had been so far, I realized it only on restarting the server. So bring the server down again, restore the DB to its previous avatar from the backup, and run the migration with extra precaution.

#4 – My development copy did not have quite a few of the images related to the newer reviews I had posted. And since I had deleted the production data in #2, the server started throwing ‘Suspicious Operation’ exception (What the hell is that? It should have said ‘File not found” or something). In view of the blunders I had made for #2 and #3, I assumed a mistake in the new configuration I had created and spent another hour debugging, then gave up and copied over the image folders from the back up to production.
All told, something that should have taken 15 minutes took 4 hours.
Lesson learnt. Always keep a backup. Of everything.

Looking just a little bit better

A lot of the feedback that I have received on Solomon Says (a big thank you to everyone who spent time and effort providing it) has been regarding the styling and design aspect of the website. Or rather, the lack of it styling and design aspects in the website. Now, I am no designer. CSS3 and templating were not quite my fortes when I started working on it. So operating out of my ignorance of these fields, I have been forced to improve the design of the site in increments. Get something working, make it usable, and put it out there. Then improve what it looks like in the next iteration.

I'd like to share with you some of the changes I'm currently working on to the layout of the review pages. This primarily involves improving the data panel just above the text of the review. For the uninitiated, this is what it currently looks like on book and travel reviews respectively.


Both look very cramped and difficult to interact with. The huge orange rating section is sort of a waste of space, and the images don’t get due prominence (especially harmful on travel reviews). So I thought through these problems and came up with a small redesign which hopefully makes everything cleaner and easier to access. Check it out.


The new version is only slightly different from the current one but I think it lends a much more spaced-out feel to the whole page. You would also have noted that there is a small panel of image thumbnails right above the ratings section. These are the images that currently show up below the text of the review, like so:



I never really this design because pics are cool and everybody loves them. So I moved the images right to the top in a combination of sliding thumbnail carousel and Fancybox. Now they are easily accessible, and clicking on the thumbnail gallery blows them up to full size too! Like this.


A lot more groovy, even if I say so myself! I am planning to roll out the changes in about two weeks after a few minor tweaks and testing.

So what do you think? Like the new look? Not quite? Let me know in the comments or drop me a line at solomonsaysindia@gmail.com. Suggestions/flowery words of praise/hate mail are all welcome.

Solomon Says at ISB

First things first – Please fill out this short survey. This will help me in assessing what I can do to make Solomon Says more exciting and useful for its users. I really, really appreciate it.

Now for the news of the week.

Solomon Says is currently the subject of a marketing project/case study in a course on Entrepreneurial Decision Making (EnDM) at ISB (Indian School of Business). The project is being conducted by Varun Jain (a very close friend of mine from my undergrad days at NSIT) of the ISB class of 2013 under Prof. Arun Pereira. Over the course of the project, I will be working with the two aforementioned gentlemen (mostly with Varun) to conduct market surveys, audiences analysis and other analytical wizardry to refine SolomonSays into an even more awesome product.

Quick background on how this came about. Essentially Varun was looking for a start-up to whet his new-found marketing chops on.. I was going around writing reviews and hacking away to glory with no time for reaching out to the wide world and finding a place in it. We discussed the website one day, and agreed that it could use some MBA lovin’. So starting this week with survey mentioned above, we’ll be doing some basic scoping exercises to (hopefully) understand out audience and define our market with a lot more clarity than before. These efforts will also try to discover how readers interact with the website and what we can build into it to make that experience smooth.

I have written before that I do not have a proper business plan yet for SolomonSays. Throughout this ISB affair, my focus will continue to be on how to make this the best, most helpful reviews website on the web. No doubt there are parts of the project which demand an emphasis on revenue streams and sustainability, but those come later. Till then, the spotlight, my dear readers, is on you.

Don't forget the survey.