Notes from Web 2.0 Expo - Scaling Synchronous Web Apps: Lessons Learned from Meebo by Sandy Jen co-founder of Meebo.

Notes from Sandy Jen's presentation.

Sandy is a co-founder of meebo defined as live communication for Web. For people who don't use it or know about it, it provides a Web page to use Instant Messengers like GTalk, Messenger, AIM, etc.

A few of the things called out immediately were - what would work for someone else might not work for you. She seems against consultants since team who develops will always know the most about stuff.

Defined sync vs. async.  Mentioned browsers are built for asynchronous, so building synchronous web apps is like shoving a square peg in a round hole (elaborated latter). Good sync apps are Facebook IM, Gtalk, Meebo, etc.

Hole consisted of - multi-platform (different browsers), spotty network connections (VPN, dialup etc), only 2 simultaneous open HTTP requests allowed (expected to change in next version of browsers), page views (meebo uses one page, so Alexa can't judge how long user stayed), static content, no downloads. Whereas the "Square Peg" is instantaneous data transfer, long polling, making the browser do work and seamless user experience.

She recommended to define what part of an app is synchronous and how much can you get away with not being synchronous. Recommendation was to not underestimate server side architecture (it will be fragile first time it is released).  It is difficult to identify what user behavior might be. There might be different bottlenecks like memory, CPU, Bandwidth, Storage, Disk I/O etc.

"Square Peg" helpers examples like Long Polling (COMET),  Web servers (found that Apache wasn't good for Meebo - single threaded event based Web server was used),  Compiled vs Interpreted (Meebo backend is on C++, scalability issues with php etc), databases (MySQL used with schema which is not complicated. Option could be to store it in Amazon - start simple and worry later), memcache (lot of data wasn't cacheable), load balancers (normally expensive - and have to buy in pairs).

Simple is usually better  - first ask - what am I using it for?  is my data cacheable? what am I gaining? do I understand the technology? Can I use DNS round robin instead of load balancers (Meebo started with ww1, etc) fast CGI vs. Web modules vs PHP (Meebo didn't want to reinvent the wheel - started out by using CGI + C++ which was slow, finally added module to Web server and solved scalability issues), do I need to save state or is it persistent.

Launching feature light is not a bad thing.

Tug of war - front end vs. backend: Where does the workload make sense? Browser will get slow if more computation is used (Meebo users use IE). Suggested to just pick one and ask for feedback on whether it slowed or became fast compared to previous version.

Good enough vs. perfect - Users don't care about what technology you use (or how clever you are). Product should just work fine. Release enough and things will approach towards perfection. Don't get married to tech but don't flirt too much.

Don't over design for the unknown. It is hard to roll back an entire design.

Nothing simulates real life, you are not the end user and hence contingency plans are key, and users will behave in ways you never imagined. Don't build flood gates, build dams - so enable/disable components or change operating parameters during runtime. This should be used for front end and backend. Be very transparent with the users on when the feature is released and set appropriate expectations.

Use your own product  - Don't be afraid to find bugs and keep your finger on pulse of community. You are scaling customer service not just hardware/software etc.

Be aware of what is going on - Monitor key areas (have enough hooks in backend). Like Zabbix, Nagios, Hyperic, Ganglia. Don't ignore your alerts, and ignoring what your system tells you is extremely dangerous.

There is no magic solution to scalability!