Mapping the Skype experience with graph analytics and machine learning

thumbnail 2

(Amrita Ray, Visha Chadha and Vinod Shanbhag all working from our Microsoft Sunnyvale office)

There are many ways to communicate today. We have apps, social networks, phones, texts and many more ways to get in touch. As a communications tool, Skype falls into this category. It has a strong global network with millions of users. Yet tools are only useful if there are people using them. If there is no one to chat with, people can leave that tool or service. Our data science team mapped this network effect to make sure users stayed engaged and avoid large-scale churn. We developed a series of algorithms and models to measure the Skype network through machine learning and graph analytics. The following picture is a simple high-level overview of our work:

Graph 1

Identifying the Problem

Before we could improve the user experience, we had to prove out our theory that the network effect and engagement at appropriate intervals prevents major drop-off. To do this we built graph theory concepts that identified key user sub-networks. We then measured their dynamic changes over time. Using machine learning models, we were able to predict which users were most disengaged.

We built a PII safe dynamic communication network graph to map these insights, and created APIs for outputs from the graph and machine learning model. Our analytics, statistical analysis and machine learning showed a correlation between user retention and a strong network. This meant that people with smaller, more personal networks, tended to have a larger impact when leaving the platform. Once they left, the rest of their network typically followed. We used data query from Microsoft’s big data MapReduce system, Stanford’s open source graph software, KPIs for network health and statistical analysis to map this thesis. We tracked attrition against those with larger networks and our model generated the following retention graph:

Graph 2(Due to network effect, loss of users with small network is significantly higher than those with larger network)

Solving the Problem

Once we proved our theory, we identified how the business could create a better experience overall to retain networks. We translated what we learned into multiple analytical questions and hypotheses. This was then used to map out a project plan with a clear road map, best methods of engagement and application of our data set. The work encompassed many teams that spanned across product managers, data engineers and leadership teams. Sharing our insights with them and establishing quarterly engagement goals helped establish our own internal network dedicated to solving channel drop-off.

Aligning closely with project managers and engineers was particularly effective for our team. It helped us map product impact and business metrics with our data science. Working together across all these teams helped alleviate some of the work load; however, through this project our local team grew from one to four to manage the multiple conversations and data sets.

Our Sunnyvale team working on Stanford’s open source graph software showed correlation between a user’s engaged Skype community and stronger retention.

(Our Sunnyvale team working on Stanford’s open source graph software showed correlation between a user’s engaged Skype community and stronger retention.)

Measuring the Impact

Through in-app and email engagement campaigns at critical junctures for user drop-off, we saw 8% of users in Skype re-engage with the platform week after week and network adoption increase by 5% over the previous year.

We also leveraged the data and insights to bring back inactive users with the help of their network. By regularly keeping in touch with active friends, the network effect eventually brought users back to the platform. The success of the campaign was measured by increased click-through rate (CTR) of 29% and significantly higher retention of the inactive users, 24% with statistical significance pvalue < 0.05. This directly correlated to a positive impact on monthly active users.

If this project was of interest to you, feel free to check out open positions on our team: https://aka.ms/MicrosoftBayAreaCareers. Many thanks to my product manager and engineering leads in Sunnyvale, Visha Chadha and Vinod Shanbhag, and the many other data science, project managers and engineering teams that helped in the execution. Remote contributors to our data science efforts were led by Avleen Bijral, Xin Deng and Alexandre Matos Martins.