Virality Prediction and Community Structure

A companion webpage to the paper by Lilian Weng, Filippo Menczer, and Yong-Yeol Ahn

"Can we understand and predict virality of memes by leveraging network structure?"

Paper online Supplementary

Dowload: SlidesDataset (Download | Readme)

Cited As: Lilian Weng, Filippo Menczer, and Yong-Yeol Ahn. Virality Prediction and Community Structure in Social Networks. Nature Scientific Report. (3)2522, 2013.
      

Visualization of Gangnam Style

Click!


Introduction

We are living in attention economy. Corporations and political campaigns spend enormous resources to fight for limited attention of people to make their messages and products viral. Then, can we understand why certain things go viral? Further, can we predict what will go viral?

We propose that network communities allow us to predict viral memes. By analyzing Twitter hashtags, we show that:

  1. Communities allow us to estimate how much the spreading pattern of a meme deviates from that of infectious diseases;
  2. Viral memes tend to spread like epidemics;
  3. We can predict the virality of memes based on early spreading patterns in terms of community structure. Actually, we can predict the success of a meme after 4 weeks by looking at only the first 50 tweets.


The video Gangnam Style was uploaded to Youtube on July 2012, and obtains over 1.4 billions views within 8 months becoming the most viewed video in history on Youtube! Considering the world population, only 7 billions, the video Gangnam gains a huge success: it is viewed by one of every five people on earth! Is it possible for us to predict its tremendous success at the very beginning?


Why are communities important?

Simple versus Complex Contagion

If something spreads like complex contagion, communities enhance spreading within them and prevent it from exiting them. The trapping effect of communities on information spread is collectively boosted by structural trapping, social reinforcement, and homophily.

Structural trapping

Communities cripple the global spread because they act as traps for random flows.

Social reinforcement

Complex contagions are sensitive to social reinforcement. A few concentrated adoptions inside highly clustered communities can induce many multiple exposures.

Homophily

Communities capture homophily as people sharing similar characteristics naturally establish more edges among them. Thus we expect similar tastes among community members, making people more susceptible to memes from peers in the same community


Viral memes spread like diseases!

Quantifying concentration
Due to the strong trapping effect of community, we expect more concentrated communication and meme adoption within communities if the meme behaves like a complex contagion. To quantify the concentration of a meme in communities, we introduce four baseline models, each incorporating different aspects of community trapping: Two measurements, dominance and entropy, are designed to gauge the strength of meme concentration.The higher the dominance or the lower the entropy, the stronger the concentration of the meme.

Do all memes behave like complex contagions?

No! While the the majority of memes are not viral, viral memes behave differently. Their concentration in the empirical data is the same as that of the simple cascading model (see the gray areas). Community structure does not seem to trap successful memes as much as others. These memes behave like simple contagions, permeating through many communities.

Quantifying the strength of social reinforcement

To further distinguish viral memes from others in terms of types of contagion, we explicitly estimate the strength of social reinforcement by, for a given meme, measuring the average number of exposures that each adopter has experienced before the adoption.

Viral memes require as little reinforcement as the simple cascading model, while non-viral memes need as many exposures as models that consider social reinforcement or homophily. We arrive at the same conclusion: viral memes behave like simple contagions rather than complex ones.


Predict which memes can go viral

The above findings imply an intriguing possibility: high concentration of a meme hint that the meme is only interesting to certain communities, while weak concentration imply a universal appeal and therefore might be used to predict the virality of the meme.

See the following figure illustrates how the diffusion pattern of a viral meme differs from that of a non-viral one, when analyzed through the lens of community concentration.

We then apply a machine learning algorithm, random forests, to predict which memes can go viral in the future based on the first 50 messages of each meme. The community-based features used in the classifier significantly improve the prediction outcomes. In our experiments, we are able to detect memes that are viral after 4 weeks only using the knowledge about the first 50 tweets of each meme. For example, when the virality is estimated by the number of meme adopters, 60-70% of our predictions are accurate and 50-70% of actually viral memes are reported.


Any application based on the findings?

We can see huge potentiality for applications in social media marketing --- social networks could give better advice to their users as to which posts are likely to give best advertising Return on Investment (ROI).

Our method is easy to play with. It exploits only network structure, without need to access message content, and can be easily applied to any socio-technical network from a small sample of data.

We believe that many other complex dynamics of human society, from ethnic tension to global conflicts, and from grassroots social movements to political campaigns, could be better understood by continued investigation of network structure.