Date Published: July 3, 2017
Publisher: Public Library of Science
Author(s): Yingying Xu, Zhixin Liu, Jichang Zhao, Chiwei Su, Wei-Xing Zhou.
This study provides new insights into the relationships between social media sentiments and the stock market in China. Based on machine learning, we classify microblogs posted on Sina Weibo, a Twitter’s variant in China into five detailed sentiments of anger, disgust, fear, joy, and sadness. Using wavelet analysis, we find close positive linkages between sentiments and the stock return, which have both frequency and time-varying features. Five detailed sentiments are positively related to the stock return for certain periods, particularly since October 2014 at medium to high frequencies of less than ten trading days, when the stock return is undergoing significant fluctuations. Sadness appears to have a closer relationship with the stock return than the other four sentiments. As to the lead-lag relationships, the stock return causes Weibo sentiments rather than reverse for most of the periods with significant linkages. Compared with polarity sentiments (negative vs. positive), detailed sentiments provide more information regarding relationships between Weibo sentiments and the stock market. The stock market exerts positive effects on bullishness and agreement of microblogs. Meanwhile, agreement leads the stock return in-phase at the frequency of approximately 40 trading days, indicating that less disagreement improves certainty about the stock market.
Early research on stock market prices, e.g., random walk theory and Efficient Market Hypothesis (EMH) suggest that stock prices are driven by new information that is unpredictable [1, 2]. Later studies, however, present the contrary evidence, supporting that stock market prices can indeed to a certain degree be predicted  and casting doubts on the EMH. Furthermore, emotions and mood of financial agents affect investment decisions [4, 5], thus resulting in changes in stock markets. A growing body of research suggests that early indicators can be extracted from online social media (e.g., blogs, Twitter feeds) to predict changes in the stock market . In contrast,  identifies that sentiments regarding stock markets have no discernible effect on stock prices.
Over the past decade, significant progress has been made in sentiment tracking techniques that extract indicators of public mood directly from social media content, such as Internet stock message boards like Yahoo! Finance, in particular large-scale Twitter feeds and microblogs [18, 19].  is the first to investigate the relationship between internet stock message boards and stock market activities. Using a sample of over 3,000 stocks listed on Yahoo! message boards, the study finds that changes in daily posting volume are associated with earnings-announcement events and daily changes in stock trading volume and returns. Consequently, the volume of messages predicts changes in next day stock trading volume and returns.  examines the relationship between the Internet message board activity and abnormal stock returns and trading volume based on the RagingBull.com discussion forum. They report that changes in investor opinions correlate with abnormal industry-adjusted returns and abnormally high trading volume on days with eccentrically high message activity. No such linkage is found between the message board activity and industry-adjusted returns or the trading volume, which is consistent with the EMH.  categorizes stock recommendations (buy, sell, and neutral) through a database under Google’s ownership, and suggests that these recommendations are overwhelmingly positive. Nevertheless, no evidence supporting that the recommendations have new information is found, indicating no significant linkage between internet stock messages and stock market activities.  establishes that stock microblog sentiments do have predictive power for simple and market adjusted returns using microblog postings from StockTwits and Yahoo! Finance, respectively.
We have collected 157,351 stock-related microblogs of 12 to 3,292 daily postings; this represents an average of 1,299 posts per trading day with a standard deviation of 623 messages. Using the classification method explained previously, we classify microblogs into five categories, i.e., anger, disgust, fear, joy, and sadness. Furthermore, as a comparison, we count the numbers of posts with the sentiment of joy as positive microblogs and the remaining as negative microblogs for each day of the time series. As suggested in , it is the natural starting place to test whether the level of microblogs or the bullishness of them affect stock markets. The second issue is whether a greater disagreement is associated with the stock market. The financial theory provides two distinct perspectives on disagreement. Whereas  and others hold the hypothesis that disagreement induces trading, the “no-trade theorem”  argues that disagreement leads to a revision of stock prices. Following the literature [6, 11, 57], we define the proxy of Bullishness for each day as:
where MtPositive and MtNegative denote the number of positive or negative microblogs on a particular day t. The index of Bullishness measures the share of surplus positive signals and gives more weight to a greater number of microblogs in a specific sentiment (positive or negative). The agreement among positive and negative microblogs is represented by:
The value of Agreementt is greater if more agents share the same sentiment. If all microblogs regarding the stock market are of the same sentiment, Agreement would be one in that case. The Shanghai Composite Index (SHCI) denotes the average stock price of Shanghai Stock Market. Thus, we use the stock return based on the closing price of SHCI as the index of the stock market. The data are available in Wind database.
This article classifies microblogs posted on Sina Weibo into five detailed sentiments (anger, disgust, fear, joy, and sadness), two polarity sentiments (positive vs. negative), and two indices representing bullishness and disagreement of stock markets (Bullishness and Agreement). Then, we assess the relationships between Weibo sentiments and the stock market in China through wavelet analysis, which is effective in capturing both frequency and time-varying features within a unified framework. There are contradictory results in the literature regarding linkages between sentiments extracted from social media and stock markets. Considering that market participants in emerging markets are more likely to be affected by psychology and sentimentality than those in developed markets, potential relationships between sentiments of investors and stock markets may be found in China. A time-frequency view of relationships between Weibo sentiments and Chinese stock market over seven months suggests that the linkage is not always significant, particularly at high frequencies. Our findings indicate that relationships between microblog sentiments and stock returns have both frequency and time-varying features. On the one hand, the linkage between sentiments and the stock return appears to be stronger at medium-term horizons surrounding ten trading days. Compared with the other four sentiments, sadness has a closer relationship with the stock market. Polarity sentiments have certain significant relationships with the stock market but are less persistent than that of detailed sentiments. On the other hand, the agreement of investors is a leading indicator of the stock market over the long run. However, regarding the lead-lag relationship, it is the stock market that leads detailed and polarity sentiments rather than the reverse. The relationship between bullishness and the stock market is not that significant. According to the results, Chinese investors’ sentiments are likely to be affected by domestic and peripheral market trends and abnormal events. Meanwhile, social media sentiments exert certain effects on Chinese market trends.