Getting statistical about emotion: using machine learning methods to predict the emotional trajectories of literary texts

    Research output: Contribution to conferenceAbstract

    Forthcoming

    View graph of relations

    In recent years, interest in applying statistical methods to solving problems in diverse areas of study has grown. This has led to a boom in the development of data driven solutions to a range of interesting commercial and academic problems across domains as diverse as business analytics, neuroscience, healthcare and social media analysis. The field of sentiment analysis (i.e. the task of “automatically determining valence, emotions, and other affectual states from text” (Mohammed, 2016) has begun to answer the question of how we can evaluate the emotional content of text, particularly with regard to commercial domains and social media. This work has obvious applications for companies who want to engage with consumer opinions of their products or services. However, while there is a rich literature on the tracking of sentiment and emotion in these domains, modelling the emotional trajectory of longer narratives, such as literary texts, poses new challenges. Previous work in the area of sentiment analysis has focused on using information from within a sentence to predict a valence value for that sentence. We propose to explore the influence of previous sentences on determining the sentiment of a given sentence in context by investigating whether information present in a history of previous sentences can be used to predict a valence value for the following sentence. We explored both linear and non-linear machine learning methods and a range of different feature combinations. We also looked at different context history sizes to determine what range of previous sentences was most informative for our models. We establish a linear relationship between sentence context history and the valence value of the current sentence and demonstrate that sentences in closer proximity to the target sentence are more informative. We show that the inclusion of semantic word embeddings enriches our model predictions.
    Original languageEnglish
    Number of pages1
    Publication statusAccepted - 21 May 2019
    EventInternational Conference of the Royal Statistical Society (RSS 2019) - Belfast, United Kingdom
    Duration: 02 Sep 201905 Sep 2019
    https://events.rss.org.uk/rss/frontend/reg/thome.csp?pageID=83705&ef_sel_menu=1647&eventID=270

    Conference

    ConferenceInternational Conference of the Royal Statistical Society (RSS 2019)
    Abbreviated titleRSS 2019
    CountryUnited Kingdom
    CityBelfast
    Period02/09/201905/09/2019
    Internet address

    ID: 168775853