A1 Journal article (refereed), original research (Journal article, original research)

A Personality Mining System for German Twitter Posts with Global Vectors Word Embedding

Open Access publication

Publication Details

Authors: Usselmann Henning, Ahmad Rangina, Siemon Dominik

Publisher: Institute of Electrical and Electronics Engineers (IEEE): OAJ / IEEE

Publication year: 2021

Language: English

eISSN: 2169-3536

JUFO level of this publication: 2

Digital Object Identifier (DOI): http://dx.doi.org/10.1109/ACCESS.2021.3130937

Social media address: https://www.researchgate.net/publication/356542894_A_Personality_Mining_System_for_German_Twitter_Posts_with_Global_Vectors_Word_Embedding

Open Access: Open Access publication


People’s personality influences their behaviors, attitudes, beliefs, and feelings. Therefore, many scientific studies already benefit from easy ways of measuring personality. By analyzing the written text of a person, it is possible to derive Big Five personality traits. One approach to this is to apply the unsupervised learning algorithm Global Vectors Word Embedding (or Representation), abbreviated GloVe, to English Twitter posts. The overall objective of our research is to show that this algorithm can also be applied to German Twitter posts. Therefore, we built a framework for training and applying machine learning models for personality predictions. We tested if a working prediction model for English Twitter users can be adapted for German users. This could reduce efforts for collecting training data. We evaluated our models based on a personality survey with a sample of German users. The method of adapting an existing model does not perform as well as expected but helps prepare the framework for higher volumes of data. In the end, the final model is based on the evaluation data, which results in an acceptable performance. Via a web application (https://www.miping.de) anyone can easily retrieve personality scores for any public German Twitter user. Altogether, it is shown that GloVe is suitable to predict personality based on German language. The published framework and source code allow for independent improvements to and easy application of the trained model. Now, scientific studies and other applications, e.g. chatbots, could easily incorporate personality data.

Last updated on 2021-09-12 at 09:13