Sie sind hier

E-Democracy Project Progress Updates and News

E-Democracy General Description

In order to fulfil the general goals of eDemocracy described in project main page, the first step is to collect relevant and promising data from social media about the German elections in 2017 and proceed with further analytics on our collected datasets. This consists of Twitter streaming as well as retrieving from Facebook. To this aim, we prepared some specific keywords in connection with German election and configured our streaming processes based on these keywords lists. The 3 different types of our keywords list consist of organizations, politicians as well as selectors. The "organization list", as its names stands for, is the screen names of political organizations, associations such as Newspapers, TV channels, and any other related organizational issues to the politics. The "selector list" consists of keywords such as political parties, key political terms and topics that are associated with the German Election. The third list is about "Politicians List" and consists of politicians such as German election candidates. As project deals with large volumes of the data, a NoSQL engine is used (MongoDB) for the implementation of retrieved data. Here in this page, we will provide periodic progress report about some facts and statistics of our dataset.

Progress Report, Reported on Friday, 05.09.2017

The following plot summarizes the changes happening in the total number of collected tweets for last three reporting periods.

reporting_periods.png

First reporting period refers to the timeline between first report release and second report release. Similarly, the second reporting period is between 30th August and 5th September. For the ease and fairness of comparison, we just considered last 6 days of the first reporting period, since it is longer than reporting period 2. As the plot shows, the streaming through the selectors list increases very fast as we become closer to the election date. This also shows the fact that our selectors list is in line with most discussed topics in the media. Furthermore, digital inclusion and social activity of organizations shows faster speed than politicians. Our collected dataset reached the 155,878 GB in the size.

Following list shows top ten least active politicians and organizations in Twitter. Since top 10 nominees never use Twitter and we have seen just 1 posted tweet during one reporting period, it is not significant enough to get some interesting results in this regard and we will ignore checking the changes in this list in the future.

 

  Politicians Organizations
Rank Real Name Screen Name Status Count Followers Count Real name Screen Name Status Count Followers Count
1 Jonas Geissler GeisslerJonas 1 46 AfD Saarland AfDSaar 35 177
2 Meryem Celikkol meryem_celikkol 2 6 thügida/wls thuegidawls 89 72
3 Walter Kubach walterkubach 2 4 DIE LINKE HESSEN DIELINKEHESSEN 126 1,440
4 Sandra Weeser WeeserSandra 2 10  DIE LINKE. M-V DIE_LINKE_MV 167 346
5 Michael Braedt michael_braedt 2 5 CDU Bremen CDUBremen 169 506
6 Martin Schaefer MartinSchaefer2 3 4 Russlanddeutsche AfD AfDrus 169 793
7 Sylvia Gabelmann SylviaGabelmann 3 13 Laut🥛Gedacht _Laut_Gedacht 217 1,483
8 Dirk Presch DirkPresch 3 6 Daniele Ganser _DanieleGanser 277 13,5K
9 Christel Sprößler spro_ler 4 64 Wissensmanufaktur W_Manufaktur 310 5,305
10 Marco Rützel Marco_Ruetzel 4 6  Heimat Zukunft HeimatZukunft 319 803

 
In the future reports, we will try to find answers to the following questions through querying our dataset:

  • The number of accounts with over 10K followings/followers?
  • The number of accounts with over 20K followings/followers?
  • The number of accounts with over 30K followings/followers?
  • The number of accounts with over 40K followings/followers?
  • The total number of verified Twitter accounts?
  • The total number of accounts with tweets between 01:00 - 07:00 AM?
  • At what time range of the day the most tweets are being posted?
  • Average time between tweets for two different time periods (08:00-23:00 and 23:00 - 08:00) for each political party.
  • Is there any twitter account that has reduced its followers count? If yes, top ten list?
  • Top 10 Tweets of the reporting period in terms of "likes”?
  • Top 10 Tweets of the reporting period in terms of “replies”?
  • Top 10 Tweets of the reporting period in terms of “Retweets”?
  • Top 10 politicians that have received the most total ‘likes’ for their total tweets in the reporting period.

Progress Report, Reported on Friday, 30.08.2017

For the period between last reporting time and today (17 days), the number of 1,959,336 tweets (average 115,255 tweets per day) have been added for our selectors list. Similarly, the politicians and organizations have posted total number of 269,272 and 935,659 tweets in this time period, accordingly (average 15,839 tweets per day for politicians and 55,038 for organizations). Our dataset has reached 145,883 GB in size.

The politician that has the most number of followers is still ‘Martin Schulz’ and the political party that has the most of tweets is SPD (Twitter id: spdde).

The list of top ten most active candidates of this reporting period is similar as the last reported list with the difference that "Paul Schmidt" (Twitter ID: PaulSch72969276) has achieved the 6th place of “SebRoloff“ from the former list by posting some more tweets. Accordingly, the new list looks like following:
 

  Politicians Organizations
Rank Real Name Screen Name Status Count Followers Count Real name Screen Name Status Count Followers Count
1 Anke Domscheit-Berg anked 77,950 21.9K der Frankfurter Allgemeinen Zeitung FAZ_NET 267,629 41.4K
2 Johannes Kahrs kahrs 64,017 14K Mitteldeutscher Rundfunk MDRAktuell 262,581 45.9K
3 Julia Schramm _juliaschramm 61,824 20.5K welt.de welt 252,050 1.25K
4 Roland Panter pant3r 33,197 2,943 Der Dutschi Der_Dutschi 203,492 6,719
5 Dieter Janecek DJanecek 32,082 7,165 Reuters Top News Reuters 198,955 18.7M
6 Sebastian Roloff SebRoloff 25,555 2,221  Saarbrücker Zeitung azaktuell 192,078 9,920
7 Paul Schmidt PaulSch72969276 25,485 161 News-Now juergen_p 184,168 6,947
8 Dorothee Bär DoroBaer 25,010 62.9K n-tv Nachrichtensender ntvde 179,463 599K
9 Matthias Zach m_zach 24,225 894 BILD-Redaktion bild 164,392 1.66M
10 Uwe Schummer UweSchummer 24,148 7,532  FOCUS Online focusonline 162,535 507K

Progress Report, Reported on Friday, 18.08.2017

As a part of our data collection task, we have started since May 2017 to test our streaming and correct some potential bugs that our system could have. After investigation on some error and bug corrections, our real streaming work has been started from mid July 2017. We considered the total number of 441 organizations, 1167 politicians and 219 keywords (selectors) for streaming German election associated posts from Twitter. We used MongoDB to store our streamed Tweets. In addition, we collect posts, likes, dislikes and comments from public accounts of the Facebook as well.

As of today, the size of our dataset is 133.9 GB. We collected the total number of 34 million tweets associated with organizations, a total number of 731,000 tweets from politicians and 6 million tweets through our selectors list. The twitter account that has the most number of followers is 'Martin Schulz' with 475,677 followers. Furthermore, in the list of organizations, the "der Frankfurter Allgemeinen Zeitung" has the most status count as 263,094 tweets. Status count indicates the total number of tweets that each account has posted from its beginnings. The following plot shows the average activeness of individual members of each list. The list is resulted through normalized division of the total number of tweets by the number of members in each list.

screen_shot_2017-09-07_at_14.04.45.png

Consequently, the following shows the top 10 most active twitter accounts of politicians and organizations in terms of status count. The selection is based on the most posted tweets accounts all the time.
 

  Politicians Organizations
Rank Real Name Screen Name Status Count Followers Count Real name Screen Name Status Count Followers Count
1 Anke Domscheit-Berg anked 77,314 21.9K der Frankfurter Allgemeinen Zeitung FAZ_NET 263,094 41.4K
2 Johannes Kahrs kahrs 61,946 14K Mitteldeutscher Rundfunk MDRAktuell 260,242 45.9K
3 Julia Schramm _juliaschramm 61,337 20.5K welt.de welt 244,541 1.25K
4 Roland Panter pant3r 32,841 2,943 Der Dutschi Der_Dutschi 195,999 6,719
5 Dieter Janecek DJanecek 31,252 7,165 Reuters Top News Reuters 194,798 18.7M
6 Sebastian Roloff SebRoloff 24,989 2,221  Saarbrücker Zeitung azaktuell 191,504 9,920
7  Dorothee Bär DoroBaer 24,869 62.9K News-Now juergen_p 180,297 6,947
8  Matthias Zach m_zach 23,965 894 n-tv Nachrichtensender ntvde 174,900 599K
9  Ulrich Kelber UlrichKelber 23,811 15.4K BILD-Redaktion bild 161,618 1.66M
10 Uwe Schummer UweSchummer 23,468 7,532  FOCUS Online focusonline 157,511 507K

 
We also checked the cities that have had the most tweets in connection to the German elections. Interestingly, Paris was the city with the most tweets for the list of organizations, Berlin for politicians and Rio de Janeiro for the list of selectors. We currently study our datasets further to find out the reasons for two different most popular none German cities of two lists.
-------------------
In case of any further collaboration, need to get any information about our dataset or adding further political or social questions to be answered by querying our dataset, please contact any of the following people: