SocialOcean enables users to explore geo-tagged social media data. In the context of my Master Thesis, it is tailored to Echo Chamber detection. Depending on the pre-processed features, it can easily be adapted for other purposes as well. The tool utilizes a Lucene index and a corresponding PostgreSQL database. A script to create the Lucene index is included. This repository is an Eclipse RCP project, so it enables plugin-creation.
The initial idea and a prototype was presented at the EuroVis2017.
A demonstration video, a poster and a short paper can be downloaded at: https://lighthouse-bodensee.de/michaelhundt/eurovis2017/
The Master Thesis, the slides of my final presentation and a short introduction video can be downloaded at:
https://lighthouse-bodensee.de/michaelhundt/socialocean/
-
You will need a Eclipse RCP Version to run this project as an eclipse application:
http://www.eclipse.org/downloads/packages/eclipse-rcp-and-rap-developers/neon3 -
Good tutorials for Eclipse RCP can be found at:
-
For a local version you need a PostgreSQL database:
Download Postgres and install the Postgis extension. -
Clone this git reporsitory and import the project into Eclipse.
-
create a db_config.properties file (according to the template) within the settings folder that fits to your database credentials.
Depending on the system that you use, you may have to adapt the configuration of the target platform or the .product file.
SocialOcean.product --> Configuration --> Configuration File (maxosx, solaris, win32)
SocialOcean.product --> Contents --> Add Required Plug-ins
For the pre-processing we need a tweets table and users table. In the following we describe which fields are obligatory. There are three scripts, that offer some basic pre-processing.
src/scripts:
(1) AddCategoryScript.java
(2) AddSentimentScript.java
(3) Geocoding.java
(4) IndexTweets.java
The first two (1) and (2) scripts need the following database fields:
tweet_id, long
tweet_content, String
The indexing script (4) in the current form needs the following database fields from a tweets table:
- tweet_id, long
- tweet_creationdate, String, timestamp of the form "yyyy-dd-MM hh:mm:ss", example: "2013-08-01 01:15:00"
- tweet_content, String
- relationship, String (Tweet, Followed)
- latitude, double
- longitude, double
- hasurl, boolean
- user___screenname, String
- source, String
- user___language, String
- positive, int (result of SentiStrength.jar)
- negative, int (result of SentiStrength.jar)
- category, String (1)
- sentiment, String (2)
And the following fields from the users table:
- gender, String (default: unknown)
- user___statusescount, int
- user___followerscount, int
- user___friendscount, int
- user___listedcount, int
- desc___score, double ( [0,1] value that rates the text of the user description )
- latitude, double (3)*
- longitude, double (3)*
*import reference data: cities1000 from geonames and timezone_shapes:
- If it is not yet included and you would like to have a GUI tool for the database, you could download DataGrip or PgAdmin