-
Notifications
You must be signed in to change notification settings - Fork 4
Resources
This resource page is meant to collect all interesting or potentially useful documents pertaining to the various fields that will be drawn from in the making of the EchoBurst browser extension.
The primary driver of the EchoBurst extension, that we hope to use to classify comments based on political leaning and how constructive/toxic they are.
-
- Change My View Subreddit: Users of this subreddit post in the hopes of having their established view changed. They, and other commenters, can then award 'Deltas' (the mathematical symbol for change) to the comments that were most persuasive. This could be utilized to create a dataset of what constitutes a persuasive comment, using the Delta measure as a label.
- SemEval - International Workshop on Semantic Evaluation
-
- Google's Perspective API: This API is able to detect and flag toxic comments, with some success.
-
- Research on Determining Political Learning Using Sentiment Analysis
- Research on Teaching Computers to Detect Sarcasm (Abstract Only)
- Article on Predicting Political Leaning Using Text Analysis
- Identifying Partisan Slant in News Articles:
- Article on Twitter Comments about Abortion after George Tiller's Shooting
- Article on "Stance" Dataset
We will need to collect substantial data sets in order to properly train our model to recognize political leaning and toxicity/constructiveness.
-
- All of Reddit: A dataset consisting of over 17 billion comments taken from reddit. Unlabeled
- Reddit (May 2015): A small subset of the previously mentioned dataset. Unlabeled
- Up-to-date Reddit Dataset
- Kaggle: for supervised datasets.
- Knoema - Data Visualized Structurally
- Stance Dataset by Saif M. Mohammad
-
- Comment Scraper Tool: If we wish to scrape and test target web pages, this tool may be useful and more quickly implemented than a custom scraper.
- Tumblr xKit extension: Tumblr has an extension called xKit that allows users to target certain words and any posts with those target words will not show up on their newsfeed. I wouldn't call it a Data Collection Tool, but I wasn't sure where else to put it. It's just interesting to look at. Here is a page with it's features. I encourage you all to go to xKit's tip page and check out the Blacklist and Whitelist posts features.
- Scrapy Python Web Scraping Library
- Portia Web Scraping Tool
- Import.io Web Scraping Tool
In order to create a system that opens minds, we must understand them, and why they close off into echo chambers in the first place. Biases, particularly confirmation bias are well researched subjects and we must leverage this research to guide our approach.
We will need to create a system that creates enough reward an incentive to keep people engaged with comments they might find distasteful, and certainly disagreeable. Gamification is one possible way to do this, and encompasses a wide range of approaches.