This repository contains the screenplays and plot synopses with turning point (TP) annotations for 99 movies. Each movie contains:
- The Wikipedia plot synopsis (extended summary of 35 sentences on average) with sentence-level TP annotations.
- The screenplay (all dialogue and description parts of the movie) segmented into scenes (selected from the Scriptbase dataset).
- The cast information (according to IMDb).
We split the dataset into training and test. For the movies of the test set, we also provide scene-level TP annotations for the corresponding screenplays.
This folder contains the screenplays (./moviename/moviename_script.txt) and the imdb meta-data (./moviename/moviename_imdb_meta.txt) for all movies in TRIPOD.
This folder contains:
- TRIPOD_synopes_train.csv: all synopses and respective TP annotations for the movies of the training set.
It contains the movie name, the raw synopsis, the synopsis segmented into sentences and the sentence index (starting from 0) that corresponds to each TP.
Note: As part of annotations, we also provide multiple annotations for a given movie when available and reliable. For this reason, the movie name is the actual movie name with an underscore and the index of the annotation (e.g., Reservoir Dogs_0, Reservoir Dogs_1, Reservoir Dogs_2...). For the movies with only one set of available annotations there is only the moviename_0 version.
-
TRIPOD_synopses_test.csv: all synopses and respective TP annotations for the movies of the test set.
-
TRIPOD_screenplays_test.csv: the goldstandard annotations for the screenplays of the test set.
It contains the movie name and a list of screenplay scene indices (starting from 0) that corresponds to each TP.
Python script for segmenting the screenplays into scenes following tha manual segmentation of the screenwriters. This segmentation also agrees with the scene indices that are provided for the goldstandard annotations of the screenplays in the test set.