Skip to content

Latest commit

 

History

History
62 lines (44 loc) · 1.3 KB

README.md

File metadata and controls

62 lines (44 loc) · 1.3 KB

Multimodal

CUMMOSI dataset download

🚀 Installation

The first step is to download the SDK:

git clone https://github.com/CMU-MultiComp-Lab/CMU-MultimodalSDK.git

Next, you need to install the SDK on your python enviroment.

cd CMU-MultimodalSDK
pip install .

MELD dataset download

wget http://web.eecs.umich.edu/~mihalcea/downloads/MELD.Raw.tar.gz

🚀 Installation

Since the MELD dataset does not provide audio data, we can use ffmpeg to manually obtain the audio data.

The first step is to download the ffmpeg:

wget https://johnvansickle.com/ffmpeg/releases/ffmpeg-release-i686-static.tar.xz
xz -d ffmpeg-git-amd64-static.tar.xz
tar -xvf ffmpeg-git-amd64-static.tar
cd ./ffmpeg-git-amd64-static
./ffmpeg

Extracting audio data

First you need to convert the video data into audio data:

python /MELD/split_audio_from_video.py

Download the pre-trained audio feature extractor wave2vec:

wegt https://dl.fbaipublicfiles.com/fairseq/wav2vec/wav2vec_large.pt

Use pretrained wave2vec to extract audio features:

python wav2vec_embedding.py

IEMOCAP dataset download

To obtain the IEMOCAP data you just need to fill out an electronic release form below.

https://sail.usc.edu/iemocap/iemocap_release.htm