Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train new SRL model ? #25

Open
GraphGrailAi opened this issue Mar 12, 2016 · 4 comments
Open

How to train new SRL model ? #25

GraphGrailAi opened this issue Mar 12, 2016 · 4 comments

Comments

@GraphGrailAi
Copy link

Hi, deepnl is cool, but i cannot find good tutorial on how to train my custom Semantic Role labeling model (language other than english).
I have read presentation and article http://docslide.us/documents/the-tsunami-of-deep-learning-over-nlp-giuseppe-attardi-dipartimento-di-informatica.html , http://www.aclweb.org/anthology/W15-1515.

What i need: i want to pass a bunch of .txt files with text data to deepnl and get result as pretrained model. Then as i ques i can pass this model to tagger = SRLTagger.load(open(filename)) and now it is ready to add semantic roles to each word in sentence.

Then i want to use semantic roles to identify facts and opinions about some objects, i.e.: "I dont like BankName because it doesn't supply customer service" - the output will be BankName - customer service. That means the problem with this bank is customer service.
Is i am on right path?

@attardi
Copy link
Owner

attardi commented Mar 12, 2016

On 12 mar 2016, at 13:19, GraphGrail notifications@github.com wrote:

Hi, deepnl is cool, but i cannot find good tutorial on how to train my custom Semantic Role labeling model (language other than english).
I have read presentation and article http://docslide.us/documents/the-tsunami-of-deep-learning-over-nlp-giuseppe-attardi-dipartimento-di-informatica.html http://docslide.us/documents/the-tsunami-of-deep-learning-over-nlp-giuseppe-attardi-dipartimento-di-informatica.html , http://www.aclweb.org/anthology/W15-1515 http://www.aclweb.org/anthology/W15-1515.

What i need: i want to pass a bunch of .txt files with text data to deepnl and get result as pretrained model. Then as i ques i can pass this model to tagger = SRLTagger.load(open(filename)) and now it is ready to add semantic roles to each word in sentence.

In principle yes, but DeepNL expects input in CoNLL format.
You need to perform sentence splitting and tokenization first with some separate tool.
And for training you need a corpus annotated with predicates, as in the CoNLL Shared Task 2008.
Then i want to use semantic roles to identify facts and opinions about some objects, i.e.: "I dont like BankName because it doesn't supply customer service" - the output will be BankName - customer service. That means the proble with this bank is customer service.
Is i am on right path?

Yes, but the path might be a long one ;-)


Reply to this email directly or view it on GitHub #25.

@GraphGrailAi
Copy link
Author

Thanks for answer, sentence splitting and tokenization is not a problem, i can do this myself in Python.
But i dont understand what is "corpus annotated with predicates, as in the CoNLL Shared Task 2008".
I googled https://catalog.ldc.upenn.edu/LDC2009T12 but no data sample available to reproduce.

Also i have found http://conll.cemantix.org/2012/data.html but istructions are hard to read and they mostly unclear

@attardi
Copy link
Owner

attardi commented Mar 12, 2016

On 12 mar 2016, at 16:04, GraphGrail notifications@github.com wrote:

Thanks for answer, sentence splitting and tokenization is not a problem, i can do this myself in Python.
But i dont understand what is "corpus annotated with predicates, as in the CoNLL Shared Task 2008".
I googled https://catalog.ldc.upenn.edu/LDC2009T12 https://catalog.ldc.upenn.edu/LDC2009T12 but no data sample available to reproduce.

This is the right one.
Also i have found http://conll.cemantix.org/2012/data.html http://conll.cemantix.org/2012/data.html but istructions are hard to read and they mostly unclear

This is a different task.

Reply to this email directly or view it on GitHub #25 (comment).

@GraphGrailAi
Copy link
Author

Your answers so short)
Is there a tutoral on how to make corpus annotated with predicates from raw text data?
(maybe offtopic, but in http://nilc.icmc.usp.br/nlpnet/training.html described steps on how to get srl, but dont work for me)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants