Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensure consistency of nO dim for biLSTM #484

Merged
merged 2 commits into from
Mar 9, 2021

Conversation

svlandeg
Copy link
Member

@svlandeg svlandeg commented Mar 6, 2021

Fixes explosion/spaCy#7088

When using a biLSTM, the given nO dimension of the Thinc model would be divided by half and stored as such. I think this behaviour is unexpected, and it clashes with some of the shape inference & listener code in spaCy.

I think it makes more sense to keep the given nO, but provide the Torch model with half the value for its hidden features, so that the final output matches the required nO dimension.

This PR changes the behaviour both for PyTorchLSTM and LSTM (if bi is True) - both would otherwise predict vectors that were incompatible with their internal nO value - cf test and spaCy issue linked above.

The additional line in the test makes CauchySimilarity error as well. Not sure (yet) whether this is due to an error in get_width or the model itself, or whether this model should just be excluded from this particular line in the test. Suggest to address this in a follow-up PR.

@codecov
Copy link

codecov bot commented Mar 6, 2021

Codecov Report

Merging #484 (a35ad58) into master (1d998ad) will decrease coverage by 0.36%.
The diff coverage is 88.46%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #484      +/-   ##
==========================================
- Coverage   59.31%   58.94%   -0.37%     
==========================================
  Files         102      102              
  Lines        7206     7207       +1     
==========================================
- Hits         4274     4248      -26     
- Misses       2932     2959      +27     
Impacted Files Coverage Δ
thinc/tests/layers/test_layers_api.py 0.00% <0.00%> (ø)
thinc/layers/lstm.py 98.18% <100.00%> (-0.02%) ⬇️
thinc/layers/cauchysimilarity.py 32.43% <0.00%> (-67.57%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d998ad...a35ad58. Read the comment docs.

@svlandeg svlandeg added bug Bugs and behaviour differing from documentation feat / layers Weights layers, transforms, combinators, wrappers labels Mar 6, 2021
@honnibal
Copy link
Member

honnibal commented Mar 9, 2021

Yeah this is sensible, I agree that a bunch of places expect nO to behave a certain way.

@honnibal honnibal merged commit f92bb9c into explosion:master Mar 9, 2021
@svlandeg svlandeg deleted the bugfix/bilstm_dim branch March 9, 2021 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and behaviour differing from documentation feat / layers Weights layers, transforms, combinators, wrappers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using spacy.TorchBiLSTMEncoder.v1 in tok2vec encode throws error when used for Pretraining
2 participants