Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

effects coding #46

Open
KarinOudshoorn opened this issue Jul 18, 2023 · 4 comments
Open

effects coding #46

KarinOudshoorn opened this issue Jul 18, 2023 · 4 comments

Comments

@KarinOudshoorn
Copy link

When using logitr with effects coded categorical variables (contrast.sum) the transformation from the categorical variable to the dummy variables gives only one dummy column irrespective of how many categorical variables are in the model. To solve this, a check should be done on the contrast of the categorical variables and an alternative to fastDummies should be used.

@jhelvy
Copy link
Owner

jhelvy commented Jul 18, 2023

I can't reproduce this issue. Here is an example.

{logitr} uses dummy coding by default for categorical variables. In this case, you get 3 brand coefficients as expected:

library(logitr)

model <- logitr(
  data = yogurt, outcome = 'choice', obsID = 'obsID',
  pars = c('price', 'feat', 'brand')
)

coef(model)
#>       price         feat  brandhiland  brandweight brandyoplait 
#>  -0.3665546    0.4914392   -3.7154773   -0.6411384    0.7345195 

Now if I use contr.sum to use effects coding, I still get 3 brand coefficient:

yogurt$brand <- as.factor(yogurt$brand)
contrasts(yogurt$brand) = contr.sum(4)

model <- logitr(
  data = yogurt, outcome = 'choice', obsID = 'obsID',
  pars = c('price', 'feat', 'brand')
)

coef(model)

#>     price       feat     brand1     brand2     brand3 
#> -0.3665883  0.4913432  0.9055508 -2.8100654  0.2643329 

@KarinOudshoorn
Copy link
Author

KarinOudshoorn commented Jul 18, 2023 via email

@jhelvy
Copy link
Owner

jhelvy commented Jul 18, 2023

Ah yes I see. Yes when I include randPars with effects coding, it appears to be ignored. I just get back the same prior model results where brand is modeled with fixed parameters:

library(logitr)

yogurt$brand <- as.factor(yogurt$brand)
contrasts(yogurt$brand) = contr.sum(4)

model <- logitr(
  data = yogurt, outcome = 'choice', obsID = 'obsID',
  pars = c('price', 'feat', 'brand'),
  randPars = c(brand = 'n')
)

coef(model)
#>     price       feat     brand1     brand2     brand3
#> -0.3665883  0.4913432  0.9055508 -2.8100654  0.2643329

I believe this is probably a pretty small issue in the code. It looks like it might be rooted in the names of the variables changing when using effects coding. I'll look into it.

I may also show this as an example in the documentation for those who want to use different coding schemes.

@KarinOudshoorn
Copy link
Author

KarinOudshoorn commented Jul 18, 2023 via email

@jhelvy jhelvy mentioned this issue Oct 17, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants