Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lookup_users error with output for one specific twitter handle #574

Closed
mareichler-spra opened this issue Apr 23, 2021 · 3 comments
Closed
Labels

Comments

@mareichler-spra
Copy link

Problem

Trying to get user information based on user_id's pulled using get_followers but receive an error. I've used this code on ~340 handles, but only receive the error with one handle. I'm not sure if this is a bug in the package or an issue with Twitter's API.

Expected behavior

Expected the output be a data frame with information about the users pulled.

Reproduce the problem

followers_ids  <- get_followers("UWBayArea")
followers_info <- lookup_users(followers_ids$user_id)
Error: Assigned data `users$user_id` must be compatible with existing data.                                                        
x Existing data has 4234 rows.
x Assigned data has 4334 rows.
i Only vectors of size 1 are recycled.
Run `rlang::last_error()` to see where the error occurred.

#run rlang::last_error() for more info 
rlang::last_error()
<error/tibble_error_assign_incompatible_size>
Assigned data `users$user_id` must be compatible with existing data.
x Existing data has 4234 rows.
x Assigned data has 4334 rows.
i Only vectors of size 1 are recycled.
Backtrace:
Run `rlang::last_trace()` to see the full context.

#run rlang::last_trace() for more info 
rlang::last_trace()
<error/tibble_error_assign_incompatible_size>
Assigned data `users$user_id` must be compatible with existing data.
x Existing data has 4234 rows.
x Assigned data has 4334 rows.
i Only vectors of size 1 are recycled.
Backtrace:
     x
  1. +-rtweet::lookup_users(followers_ids$user_id)
  2. | \-rtweet::users_with_tweets(results)
  3. |   +-base::`$<-`(...)
  4. |   \-tibble:::`$<-.tbl_df`(...)
  5. |     \-tibble:::tbl_subassign(...)
  6. |       \-tibble:::vectbl_recycle_rhs(...)
  7. |         +-base::withCallingHandlers(...)
  8. |         \-vctrs::vec_recycle(value[[j]], nrow)
  9. +-vctrs:::stop_recycle_incompatible_size(...)
 10. | \-vctrs:::stop_vctrs(...)
 11. |   \-rlang::abort(message, class = c(class, "vctrs_error"), ...)
 12. |     \-rlang:::signal_abort(cnd)
 13. |       \-base::signalCondition(cnd)
 14. \-(function (cnd) ...

Also, it only appears to be an issues for a certain section of user_ids:

# no error, produces tbl of 4263 obs. of 20 variables 
followers_info <- lookup_users(followers_ids$user_id[c(1:980, 1052:nrow(followers_ids))])

# produces error (different error than above) 
followers_info <- lookup_users(followers_ids$user_id[981:1051])
Error in `$<-.data.frame`(`*tmp*`, "user_id", value = c("994662436253913088",  : 
  replacement has 71 rows, data has 0

Not sure if it's important but all of the user_id's that create issues are 18 characters long (there are user_id's that are 18 characters long and do not produce issues).

rtweet version

## copy/paste output
packageVersion("rtweet")
[1] ‘0.7.0.9000

Session info

## copy/paste output
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rtweet_0.7.0.9000

loaded via a namespace (and not attached):
 [1] rstudioapi_0.13   magrittr_2.0.1    hms_0.5.3         progress_1.2.2    rappdirs_0.3.3    R6_2.5.0          rlang_0.4.10     
 [8] httr_1.4.2        tools_4.0.2       cli_2.3.0         withr_2.4.1       askpass_1.1       ellipsis_0.3.1    openssl_1.4.2    
[15] assertthat_0.2.1  tibble_3.0.6      lifecycle_1.0.0   crayon_1.4.1      vctrs_0.3.6       curl_4.3          glue_1.4.2       
[22] compiler_4.0.2    pillar_1.4.7      prettyunits_1.1.1 jsonlite_1.7.2    pkgconfig_2.0.3 
@llrs llrs added the bug label Apr 24, 2021
@llrs
Copy link
Member

llrs commented Apr 24, 2021

Thanks for the report! I can reproduce the error. I think this is related to how rtweet parses the responses which is not ideal (see #558). I had started working on this (#572), your code will be useful to track progress and to test if the new parser works better 👌

@llrs
Copy link
Member

llrs commented Apr 24, 2021

Mmh this issue really reports 2 bugs: To minimize even further the code using the first problematic user_id: lookup_users("994659707766833153") or equivalent with screen name lookup_users("DadliDerya"). Which is an account without any tweet which might explain why there are more users id than tweets. This is account currently has no likes, tweets or replies it only follows some people. Even fixing this doesn't really solve the original problem, probably because there are some users that are deleted or suspended.

@nicoleprause
Copy link

I just posted a new issue that I think might show generalization. I am tracking accounts that seem to have high bot purchases, where users are deleted within an hour of get_followers runs. It may be useful to include a row with only the ID and "fail".

However, that raises the second problem. Some users appear to have a URL field and others do not. Any mismatch also causes this error. For example, if the first user ID happens to have that field and the second user ID does not, it will fail. I have "fixed" that with a slow loop, but it would be even better just to have the fields consistently added, even if they cannot be populated!

@llrs llrs closed this as completed Sep 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants