Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search_tweets with retryonratelimit = TRUE goes beyond requested limit #346

Closed
DonBunk opened this issue Jul 25, 2019 · 3 comments
Closed
Labels

Comments

@DonBunk
Copy link

DonBunk commented Jul 25, 2019

Problem

The problem is pretty basic. Simple queries such as:
search_tweets(q = 'datascience', n = 10, retryonratelimit = TRUE)
return 1,000s of tweets which imply the n parameter is completely ignored. If I do search_tweets(q = 'datascience', n = 10), 10 tweets are returned, as requested. No errors are produced in either case.

Expected behavior

search_tweets(q = 'datascience', n = 10, retryonratelimit = TRUE) returns only 10 tweets.

rtweet version 0.6.9

Session info

version 3.4.3 (2017-11-30)
Platform: x86_64-koji-linux-gnu (64-bit)
Running under: Amazon Linux 2

Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] rtweet_0.6.9

loaded via a namespace (and not attached):
[1] Rcpp_1.0.1 rstudioapi_0.10 knitr_1.20 magrittr_1.5 hms_0.4.2
[6] progress_1.2.2 R6_2.4.0 rlang_0.3.4 httr_1.4.0 tools_3.4.3
[11] DT_0.6 htmltools_0.3.6 askpass_1.1 openssl_1.3 yaml_2.2.0
[16] digest_0.6.18 assertthat_0.2.1 tibble_2.1.1 crayon_1.3.4 htmlwidgets_1.3
[21] curl_3.3 pillar_1.4.0 compiler_3.4.3 prettyunits_1.0.2 jsonlite_1.6
[26] pkgconfig_2.0.2

Token

$APP_NAME = dons...

@DonBunk
Copy link
Author

DonBunk commented Aug 6, 2019

I believe the issue is line 279-280 in search_tweets.R where:

        .search_tweets(
          q = q, n = remaining,...)

I do not think that should be n = remaining, which is 18,000 by line 276. Instead I think this should be the ith entry in a list of the total number of queries requested broken up into rate_limit chunks, plus the final remainder. Below is a function I wrote that does that, but I am sure someone else could come up with something more elegant.

get_list_of_n <- function(total_n, query_max, remaining_queries) {
     # returns a list of the total number of queries requested broken up into rate_limit chunks, plus the final remainder
     
     # if total < remaining_queries, then it is just a 
     if (total_n < remaining_queries) {
          return(list(total_n))
          
     } else {
          
          if (remaining_queries==0) {
               n_list <- list()
          } else {
               n_list <- list(remaining_queries)
          }
          
          remaining_queries <- total_n - remaining_queries
          
          integer_n <- remaining_queries %/% query_max
          remainder <- remaining_queries %% query_max
          
          if (remainder>0) {
               return( c(n_list, rep(query_max, integer_n), remainder) )   
          } else if (remainder==0) {
               return( c(n_list, rep(query_max, integer_n) )  ) 
          }
     }
}

I call this function on line 265 when the rt list is first populated with n_list <- get_list_of_n(total_n = n, query_max = 18000), and then the lines 279-280 in search_tweets.R are changed to

        .search_tweets(
          q = q, n = n_list[[]i],...)

Please let me know if you have any questions or if I should just submit a formal pull request.

@manu-torres
Copy link

manu-torres commented Oct 20, 2019

I am still having this issue as of October 2019 with rtweet version 0.6.9. If I search without the retryonratelimit = TRUE parameter, I get the expected number of tweets, but when I include said option it ignores the n option and tries to recover all tweets avaible.

Is there any way I can fix this on my installation or do I have to wait for it to be patched??

It's a pity, because this is by far the best twitter interface for R and this the only "drawback" that I found. With that fixed, it would be perfect for sentiment analysis.

@llrs llrs mentioned this issue Feb 15, 2021
@llrs llrs added the bug label Feb 16, 2021
@hadley
Copy link
Member

hadley commented Feb 27, 2021

Now tracking in #510

@hadley hadley closed this as completed Feb 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants