Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_friends() should return a 0 x 2 tibble on failure #339

Closed
alexpghayes opened this issue Jun 24, 2019 · 9 comments
Closed

get_friends() should return a 0 x 2 tibble on failure #339

alexpghayes opened this issue Jun 24, 2019 · 9 comments

Comments

@alexpghayes
Copy link
Contributor

Problem

When get_friends() fails, for example when requesting the friend list of an account that you do not have access to, it returns a 0 x 0 tibble. This breaks any code that assumes you will receive a two-column tibble.

library(rtweet)

f <- get_friends("@karlrohe")
f
#> # A tibble: 225 x 2
#>    user      user_id            
#>    <chr>     <chr>              
#>  1 @karlrohe 1140773875837521924
#>  2 @karlrohe 26637348           
#>  3 @karlrohe 895195137663528964 
#>  4 @karlrohe 143430352          
#>  5 @karlrohe 344363822          
#>  6 @karlrohe 140337831          
#>  7 @karlrohe 1115423427907682307
#>  8 @karlrohe 1539769902         
#>  9 @karlrohe 3183383692         
#> 10 @karlrohe 47516412           
#> # ... with 215 more rows

# protected account i don't follow
f2 <- get_friends("@RababAlkhalifa")
#> Warning: /1.1/friends/ids.json - Not authorized.
#> Warning: ^^ warning regarding user: @RababAlkhalifa
f2
#> # A tibble: 0 x 0

colnames(f) <- c("from", "to")

colnames(f2) <- c("from", "to")
#> Error in attr(x, "names") <- as.character(value): 'names' attribute [2] must be the same length as the vector [0]

Created on 2019-06-24 by the reprex package (v0.2.1)

Proposed new behavior

get_friends("@RababAlkhalifa")
#> Warning: /1.1/friends/ids.json - Not authorized.
#> Warning: ^^ warning regarding user: @RababAlkhalifa
#> # A tibble: 0 x 2
#>    user      user_id            
#>    <chr>     <chr>    
@hmeleiro
Copy link

I'm having this same issue, but I have to add that this is not only a problem for any code that asumes that you are recieving a two column data.frame, but is a problem for the function itself. If I run this code where the first user is a protected account I dont follow, the process stops and does not give me back anything:

users <- c("Don_Pelayo_2014", "hmeleiros")
friends <- get_friends(users = users)

#> Warning: /1.1/friends/ids.json - Not authorized.
#> Warning: ^^ warning regarding user: Don_Pelayo_2014
#> 1 friend networks collected!
#> Error: Must extract column with a single valid subscript.
#> x Subscript `grep("id$", names(x))[1]` can't be `NA`.
#> Run `rlang::last_error()` to see where the error occurred.

@AltfunsMA
Copy link

Below is another example with a user that no longer exists.

If any user_id or screen_name fails for whatever reason; the pipeline inside rtweet itself breaks down, regardless of the success of others. This is particularly frustrating if you've let it run with retryonlimit and user No. 120 happens to have removed their account.

rtweet::get_friends(c("796572560821485568", "BarackObama"), token = tk)
Warning: 34 - Sorry, that page does not exist.
Warning: ^^ warning regarding user: 796572560821485568
1 friend networks collected!
Error: Must extract column with a single valid subscript.
x Subscript `grep("id$", names(x))[1]` can't be `NA`.
Run `rlang::last_error()` to see where the error occurred.
``




 

@llrs
Copy link
Member

llrs commented Feb 16, 2021

I think this was requested in another issue, but as I don't find it now I'll leave it open hoping to be able to fix it.

@hadley
Copy link
Member

hadley commented Mar 5, 2021

In the dev version, this now throws an error. I think this is the best that rtweet can do — I don't think it makes sense to silently drop errors because there are many potential errors that you do want to know about (e.g. your auth is bad, or your internet connection is done), and presumably even if the error is a non-existent user, you'd still want to know exactly which users didn't exist. Instead, I think you'd be better off using some standard technique for dealing with failure, e.g.:

library(purrr)
users <- c("796572560821485568", "BarackObama")

results <- map(users, safely(rtweet::get_friends))
#> Reading token from ~/Library/Caches/rtweet/auth.rds
results <- transpose(results)
results$error
#> [[1]]
#> <simpleError: Twitter API failed [404]
#>  * Sorry, that page does not exist. (34)>
#> 
#> [[2]]
#> NULL

dplyr::bind_rows(results$result)
#> # A tibble: 5,000 x 2
#>    user        ids       
#>    <chr>       <chr>     
#>  1 BarackObama 1330457336
#>  2 BarackObama 30354991  
#>  3 BarackObama 3157910605
#>  4 BarackObama 739119366 
#>  5 BarackObama 1111940934
#>  6 BarackObama 61636488  
#>  7 BarackObama 22255654  
#>  8 BarackObama 22712235  
#>  9 BarackObama 346197350 
#> 10 BarackObama 18509818  
#> # … with 4,990 more rows

Created on 2021-03-05 by the reprex package (v1.0.0)

@AltfunsMA
Copy link

AltfunsMA commented Mar 9, 2021

How does this interact with retryonratelimit? I'm guessing now that get_friends() can call rate_limit() every time (or perhaps the return object contains this info?), so it will actually work pretty seamlessly.

However, the retryonratelimit switch in get_friends() and other functions suggests they will handle rate limitations on a per call basis; as opposed to having a global retryonratelimit (see #173). I avoided safely or other implementations because I thought I'd have to implement the rate limitation myself (which nonetheless seemed doable even within rtweet given rate_limit() is exported).

Maybe the fact that retryonratelimit will take into account previous calls to the same function needs to be explicitly noted in the docs?

@hadley
Copy link
Member

hadley commented Mar 9, 2021

This is mostly being tracked in #510 — once that's complete every function will have an argument that lets you wait until the rate limit reset (which will default to a global option, as in #173).

I'm not sure what you mean in your last question.

@AltfunsMA
Copy link

AltfunsMA commented Mar 11, 2021

I meant that something like counter in map(my_iter, ~f(.x, counter = TRUE)) would normally restart with each iteration; so retryonratelimit is doing something special. But the explanation I was looking for is actually already in the Details section of get_friends or get_followers:

When retryonratelimit = TRUE this function internally makes a rate limit API call to get information on (a) the number of requests remaining and (b) the amount of time until the rate limit resets. [my emphasis]

It is not spelt out in search_tweets so that's probably where it stuck in my head that rate limits were calculated internally by each function... I realise now that would actually be pretty hard/useless and that some of my code likely relies on that not being the case... but perhaps it's good to make clear across the board how retryonratelimit (or its replacement) works, which I'm sure you'll get to once #510 is done!

@hadley
Copy link
Member

hadley commented Mar 11, 2021

@AltfunsMA that's not how retryratelimit works any more. It now inspects the rate-limit header that's returned by every request.

@hadley
Copy link
Member

hadley commented Mar 31, 2021

Closing since now mostly fixed in dev, and the remaining bits are tracked in #510.

@hadley hadley closed this as completed Mar 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants