Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search_tweets pulls way more tweets than n specified #449

Closed
send-me-dogs opened this issue Oct 2, 2020 · 4 comments
Closed

search_tweets pulls way more tweets than n specified #449

send-me-dogs opened this issue Oct 2, 2020 · 4 comments
Labels

Comments

@send-me-dogs
Copy link

Problem

I try to run a line like this, specifying I want to pull 10 tweets using the word "politics", but then it pulls usually over 12,000 tweets. I've tried changing n to other low numbers but it's always the same.

data <- search_tweets("funny", n = 10, include_rts = FALSE)

rtweet version

0.7.0

Token

request: https://api.twitter.com/oauth/request_token authorize: https://api.twitter.com/oauth/authenticate access: https://api.twitter.com/oauth/access_token rstats2twitter key: 6j7Ig4xzHlBr8uUJ5A4Ym0NTf secret: oauth_token, oauth_token_secret, user_id, screen_name --
@Arf9999
Copy link

Arf9999 commented Oct 19, 2020

I can't reproduce this.

data <- rtweet::search_tweets("funny", n = 10, include_rts = FALSE)
> glimpse(data)
Rows: 10
Columns: 90
$ user_id                 <chr> "72434931", "1560525734", "1250740123076923392", "1264512468782477312", "1315992408904077…
$ status_id               <chr> "1318134034136993797", "1318134033096822784", "1318134033067302912", "1318134028441022464…
$ created_at              <dttm> 2020-10-19 10:16:58, 2020-10-19 10:16:57, 2020-10-19 10:16:57, 2020-10-19 10:16:56, 2020…
$ screen_name             <chr> "breezy_brynn", "Blu4evrlvsJDC", "renhyunn", "WillOfTheKing", "luciflary", "unboundings",…
$ text                    <chr> "This is funny only if those waters aren’t dangerous which they probably aren’t lol \U000…
$ source                  <chr> "Twitter for iPhone", "Twitter Web App", "Twitter for Android", "Twitter for Android", "N…
$ display_text_width      <dbl> 106, 237, 28, 32, 33, 187, 59, 121, 115, 57
$ reply_to_status_id      <chr> NA, "1317980807915540481", NA, NA, NA, NA, NA, NA, NA, NA
$ reply_to_user_id        <chr> NA, "520481378", NA, NA, NA, NA, NA, NA, NA, NA
$ reply_to_screen_name    <chr> NA, "brnxsheri", NA, NA, NA, NA, NA, NA, NA, NA
$ is_quote                <lgl> TRUE, FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE
$ is_retweet              <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE
$ favorite_count          <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
$ retweet_count           <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
$ quote_count             <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ reply_count             <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ hashtags                <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA]
$ symbols                 <list> [NA, NA, NA, NA, NA, NA, NA, NA, NA, NA]
$ urls_url                <list> ["twitter.com/nochlllwlll/st…", NA, "twitter.com/yutafirst/stat…", "twitter.com/OniNoSan…
$ urls_t.co               <list> ["https://t.co/BC7nmrXZvt", NA, "https://t.co/WBcPpQqMDJ", "https://t.co/oGJVkL10OH", NA…
$ urls_expanded_url       <list> ["https://twitter.com/nochlllwlll/status/1317671130744979463", NA, "https://twitter.com/…
$ media_url               <list> [NA, NA, NA, NA, "http://pbs.twimg.com/media/Ekrz6UNUcAExTUd.jpg", NA, NA, NA, NA, NA]
$ media_t.co              <list> [NA, NA, NA, NA, "https://t.co/2Qh0Qd8wqV", NA, NA, NA, NA, NA]
$ media_expanded_url      <list> [NA, NA, NA, NA, "https://twitter.com/luciflary/status/1318134025769283584/photo/1", NA,…
$ media_type              <list> [NA, NA, NA, NA, "photo", NA, NA, NA, NA, NA]
$ ext_media_url           <list> [NA, NA, NA, NA, "http://pbs.twimg.com/media/Ekrz6UNUcAExTUd.jpg", NA, NA, NA, NA, NA]
$ ext_media_t.co          <list> [NA, NA, NA, NA, "https://t.co/2Qh0Qd8wqV", NA, NA, NA, NA, NA]
$ ext_media_expanded_url  <list> [NA, NA, NA, NA, "https://twitter.com/luciflary/status/1318134025769283584/photo/1", NA,…
$ ext_media_type          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ mentions_user_id        <list> [NA, <"520481378", "14480351", "1267210398266204162">, NA, NA, NA, NA, NA, NA, NA, NA]
$ mentions_screen_name    <list> [NA, <"brnxsheri", "adamblust", "Jennife79527579">, NA, NA, NA, NA, NA, NA, NA, NA]
$ lang                    <chr> "en", "en", "en", "en", "en", "en", "en", "en", "en", "en"
$ quoted_status_id        <chr> "1317671130744979463", NA, "1317817105538953216", "1318133850061623296", NA, NA, NA, NA, …
$ quoted_text             <chr> "she left her cheating bf at sea LMAOO https://t.co/pO6NS0U23H", NA, "yuta: walking \n\nj…
$ quoted_created_at       <dttm> 2020-10-18 03:37:33, NA, 2020-10-18 13:17:36, 2020-10-19 10:16:14, NA, NA, NA, NA, NA, 2…
$ quoted_source           <chr> "Twitter for iPhone", NA, "Twitter for Android", "Twitter for Android", NA, NA, NA, NA, N…
$ quoted_favorite_count   <int> 181462, NA, 3527, 0, NA, NA, NA, NA, NA, 25
$ quoted_retweet_count    <int> 33946, NA, 1115, 0, NA, NA, NA, NA, NA, 12
$ quoted_user_id          <chr> "883455697", NA, "1139883418492256256", "1316026461082451972", NA, NA, NA, NA, NA, "74952…
$ quoted_screen_name      <chr> "NOCHlLLWlLL", NA, "yutafirst", "OniNoSantoryu", NA, NA, NA, NA, NA, "vuyiswamb"
$ quoted_name             <chr> "lil uzi vers", NA, "sab`", "‍ ‍ ‍ ‍ ‍ ‍𝘡𝘖𝘙𝘖", NA, NA, NA, NA, NA, "Vuvu Acting King Yama…
$ quoted_followers_count  <int> 4262, NA, 4819, 97, NA, NA, NA, NA, NA, 23908
$ quoted_friends_count    <int> 798, NA, 3662, 97, NA, NA, NA, NA, NA, 19639
$ quoted_statuses_count   <int> 152158, NA, 28270, 123, NA, NA, NA, NA, NA, 47252
$ quoted_location         <chr> "dmv", NA, "", "", NA, NA, NA, NA, NA, "Pretoria , Mamelodi"
$ quoted_description      <chr> "Don’t care, didn’t ask, plus you’re not black #BLACKLIVESMATTER", NA, "♡特殊な: #YUTA ؛ ✒⠁t…
$ quoted_verified         <lgl> FALSE, NA, FALSE, FALSE, NA, NA, NA, NA, NA, FALSE
$ retweet_status_id       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_text            <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_created_at      <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_source          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_favorite_count  <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_retweet_count   <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_user_id         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_screen_name     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_name            <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_followers_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_friends_count   <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_statuses_count  <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_location        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_description     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ retweet_verified        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ place_url               <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ place_name              <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ place_full_name         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ place_type              <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ country                 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ country_code            <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ geo_coords              <list> [<NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA…
$ coords_coords           <list> [<NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA…
$ bbox_coords             <list> [<NA, NA, NA, NA, NA, NA, NA, NA>, <NA, NA, NA, NA, NA, NA, NA, NA>, <NA, NA, NA, NA, NA…
$ status_url              <chr> "https://twitter.com/breezy_brynn/status/1318134034136993797", "https://twitter.com/Blu4e…
$ name                    <chr> "breezy \U0001f618", "LC", "yan | semi-ia", "𝙼𝚘𝚗𝚔𝚎𝚢 𝕯▪︎𝙻𝚞𝚏𝚏𝚢", "luci ☆ playing nicola’s r…
$ location                <chr> "", "", "+65 she/her ☁️", "{ Shinsekai }", "", "any prns", "East London, South Africa", "…
$ description             <chr> "", "Happily Married. No DM's #resist #fbr #blm #voteblue #Biden2020", "ɴ\u1d04\u1d1b 𝒏𝒐𝒊…
$ url                     <chr> NA, NA, NA, NA, NA, "https://t.co/gt6d4CZ5Yu", NA, NA, "https://t.co/8m88cQsZl1", "https:…
$ protected               <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE
$ followers_count         <int> 145, 1188, 470, 106, 89, 156, 5592, 323, 992, 3154
$ friends_count           <int> 321, 1205, 626, 81, 91, 112, 694, 387, 774, 698
$ listed_count            <int> 1, 1, 4, 0, 0, 3, 0, 2, 16, 7
$ statuses_count          <int> 11145, 5343, 554, 199, 87, 1496, 10963, 1551, 26426, 45143
$ favourites_count        <int> 32523, 4909, 2329, 137, 130, 725, 205, 1622, 55139, 15865
$ account_created_at      <dttm> 2009-09-08 01:11:50, 2013-07-01 13:15:26, 2020-04-16 10:57:51, 2020-05-24 11:04:35, 2020…
$ verified                <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE
$ profile_url             <chr> NA, NA, NA, NA, NA, "https://t.co/gt6d4CZ5Yu", NA, NA, "https://t.co/8m88cQsZl1", "https:…
$ profile_expanded_url    <chr> NA, NA, NA, NA, NA, "https://kalosprofessor.carrd.co/", NA, NA, "http://tadatonin.carrd.c…
$ account_lang            <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
$ profile_banner_url      <chr> "https://pbs.twimg.com/profile_banners/72434931/1585276761", "https://pbs.twimg.com/profi…
$ profile_background_url  <chr> "http://abs.twimg.com/images/themes/theme10/bg.gif", "http://abs.twimg.com/images/themes/…
$ profile_image_url       <chr> "http://pbs.twimg.com/profile_images/1243367201995993088/nBcam0q1_normal.jpg", "http://pb…

@AlexB51
Copy link

AlexB51 commented Oct 26, 2020

I've had the same issue with search_tweets as well as with get_friends/get_followers. They roughly appear to bin to the rate limits for me. For example, search_tweets(query, n = 10000) returned 18,000 tweets. get_followers(acct_name, n=100000) returned 140,000 users (70k rate limit before timeout). I'm not sure if your experience has been the same on your side. I use rtweet 0.7.0 as well.

@llrs llrs mentioned this issue Feb 15, 2021
@llrs llrs added the bug label Feb 18, 2021
@llrs
Copy link
Member

llrs commented Feb 18, 2021

Seems that there is a bug somewhere when retryonratelimit = TRUE as I get 179000 results with:

data <- search_tweets("funny", n = 10000, include_rts = FALSE, retryonratelimit = TRUE)

@hadley
Copy link
Member

hadley commented Feb 27, 2021

Closing in favour of #510.

@hadley hadley closed this as completed Feb 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants