If you’re looking for historical Tweets about a highly specific topic, time and/or location, then the Twitter Advanced Search option is a great tool! While it’s a good option for manually browsing Tweets, it doesn’t offer an export option to collect the raw Tweets into a CSV file for further analysis and processing. Furthermore, you may run into situations with Twitter search not showing all results for a search strings, as the website can use sampling and hide certain results.
To get around these obstacles and properly export Tweets from the Twitter archive, we need to use the Twitter API which will allow us to search Tweets with all options the advanced search provided, but with even more granularity (like searching by specific location, something advanced search doesn’t support).
One caveat is that the Twitter API will only let you search 7 days into the past unless you’re approved for the Academic Research access level, which allows you to search by specific dates beyond 7 days ago. So if you need to scrape historical Tweets you will need to look into obtaining this level of access, or you can use the older Twitter V1.1 Enterprise API, which provides 30 days of Tweets.
The 7-day limit though is often not a problem, as the whole point of Twitter is to know what’s happening in real time, and for this use case the free 7 day Tweet scraping features the API offers is typically sufficient for most research projects.
In this article, we’ll detail how to use the Twitter API V2 Tweet Search Endpoint with detailed examples on how to use advanced operators like Location Search, Time Ranges, etc…
You’ll find that nearly all of the search functionality is wrapped into a single
query parameter that accepts a string of keywords and search operators. We’ll walk through these examples here so you can better understand what’s possible with the API, or you can browse the official Twitter Query Operator Documentation for more information.
We’ll cover the most common use cases below, but please check the above links if you’d like to see ALL of the possible operators.
Scraping with Stevesie If you need help collecting the results out of this endpoint into CSV format, you can see our Tweets & Archive Search Integration which will allow you to extract Twitter data without writing any code. See the workflow version to scrape thousands of results (auto-pagination) for your search terms.
The easiest search to perform is for a keyword, just enter one in like
beer and you’ll get back results. By default, entering multiple keywords will use
AND logic and only return Tweets containing both keywords. E.g. a search for
free beer will return Tweets containing both of these words.
OR logic If you want to change the behavior to return Tweets containing either word, then you can put an
OR in between your search, e.g. a search for
beer OR vodka will return Tweets containing either of these terms.
Sub-groups If you want to combine this logic, you can use parenthesis to make the logic explicit. E.g.
(free beer) OR (free vodka) will return Tweets that either contain both free and beer OR free and vodka.
Exact Match If you need to guarantee that the search terms in an
AND clause appear next to each other, you can simply quote them. This will automatically group them too, so we can use our example above and change it to
"free beer" OR "free vodka" and the results will only show Tweets with either of these exact matches.
Negation You can also specify words to not include in your results with the
- sign. Simply prepend it to a clause, e.g.
beer -free will exclude any results containing the word free. This also works with groups or quotes, e.g.
-(free beer) or
To find all Tweets containing a hashtag, simply provide it as your query with the
# sign. E.g.
#beer will return all Tweets with this hashtag. You can also use this in conjunction with keyword searches, e.g.
#beer -free will exclude any Tweets with the word “free.”
To find all Tweets from a certain user, simply prepend the username with
from: and use it as a search operator just like above. To find Tweets from multiple accounts in a single search, you can combine the logic with the
from:twitterdev OR from:twitterapi - don’t forget this as
AND logic will probably not work so well here ;)
You can filter Tweets by language using the
lang: operator. Just add it to your search term and specify a
BCP 47 language. E.g.
lang:en to filter for only English Tweets. You can see the full list of supported languages in building a query (scroll to the bottom).
Academic Track Required for location based Tweet scraping. You can find Tweets that are tagged with a geolocation if you have access to the Academic Track. There are four main options to find Tweets by location we’ll walk through below:
Place Search This will match Tweets for a specific
place that Twitter has indexed and knows about. You can often get Place IDs from other Tweets you searched for, or you can specify by name like “new york city”. Some examples are:
place:"new york city" OR place:seattle OR place:fd70c22040963ac7
Country Search You can also specify Tweet location by country with the
place_country operator. Use it with any valid ISO Country Code like this:
place_country:US OR place_country:MX OR place_country:CA
Coordinate Search If you have a specific latitude and longitude in mind, you can use the
point_radius operator with a radius of up to 25 miles. The syntax is as follows:
point_radius:[longitude latitude radius] (including the brackets) and an example is:
point_radius:[2.355128 48.861118 16km] OR point_radius:[-41.287336 174.761070 20mi]
Bounding Box If you have a larger area to search for, you can specify 2 coordinates to form a bounding rectangle (called “box” for some reason) to restrict the Tweet search to. The format is:
bounding_box:[west_long south_lat east_long north_lat] where the width and height of the box can be up to 25 miles. An example is:
bounding_box:[-105.301758 39.964069 -105.178505 40.09455]
While this option is available to all access levels, in reality it’s probably only useful for the Academic Research Track as you’ll otherwise be limited to only the past week of Twitter data. To specify a time range to get Tweets back, you need to use the
end_time fields, providing exact datetimes in ISO 8601 format. E.g.
YYYY-MM-DDTHH:mm:ssZ. Note, this is not part of the
query field as was everything described earlier.