Last active
December 11, 2021 11:07
-
-
Save ycaty/23cf1c17e6bb6e353f5823b3392c1e01 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Instagram scraping user tags 2020 | |
Brief demonstration on how to scrape and collect user tags without needing login information. | |
A lot of data end points such as "viewing followers" is blocked by public users(not logged in). | |
This is a way to bypass one of these endpoints. | |
Scrape user tags (same as the url) -> instagram.com/username/tagged | |
https://www.instagram.com/graphql/query/?query_hash=31fe64d9463cbbe58319dced405c6206&variables={"id":"29883180","first":12} | |
Url Breakdown | |
#start of url | |
https://www.instagram.com/graphql/query/ | |
#query hash is the "type" such as {user tagged /or/ user followers} just a complicated looking string that tells Instagram to return "tagged posts" for X user | |
?query_hash=31fe64d9463cbbe58319dced405c6206 | |
#id profile id of target user *this would be a first call before pagination begins, for tagged 12 is max note instagram doesnt always return exactly 12 | |
&variables={"id":"29883180","first":12} | |
Pagination | |
Once we load up the initial url we will be presented with json code containing all the images/data | |
If we want to paginate deeper we need to present the next cursor. | |
Find "end_cursor" in the json, it should look like a messy string with "==" at the end like so. | |
QVFEZ2FULUZTdHE5N2ZtWGJXNDVrWkZJZzVKY3VHOUtGT2F2UUZOaGpjeWpsOFc0Y1UzdU9MaEJYUERZRENMc0NKSzZsNnF4dU1WTW9DOHNIb2JGN2F0WA==" | |
In the &variables={"id":"29883180","first":12} change it to | |
&variables={"id":"29883180","first":12,"after":END_CURSOR_STRING_HERE} | |
example | |
https://www.instagram.com/graphql/query/?query_hash=31fe64d9463cbbe58319dced405c6206&variables={"id":"29883180","first":12, "after":"QVFEZ2FULUZTdHE5N2ZtWGJXNDVrWkZJZzVKY3VHOUtGT2F2UUZOaGpjeWpsOFc0Y1UzdU9MaEJYUERZRENMc0NKSzZsNnF4dU1WTW9DOHNIb2JGN2F0WA=="} | |
hello,
do you know if it's possible to query by username instead of user id?
if not, do you know if it's possible to retrieve a user id by username (maybe in a different endpoint)?
or use https://rapidapi.com/neotank/api/instagram130 /username-by-id
method.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
to my knowledge it's not possible, better to just grab the user id using the username.
it's best practice to always save user ids when saying usernames