Skip to content

Instantly share code, notes, and snippets.

@cosmocatalano
Last active November 4, 2024 17:16
Show Gist options
  • Save cosmocatalano/4544576 to your computer and use it in GitHub Desktop.
Save cosmocatalano/4544576 to your computer and use it in GitHub Desktop.
Quick-and-dirty Instagram web scrape, just in case you don't think you should have to make your users log in to deliver them public photos.
<?php
//returns a big old hunk of JSON from a non-private IG account page.
function scrape_insta($username) {
$insta_source = file_get_contents('http://instagram.com/'.$username);
$shards = explode('window._sharedData = ', $insta_source);
$insta_json = explode(';</script>', $shards[1]);
$insta_array = json_decode($insta_json[0], TRUE);
return $insta_array;
}
//Supply a username
$my_account = 'cosmocatalano';
//Do the deed
$results_array = scrape_insta($my_account);
//An example of where to go from there
$latest_array = $results_array['entry_data']['ProfilePage'][0]['user']['media']['nodes'][0];
echo 'Latest Photo:<br/>';
echo '<a href="http://instagram.com/p/'.$latest_array['code'].'"><img src="'.$latest_array['display_src'].'"></a></br>';
echo 'Likes: '.$latest_array['likes']['count'].' - Comments: '.$latest_array['comments']['count'].'<br/>';
/* BAH! An Instagram site redesign in June 2015 broke quick retrieval of captions, locations and some other stuff.
echo 'Taken at '.$latest_array['location']['name'].'<br/>';
//Heck, lets compare it to a useful API, just for kicks.
echo '<img src="http://maps.googleapis.com/maps/api/staticmap?markers=color:red%7Clabel:X%7C'.$latest_array['location']['latitude'].','.$latest_array['location']['longitude'].'&zoom=13&size=300x150&sensor=false">';
?>
*/
@rramoscabral
Copy link

hey really enjoyed this post. i made a quick lil mockup on the break down of scraping user tags without login.
https://gist.github.com/levlet/23cf1c17e6bb6e353f5823b3392c1e01

By any chance does anyone happen to have a way to collect followers without logging in?

Page not found

@ycaty
Copy link

ycaty commented Feb 2, 2021

hey really enjoyed this post. i made a quick lil mockup on the break down of scraping user tags without login.
https://gist.github.com/levlet/23cf1c17e6bb6e353f5823b3392c1e01
By any chance does anyone happen to have a way to collect followers without logging in?

Page not found

updated link
https://gist.github.com/ycaty/23cf1c17e6bb6e353f5823b3392c1e01#file-instagram-user-tag-scraping-2020

@Yashwanthd1998
Copy link

looks like instagram blocking scraping using file_get_contents/curl anyone got solution? i wonder how online web scraping tools are working then without block?

Copy link

ghost commented Aug 6, 2021

Hi 'Cosmocatalano' [ nomen est omen?] :) ,
this is a very interesting solution. I only try it on local host so I have no problem with CORS. But the array names seem to be changed completely. The only one which is still the same seems to be 'entry_data'. Is this changed response still usable with alternative array 'names'? This would be very interesting.

Best regards and thanks
Axel Arnold Bangert

@skmachine
Copy link

skmachine commented Dec 11, 2021

looks like instagram blocking scraping using file_get_contents/curl anyone got solution? i wonder how online web scraping tools are working then without block?

I guess it is just the right amount of good proxies.. I am using https://rapidapi.com/neotank/api/simple-instagram-api to avoid dealing with proxies now because they fail all the time (for Instagram) and get 302 redirect to login..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment