A Guide to Instagramming with Python for Data Analysis
I am writing this article to show you the basics of using Instagram in a programmatic way. You can benefit from this if you want to use it in a data analysis, computer vision, or any other cool project you can think of.
Save/Load Data to Disk
Because this request might take a long time, we don't want to run it when it's unnecessary, so it's good practice to save the results and load them when we continue working. To do so we will use Pickle. Pickle can serialize any variable, save it to file, and then load it. Here is an example of how it works.
Sort by Number of Likes
Now we have a list called ‘myposts’ of dictionaries in order. Since we want to sort it by a certain key inside the dictionary, we can use the lambda expression in this way:
Then we can display them just like above:
We may want to apply some filter to our list of posts. For example, if there are videos in the posts but I only want pictures, I can filter this way:
Of course, you can apply filters on any variable in the result, so get creative ;)
You should see something like:
Notifications Only From One User
At this point we can manipulate and play with notifications as we wish. For example, I can get the list of notifications of only a certain user:
Let's try something more interesting: let's see WHEN is the time you get the most likes, the time of the day during which people most press like. To do that, we will plot the time of day vs. the number of likes you received.
Plot Datetime of notifications:
As we can see in my case, I get the most likes between 6:00 PM and 10:00 PM. If you are into social media you know this is a peak usage time, and when most companies chose to post to get the most engagements.
Getting Followers and Following lists
Here I will get the list of followers and following, and perform some operations on them.
In order to use the two functions
getUserFollowers, you will need to get the
user_id first. You can get
user_id this way:
Now you can simply use the functions as follows. Note that if the number of followers are large you will need to do more than one request (more on that next). Here we made one request to get the followers / following. The JSON result contains a list of 'users' that contains all info about each follower / following.
This result may not have the complete set if the number is large.
Getting All Followers
Getting the list of followers is similar to getting all posts. We will make one request and then iterate using the
*Thanks to Francesc Garcia for the support*
You should do the same for following, but in this case I won't because one request was enough to get all following in my case.
Now that we have a list of all data of following and followers in JSON format, I will convert them into a more friendly data type -- a set -- in order to perform some set operations on them.
I will only take the 'username' and make a
set() out of it.
Here I chose to make a set of usernames of each user. 'full_name' would work too -- and is more user friendly -- but it is not unique, and some users may not have a value for full_name.
Now that we have two sets we can do the following:
Here we have some statistics about followers. You can do many things from this point, such as saving the followers list and then comparing it at a later time to get the list of unfollowers.
Those are some things you can do to with the data of Instagram. I hope you learned how you can use the Instagram API and got a basic idea of what you can do with it. Keep an eye on the original as it is still under development, and there will be more things you can do in the future. For any questions or suggestions do not hesitate to contact me.
Bio: Nour Galaby is a Data Science Enthusiast that is passionate about Data Science and Machine learning.