KDnuggets Home » News » 2017 » Aug » Tutorials, Overviews » A Guide to Instagramming with Python for Data Analysis ( 17:n32 )

Silver Blog, Aug 2017A Guide to Instagramming with Python for Data Analysis


 
  http likes 202

I am writing this article to show you the basics of using Instagram in a programmatic way. You can benefit from this if you want to use it in a data analysis, computer vision, or any other cool project you can think of.



Save/Load Data to Disk

Because this request might take a long time, we don't want to run it when it's unnecessary, so it's good practice to save the results and load them when we continue working. To do so we will use Pickle. Pickle can serialize any variable, save it to file, and then load it. Here is an example of how it works.

Save:

import pickle
filename=username+"_posts"
pickle.dump(myposts,open(filename,"wb"))


Load:

import pickle
filename="nourgalaby_posts"
myposts=pickle.load(file=open(filename))


Sort by Number of Likes

Now we have a list called ‘myposts’ of dictionaries in order. Since we want to sort it by a certain key inside the dictionary, we can use the lambda expression in this way:

myposts_sorted = sorted(myposts, key=lambda k:
k['like_count'],reverse=True) 
top_posts=myposts_sorted[:10]
bottom_posts=myposts_sorted[-10:]


Then we can display them just like above:

image_urls=get_images_from_list(top_posts)
display_images_from_url(image_urls)


Filtering Photos

We may want to apply some filter to our list of posts. For example, if there are videos in the posts but I only want pictures, I can filter this way:

myposts_photos= filter(lambda k: k['media_type']==1, myposts)
myposts_vids= filter(lambda k: k['media_type']==2, myposts)
print len(myposts)
print len(myposts_photos)
print len(myposts_vids)


Of course, you can apply filters on any variable in the result, so get creative ;)

Notifications

InstagramAPI.getRecentActivity()
get_recent_activity_response= InstagramAPI.LastJson 
for notifcation in get_recent_activity_response['old_stories']:
    print notifcation['args']['text']


You should see something like:

userohamed3 liked your post.
userhacker32 liked your post.
user22 liked your post.
userz77 liked your post.
userwww77 started following you.
user2222 liked your post.
user23553 liked your post.


Notifications Only From One User

At this point we can manipulate and play with notifications as we wish. For example, I can get the list of notifications of only a certain user:

username="diana"
for notifcation in get_recent_activity_response['old_stories']:
    text = notifcation['args']['text']
    if username  in text:
        print text


Let's try something more interesting: let's see WHEN is the time you get the most likes, the time of the day during which people most press like. To do that, we will plot the time of day vs. the number of likes you received.

Plot Datetime of notifications:

import pandas as pd
df = pd.DataFrame({"date":dates})
df.groupby(df["date"].dt.hour).count().plot(kind="bar",title="Hour" )


Instagram

As we can see in my case, I get the most likes between 6:00 PM and 10:00 PM. If you are into social media you know this is a peak usage time, and when most companies chose to post to get the most engagements.

Getting Followers and Following lists

Here I will get the list of followers and following, and perform some operations on them.

In order to use the two functions getUserFollowings and getUserFollowers, you will need to get the user_id first. You can get user_id this way:

Now you can simply use the functions as follows. Note that if the number of followers are large you will need to do more than one request (more on that next). Here we made one request to get the followers / following. The JSON result contains a list of 'users' that contains all info about each follower / following.

InstagramAPI.getUserFollowings(user_id)
print len(InstagramAPI.LastJson['users'])
following_list=InstagramAPI.LastJson['users']

InstagramAPI.getUserFollowers(user_id)
print len(InstagramAPI.LastJson['users'])
followers_list=InstagramAPI.LastJson['users']


This result may not have the complete set if the number is large.

Getting All Followers

Getting the list of followers is similar to getting all posts. We will make one request and then iterate using the next_max_id key:

*Thanks to Francesc Garcia for the support*

import time

followers   = []
next_max_id = True
while next_max_id:
    print next_max_id
    #first iteration hack
    if next_max_id == True: next_max_id=''
    _ = InstagramAPI.getUserFollowers(user_id,maxid=next_max_id)
    followers.extend ( InstagramAPI.LastJson.get('users',[]))
    next_max_id = InstagramAPI.LastJson.get('next_max_id','')
    time.sleep(1) 
    
followers_list=followers


You should do the same for following, but in this case I won't because one request was enough to get all following in my case.

Now that we have a list of all data of following and followers in JSON format, I will convert them into a more friendly data type -- a set -- in order to perform some set operations on them.

I will only take the 'username' and make a set() out of it.

user_list = map(lambda x: x['username'] , following_list)
following_set= set(user_list)
print len(following_set)

user_list = map(lambda x: x['username'] , followers_list)
followers_set= set(user_list)
print len(followers_set)


Here I chose to make a set of usernames of each user. 'full_name' would work too -- and is more user friendly -- but it is not unique, and some users may not have a value for full_name.

Now that we have two sets we can do the following:

Instagram

Here we have some statistics about followers. You can do many things from this point, such as saving the followers list and then comparing it at a later time to get the list of unfollowers.

Those are some things you can do to with the data of Instagram. I hope you learned how you can use the Instagram API and got a basic idea of what you can do with it. Keep an eye on the original as it is still under development, and there will be more things you can do in the future. For any questions or suggestions do not hesitate to contact me.

 
Bio: Nour Galaby is a Data Science Enthusiast that is passionate about Data Science and Machine learning.

Related:


Sign Up