The Star Wars social networks – who is the central character?
Data Scientist looks at the 6 Star Wars movies to extract the social networks, within each film and across the whole Star Wars universe. Network structure reveals some surprising differences between the movies, and finds who is actually the central character.
By Evelina Gabasova, U. of Cambridge.
Some of us are looking forward to Christmas, and some of us are looking forward to the new film in the Star Wars franchise, The Force Awakens. Meanwhile, I decided to look at the whole 6-movie cycle from a quantitative point of view and extract the Star Wars social networks, both within each film and across the whole Star Wars universe. Looking at the social network structure reveals some surprising differences between the original trilogy and the prequels.
If you’re interested in technical details of how I extracted the data, head down to the How I did the analysis section. But let’s start with some visualizations.
This is the social network from all the 6 movies combined together:
You can open the network in a full window which will show an interactive visualization of the network where you can drag individual nodes around. If you hover over the individual nodes, you’ll see the name of the corresponding character.
Here the nodes represent characters in the movies. The characters are connected by a link if they both speak in the same scene. And the more the characters speak together, the thicker the link between them. The size of each node corresponds to the total number of scenes the character appears in. I made a few arguable decisions though: Anakin and Darth Vader are represented by two separate nodes, because this distinction is important to the story. On the other hand, the Emperor node also jointly represents Palpatine and Darth Sidious. I also merged Amidala with Padme.
The original trilogy (episodes IV, V and VI) on the right is mostly separated in the network from the prequel trilogy on the left because most characters appear only in one of the trilogies. The crucial nodes that are connecting the two networks are Obi-Wan Kenobi, R2-D2 and C-3PO. Especially the robots seem to play an important social function because they appear frequently across all the movies.
The structures of the two sub-networks are also different. The original trilogy has fewer important nodes (Luke, Han, Leia, Chewbacca, Darth Vader) and they are densely interconnected between themselves. The prequel trilogy has more nodes overall, with many more connections. I’ll look at individual films in more detail later in the post.
Character timelines
Many of the characters feature in multiple movies, so I also created a comparison of their timelines across the individual episodes. The following graphics shows where the individual characters are mentioned in the film scripts. In order of appearance, these are the timelines of some of the main characters:
Here I included all mentions of each character, which includes other characters discussing their name. It is interesting to see how Anakin appears simultaneously with Darth Vader during Episode III, and then Darth Vader takes over. Anakin again reappears towards the end of Episode VI when Darth Vader turns away from the Dark side.
The characters that appear most consistently across all the films are the same ones that are in the centre of the social network – Obi-Wan, C-3PO and R2-D2. Yoda and the Emperor also appear across all of the films but they don’t talk directly with many people in the original trilogy, which moves them off the centre in the social network.
Networks in individual films
Now let’s look at the networks in individual films. Notice how the number of nodes and complexity of the networks change between the prequels and the original movies. Again, a link appears between characters if they speak within the same scene.