Marvel Cinema Universe Network Analysis

If you are a villain in the Marvel Cinema Universe (MCU), good luck surviving long enough to be a staple in the movie franchise. A select few actors have been fortunate enough to make an appearance in 4 or more of the Marvel Comic Universe movies, with Robert Downey Jr. (Tony Stark/Iron Man), Samuel L. Jackson (Nick Fury), and Chris Evans (Captain America) being the most prominently featured stars in the series of superhero films.
Notably missing from the network figure are Jeremy Renner’s Hawkeye, Mark Ruffalo’s Hulk and fan favorite, Loki, a villain portrayed by Tom Hiddleston. However,  these three appear if the filter is lowered to allow actors with 3 or more connections to appear.
There are also perks to being the creator of this lucrative movie series; Stan Lee, comic book creator and producer, is the only individual to have made an appearance in all 12 of the films currently released.


  • If you are a super-villain looking to do the most damage to network connections, it is not Agent Phil Coulson that you should go after (as hypothesized by Loki in “The Avengers”), but rather Tony Stark, Nick Fury, or Captain America. Those are the characters in the Avengers that have the most ties in the Marvel Cinema Universe (Figure 1).
    J.A.R.V.I.S, Tony Stark’s A.I. system and butler (voiced by Paul Bettany), also is a common component in the Marvel Movies, appearing in 5 of the 12 films. While a super-villain might think the computer system would be an easier target for destroying network connections than one of the big-name heroes, I wouldn’t recommend this technique as this plan didn’t go well in “Avengers: Age of Ultron.” J.A.R.V.I.S was able to distribute his consciousness across the internet and avoid destruction, making destruction of the physical unit futile.
  • If you are an actor and have the option of picking between a superhero and a super-villain role in a Marvel Film, I recommend going with the hero role if you want to appear in multiple movies in the franchise, or go with the villain if you aren’t looking for a large commitment to the series.
  • If you are a casting director, I recommend hiring more women in recurrent roles. There are only two women with frequent (>3) appearances in the films, Black Widow and Pepper Potts.

The Marvel Comic Universe is the fictional universe in which several connected super hero films take place. Since 2008, there have been 12 films released, with 11 more planned over the next 3 years. Many of the lead characters have signed contracts to appear in multiple films, creating an opportunity to perform a network analysis of the shared actors between films. For this analysis, I created a dataset with 1,441 nodes and 1,602 edges from a list of all the actors from the currently released MCU films (Table 1).
Table 1. List of Currently Released MCU Movies
Marvel 2
The figure below (Figure 2) uses eigenvalue centrality to show the importance of connections within the network. The higher the centrality, the higher the influence of the node. Stan Lee is the node with the most importance and influence in the network. His node has the most connections to the different movies and actors analyzed and is represented by the large red dot in the center of the graph.
marvel 3

Figure 2. Network analysis graph of actors in Marvel Comic Universe movies. Colors are based on centrality. Node size are based on n-degree.

The tightly clustered yellow nodes represent actors who were only in one movie and do not have any relationships with other films.
One element that also ties many of the Marvel movies together are the post credit scenes. These scenes normally give hints to the next Marvel movie to be released. Therefore, they often contain actors from future movies, helping to create more ties between the movies.
Another observation from Figure 2 is that there are small clusters of relatively minor characters and extras who are commonly used between movies. These clusters would be expected between sequels or other movies that share many secondary characters, but in many cases these are unexpected and the links are merely actors who played a different role in each movie. For example, actor Fred Galle, played a taxi driver in Ant Man and a White-House press reporter in Iron Man 3.These relationships may be indicative of a relationship the actor has with one of the casting directors or other production team members for the movie. Or perhaps they have a good agent who has been able to get them multiple roles in the different movies.
An additional figure describing the connectedness of the nodes can be seen below (Figure 3). This figure reports the score of the movies by degree, with higher numbers denoting a higher degree of connectivity. The degree of connectivity is also shown by the size of the nodes and the color, with darker nodes and larger nodes denoting higher degrees. Iron Man 3 had the highest overall number of degrees with 247, perhaps unsurprisingly as it is the only trilogy of movies released so far. Captain America: The Winter Soldier also had a high number of degrees, with many tie-ins to other movies. It was interesting that there were more ties between The Winter Soldier and the Avengers than there were with that movie’s prequel, Captain America: The First Avenger. Thor and Thor: The Dark World had the least number of degrees of the movies analyzed, showing that there are the fewest connections between those movies and the MCU at large.
marvel 4

Figure 3. MCU by Degree. Darker colors and larger diameters denote higher degrees.

An analysis of the modality of the graph can be seen in Figure 4. Modality is a measure that describes the number of communities within a network. There were 8 different communities defined within the network (each color represents a community), and the community composition wasn’t always what would be expected. The movies that are direct sequels of one another (ex. Iron Man, Iron Man 2, and Iron Man 3) do not necessarily cluster together.
The Thor Movies did cluster together, and two of the three Iron Man movies clustered together, but the Captain America Movies did not, and neither did the Avenger Movies. Instead, Captain America: The Winter Soldier clustered with The Avengers. The sequel to the Avengers, Avengers: Age of Ultron, clustered more closely with Guardians of the Galaxy.
marvel 5

Figure 4. Modality of the Network. Communities are represented by the color of the nodes.

Additional analyses using other Marvel Universe productions could show additional connections between the movies and other features. Marvel has a series of short features, television shows, and Netflix television productions that also tie in with the larger universe that I did not include in my analysis.
Building the Dataset
To create the dataset, a list of all of the actors from currently released Marvel Cinema Universe movies were pulled from IMDb (Internet Movie Database). The dataset used the movie title as the source and the actor as the target. Character name was not used as a target due to the fact that many actors are credited with the same name (ex. “Dancer”), and it was decided that connections between the actors would yield a better descriptor of the connections between the movies. The dataset was created in Excel and saved as a comma separated file (csv) for importing into Gephi.
Descriptive Statistics
Analyses were performed in Gephi, using the statistical tools window. Layout and appearance were controlled by the appearance window. The layouts of the nodes and edges were created using all 12 available layouts and evaluated for their particular aesthetics. ForceAtlas (Figure 2 and 3) and Fruchterman Reingold (Figure 1 and 4) were the most commonly used algorithms for generating the layout. Eigenvalues and Degrees were calculated within Gephi and the results were applied to generate colors and sizes for the nodes and edges as described for each graph.
Clustering was performed using the Modality feature in Gephi. A resolution of 2 was set for calculating the modality, as leaving the modality at the default setting of 1 produced a unique value for each movie cluster and did not help to explain the relationships between clusters. A modality value of .782 was calculated, which shows that there were communities present within the network, as was expected.
Social Network Analysis could be a powerful tool for the superhero and super-villain communities. They can provide vital information about the super-villain and superhero networks, including which super-villains or superheroes should be targeted to have the largest impact on the social network, as well as show which individuals are the most popular in the universe.These analyses could be strengthened by adding in the additional Marvel Comic Universe publications.
Directors can use the analyses to see where the MCU is most loosely connected and provide supplementary material to strengthen the lore about those characters and tie them in more successfully. Additional analyses could look to see whether the ties between the movies created by extras can be attributed to the director of the movie. As the MCU grows, it is likely that the network connections between the publications will grow as well.
Columnist: Cory Everington