Network Analysis

DA 101, Dr. Ladd

Week 12

What are Networks?

Networks are made up of…

  • Entities (entity = node/vertex/actor)
  • Relationships (relationship = edge/link/tie)
  • We’ll use “nodes” and “edges”

Nodes and Edges have Attributes

Node Attributes

  • numerical (size)
  • categorical (color)

Edge Attributes

Directed and Undirected Edges

Weighted and Unweighted Edges

Edge Types

Multiple Edges “in a row” Make a Path

Path & Diameter

(& Average Shortest Path Length)

Some special kinds of nodes

Isolates

Hubs

Bridges

Measuring a node’s “importance” with centrality

Degree

Strength

Betweenness

Different kinds of entities or nodes

Unipartite/unimodal

Bipartite/bimodal

Bipartite (cont.)

Multipartite/k-partite/multimodal

Groups of nodes within a network

Connected components

Cliques and clustering

Clustering Coefficient

Image from Wikipedia

Communities and community detection

Density

A Sparse Network

A Dense Network

There are many ways to visualize a network

Adjacency Matrix

Tilman Piesk, via Wikipedia

Adjacency List

  • A adjacent to B,C
  • B adjacent to A,C
  • C adjacent to A,B

Other Important Concepts

Triadic Closure

Assortative mixing/Homophily

Preferential Attachment

Weak Ties

Small World Network

  • low average path length
  • low clustering coefficients
  • degree distribution follows power law (a few large hubs)
  • low diameter (usually around “six degrees”)

Working with Networks in R

Let’s start with an example.

We’ll need two new libraries.

library(igraph) # Network tools and metrics

library(networkD3) # Interactive network visualizations

We can download the data and load it into a “network object” with iGraph.

First, download got-edges.csv. What does this data look like?

# Read in edgelist CSV
edges <- read_csv("got-edges.csv")

# Create igraph object from edgelist
G <- graph_from_data_frame(d = edges, directed = FALSE)
E(G)$weight <- E(G)$Weight #Make sure weight is labeled correctly

We can use iGraph to calculate different metrics.

# Calculate degree and add to network data
V(G)$degree <- degree(G)

# Calculate betweenness centrality and add to network data
V(G)$betweenness <- betweenness(G, normalized=TRUE)

# Calculate the ratio of betweenness to degree
# (i.e. This is the node that functions most as a "bridge")
V(G)$betweenness_over_degree <- betweenness(G)/degree(G)

You can convert node/vertex data to a dataframe.

nodes <- igraph::as_data_frame(G, what="vertices")

What happens if you try:

edges <- igraph::as_data_frame(G, what="edges")

With a dataframe, you can easily see degree and betweenness distributions.

# Degree distribution
ggplot(nodes, aes(degree)) +
     geom_histogram()

How would you make a graph of the betweenness distribution?