{ "cells": [ { "cell_type": "markdown", "id": "7ff46f8d-910d-42a9-979e-7650e3cf3e65", "metadata": {}, "source": [ "# Working with Graph Objects\n", "\n", "This includes the basics of NetworkX and how to work with its specialized Graph objects. It uses the [Marvel network](https://github.com/melaniewalsh/sample-social-network-datasets/tree/master/sample-datasets/marvel) data as an example.\n", "\n", "## Importing a Graph object" ] }, { "cell_type": "code", "execution_count": 1, "id": "a1efdb13-5b1e-40f7-abca-9c50ec879de1", "metadata": { "tags": [] }, "outputs": [], "source": [ "# Import NetworkX and key data science libraries\n", "import networkx as nx\n", "import pandas as pd\n", "import numpy as np\n", "import altair as alt" ] }, { "cell_type": "markdown", "id": "3e6cb0cd-c4f8-4861-ac5d-a61a508ee984", "metadata": {}, "source": [ "NetworkX provides [lots of different functions for importing and exporting network data](https://networkx.org/documentation/stable/reference/readwrite/index.html). Here we'll go over the three you will most commonly encounter.\n", "\n", "Network data is often stored in edge lists, plaintext files where every row contains a pair of nodes, separated by whitespace. Sometimes these files also include edge weights.\n", "\n", "NetworkX has a `read_edgelist()` function for these kinds of files. If your file has a third column for weights, you can use `read_weighted_edgelist()`." ] }, { "cell_type": "code", "execution_count": 2, "id": "ea1b4123-228c-4cb2-9e0f-d9e0733397f8", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Graph with 327 nodes and 5818 edges\n" ] } ], "source": [ "# Import an edgelist of high school student interaction data.\n", "HS = nx.read_weighted_edgelist(\"../data/contact-high-school-proj-graph.txt\")\n", "print(HS)" ] }, { "cell_type": "markdown", "id": "a4159578-a188-4faf-80ca-85a0f4f12747", "metadata": {}, "source": [ "You can also use NetworkX read `.gml` files, which record network data using the Graph Modelling Language." ] }, { "cell_type": "code", "execution_count": 3, "id": "3fbb8480-b090-4948-a427-042f019527d0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Graph with 115 nodes and 613 edges\n" ] } ], "source": [ "# Import a gml of college football data.\n", "F = nx.read_gml(\"../data/football.gml\")\n", "print(F)" ] }, { "cell_type": "markdown", "id": "692bc64a-b0b1-4576-804c-ce8b18068252", "metadata": {}, "source": [ "Finally, you'll also encounter edge list data in CSV files, where the node pairs are separated by commas instead of whitespace. In this cases, it's often easier to use Pandas as an intermediary.\n", "\n", "First, read the CSV with Pandas." ] }, { "cell_type": "code", "execution_count": 4, "id": "17fad08b-a6a0-47d1-83d9-5a4d5e0fb55e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Source | \n", "Target | \n", "Weight | \n", "
---|---|---|---|
0 | \n", "Black Panther / T'chal | \n", "Loki [asgardian] | \n", "10 | \n", "
1 | \n", "Black Panther / T'chal | \n", "Mantis / ? Brandt | \n", "23 | \n", "
2 | \n", "Black Panther / T'chal | \n", "Iceman / Robert Bobby | \n", "12 | \n", "
3 | \n", "Black Panther / T'chal | \n", "Marvel Girl / Jean Grey | \n", "10 | \n", "
4 | \n", "Black Panther / T'chal | \n", "Cyclops / Scott Summer | \n", "14 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "
9886 | \n", "Mr. Fantastic Doppel | \n", "Iron Man Doppelgange | \n", "7 | \n", "
9887 | \n", "Maddicks, Arthur Art | \n", "Hodge, Cameron | \n", "18 | \n", "
9888 | \n", "Maddicks, Arthur Art | \n", "Leech | \n", "55 | \n", "
9889 | \n", "Leech | \n", "Hodge, Cameron | \n", "11 | \n", "
9890 | \n", "Mam'selle Hepzibah | \n", "Raza Longknife | \n", "34 | \n", "
9891 rows × 3 columns
\n", "\n", " | character | \n", "degree_centrality | \n", "
---|---|---|
0 | \n", "Absorbing Man / Carl C | \n", "0.159509 | \n", "
1 | \n", "Angel / Warren Kenneth | \n", "0.518405 | \n", "
2 | \n", "Ant-man / Dr. Henry J. | \n", "0.500000 | \n", "
3 | \n", "Ant-man Ii / Scott Har | \n", "0.156442 | \n", "
4 | \n", "Apocalypse / En Sabah | \n", "0.171779 | \n", "
... | \n", "... | \n", "... | \n", "
322 | \n", "Wrecker Iii / Dirk Gar | \n", "0.165644 | \n", "
323 | \n", "X-man / Nathan Grey | \n", "0.095092 | \n", "
324 | \n", "Zabu | \n", "0.076687 | \n", "
325 | \n", "Zero | \n", "0.095092 | \n", "
326 | \n", "Zeus | \n", "0.073620 | \n", "
327 rows × 2 columns
\n", "\n", " | Source | \n", "Target | \n", "Weight | \n", "
---|---|---|---|
0 | \n", "Absorbing Man / Carl C | \n", "Enchantress / Amora / He | \n", "15 | \n", "
1 | \n", "Absorbing Man / Carl C | \n", "Fandral [asgardian] | \n", "6 | \n", "
2 | \n", "Absorbing Man / Carl C | \n", "Iron Man Iv / James R. | \n", "12 | \n", "
3 | \n", "Absorbing Man / Carl C | \n", "Lizard / Dr. Curtis Co | \n", "12 | \n", "
4 | \n", "Absorbing Man / Carl C | \n", "Molecule Man / Owen Re | \n", "13 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "
9886 | \n", "Wrecker Iii / Dirk Gar | \n", "Titania Ii / Mary Skee | \n", "12 | \n", "
9887 | \n", "Wrecker Iii / Dirk Gar | \n", "Ulik | \n", "6 | \n", "
9888 | \n", "Wrecker Iii / Dirk Gar | \n", "Volcana / Marsha Rosen | \n", "10 | \n", "
9889 | \n", "Zabu | \n", "High Evolutionary / He | \n", "5 | \n", "
9890 | \n", "Zabu | \n", "Ka-zar / Kevin Plunder | \n", "159 | \n", "
9891 rows × 3 columns
\n", "