MakeoverMonday: Mapping a Global City Network — 5,470 nodes, 10,596 ties

MakeoverMonday
Python
Finance
Reframing the 5,470-city, 10,596-link graph (Wikidata QIDs) into a map to surface geographic hubs
Author

chokotto

Published

May 11, 2026

Overview

This week’s dataset ships as two tidy files: a city register with coordinates and a link list connecting city QIDs. The original community takes on this challenge often reach for a force-directed “hairball” to show how places relate. I took a map-first angle instead: put every city where it actually sits on Earth, then let the relationships ride on top as size and color. That way the structure reads through a geographic lens, not just network physics.

Why the change? A node-link diagram is great for topology, but it hides where those ties live. With lng/lat, country, and continent in hand, we can make adjacency visible while still surfacing connectivity. That supports questions like “which regions are overrepresented?” and “where are the big hubs?” without losing a sense of place.

A quick look at the schema shows a globally scoped graph: - 5,470 rows in cities.csv; 10,596 rows in links.csv - Longitude spans -172 to 178 and latitude -54.8 to 72.8 (median 13.3°E, 44.9°N) - countrycd has 7 missing values; continent is fully populated

The post walks through a compact, map-centered redesign and what it reveals about coverage and connectivity. See the data on data.world for context.

Original Visualization

Source: MakeoverMonday

The original visualization for this challenge was a node-link network: cities (QIDs) as circular nodes connected by straight edges, typically laid out with a force-directed algorithm. It aimed to communicate which places act as hubs by using node size/labels and the density of connections, with no geographic axes—just x/y from the layout to emphasize topology over location.

Dataset

Code
import sys
from pathlib import Path

import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go

# posts/_mm_layout.py を import(実行 cwd は投稿サブフォルダまたは quarto プロジェクトルートのいずれか)
_p = Path.cwd()
if (_p / "_mm_layout.py").exists():
    _posts = _p
elif (_p.parent / "_mm_layout.py").exists():
    _posts = _p.parent
elif (_p / "posts" / "_mm_layout.py").exists():
    _posts = _p / "posts"
else:
    _posts = _p
sys.path.insert(0, str(_posts))
from _mm_layout import apply_mm_layout
Code
# Load this week's data
# df = pd.read_csv("data.csv")

My Makeover

What I Changed

I shifted from a force-directed network to a geo-anchored view so location informs the reading. With explicit lng/lat, each city lands on a map; connectivity is encoded in size and color rather than in where the algorithm chooses to place nodes. This keeps the spatial story intact while still elevating structure.

Concretely, I: - Chose a map with points for cities, sizing nodes by link count (degree) and coloring by continent to balance topology with geography. - Swapped arbitrary force x/y for actual lng/lat and used a consistent projection; adjusted size with a mild nonlinear scale to control outliers. - Framed the analysis around coverage and hubs by region (e.g., completeness across continents, missing country codes) rather than listing individual high-degree cities.

This combination preserves the graph’s relationships but avoids the “hairball” effect, surfaces regional patterns at a glance, and makes data quality (like the 7 missing countrycd values) immediately visible.

Visualization

Code
df_cities = pd.read_csv('data/cities.csv')
df_links = pd.read_csv('data/links.csv')

# Compute degree (link count) per city by counting occurrences in source and target
edges = pd.concat([df_links['source'], df_links['target']], ignore_index=True).rename('id').to_frame()
deg = edges['id'].value_counts().rename('degree').reset_index().rename(columns={'index': 'id'})

# Join degree back to city table; missing degrees mean 0
cities = df_cities.merge(deg, on='id', how='left')
cities['degree'] = cities['degree'].fillna(0).astype(int)

# Drop rows missing coordinates
cities = cities.dropna(subset=['lng', 'lat'])

# Prepare continents (treat any missing as 'Unknown')
cities['continent'] = cities['continent'].fillna('Unknown')
continents = sorted(cities['continent'].unique().tolist())

fig = go.Figure()
for i, cont in enumerate(continents):
    sub = cities[cities['continent'] == cont]
    # mild nonlinear size scale so high-degree hubs stand out without overwhelming
    sizes = (sub['degree'].astype(float).pow(0.5) * 3) + 4
    hover = sub['name'] + ' (' + sub['id'] + ')<br>Links: ' + sub['degree'].astype(str)

    fig.add_trace(go.Scattergeo(
        lon=sub['lng'],
        lat=sub['lat'],
        text=hover,
        hoverinfo='text',
        mode='markers',
        name=cont,
        marker=dict(
            size=sizes,
            color=PAL[i % len(PAL)],
            opacity=0.8,
            line=dict(width=0.2, color='rgba(0,0,0,0.2)')
        )
    ))

fig.update_layout(
    **THEME,
    xaxis=dict(title='Longitude', tickformat=',.2f'),
    yaxis=dict(title='Latitude', tickformat=',.2f'),
    geo=dict(
        scope='world',
        projection_type='natural earth',
        showcountries=True,
        showcoastlines=True,
        landcolor='rgba(240,240,240,0.9)'
    ),
    legend=dict(traceorder='normal')
)

apply_mm_layout(
    fig,
    'Cities positioned by real coordinates, sized by link count',
    subtitle='Each point is a city; marker size = number of connections (degree). Colors show continent.',
    legend_position='top',
    n_legend_items=len(continents),
)

add_source(fig)
assert_no_title_overlap(fig)
fig.show()

try:
    fig.write_image('chart-1.png', width=1200, height=520, scale=2)
except Exception:
    pass

Key Takeaways

  • Global spread is clear in the numeric bounds: longitude runs from -172 to 178, and latitude from -54.8 to 72.8, with medians at 13.3°E and 44.9°N as shown in the stats.
  • Scale matters: there are 5,470 cities connected by 10,596 links, enough for a worldwide network while staying sparse enough that many nodes will show few ties.
  • In the head sample, all five rows are Japanese cities—Fukaya, Ishigaki, Tsuru, Koga, and Kakegawa—tagged Asia (countrycd JP), a reminder to check regional balance beyond the first look.

This post is part of the MakeoverMonday weekly data visualization project.

CautionDisclaimer

This analysis is for educational and practice purposes only. Data visualizations and interpretations are based on the provided dataset and may not represent complete or current information.