Category Archives: Uncategorized

Tropict: A clearer depiction of the tropics

Tropict is a set of python and R scripts that adjust the globe to make land masses in the tropics fill up more visual real estate. It does this by exploiting the ways continents naturally “fit into” each other, splicing out wide areas of empty ocean and nestling the continents closer together.

All Tropict scripts are designed to show the region between 30°S and 30°N. In an equirectangular projection, that looks like this:

original

It is almost impossible to see what is happening on land: the oceans dominate. By removing open ocean and applying the Gall-Peters projection, we get a clearer picture:

version4

There’s even a nice spot for a legend in the lower-left! Whether for convenience or lack of time, the tools I’ve made to allow you to make these maps are divided between R and Python. Here’s a handy guide for which tool to use:

decisions

(1) Supported image formats are listed in the Pillow documentation.
(2) A TSR file is a Tropict Shapefile Reinterpretation file, and includes the longitudinal shifts for each hemisphere.

Let’s say you find yourself with a NetCDF file in need of Tropiction, called bio-2.nc4. It’s already clipped to between 30°S and 30°N. The first step is to call splice_grid.py to create a Tropicted NetCDF:

python ../splice_grid.py subjects/bio-2.nc4 ../bio-2b.nc4

But that NetCDF doesn’t show country boundaries. To show country boundaries, you can follow the example for using draw_map.R:

library(ncdf4)
library(RColorBrewer)

## Open the Tropicted NetCDF
database <- nc_open("bio-2b.nc4")
## Extract one variable
map <- ncvar_get(database, "change")

## Identify the range of values there
maxmap <- max(abs(map), na.rm=T)

## Set up colors centered on 0
colors <- rev(brewer.pal(11,"RdYlBu"))
breaks <- seq(-maxmap, maxmap, length.out=12)

## Draw the NetCDF image as a background
splicerImage(map, colors, breaks=breaks)
## Add country boundaries
addMap(border="#00000060")
## Add seams where Tropict knits the map together
addSeams(col="#00000040")

Here’s an example of the final result, for a bit of my coffee work:

arabica-futureb

For more details, check out the documentation at the GitHub page!

And just for fun, here were two previous attempts of re-hashing the globe:

version1

I admit that moving Australia and Hawaii into the India Ocean was over-zealous, but they fill up the space so well!

version3

Here I can still use the slick division between Indonesian and Papua New Guinea and Hawaii fits right on the edge, but Australia gets split in two.

Enjoy the tropics!

Redrawing boundaries for the GCP

The Global Climate Prospectus will describe impacts across the globe, at high resolution. That means choosing administrative regions that people care about, and representing impacts within countries. However, choosing relevant regions is tough work. We want to represent more regions where there are more people, but we also want to have more regions where spatial climate variability will produce different impacts.

We now have an intelligent way to do just that, presented this week at the meeting of the American Geophysical Union. It is generalizable, allowing the relative role of population, area, climate, and other factors to be adjusted while making hard decisions about what administrative units to combine.  See the poster here.

Below is the successive agglomeration of regions in the United States, balancing the effects of population, area, temperature and precipitation ranges, and compactness. The map progresses from 200 regions to ten.

animation

Across the globe, some countries are maintained at the resolution of their highest available administrative unit, while others are subjected to high levels of agglomeration.

world-24k

The tool is generalizable, and able to take any mechanism for proposing regions and scoring them. That means that it can also be used outside of the GCP, and we welcome anyone who wants to construct regions appropriate for their analysis to contact us.

algorithm

Top 500: Leverage Points: Places to Intervene in a System

This is another installment of my top 500 journal articles: the papers that I keep coming back to and recommending to others.

Few papers have had a larger impact on my thinking and goals as Donella Meadows’s article Leverage Points: Places to Intervene in a System:

Folks who do systems analysis have a great belief in “leverage points.” These are places within a complex system (a corporation, an economy, a living body, a city, an ecosystem) where a small shift in one thing can produce big changes in everything.

She then explains how to understand them and where to find them, with fantastic examples from across the systems literature: global trade, ecology, urban planning, energy policy, and more. Reading it makes you feel like a kid in a candy shop, with so many leverage points to choose from. Shamelessly stealing a punch-line graphic, here are the leverage points:

leverage points

I have a small example of this, which you can try out. Go to my Thermostat Experiment and try to stabilize the temperature at 4 °C without clicking the “Show Graph” button until at least 30 “game minutes”. Then read on.

I’ve had people get very mad at me after playing this game. Some people find it impossible, get frustrated, and want to lash out. It’s a very simple system, but you are part of the system and you’re only allowed to use the weakest level of leverage point: the parameter behind the thermostat knob. What would each of the other leverage points look like?

  • 11. Buffer sizes: you can sit at a bad temperature for longer without hurting your supplies
  • 10. Material stocks and flows: you can move all the supplies out of the broken refrigerator
  • 9. Length of delays: the delay between setting the thermostat and seeing a temperature change is less
  • 8. Negative feedback: you’re better at setting the temperature
  • 7. Positive feedback: the recovery from a bad temperature is faster
  • 6. Information flows: you get to use the “Show Graph” button
  • 5. Rules of the system: you can get a new job not working at a refigerator warehouse
  • 4. Change system structure: you can modify the Thermostat experiment code
  • 3. Goals of the system: you replace the thermostat with a “fresh-o-stat” and just turn that up
  • 2. System mindset: you can close the website
  • 1. Transcending paradigms: you can close your computer

Resources:
Take a look at this podcast on Leverage Points from Made You Think.

Observations on US Migration

The effects of climate change on migration are a… moving concern. The news usually go under the heading of climate refugees, like the devastated hoards emanating from Syria. But there is already a less conspicuous and more persistent flow of climate migrants: those driven by a million proximate causes related to temperature rise. These migrants are likely to ultimately represent a larger share of human loss, and produce a larger economic impact, than those with a clear crisis to flee.

In most parts of the world, we only have coarse information about where migrants move. The US census might not be representative of the rest of the world, but it’s a pool of light where we can look for our key. I matched up the ACS County-to-County Migration Data with my favorite set of county characteristics, the Area Health Resource Files from the US Department of Health and Human Services. I did not look at migration driven by temperature, because I wanted to know if some of the patterns we were seeing there were a reflection of anything more than the null hypothesis. Here’s what I found.

First, the distribution of the distance that people move is highly skewed. The median distance is about 500 km; the mean is almost 1000. Around 10% of movers don’t move more than 100 km; another 10% move more than 2500 km.

bydist

The differences between characteristics of the places where migrants are moving from and where they are moving to reveals an interesting fact: the US has approximate conservation of housing. The distribution of the ratio of incomes in the destination and origin counties is almost symmetric. For everyone who moves to a richer county, someone is abandoning that county for a poorer one. The same for the difference between the share of urban population in the destination and origin counties. These distributions are not perfectly symmetric though. On median, people move to counties 2.2% richer and 1.7% more urban.

byincome byurban

The urban share distribution tells us that most people move to a county that has about the same mix of rurality and urbanity as the one they came from. How does that stylized fact change depending on the backwardness of their origins?

urbancp-total

The flows in terms of people show the same symmetry as distribution. Note that the colors here are on a log scale, so the blue representing people moving from very rural areas to other very rural areas (lower left) is 0.4% of the light blue representing those moving from cities to cities. More patterns emerge when we condition on the flows coming out of each origin.

urbancp-normed

City dwellers are least willing to move to less-urban areas. However, people from completely rural counties (< 5% urban) are more likely to move to fully urban areas than those from 10 - 40% urban counties. How far are these people moving? Could the pattern of migrants' urbanization be a reflection of moving to nearby counties, which have fairly similar characteristics? urbandistcp

Just considering the pattern of counties (not their migrants) across different kinds degrees of urbanization, how similar are counties by distance? From the top row, on average, counties within 50 km of very urban counties are only slightly less urban, while those further out are much less urban. Counties near those with 20-40% urban populations are similar to their neighbors and to the national average. More rural areas tend to also be more rural than their neighbors.

What is surprising is that these facts are almost invariant across the distance considered. If anything, rural areas are *more* rural than their immediate neighbors than to counties further away.

So, at least in the US, even if people are inching their way spatially, they can quickly find themselves in the middle of a city. People don’t change the cultural characteristics of their surroundings (in terms of urbanization and income) much, but those it is again the suburbs that are stagnant, with rural people exchanging with big cities almost one-for-one.

The Society, Environment and Economics Lab

seel

I’d like to introduce SEEL, David Anthoff’s nascent lab within the Energy and Resources Group at UC Berkeley. What was initially a ramshackle group of Ph.D. students, associated with David for as little reason that economically minded folk in ERG’s engineering-focused community need to stick together, seems to be growing into a healthy researching machine. Check out the new website for the Society, Environment and Economics Lab.

The current drivers are around FUND, a widely-used integrated assessment model, maintained by David. For a long time, models like this have been black boxes, and FUND is one of the few with open source code. That’s changing with David’s new modeling framework, Mimi, which has allowed him to rewrite FUND as a collection of interconnected components.

I like the vision, and I think it’s implemented in a way that has real legs for shifting the climate impact assessment process into a more open process. But we’ll find out soon. The National Academy of Sciences is meeting soon to discuss the future of the “social cost of carbon”, an influential quantity computed by models like FUND. David is going to try to convince them that the future of impact modeling looks like Mimi. Godspeed.

One console to rule them all

I love text consoles. The more I can do without moving a mouse or opening a new window, the better. So, when I saw XKCD’s command-line interface, I grabbed the code and started to build new features into it, as my kind of browser window to a cyber world of text.

I want to tell you about my console-based time-management system, the entertainment system, the LambdaMOO world, the integration with my fledgling single-stream analysis toolbox. But the first step was to clean out the password-protected stuff, and expose the console code for anyone who wants it.

So here it is! Feel free to play around on the public version, http://ift.tt/1PEsTtI, or clone the repository for your own.

screenshot

Here are the major changes from the original XKCD code by Chromacode:

  • Multiple “shells”: I currently just have the Javascript and XKCD-Shell ones exposed. Javascript gives you a developer-style javascript console (but buggy). You can switch between the two by typing x: and j:.
  • A bookmark system: ln URL NAME makes a new bookmark; ls lists the available bookmarks, and cd NAME opens a bookmark.
  • A login/registration system: Different users can have different bookmarks (and other stuff). Leave ‘login:’ blank the first time to create a new account.
  • Some new commands, but the only one I’m sure I left in is scholar [search terms] for a Google Scholar search.

Share, expand, and enjoy!

Labor Day 2015: More hours for everyone

In the spirit of Labor Day, I did a little research into Labor issues. I wanted to explore how much time people spent either at or in transit to work. Ever since the recession, it seems like we are asked to work longer and harder than ever before. I’m thinking particularly of my software colleagues who put in 60 hour weeks as a matter of course, and I wanted to know if it’s true across sectors. Has the relentless drive for efficiency in the US economy taken us back to the limit of work-life balance?

I headed to the IPUMS USA database and collected everything I could find on the real cost of work.

When you look at average family working hours (that is, including averaged with spouses for couples), there’s been a huge shift, from an average of 20-25 hours/week to 35-40. If those numbers seem low, note that this is divided across the entire year, including vacation days, and includes many people who are underemployed.

The graph below shows the shift, and that it’s not driven by specifically employees or the self-employed. The grey bands show one standard deviation, with a huge range that is even larger for the self-employed.

klass

So who has been caught up in this shift? Everyone, but some industries and occupations have seen their relative quality of life-balance shift quite a bit. The graph below shows a point for every occupation-and-industry combination that represents more than .1% of my sample.

hours

In 1960, you were best off as a manager in mining or construction; and worst as a laborer in the financial sector. While that laborer position has gotten much worse, it has been superseded in hours by at least two jobs: working in the military, and the manager position in mining that once looked so good. My friends in software are under the star symbols, putting in a few more hours than the average. Some of the laboring classes are doing relatively well, but still have 5 more hours of work a week than they did 40 years ago.

We are, all of us, more laborers now than we were 60 years ago. We struggle in our few remaining hours to maintain our lives, our relationships, and our humanity. The Capital class is living large, because the rest of us have little left to live.

Economic Risks of Climate Change Book out tomorrow!

The research behind the Risky Business report will be released as a fully remastered book, tomorrow, August 11!  This was a huge collaborative effort, led by Trevor Houser, Solomon Hsiang, and Robert Kopp, and coauthored with nine others, including me:

Economic Risks of Climate Change

From the publisher’s website:

Climate change threatens the economy of the United States in myriad ways, including increased flooding and storm damage, altered crop yields, lost labor productivity, higher crime, reshaped public-health patterns, and strained energy systems, among many other effects. Combining the latest climate models, state-of-the-art econometric research on human responses to climate, and cutting-edge private-sector risk-assessment tools, Economic Risks of Climate Change: An American Prospectus crafts a game-changing profile of the economic risks of climate change in the United States.

The book combines an exciting new approach to solidly ground results in data with an extensive overview of the world of climate change impacts. Take a look!

Guest Post: The trouble with anticipation (Nate Neligh)

Hello everyone, I am here to do a little guest blogging today. Instead of some useful empirical tools or interesting analysis, I want to take you on a short tour through of the murkier aspects of economic theory: anticipation. The very idea of the ubiquitous Nash Equilibrium is rooted in anticipation. Much of behavioral economics is focused on determining how people anticipate one another’s actions. While economists have a pretty decent handle on how people will anticipate and act in repeated games (the same game played over and over) and small games with a few different decisions, not as much work has been put into studying long games with complex history dependence. To use an analogy, economists have done a lot of work on games that look like poker but much less work on games that look like chess.

One of the fundamental problems is finding a long form game that has enough mathematical coherence and deep structure to allow the game to be solved analytically. Economists like analytical solutions when they are available, but it is rare to find an interesting game that can be solved by pen and paper.

Brute force simulation can be helpful. Simply simulating all possible outcomes and using a technique called backwards induction, we can solve the game in a Nash Equilibrium sense, but this approach has drawbacks. First, the technique is limited. Even with a wonderful computer and a lot of time, there are some games that simply cannot be solved in human time due to their complexity. More importantly, any solutions that are derived are not realistic. The average person does not have the ability to perform the same computations as a super computer. On the other hand, people are not as simple as the mechanical actions of a physics inspired model.

James and I have been working on a game of strategic network formation which effectively illustrates all these problems. The model takes 2 parameters (the number of nodes and the cost of making new connections) and uses them to strategically construct a network in a decentralized way. The rules are extremely simple and almost completely linear, but the complexities of backwards induction make it impossible to solve by hand for a network of any significant size (some modifications can be added which shrink the state space to the point where the game can be solved). Backwards induction doesn’t work for large networks, since the number of possible outcomes grows at a rate of (roughly) but what we can see is intriguing. The results seem to follow a pattern, but they are not predictable.

The trouble with anticipation

 

Each region of a different color represents a different network (colors selected based on network properties). The y-axis is discrete number of nudes in the network. The x axis is a continuous cost parameter. Compare where the color changes as the cost parameter is varied across the different numbers of nodes. As you can see, switch points tend to be somewhat similar across network scales, but they are not completely consistent.

Currently we are exploring a number of options; I personally think that agent-based modeling is going to be the key to tackling this type of problem (and those that are even less tractable) in the future. Agent based models and genetic algorithms have the potential to be more realistic and more tractable than any more traditional solution.

Scripts for Twitter Data

Twitter data– the endless stream of tweets, the user network, and the rise and fall of hashtags– offers a flood of insight into the minute-by-minute state of the society. Or at least one self-selecting part of it. A lot of people want to use it for research, and it turns out to be pretty easy to do so.

You can either purchase twitter data, or collect it in real-time. If you purchase twitter data, it’s all organized for you and available historically, but it basically isn’t anything that you can’t get yourself by monitoring twitter in real-time. I’ve used GNIP, where the going rate was about $500 per million tweets in 2013.

There are two main ways to collect data directly from twitter: “queries” and the “stream”. Queries let you get up to 1000 tweets at any point in time– whichever the most recent tweets that match your search criteria. The stream gives you a fraction of a percent of tweets continuously, which very quickly adds up, based on filtering criteria.

Scripts for doing these two options are below, but you need to decide on the search/streaming criteria. Typically, these are search terms and geographical constraints. See Twitter’s API documentation to decide on your search options.

Twitter uses an athentication system to identify both the individual collecting the data, and what tool is helping them do it. It is easy to register a new tool, whereby you pretend that you’re a startup with a great new app. Here are the steps:

  1. Install python’s twitter package, using “easy_install twitter” or “pip install twitter”.
  2. Create an app at http://ift.tt/1oHSTpv. Leave the callback URL blank, but fill in the rest.
  3. Set the CONSUMER_KEY and CONSUMER_SECRET in the code below to the values you get on the keys and access tokens tab of your app.
  4. Fill in the name of the application.
  5. Fill in any search terms or structured searches you like.
  6. If you’re using the downloaded scripts, which output data to a CSV file, change where the file is written, to some directory (where it says “twitter/us_”).
  7. Run the script from your computer’s terminal (i.e., python search.py)
  8. The script will pop up a browser for you to log into twitter and accept permissions from your app.
  9. Get data.

Here is what a simple script looks like:

import os, twitter

APP_NAME = "Your app name"
CONSUMER_KEY = 'Your consumer key'
CONSUMER_SECRET = 'Your consumer token'

# Do we already have a token saved?
MY_TWITTER_CREDS = os.path.expanduser('~/.class_credentials')
if not os.path.exists(MY_TWITTER_CREDS):
    # This will ask you to accept the permissions and save the token
    twitter.oauth_dance(APP_NAME, CONSUMER_KEY, CONSUMER_SECRET,
                        MY_TWITTER_CREDS)

# Read the token
oauth_token, oauth_secret = twitter.read_token_file(MY_TWITTER_CREDS)

# Open up an API object, with the OAuth token
api = twitter.Twitter(api_version="1.1", auth=twitter.OAuth(oauth_token, oauth_secret, CONSUMER_KEY, CONSUMER_SECRET))

# Perform our query
tweets = api.search.tweets(q="risky business")

# Print the results
for tweet in tweets['statuses']:
    if not 'text' in tweet:
        continue

    print tweet
    break

For automating twitter collection, I’ve put together scripts for queries (search.py), streaming (filter.py), and bash scripts that run them repeatedly (repsearch.sh and repfilter.sh). Download the scripts.

To use the repetition scripts, make the repetition scripts executable by running “chmod a+x repsearch.sh repfilter.sh“. Then run them, by typing ./repfilter.sh or ./repsearch.sh. Note that these will create many many files over time, which you’ll have to merge together.