__________________________________________________ A CENTURY OF SPRAWL IN THE UNITED STATES: ROAD NETWORK GRAPH DATA Chris Barrington-Leigh and Adam Millard-Ball __________________________________________________ Table of Contents _________________ 1 FILE FORMAT 2 ATTRIBUTES 3 TIPS AND CAVEATS 4 CITATION 5 CONTACT This file describes the network graphs published, along with geographic tables described separately, as part of the 2015 PNAS article DOI: 10.1073/pnas.1504033112. The full citation is below. The dataset released with this article is archived independently as DOI: 10.5061/dryad.3k502 ([http://dx.doi.org/10.5061/dryad.3k502]). Graph files represent the network structure, but not the physical layout, of roads. The graph files are large and can be slow to load. They are recommended for advanced users only, who are interested in network properties. In most cases, the .tsv and .shp files will be more useful. For these geographic tabular data, see a separate README. 1 FILE FORMAT ============= The collection of graph files is collated in a GNU zip compressed GNU tar file (ie, .tar.gz, or .tgz format). Each individual graph file is also compressed using the GNU zip (gzip) format before collation. Once extracted and decompressed, the graphs are provided in GML format, and were written with the Python networkx library. See: [https://networkx.github.io/documentation/latest/reference/readwrite.gml.html] Files are provided at the metropolitan region and state level, as follows: - Combined Statistical Area (CSA), for counties within a CSA - Core-Based Statistical Area (CBSA) for counties within a CBSA but not a CSA - state, for counties that are not within a CSA or CBSA The suffix indicates the FIPS code. A two digit code indicates a state. Longer codes indicate a CSA or CBSA. The FIPS code lookups are available here (we use the 2013 delineations): [http://www.census.gov/geo/reference/ansi.html] Files are created based on the 2014 vintage of the TIGER/Line files published by the US Census Bureau. Nodes are collapsed according to the procedure described in Section S1.2 and Figure S3 of the Supporting Information. 2 ATTRIBUTES ============ Three attributes are included: - for each edge, the TLID, which references the edge to the Census Bureau TIGER/Line files - for each edge, the length in meters - for each node, whether it is within an urbanized area (defined as a census block group where more than half of the census blocks are designated urban) 3 TIPS AND CAVEATS ================== To replicate the measures of network connectivity in the published paper (e.g. mean nodal degree and % deadends), you will need to make the following adjustments: - drop or ignore 2-degree nodes - drop or ignore nodes that are not designated as urban - top-code nodes with degree 5+ as degree 4 Because of boundary issues, the graph summary statistics (e.g. nodal degree) may not exactly match the published results. The FIPS code lookups are available here (we use the 2013 delineations): [http://www.census.gov/geo/reference/ansi.html] 4 CITATION ========== Barrington-Leigh, Christopher and Millard-Ball, Adam (2015), "A Century of Sprawl in the United States." Proceedings of the National Academy of Sciences. DOI: 10.1073/pnas.1504033112 5 CONTACT ========= For further questions, please contact: - Chris Barrington-Leigh, McGill University: Chris.Barrington-Leigh@McGill.ca - Adam Millard-Ball, University of California, Santa Cruz: adammb@ucsc.edu