Friday, November 20, 2015

Network Analysis

Background


One negative impact that sand mining can have, is deterioration of roads from the heavy trucks carrying frac sand.  For numerous counties, this is a serious concern as it could lead to higher road maintenance costs.  The additional costs to the counties could potentially be avoided through road upgrade maintenance agreements (RUMA) between sand mining companies and county governments (Hart, Adams, and Schwartz, 2013)
Figure 1.  On the left is shown elements of a good road use agreement (Hart, Adams, and Schwartz, 2013) 
 
Part of understanding costs, is understanding the routes being used to transport sand from mines to rail terminals.  This can be done using network analysis, a set of tools that can be used to calculate the best routes in terms of distance, expense, and time.  Utilizing network analysis is often referred to as logistics, and can help companies and individuals transport more efficiently.
 

 

Goals and Objectives

In this lab, we will perform a network analysis to find the routes that would be used to transport sand from mines to rail terminals.  We will then use this estimation to produce an estimate of how much counties would have to spend on road maintenance annually.  We will be using theoretical numbers provided by our professor on how often the roads are used and what the average cost of a truck driving over a section of road is.  Because of this, our results our not intended to be an accurate representation of actual costs.  They are meant to show the process by which you could use network analysis to generate these estimates.

The general work flow for this lab is shown in Figure 2.  We prepared data by removing mines within 1.5 km of a rail road.  This was done using Pyscriptor, (code can be viewed here).  Then we calculated routes based on closest facilities using network analysis and built a model to incorporate theoretical costs given to us by our professor and calculated our final answer.
 

Methods

 
Step 1.  Prepare data using queries
Using a Python Script, mines that were within 1.5 kilometers of a rail terminal were eliminated.  This was done, because these mines would likely load the sand directly unto trains, and therefore would not likely have a great impact on the road.  The feature class produced from the python script was used in the following steps.  The mine feature class produced can be seen in Figure 3 below.
Figure 3.  Map of locations of Sand Mines and Rail Terminals in Wisconsin.  Mines within 1.5 km of rail terminals not shown.
Step 2.  Calculate routes based on closest facilities
 
After enabling Network Analysis, Esri street data was imported.  Then using the Closest Facility tool we calculated the closest routes from each sand mine to a rail terminal.  The mines were loaded as incidents and the terminals were loaded as facilities. The resulting routes can be seen in Figure 4.
Figure 4.  Map of routes from sand mines to rail terminals derived using network analysis Closest Facility tool
 


Step 3. Generate hypothetical estimates of cost per county
 
Our professor gave us the following theoretical information to use to calculate the annual cost per county that would result from hauling sand on the roads.
  • Assume each sand mine takes 50 truck trips per year to the rail terminal, and that the truck has to return to the sand mine after each trip
  • The hypothetical costs per truck mile is 2.2 cents
Model builder was used to organize the tools needed to perform the spatial and data analysis needed for this calculation.  The model can be seen in Figure 5.  A description of the process is written in numbered steps below:
 
1.  Used Closest Facility tool in Network Analysis to solve for routes.  Mines were loaded as incidents and rail terminals were loaded as facilities
 
2.  The routes were selected and copied into a geodatabase.
 
3.  The routes were projected in NAD 1983 Wisconsin TM (Meters) so that distance could be calculated
 
4.  The Intersect tool was used for the routes and the Wisconsin counties that were in the same projection.
 
5.  A field was added to calculate the distance of the routes in miles
 
6.  Summary statistics were used to get the sum of routes in each county
 
7.  A new field was added and calculated to account for the number of truck trips annually
 
8.  A new field was added and calculated to account for the cost of these truck trips
Figure 5.  Model used to organize tools and estimate the hypothetical annual cost per county

Results

The hypothetical annual costs of sand mining associated with road maintenance varied greatly between different counties.  Some counties with sand mines, like Buffalo and La Crosse have less than 50 dollars in associated road costs, while other counties like Barron and Chippewa have over 400 dollars in associated road costs.  The differences between counties can be seen in the map in Figure 6 and the graph in Figure 7.
Figure 6.  Map showing hypothetical annual costs of road maintenance due to hauling sand in various Wisconsin Counties
Figure 7.  Graph showing hypothetical annual costs of road maintenance due to hauling sand in various Wisconsin Counties
 
 

Discussion

 
Network analysis is incredibly useful, and can be used to make smart and efficient choices in transportation.  They can also be useful in helping determine the impact that certain businesses might have on local roadways, and can be used to make sure that they are held accountable for this.
Model building was also useful in this activity, in that it helped organize the tools used, and could be used again even if certain parameters or data was altered.  For instance, it would not be too difficult if we were to receive an updated list of mine locations, to simply adjust the model for this and run it again.  It is a great time saver when you are performing a series of operations that may need to be run again with new and updated information.
 
Conclusion
 
In this lab we prepared data using queries, we calculated routes using network analysis, and we build a model to derive hypothetical annual road maintenance costs associated with sand mining.  We went through the steps that could be used when performing research on the associated costs of sand mining.  Actual analysis on these costs are very important for counties who may need to negotiate road upgrade maintenance agreements (RUMA) with sand mining companies. 
 
Sources
 
Data
Mine Locations: Wisconsin DNR
Railroad Terminals: Federal Railroad Administration
Basemaps: Esri
Streets for Network Analysis: Esri
Wisconsin Counties Feature Class: Trempealeau county geodatabase
 
Background Information
Hart, M. V., Adams, T., & Schwartz, A. Transportation Impacts of Frac Sand Mining in the MAFC Region: , CFIRE.
 
 
 
 
 
 
 
 
 
 
 

 

 
 

 


 
 
 
 

 

 


 

Monday, November 9, 2015

Data Normalization, Geocoding, and Error Assessment in Sand Mining Suitability Project

Goals and Objectives

The purpose of this lab was to geocode the locations of frac sand mines in Wisconsin.  The locations of the sand mines will be important in further parts of this project when we perform network analysis on the frac sand mines. 

Geocoding is when a spatial location is mapped using a description of the location, such as an address or latitude/longitude coordinates.  In this exercise, our professor gave us descriptions of the frac sand mine locations originally provided from the Wisconsin DNR.

A table from the Wisconsin DNR was normalized and then used to geocode for the locations of frac sand mines.  This offers good experience working with data that is not in the format that we need for our project.  After geocoding, we will compare the spatial locations of sand mines with other members in the class, and with the accurate locations of the mines.  This will be done as a way to evaluate how the geocoding went, and to calculate error.  A workflow for the activity can be seen in Figure 1.

Figure 1.  Workflow for the exercise, provided by Professor Hupy

Methods

Table Normalization
The table with descriptions of the Sand Mine locations needed to be normalized before it could be geocoding could be done.  The process of normalizing the DNR table included
  • Removing any commas between words
  • Separating Address into different fields for street name, town name, zip code, etc.
  • Separating the PLSS descriptions from the street address descriptions
In Figure 2, a selection of addresses that were normalized is shown. There was little organization kept throughout the table to begin with, and it was necessary to go through all of them to adjust the formatting.  Additionally many of the sand mine locations did not have any street addresses and had only a PLSS description.  
Figure 2.  Example of table normalization
Geocoding the Mines
The first step of this was to log in to ArcGIS online using the UWEC enterprise log in.  Geocoding requires credits, and will not run if it is not logged into a correct account.  After selecting the normalized table, the mines were geocoded using the World Geocode Service (ArcGIS Online).

Then using the View Address Inspector, check to make sure the ones that are matched are in the correct place.  13 out of the 18 sand mine locations I was geocoding were correctly matched automatically, and 5 of them were tied (see Figure 3). For many of them, it was simply an issue of moving the points so that they were near the driveway.
Figure 3. Screenshot taken after the automated geocoding process
For the locations with only PLSS descriptions, they would end up being placed in the center of the town.  These locations had to be found without the help of the automated geocoding.  

The PLSS quarter quarter sections from the Wisconsin DNR 2014 geodatabase.  From these, query statements could be used to locate where they were on the map.  For some of them, it was helpful to look at the same locations using google maps in order to locate the mines.  After I had checked the positions of all of my mines, and adjusted the positions of the ones that were off and then exported it as a shapefile to be shared with the rest of the class.

Then I made a new geodatabase, and uploaded the shapefiles from my classmates, as well as the shapefile from the DNR showing the actual locations of all of the sand mines.  Before I began, I projected all of my data in NAD_1983_Wisconsin_TM_US_Ft.  I chose this because there were mines across all of Wisconsin, and this was the best projection for the given study area.  It was essential to project the data, because I would be calculating distances in the next steps.

After looking through classmates, ones without a Mine ID field were deleted, because without the mine numbers, they could not be properly compared.  The mine IDs were sometimes recorded under different headings, so I created a new field in each of my classmates table with the field title ID, so that when I merged them, all of the mines would be in the same field and would be easy to query.  I first merged the shapefiles of my classmates’ mines together, and then I queried to find the Mine IDS that I also had found.

Using a query, I selected the same mines that I had geocoded from my classmates and used the generate near table tool to find how closely we had placed the mines to each others.  A near table was also used to compare the distance between my points and the DNR verified sand mine locations, and to compare our entire classes estimates to the DNR locations.

Results

For the most part, the geocoding of myself and my classmates had high amounts of error.  The average error (taken as an average distance from the near distance table) between my geocoded mines and my classmates was 36,874 feet.  The average error between my geocoded mines and the DNR mine locations was 33,889 feet, and the average error between the entire classes geocoded mines and the DNR locations was 14,215 feet.

Figure 4. Comparison of my geocoded mine locations with my classmates.  Example of estimates that were close together shown top right, example of locations that were far apart shown bottom right  

Figure 5.  Comparison of our classes geocoded sand mine locations compared to actual locations provided by the DNR.  

Discussion

The following table (Figure 6.) is from a class textbook that systematically lays out the different types of error possible when generating geographic data.  For the most part, I believe that mostly inherent errors were the reason that we had such large error values in our geocoding.  There was limitations with how up to date we could find aerial photography for (using either basemaps in google or ESRI), and this prevented us from accurately locating many of the sand mines.  Also, the data that was originally described using PLSS descriptions proved to be challenging to find, and was often hard to correctly get.  
Figure 6.  Table with classifications of GIS errors.

An example of a more operational error that may have occurred is a few  incorrectly geolocated mines are lowering the average accuracy.  When looking at the statistics of the near table for comparisons between myself and classmates, myself and the DNR, and the entire class and the DNR, I noticed that most of the distances were actually reletively small, and that a few of the larger distances were likely a reason that our average error was so high.
Figure 7.  Generate Near tables shown with frequency distribution tables to their right.  

Conclusion

Locations of Sand mines in Wisconsin and Trempealeau County
As a result of this lab, we now have a sense of where the frac sand mines are located in Wisconisn, and although we now have that spatial data from the DNR, it is important to know how to derive that information in case it was not available.  Geocoding can be a very difficult process, particularly when there are problems with the initial data sets being used.  However, it can provide extremely useful information when done correctly.  It is always good to have ways to evaluate errors that exist in your data, and to determine what the sources of those errors are.