Goals and Objectives
The purpose of this lab was to geocode the locations of frac
sand mines in Wisconsin. The locations
of the sand mines will be important in further parts of this project when we
perform network analysis on the frac sand mines.
Geocoding is when a spatial location is mapped using a description of the location, such as an address or latitude/longitude coordinates. In this exercise, our professor gave us descriptions of the frac sand mine locations originally provided from the Wisconsin DNR.
A table from the Wisconsin DNR was normalized and then used
to geocode for the locations of frac sand mines. This offers good experience working with data
that is not in the format that we need for our project. After geocoding, we will compare the spatial
locations of sand mines with other members in the class, and with the accurate
locations of the mines. This will be
done as a way to evaluate how the geocoding went, and to calculate error. A workflow for the activity can be seen in
Figure 1.
Figure 1. Workflow for the exercise, provided by Professor Hupy |
Methods
Table Normalization
The table with descriptions of the Sand Mine locations needed to be normalized before it could be geocoding could be done. The process of normalizing the DNR table included
- Removing any commas between words
- Separating Address into different fields for street name, town name, zip code, etc.
- Separating the PLSS descriptions from the street address descriptions
Figure 2. Example of table normalization |
Geocoding the Mines
The first step of this was to log in to ArcGIS online using the UWEC enterprise log
in. Geocoding requires credits, and will
not run if it is not logged into a correct account. After selecting the normalized table, the mines were geocoded using the World Geocode Service (ArcGIS Online).
Then using the View Address Inspector, check to make sure
the ones that are matched are in the correct place. 13 out of the 18 sand mine locations I was geocoding were correctly matched automatically, and 5 of them were tied (see Figure 3). For many of them, it was simply an issue of moving
the points so that they were near the driveway.
Figure 3. Screenshot taken after the automated geocoding process |
For the locations with only PLSS descriptions, they would end up being placed in the center of the town. These locations had to be found without the help of the automated geocoding.
The PLSS quarter quarter sections from the Wisconsin DNR
2014 geodatabase. From these, query
statements could be used to locate where they were on the map. For some of them, it was helpful to look at
the same locations using google maps in order to locate the mines. After I had checked the positions of all of
my mines, and adjusted the positions of the ones that were off and then
exported it as a shapefile to be shared with the rest of the class.
Then I made a new geodatabase, and uploaded the shapefiles
from my classmates, as well as the shapefile from the DNR showing the actual
locations of all of the sand mines. Before
I began, I projected all of my data in NAD_1983_Wisconsin_TM_US_Ft. I chose this because there were mines across
all of Wisconsin, and this was the best projection for the given study
area. It was essential to project the
data, because I would be calculating distances in the next steps.
After looking through classmates, ones without a Mine ID
field were deleted, because without the mine numbers, they could not be
properly compared. The mine IDs were
sometimes recorded under different headings, so I created a new field in each
of my classmates table with the field title ID, so that when I merged them, all
of the mines would be in the same field and would be easy to query. I first merged the shapefiles of my classmates’
mines together, and then I queried to find the Mine IDS that I also had found.
Using a query, I selected the same mines that I had geocoded
from my classmates and used the generate near table tool to find how closely we had placed the mines to each others. A near table was also used to compare the distance between my points and the DNR verified sand mine locations, and to compare our entire classes estimates to the DNR locations.
Results
For the most part, the geocoding of myself and my classmates had high amounts of error. The average error (taken as an average distance from the near distance table) between my geocoded mines and my classmates was 36,874 feet. The average error between my geocoded mines and the DNR mine locations was 33,889 feet, and the average error between the entire classes geocoded mines and the DNR locations was 14,215 feet.
Figure 4. Comparison of my geocoded mine locations with my classmates. Example of estimates that were close together shown top right, example of locations that were far apart shown bottom right |
Figure 5. Comparison of our classes geocoded sand mine locations compared to actual locations provided by the DNR. |
Discussion
The following table (Figure 6.) is from a class textbook that systematically lays out the different types of error possible when generating geographic data. For the most part, I believe that mostly inherent errors were the reason that we had such large error values in our geocoding. There was limitations with how up to date we could find aerial photography for (using either basemaps in google or ESRI), and this prevented us from accurately locating many of the sand mines. Also, the data that was originally described using PLSS descriptions proved to be challenging to find, and was often hard to correctly get.
Figure 6. Table with classifications of GIS errors. |
An example of a more operational error that may have occurred is a few incorrectly geolocated mines are lowering the average accuracy. When looking at the statistics of the near table for comparisons between myself and classmates, myself and the DNR, and the entire class and the DNR, I noticed that most of the distances were actually reletively small, and that a few of the larger distances were likely a reason that our average error was so high.
Figure 7. Generate Near tables shown with frequency distribution tables to their right. |
Conclusion
Locations of Sand mines in Wisconsin and Trempealeau County |
As a result of this lab, we now have a sense of where the frac sand mines are located in Wisconisn, and although we now have that spatial data from the DNR, it is important to know how to derive that information in case it was not available. Geocoding can be a very difficult process, particularly when there are problems with the initial data sets being used. However, it can provide extremely useful information when done correctly. It is always good to have ways to evaluate errors that exist in your data, and to determine what the sources of those errors are.
No comments:
Post a Comment