Objective:
This blog is taking a step away from Python coding and back
to frac sand mining in Wisconsin. The main purpose of the assignment is to
geocode the locations of all sand mines in Wisconsin by using ESRI products
such as ArcMap and ArcGIS Online. Once the mine address have been geolocated
the results of this project will be compared to the actual locations of the
mines and to the results of other individuals doing this same lab.
Methods:
The first step of the project was to normalize a DNR
supplied dataset of the location of all mines in Wisconsin. Each student of the
class was required to normalize and geocode 16 mines, meaning that each
individual mine will be geocoded by at least two people in the class.
This means that there will be results for an individual to compare to.
The DNR dataset is in poor shape as its creators did not
follow very many conventions of data normalization; the worst offence being
multiple attributes located in the same fields. New fields were added and the
data was manually sorted out without the aid of any tools. If the dataset had
more than 16 mines it would have been more time efficient to use an automated
process.
Figure one. Possible address locations after geocoding
function.
|
Figure two. Model used to query and
merge class data into
one dataset.
|
After the dataset was edited enough to be compatible with
ESRI ArcMap it was geocoded via its addresses. As this data set had all but one
address present, it almost a simple process to find the correct locations for
the points. When an address is geocoded the GIS software attempts to match it
with the most correct location that it can. Unfortunately these are estimations
and for many addresses there are different probabilities for each point. Figure
on is a display of where different address points could exist for this address.
The central point is the most true to reality so it was obvious to choose this
option as the address location, even if the calculated probability for the
other points are higher.
There were several address locations which did not seem to correlate with the location of any mines. It was a tedious, yet achievable, task to find the true locations of the mines with the use of Public Land Survey System (PLSS) data and aerial photography. Google Maps was heavily utilized as well.
Once the points were geocoded it was necessary to compare
individual results with the rest of the class. All student results were stored
in a shared class folder for all to access. In order to compare results it was
necessary to create a new feature class with the use of a SQL statement for
each student’s results. All these output files were then combined with the use
of a merge tool, see figure two for operation model. This merged feature class
was then used in conjunction of the individual geocoding results in ArcMap’s
Generate Near Table Tool. A 1 kilometer search area was set for each input mine
location and a planar search distance method was used.
Another aspect of accuracy assessment used for this lab was
the comparison of individual geocoding results to those of the actual
geographic coordinates of the mines. A feature class was created using the
newly supplied coordinates for mine locations. This features class was used in
the Generate Near Table tool in the same fashion (see tables one and two in
results).
Results:
Map one. Individually geocoded mine locations.
|
Table one. Source data for geocoding project.
Note how the
address field is populated with multiple attributes.
|
Table two. Source data after some normalization. Note the
additional PLSS field.
|
Standardizing the source Excel data table was a simple yet tedious
exercise of sorting out all the information in the address column. As the data
was compiled in far too few columns (See Table One) a new column was required
to rectify the issue and prepare the points for mapping within ESRI ArcMap.
After checking and fine tuning the mine address points their
locations can be correlated with the locations of several mines that are
visible via aerial photography. Others were not so easy to locate as their
locations could not be determined via aerial photographs. As mentioned earlier,
imagery from Google maps was used extensively to search out the mine’s
locations.
Table three. Distance comparison between individual and
class geocoding results. Distance in meters.
|
Figure three. Statistics and distribution for distance
measurements
between individual and class geocoding results.
|
Table four. Distance comparison between individual geocoding
results and geographic coordinate locations of mines.
|
Figure four. Statistics and distribution for distances between
individual
geocoding results and geographic coordinate locations of mines.
|
Discussions:
From the results section it is easy to see that there are
many differences between individual, class, and actual mine location results. While
the majority of the individual points were within 200 meters of both class points
and actual geographic coordinates for each mine, there were several issues
regarding class point outliers reaching 4-8 kilometers from the individual
point. The individual points and coordinate points were most often closely
associated. The minor issues experienced in almost all of the points was
because of how the points were mapped: the coordinate points were at the
approximate center of the mines, while the individual points were plotted at
the entrance of the mine.
Generally the automate plotting of the points from the
geocoding process were not as accurate as one would like; this is simply an
inherent issue that has to do with how the geocoding program calculated the
general address of points. Several other issues had to do with the quality of
the source data. These points were initially placed far off from their actual location
meaning they had to be manually relocated. This format translation issue more
than likely was caused by the poor quality of normalization for the source data
used.
After the addresses were digitized and relocated there were
still several very minor and hardly mentionable issues regarding projections of
several feature classes. This really paled in comparison to the most difficult
aspect to overcome: the temporal accuracy of reference images. There were two
mines in particular that could not be immediately located. The first was a relatively
new mine as ESRI’s ageing aerial photographs of the area did not show it. The other
mine was unable to be diagnosed even with the use of aerial photography from
both ESRI and Google and after having gotten the coordinate location of the
mine. This case is probably because the registration for the mine was recently added
to the DNR database but no mining has yet occurred on location. The point designating
the address for this mine is more than likely less correct than the other
points in the dataset.
Honestly, the only real way to know if the address locations
points are correct is to test them by visiting the locations of every mine in
person. This is not a cost effective technique so a good alternative would be
to use aerial photography (georectified) to associate the points to the mines. In
the above exercise this was a paramount technique to ensure the accuracy of the
mines locations.
Conclusions:
Have geocoded and compared 16 mine address with their actual
and independently geocoded counter parts it was moderately surprising the
amount of variation that was among the chosen locations. When individual
results were compared to class results it resulted in most points being quite
close to one another. There were a few outliers with distances that were extensive when comparing individual and actual
locations. The convention here are that the majority of points were off by a
higher amount than class results, but had only a few outliers that were closer
to the mean distance than the other comparison.
No comments:
Post a Comment