Adventures in Geocoding for Neatline

Posted on December 8, 2014 by

As part of my ongoing project, Documenting Teresa Carreño, I am plotting a sample of her concert appearances between 1862 and 1865 in a Neatline Exhibit. To do this most efficiently rather than add each geocoded address to each Item Record, I explored using the programming language, Ruby. The Scholars’ Lab has two very useful tutorials on geocoding for Neatline with Ruby, which I followed and modified in order to meet my needs, and so it would function with the 2.1.4 version of Omeka that I am using for my project.1 Since I had 60 Item Records ready for import into Omeka, I wanted to geo-code the addresses of each concert appearance and import them as a batch upload using the CSV Plugin. It is also possible to create Item Records only with the location (coverage) information and a few other Dublin Core elements, which would be imported directly into Omeka using Ruby with the Mechanize Gem.

Neatline 2.0 requires the use of Well Known Text (WKT) to represent geometric objects on a map, which is why the Ruby geocoding script is very useful and makes the process more efficient. If you have lat/lon coordinates, for example 40.732017, -73.982069, the WKT equivalent of this would be: POINT(-8235646.247766419, 4972894.226918806).

In this post, I will provide the steps and tips that I discovered while attempting to do this myself, which I hope may be useful to others. I tested importing the coverage data using Ruby with the Mechanize Gem, however it seems to be easier to move the coverage data, created after you geocode addresses, into a spreadsheet, which contains all Item Records and corresponding Dublin Core elements so that you don’t have to manually add metadata to each record. All of the elements can be mapped into Omeka using the CSV Plugin and eventually into a Neatline Exhibit.

 

How to Geocode Addresses for use in Neatline 

First, I opened a spreadsheet and created four column headers that correspond to my address data. Since I am mapping Carreño’s concert appearances, I used the following column headers:

Address | City | State

Under each column header, I provide the related data for each of the 60 appearances.

LocationsCSV

 

 

 

 

 

 

 

 

 

 

After I added all 60 appearances in the spreadsheet, I saved it as a CSV file.

The next step is to use Ruby to geocode the addresses. If you don’t know whether you have Ruby installed on your machine, check by opening your terminal and type: ruby -v

If you have Ruby installed then it will tell you which version is running on your Mac.2

If you do not have it installed go to the Ruby installation site. I have Ruby installed at the root, so I ran it in sudo mode. To use sudo mode, type: sudo bash

You’ll be prompted for your password and then you’ll be in sudo mode. Once Ruby is installed or accessed, then you will install Ruby Gems with this command: gem install

Once Ruby Gems is installed you’ll need to set up your project space. In your command line, enter:

mkdir -p ~/projects/geocode

gem install geocoder

touch ~/projects/geocode/geocoder.rb

cd ~/projects/geocode

 

Find the newly created projects folder on your computer and move the CSV file you created earlier into the geocode folder. You should also see a geocoder.rb file in this folder.

Open up the geocoder.rb file in a text editor, such as TextWrangler and add the following script:3

GeocoderRb

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

In the above script, the fields: address, city, state – directly match the column headers in the CSV spreadsheet. If you name your column headers differently, you must change the fields in the script where they appear. You can make the changes using a text editor or use vim to write directly to geocoder.rb using your command line.

In line 4: LOCATIONS

  • the file name must refer to the CSV with addresses to be geocoded

 

In line 16: puts “address,city,state,point,lat,lon”

  • if you are using different column headers in your CSV, add them in the order they appear, from left to right on your spreadsheet.
  • Example: puts “street,city,state,point,lat,lon”

 

In line 21: address_string = “#{line[:address]}, #{line[:city]}, #{line[:state]}”

  • make sure the column headers in your CSV are represented.
  • Example: “#{line[:street]}, #{line[:city]}, #{line[:state]}”

 

After you’ve checked the script and made sure you CSV file name is correct, as are the column headers, go ahead and run the script. In the command line, enter:

ruby geocoder.rb > geocoded.csv

 

Once you run the script, you will have a new file in your projects folder called geocoded.csv. Open the CSV file and you will see three additional column headers: point, lat, lon, which will be populated with the WKT and coordinates.

GeocodedLocations

 

 

 

 

 

 

 

 

 

 

 

 

You have now geocoded your addresses. The column header “point” contains the WKT, which Neatline will use to plot your locations in a Neatline Exhibit. At this point, depending on your project needs, you can add the column with the WKT to an existing spreadsheet, which contains your Item Record metadata for your Omeka site. The “point” column would be renamed “coverage” so that it can be mapped to the correct Dublin Core field when you use the CSV Plugin to upload it to Omeka. When you batch upload, new Item Records will appear in Omeka with the coverage data. Once your Item Records are in Omeka, you will be able to pull them into a Neatline Exhibit.

If you are only interested in uploading the geocoded addresses into Omeka, you can do so using the CSV file you created here. You can then go back and manually add additional metadata to your Item Records.

[Note: For Omeka tutorials and assistance, see the Omeka Codex.]

 

  1. Scholars’ Lab Tutorial: Geocoding for Neatline – Part I and Part II.
  2. Please note that I’m working on a Mac OS 10.9.5.
  3. You can copy the full script from the Scholars’ Lab Geocoding for Neatline: Part I Tutorial.
Facebooktwittergoogle_pluslinkedinmail

Response to Adventures in Geocoding for Neatline