Geohumanities Tools for Investigating Charcoal and Colliers.

Geohumanities Tools- April 3, 2021

All of these were discussed (or maybe just used) in Ben Carter’s presentation “Geohumanities tools for examining 19th century charcoal-making communities in eastern Pennsylvania” presented at Lafayette College, April 3, 2021. See https://sites.lafayette.edu/lvehc/2021/02/28/past-place-and-presence-in-the-lehigh-valley/

  • Pennsylvania Imagery Navigator- https://maps.psiee.psu.edu/ImageryNavigator/
    • Navigate and visualize aerial, satellite and LiDAR data for the entire state. All downloadable with a right click. Note that historic aerial images are not visualized, but are available for download.
  • Pennsylvania Spatial Data Access- https://www.pasda.psu.edu/
    • Much of the same material above (plus more) is available here. You can find downloadable data as well as WMS (and other) links for streaming GIS data into your GIS software
  • QGIS- https://qgis.org/en/site/
    • Full featured FOSS (Free and Open Source Software) GIS (Geographical Information System) software. Relatively user friendly “out-of-the-box” (but note that GIS programs are complicated, so you will likely need some training- see link to my videos below).
    • Can be greatly expanded with plugins (and Python scripts). My favorite plugins include:
      • QuickMapServices- provides access to a wide array of background maps, Google, OpenStreetMap, NASA, LandSat, USGS, etc.
        Azimuth and distance- for mapping “legal descriptions” of plots of land from deeds
      • Profile Tool- for examining the profile of the landscape (I use this to double check charcoal hearths)
      • MMQGIS- a wide array of tools (some of which are already provided), including Geocoding, which converts an address into a geolocated point (latitude/ longitude)
      • Want to learn how to use QGIS? Try this- http://benjaminpcarter.com/workshops/
        • Note that the videos are organized into workshops. Start at the beginning.
  • FamilySearch.org- https://www.familysearch.org/search/
    • Census and more. Similar to, but more limited than, Ancestry.com. Must sign up for an account, but not cost.
  • Qfield- https://qfield.org/
    • Android app designed to take your QGIS maps, layers, etc. into the field. Data is fully editable.
  • LASTools- https://rapidlasso.com/lastools/
    • A simple tool for working with Lidar data (which comes in .las files). Not all components are open source, but there are workarounds. Note that this can also be used within the QGIS environment.
  • Zenodo- https://zenodo.org/
    • Data publishing service. Data gets a DOI (and is therefore permanent), but can also be versioned. Located on the servers at CERN and available for all researchers.
      Some of my data- https://zenodo.org/record/1255101
  • University of Michigan’s The Encyclopedia of Diderot and D’Alembert- https://quod.lib.umich.edu/d/did/
    • Images of and text (most translated in English, but some still in French) of the Encyclopédie, ou dictionnaire raisonné des sciences, des arts et des métiers (English: Encyclopedia, or a Systematic Dictionary of the Sciences, Arts, and Crafts). Lots of great information about “crafts.”
  • Kemper’s book on charcoal-
  • Library of Congress Map Collection- https://www.loc.gov/maps/
  • Mask R-CNN- https://github.com/matterport/Mask_RCNN
    • You’ll need lots of programming skills for this one. Used for Deep Learning. Sharing in case anyone is interested.
  • Transcribus- https://readcoop.eu/transkribus/
    • Software for transcribing handwritten documents. I’ve only used this on a handful of deeds, but has produced amazing results so far.
  • Is there anything that you would like to share with the group? Share in the Comments.

Why Use Geopackage?

Recently, I rediscovered Geopackage. It’s a long story, but I tried to use them a while ago and it didn’t work so well. But, when I tried to use QField the other day to collect some data as I have done for years, it had -as you may not be surprised to hear- updated automatically. So, some things had changed – mostly for the better and easily adapted to. One of these was that I wasn’t able to load my DEM. It’s a bit of a monster (c. 20 miles long and a couple miles wide with resolution at around one meter- about a 1GB) and I have always used TIFF format (or more precisely GeoTiff). But, QField was now forcing me to convert to GeoPackage. So I shrugged and did it – easily in QGIS, which is where the maps that I use in QField are built.

But, this may have changed my life!

Alright, maybe that’s being a bit dramatic. Here’s the thing. I want to share data and publish it openly. My first experience with this was a bit of a bear (see my data in the Journal of Open Archaeology Data). What’s the problem, you ask? First, the standard file format for vectors (point, line, polygon) is shapefile (Do I hear some “boo”s?). Anyone who has used shapefiles knows that, in order to share shapefiles you need to share a minimum of three files (and usually more). If you have ten layers you would like to share with collaborators this means you need to share around 50 files. That’s just ridiculous! Not only is it an unnecessary hassle that makes it very difficult to collaborate and to version. But, the solution to that particular problem is a GeoJSON file. This solves a number of problems associated with shape files (see these links- 1, 2, and 3– for more information on  other issues with shapefiles). But, this still means that I need to share separate files for each layer, so for five layers, that’s five files. Not too bad, right. By the way, one of the major benefits of a GeoJSON file is that all of the information is internal to the file. That means, for example, I can publish the file online and stream the data (in my case, using QGIS– if you want to try it, you can use my data on charcoal hearths in PA). So, the data can live in an online repository such as Zenodo or Open Context and I can visualize that same data in a GIS program (I recommend QGIS) along with any other layers that live locally. Because the data is stored in a repository, I can rely upon it being consistent and so can my collaborators.

But, that still means that each layer is a separate layer and you cannot use GeoJSON for rasters (that’s not totally true, but it certainly was not designed for it). So, what would work better? How about a file that holds all of your rasters and your vectors AND styles them. That’s what GeoPackage does. It’s actually a “container” for a SQLite database, where each layer is a separate table. Rasters are stored as JPEGs and PNG– JPEGS provide lossless compression and PNGs are used at the edges because they support transparency.

Imagine this. I complete an archaeological project that involves georeferenced historical data, original LiDAR data (e.g., as LAS files), derivatives from the LiDAR (such as DEM, hillshade, slope analysis, etc.), points collected in the field, various polygons (in my case, State Game Lands boundaries, Appalachian Trail boundary, etc.) and lines (historic and modern roads, etc.). I want to archive everything. The way I did this the last time, I archived each file separately. The only link they had was a description (see this) that discussed how each layer was derived and interconnected. But, they still live as distinct, if tenously connected, digital objects. However, Geopackage allows me to bundle all of this together- remember it is a database- into a single package (i.e. file). I can then archive that file and everything REMAINS connected. So much easier for me and for any present or future collaborators and so much better for digital preservation . If I do another project, I can either archive a new Geopackage file or, if is additional research using the same data,  version the old one (retaining all versions, of course).

Lastly, as I mentioned above, it is very important for me to be able to archive data in an online repository AND be able to stream that data to my workstation (in QGIS). I could do this with GeoJSON, so I am a big fan. However, I have not been able to figure out how to do this with GeoPackage, but I’m still investigating.

I would also like to be able to store the files online, stream those to my workstation AND visualize them on the web. There is one tool that seems to be able to do this with Geopackage (see this) that promises to do this. You can use this link to see some a test of some of my data (http://ngageoint.github.io/geopackage-js/?gpkg=http://ironallentownpa.org/Testsmay27a_4326.gpkg ).  Sometimes it does not load (I don’t know why), but even when it loads, it does not seem to support rasters, which is a big problem.

Anyone out there with any thoughts, suggestions or recommendations please comment below!

Juxtapose Test

This is a test of Juxtapose  by the Knight Lab at Northwestern University. The two images below show the present (Jan 2019) compared to an aerial photo from 1938. The furnace, casting house and the “dwelling house” have all been demolished, along with other local buildings. The original buildings were identified using a application for insurance  for the furnace complex (valued at $5500) from 1828.

Final Post- FLC on Open Scholarship

Over the past year, I have coordinated a Faculty Learning Community on Open Scholarship. In this post, I briefly discuss the progress on the project associated with my participation and some of the lessons that I have learned through organizing and participating in this FLC.

First, my project has morphed significantly over the year. What began as a project to test a particular method for sharing geospatial data openly (see description here) has changed into a particular publication strategy. The essential problem is that I was hoping to solve what I now consider two separate and distinct issues;  a means for collaborating openly and a means for publishing openly. Fundamentally, these two components of research should not be separated. That is, even after publication, collaboration should continue to be possible- this would be a model such as Wikipedia, where, at least theoretically, pages continue to get edited and refined to represent the best research.  That does not jive well with peer-review, which is essential both for the purposes of ensuring high quality research and for punctuating a research project. Publication should occur at significant points along the process of a collaborative project (even if it is open-ended). The key, of course, is knowing what those significant points are. Ideally, these points would be part of an original research strategy. However, my project, which involves looking at the historical landscape of charcoal production in the 19th century, began organically more from pedagogical considerations than from research. The focus was on  coordinating hands-on learning experiences for my students in courses such as Field Archaeology and Historical Ecology. It has become increasingly apparent, however, that I have reached an important milestone (and indeed probably reached this a year ago) and the work needs to be published. But, the key is to publish openly. Here’s my plan. I have completed an article for the journal, Historical Archaeology, as a part of it’s Technical Briefs series that discusses my work using open data and open source software to analyze LiDAR data to identify and map charcoal hearths. A description of the data will be published with the Journal of Open Archaeology Data and the actual data, I hope, will be published with Open Context. All three of these are peer-reviewed. The latter two are fully open access. The first is not, but it seemed to be the best location to publish the work.

This is the first FLC that I have lead. My biggest lesson may be that I tried to be too broad; open scholarship is a very wide net. This was intentional since I wanted to be sure to include everyone who was interested in “opening” their research (and associated scholarly practices) to a broader audience. Yet, that also meant that when it came down to choosing the readings, I tried to keep them as broad as possible. It is my impression that while each reading was broadly relevant to participants, they didn’t fit anyone particularly well. This broad net also means that everyone’s project was so very different that there was limited overlap. While it appears that participants were interested and excited about each others projects, it also means that there was limited interaction about these problems and I could provide little or no support. On the flip side, I’m not sure that I am personally interested in just one aspect of open scholarship, so I am not sure how I would reduce the scope of the FLC.

Lastly, I tried to keep communication as open as possible, but that meant that there were too many channels of communication and no one central “location” for participants to communicate.

I hope that participants in the FLC will provide additional feedback about the limitations of this FLC.

Openness and predatory journals.

This morning I read the article by Gina Kolata in the New York Times entitled, Many Academics Are Eager to Publish in Worthless Journals. The article discusses predatory journals, which are “journals” (if we bend that term beyond recognition) that are willing to publish anyone, have little to no editorial staff and do not employ peer-review (even if they claim to do so). Authors usually pay a fee to publish in these journals. These journals, therefore, promote psuedoscience- there is literally NO mechanism to ensure that the scholarship is well supported.

In the environment of “publish or perish,” some academics, Kolata reports, are publishing in these journals simply to get another publication. Publication in these non-peer-reviewed, unedited, “pay-to-play” journals does not appear to hinder academics ability to secure tenure. Indeed, some of academics who have published in predatory journals have received awards that are, at least partially, based upon their  publications. Additionally, it’s not terribly surprising that academics who work at institutions with high teaching loads but with limited support for research (such as at community colleges or liberal arts colleges- like Muhlenberg College, where I teach), but who are also required to publish, are particularly susceptible to these journals.

So, what does this have to do with openness? Most of these journals are “open”- that is, they have minimal overhead and, since very few (none?) produce expensive print copies; the only place you can find these publications is online. Additionally, I presume that no reputable library would purchase a subscription. Therefore, the primary (sole?) source of income is from the authors.

Predatory journals are producing articles with “research” that has not been peer-reviewed or are in reputable journals, books or from reputable publishers but the articles are available to the public. There is no way to ensure their veracity or accuracy. Some are even indexed in Google Scholar, which does not vet the journals it indexes. Not only is this poor scholarship, but it is also MORE available to the public than most peer-reviewed scholarship, which is closed-up tightly behind pay walls. A side note here… I headed over to Beall’s List of Predatory Publishers and Journals to try to find some of these articles. It was actually quite opaque. Many websites were nearly non-functional and actually finding articles was difficult. The concern expressed above may be more about future concerns than about what is available at present.

This post, therefore, though quite brief, is a call for more openness in research and scholarship- the more reliable research available the better. But, it is also a recognition that availability is clearly not enough. Openness is not enough. We, both as researchers and teachers, need to think deeply about how we teach our students to recognize the difference between reliable scholarship and  research that is not supported. Also, because a limited portion of the “public” goes to college, we (perhaps as institutions, not as individuals) also need to figure out ways to communicate this information outside of our physical and scholarly spaces.

 

A quick test of Harvard WorldMap

For a long time, I have been looking for a way to both collaborate and publish geospatial data and map. Harvard WorldMap may be the answer. It is certainly the best thing I have found so far. Although it is based upon GeoNode and you (perhaps with help) could get your own instance up and running, the key to Harvard WorldMap is that it also aggregates maps from other sources.

With Harvard WorldMap, users can upload layers- including vector (points, lines, polygons) and georeferenced raster (e.g., aerial photos or historic maps) layers. Formats are currently limited to shapefiles and GeoTiffs. Once uploaded, the user must add metadata. This is a very good thing and a vital step in the production and sharing of any type of data, but is often difficult or imperfect for geospatial data.

The user can manage who can view, edit and manage the layers. Until it is ready for sharing, the user can keep it private. If they want to collaborate with others, they can allow only those individuals view, edit or edit and manage permission.

Once added, layers can be downloaded in a number of useful formats (Zipped Shapefile, GML 2.0, GML 3.1.1, CSV, Excel, GeoJSON, GeoTIFF, JPEG, PDF, PNG, and KML). Layers can also be streamed to your desktop GIS program (you are using QGIS, right?) via Web Mapping Service (WMS). This means that, to make other layers in your desktop program you can have the same data as all of your collaborators streaming rather than from a file on your computer.

Layers can be aggregated into maps, for which access can also be restricted or not in the same way as layers. You can add your layers, but you can also use their search engine to find layers that are connected to Harvard WorldMap, such as maps from USGS or from ESRI. The selection is not yet amazing, but I was able to find a few maps for my work in Ecuador that I had not found elsewhere.

Vector layers can be styled by changing the marker shape, color, size and label.

This map can then be published. Here’s a test of some data collected by my students and I regarding charcoal production on the Blue Mountain in Pennsylvania. Take a look. Note that you can change the layers (both my uploaded layers and the basemap).

Creating DEM from PASDA las files

The following is a description of how the maps discussed in the previous post were constructed. This information is provided in the spirit of open access and replicability. The following is a step-by-step guide to building digital elevation models (and their derivatives) from PASDA LiDAR data.

  • Download las tiles from PASDA.
    • Go to PASDA Imagery Navigator: http://maps.psiee.psu.edu/ImageryNavigator/
    • Zoom in on the area of interest.
    • Under the Display Tile Index drop down menu, select “Lidar Hillshade”
      • This will show you the tile index and the relevant file names
    • Place your cursor over a spot of interest and right click.
      • This will bring up a list of available data.
      • Click on the “LiDAR,, Topo and DEM” tab
      • At the right, you will see a listing of “LAS” files for download.
    • Select and download all the appropriate files.
  • Convert projection and reserve only category “2” points (2= ground return).
      • Note that Pennsylvania Data MUST be converted from NAD83 PA S (feet) to NAD83 PA S (meters)
      • Open las2las.exe
        • In the upper left, find and select all of the files from the above.
          • Note that you can use the wildcard (.las, not .laz as is the default)
        • Keep only ground points
          • Expand the “filter” menu
          • Select “keep_classification” under “by classification or return”
          • Under “number or value”, enter 2
        • Reproject from feet to meters.
          • Under “target projection” select
            • State plane 83
            • PA_S
            • Be sure “units” are in meters.

Your GUI should look something like this:

forWP

  • Choose an output location in the upper right.
    • Click “Run” (in the lower left; you may have to minimize (click the “-“))
    • In the command line, you should see something like:
    • las2las -lof file_list.7808.txt -target_sp83 PA_S –olaz
  • You should now have reprojected las files that include only the ground return.
  • Convert las files into smaller “tiles”
    • Open “lastile.exe”
    • Add the reprojected las files (actually now they should be laz files) in the upper left.
    • Choose a tile size of 1000 (for the above this means 1000 meters)
      • Choose a buffer of 25 (you need a buffer and just need to experiment with what works best for you.)

Your GUI should look like this:

2lastile

    • Hit “Run”
    • The command line should look something like this:
      • lastile -lof file_list.1576.txt -o “tile.laz” -tile_size 1000 -buffer 25 -odir “C:\Users\Benjamin\Desktop\Working_LiDAR\Repoj_tile_las” –olaz
  • Convert tiles into DEM
    • Open “BLAST2DEM.exe”
    • Add the tiles constructed in previous section
    • Choose your output location
    • Choose “tif” for file format

Your GUI should look like this:

3blast2dem

    • Click “RUN”
    • Your command line should look like this:
      • blast2dem -lof file_list.6620.txt -elevation -odir “C:\Users\Benjamin\Desktop\Working_LiDAR\DEM_tiles” –otif
    • Your DEM’s are now created.

From here, you will want to stitch the DEM’s back together, but you need a GIS program for that. You can use the open source QGIS.

  • Open QGIS
  • Click on Raster- Miscellaneous- Merge.
  • Select the “choose input directory instead of files” box
  • Select the destination location and file name.
  • Click “OK”-
    • I frequently get an error here, but the results appear complete.

At this point, all of your data should be in a single Geotiff file (be sure to save it) as a digital elevation model.

In order to complete the analysis in the previous post, I converted the DEM into a slope model, which shows high slope in lighter gray and low slope in darker gray.

  • To do this, all you need to do is, in QGIS, use Raster- Terrain analysis- Slope. The input is your DEM and the output is the new slope model.
  • Within QGIS, you should now be able to see maps similar to those shown in the previous post.

Finding Charcoal- LasTools + PASDA LiDAR data= Amazing!

For a long time, I have been interested in charcoal production on the mountains around the Lehigh Valley, which I first learned about along the Lehigh Gap Nature Center‘s  Charcoal Trail. I had hiked this trail many times before I discovered what the name meant. Along the trail are flat areas (around 30 feet [10 meters] in diameter) upon which colliers (charcoal makers) piled large mounds of wood that they charred to produce charcoal. One of the primary uses of that charcoal was iron production. Indeed, the area around the Lehigh Gap Nature Center (ok, a bit farther west) was owned by the Balliet family who owned and operated two iron furnaces, one on each side of the Blue (Kittatiny) Mountain (one in Lehigh Furnace, Washington Township and another in East Penn Township; and likely a forge in East Penn).

I became interested, but was not truly fascinated until I found and perused PASDA’s Imagery Navigator. Within the Navigator, you can view DEMs (digital elevation models) created from a LiDAR survey from around 10 years ago. To put it too simply, to collect LiDAR a plane flies over an area shooting the ground with lasers. Since the location of the plane is known (through an amazing combination of GPS and IMU) and the speed of light is known, lasers bouncing back to the plane effectively measure the distance to a “return”, which is an object, such as the tree canopy, a trunk, a roof, or the ground. A DEM is then constructed from the LiDAR point cloud. I wondered if this data could show me flat areas on the sloped landscape (like those clearly visible along the LGNC’s Charcoal Trail). They could!

I used the “hillshade,” which is a view of the landscape created by applying a light source to the DEM (digital elevation model). It’s as if all of the vegetation was removed from the landscape and it was painted gray with a sun shining on it from the NW at about 45 degrees. This way, I was able to identify over 400 charcoal pits over an area of approximately 100 square kilometers. .

So… many years later, I am finally doing something with this. My students and I, as a part of a Field Archaeology class, are investigating charcoal pits and the people who used them. More on this  part of the project another time.

In the meantime, my GIS skills have dramatically increased and I was lucky enough to attend a workshop on LiDAR (funded by NSF and run by NEON and Open Topography; a special thanks to Kathy Harring, Muhlenberg’s Interim Provost). I was interested in LiDAR for a new project that I am working on, but as a part of the workshop, we were to do a small “project” based upon new understandings and skills developed over the three days. I choose to download some of the original LiDAR data (props to Pennsylvania for providing all of this online) and build my own digital elevation model. The idea was that I could tweak it in order to see the landscape better. So, I started off just trying to remake the DEM provided via PASDA; that would at least show that I had developed some skills. However, just trying to do this resulted in spectacular results that have changed the way I conceptualize the landscape and our project.

Most importantly, the resolution of my reconstructed DEM is much, much greater than that of the DEM provided by PASDA. It is clear why this is true (see this description), but it is not apparent as it should be when viewing (or downloading) the PASDA DEM. PASDA provides a DEM based upon points that are categorized as “8,” which are “Model Key (thinned-out ground points used to generate digital elevation models and contours)”, not those categorized as “2,” which are “ground” points. So, I was working with all of the ground points and the PASDA provided DEMs were based solely upon a subsample.

Here’s what I found:

Here’s an image of one section of the area under study. This is original data from PASDA. Note the “eye-shaped” flat spots.

lidarchardemorig

In the following image, I have marked all of the charcoal pits I could find with a blue dot.

lidarchardemorigpoints

This image shows the hillshade made from the DEM I built. Honestly, it doesn’t look terribly different from far above. Perhaps a bit more granular.

lidarchardemnew

However, here’s a zoomed in comparison of the area just NE of the point in the lower left corner in the image above:

Before:

lidarchardemorigsm

After:

lidarchardemnewsm

However, once I do a slope analysis, which shows flatter areas in darker gray and steeper areas in lighter gray, the charcoal pits (which are flat areas on a sloped landscape) literally leap out of the image.

lidarchardemnewslope

The image below shows all of the all of the newly identified charcoal pits with red triangles. A landscape that I once thought had minimal charcoal pits (I wondered why… and was developing possible hypotheses) now appears to have been quite densely packed with charcoal pits.

lidarchardemnewslopepts

Next post… details on exactly how I did the above. Hint- LasTools and QGIS.

Wanna collect data digitally?

(Note- originally posted here http://digitalarchaeology.msu.edu/wanna-collect-data-digitally/ on Sept 6, 2016)

This is my final post as a participant in the Institute for Digital Archaeology. This post serves three purposes. First, I announce a resource that I have created to enable digital data collection in archaeology. Second, I want to mention a few of my favorite aspects of the Institute. Finally, I just want to say a few thanks.

First, I announce a new resource for digital data collection in archaeology (see website ). While I initially planned to make something (I didn’t even really know what… an app?), instead I have cobbled together a couple of pre-built, “off-the-shelf” tools into a loose and compartmentalized system. And… because they are all well-supported open source tools they are also 100% free! On the website, I provide a justification for why I chose these tools, criteria for selection and descriptions of the tools. More importantly and even though all of these have low adoption thresholds (that was one of the criteria!), in order to support the testing, adoption and use of these tools in archaeology, I provide documentation on the ins and outs of using these tools. This means that you can be up and running in a matter of minutes (OK, maybe more depending upon download speeds…). In her final post Anne talks about toe-dipping and cannon-balling. My goal here was to suggest tools and provide assistance so that you either can dip your toes or jump right in; either way, I think you will see a big splash. I hope this helps. PLEASE LEAVE FEEDBACK. Please.

Second, I wanted to share two of my favorite aspects of the Institute. One, my colleagues. I have been honored to be part of such an open, collaborative and supportive cohort of insightful and dedicated scholars. I learned much simply from conversations over coffee at breakfast, Thai food at lunch and beers over dinner as I could hope to learn at any organized workshop or talk. Your struggles are as valuable to me as your final products. I want you all to know that I look forward to more conversations over beer, lunch (maybe Mexican this time?), and beer (did I write beer twice?). Two, time. I greatly appreciate the space that participating in this yearlong institution has given me. Without this institute, I think I would be still struggling away trying to put some sort of digital data collection system together in my “spare” time. No, it’s not done (is there such a thing), but the institute and the (dreaded) posts have kept me on track even though dead ends and unexpected turns.

Third, I want to thank the entire faculty. Of course, an especially large “THANK YOU” goes to Ethan and Lynne for putting the Institute together. I have learned so much from the rest of the faculty that I would like to thank them as well for their time and effort, both at the institute weeks at MSU as well as during the year in between. I understand the amorphous, complex, ugly (i.e. coding) world of digital archaeology much better than I ever thought I would. Thank you, Terry, Kathleen, Catherine, Brian, Shawn, Eric, Dan and Christine.

Lastly, a satisfied smile goes out to the NEH for supporting the Institute. Good decision! Amazing results! Just look.

Kobo Toolbox in the field- limitations? and solutions.

(Note: originally posted here: http://digitalarchaeology.msu.edu/kobo-toolbox-in-the-field-limitations-and-solutions/ on Aug 6, 2016)

This is a field report of efforts to develop a plan for low cost, digital data collection. Here’s what I have tried, what worked well, what did not and how those limitations were addressed.

First a description of the conditions. We live in two locations in Ecuador. The first is the field center established and currently run by Maria Masucci, Drew University. It has many of the conveniences needed for digital data collection, such as reliable electricity, surge protectors, etc. It does not have internet nor a strong cellular data signal. We are largely here only on weekends. During the week, we reside in rather cramped conditions in rented space in a much more remote location, where amenities (digital and otherwise) are minimal. There is limited cellular data signal (if you stand on the water tower, which is in the center of town and the highest point even though it is only one story tall, you can get a weak cellular data signal; enough for texts and receiving emails, but not enough for internet use or sending emails) and there is no other access to internet. We also take minimal electronic equipment into the field for the week (e.g. my laptop does not travel). So, everything needs to be set up prior to arrival in the field. The idea, therefore, is to largely use minimal electronic equipment in the field; I tried to use only one device (while also experimenting with others) for this reason. My device of choice (or honestly by default) is my iPhone 5s.

The central component of this attempt at digital data collection is Kobo Toolbox (see my earlier posts for more details… here, here, here and here), an open-source and web-browser based form creation, deployment and collection tool. Kobo Toolbox’s primary benefit is that, because it is browser-based, it is platform independent. You can use an iPad or an iPhone just as well as an Android device or a Mac or PC computer. This means that data can be collected on devices that are already owned or that can be bought cheaply (e.g., a lower level Android device v. an iPad). The form is created through their online tools and can create fairly elaborate forms with skip logic and validation criteria. Once the form is deployed and you have an internet connection, the user loads the form into a browser on your device. You need to save the link so that it can be used without a data connection. On my iPhone 5s, I simply saved the link to the home screen. A couple of quick caveats are important here. I was able to load the form onto an iPhone 4s (but only using Chrome, not Safari), but was unable to save it, so lost it once the phone was offline. I was unable to load the form at all on an iPhone 4 (even in Chrome). Therefore, although ideally the form should work in any browser, the reality is that it makes use of a number of HTML5 features that are not necessarily present in older browsers. Of course, as time goes on, phones and browsers will incorporate more HTML5 components and therefore, this will be less of an issue.

Once the form is deployed and saved on your device, you can collect data offline. When the device comes back online, it will synchronize the data you have collected with Kobo’s server (note that you can install Kobo Toolbox on a local server, but at your own risk). Then, you can download your data from their easy-to-use website.

For the first week, I set up a basic form that collected largely numerical, text and locational data. We were performing a basic survey and recording sites. Outside of our normal methods of recording sites and locations, I recorded sites with Kobo Toolbox in order to determine its efficacy under rather difficult “real-world” conditions. I collected data for 5 days and Kobo Toolbox worked like a dream. It easily stored the data offline and, once I had access to a data signal, all the queued data was quickly uploaded. I had to open the form for this to occur. I was unable to upload with a weak cellular data signal. It only completed uploaded once I had access to WiFi (late on Friday night). However, it synchronized nicely and I was able to then download the data (as a CSV file) and quickly pull it into QGIS.

The single biggest problem that I discovered in the field was that I needed to be able to see the locations of the sites recorded with Kobo Toolbox on a dynamic map. Although Kobo Toolbox recorded it nicely, you cannot see a point on the map, so I had to use another method to visualize what I was recording. The only way to see the recorded data is by downloading from the Kobo Toolbox, but a data connection is required. You can see and edit the data only if you submit as a draft. Once the data is submitted however, you cannot edit it in the field (this was true of other field collection systems that I have used, e.g. Filemaker Go). Yet, I still needed a way to visualize site locations (so I could determine distances, relationships to geographic features and other sites, etc. while in the field).

For this purpose I used iGIS, an free IOS app (see below for limitations; subscriptions allow additional options). Although this is an IOS app with no Android version, there are Android apps that function similarly. With this app, I was able to load my own data as shapefiles (created in QGIS) of topographic lines, previous sites and other vector data, as well as use a web-based background map (which seemed to work, even with very minimal data connection). Raster data is possible, but it needs to converted into tiles (the iGIS website suggests MapTiler, but this can also be done in QGIS). Although you can load data via multiple methods (e.g., wifi using Dropbox) I was able to quickly load the data using iTunes into the app. Once this data is in the app on the phone, an internet connection is no longer needed. As I collected data with Kobo Toolbox, I also collected a point with iGIS (with a label matching the label used in Kobo), so that I could see the relationship between sites and the environment. Importantly, I was also able to record polygons and lines, which you cannot do with Kobo Toolbox. Larger sites are better represented as polygons, rather than points (recognizing the c. 5-10m accuracy of the iPhone GPS). The collection of polygons is a bit trickier, but it works. Polygons and lines can later be exported as shape files and loaded into a GIS program. By using equivalent naming protocols between Kobo Toolbox and iGIS, one can ensure that the data from the two sources can be quickly and easily associated. The greatest benefit of iGIS is seeing the location of data points (and lines and polygons) in the field and being able to load custom maps (vector and raster) into the app and be able to view without a data connection. Although this is possible with paper maps (by printing custom maps, etc.), the ability to zoom in and out increases the value of this app greatly. Getting vector data in and out of iGIS is quite easy and straightforward. iGIS is limited in a couple of ways; nearly all of which are resolved with a subscription, which I avoided. Here’s a brief list of limitations:
– All points (even on different layers) appear exactly the same (same size, shape, color; fully editable with subscription). This can make it very difficult to distinguish a town from a site from a geographic location
– Like points, all lines and polygons appear the same (also remedied with a subscription). I was particularly difficult to tell the difference between loaded the many uploaded topolines and collected polygons.
– Limited editing capabilities (can edit location of points, but not nodes of lines; can edit selected data).
– Limited entry fields ( remedied with subscription, but, perhaps this is not necessary, if it can be connected to data collected with Kobo Toolbox).
– Unable to collect “tracks” as with a traditional GPS device (Edit- OK, so I was wrong about this! You can collect GPS tracks in iGIS, even though this is not as obvious as one might like).

The final limitation of iGIS was not something that was originally desired, but became incredibly useful in collecting survey data, especially negative results (positive results were recorded with the above). Our survey employed a “stratified opportunistic” strategy. We largely relied upon local knowledge and previous archaeological identification to locate sites, but also wanted to sample the highest peaks, mid-level areas and valley bottoms. In order to do this, we also used three different strategies. First, we utilized knowledgeable community members to take us to places they recognized as archaeological sites. Second, we followed selected paths (also chosen by local experts). Third, we chose a few points (especially in the higher peaks c. 200-300 meters above the valley floor). One of the most important aspects of this type of survey was recording our “tracks” so that we would know where we had traveled. This is commonly done with GPS units, but I was able to collect these using MotionX-GPS with the iPhone already in use. The GPS “tracks” (which are really just lines) as well as “waypoints” (i.e., points) were easily exported and loaded into QGIS. This allows for an easily collected data about where surveys traveled, but did not find archaeological sites. (Edit- Note that you can use iGIS for this function! MotionX GPS is not needed, therefore. It is great for recording mountain biking and hiking, however!).

One final comment will suffice here. I just discovered a new app that may be able to replace iGIS. QField is specifically designed to work with the open source GIS program QGIS. Although it is still new and definitely still in development, it promises to be an excellent open source solution for offline digital data collection- though limited to Android devices!