Geospatial Data Management & Metadata

This Presentation:

  1. Data Management and Best Practices
  2. Overview of Geospatial Metadata

to view speaker notes for this presentation click on your >> s << key

a new window will open and you can view the presentation in speaker mode

What do you mean, "Data Management"?

you plan, find, create/edit, analyze, describe, and share/preserve data

throughout these processes you make decisions about how to manage your data

Have you ever thought:

  • How should I name my files?
  • Where can I store my data?
  • What should I keep track of when I make changes?
  • How will I explain my data to others?

Data Workflows

GIS Analyst ⮕ IT Specialist ⮕ Community Member

Grad RA ⮕ Research Cluster ⮕ Funder

Workflows can break down

inadequately described data versions

meaningless filenames

changes in storage locations

...what else?

Data management plans

detailed data management plans are often necessary for funding proposals and satisfying grant requirements

data management planning tools:

DMPtool.org

DMP Assistant

A few basic geodata management best practices

File Naming Guidelines

best practices include being consistent, and keeping file names short and descriptive.

File Naming - Dates

Use YYYYMMDD format

Do: homework_20200319.txt

Don't: homework_19032020.txt

File Naming - Identifiers

Use unique abbreviations for project names or grants

Do: fhabc_notes.txt

Don't: forest_history_association_of_BC_notes.txt

File Naming - Descriptors

Descriptor should be minimal but unique

Do: fhabc_grantProposal.pdf

Don't: fhabc.pdf

File Naming - Delimeters

Use _ or - to divide your filename elements

Do: fhabc_grantProposal_v01.pdf

Don't: fhabc, grant proposal -->[v01].pdf


In ArcGIS only use _

File Naming - Versions

Note versions sequentially or with unique date and time

Do: NRC_userGuidelines_v04.doc

Do: MSL-fraserRiverSamples-20200319-0900.csv

Don't: userGuidelines_final_edits_2_forreal.doc

File Naming - Other things

  • don't start filenames with a number or underscore
  • be aware of character limits
  • never ever ever use spaces as delimeters

More info can be found using UBC Library's research data planning guidelines.

Attribute Naming Guidelines

best practices again include being consistent, and keeping field names short and descriptive.

it's difficult to briefly describe the output of one or many calculations!

start a codebook if you need to abbreviate

Attribute Naming - Character Length

be aware of limits – Shapefile limit is 10

Do: POPDEN_20

Don't: population_density_2020

Attribute Naming - Delimiters

use camelCase when necessary to divide field elements

Do: fieldName

Don't: thisismyfieldname

Attribute Naming - Codebooks

list your field names and labels

provide description and info about each one

describe how values are coded or recorded

keep it up-to-date

Structuring Directories

folders organize data for you AND for others

✔️logical

✔️predictable

README files

text files explaining a project or parts of a project so others know what it is

found in top-level directories of projects

can link to other docs or relevant information

makeareadme.com

Version Control

version control system softwares keep track of file changes

essentially a database of changes

Version Control

different types of systems for different industries

git is very common and widely integrated

geogig is emerging but specific to geodata

Data Preservation

data preservation ensures long-term access to and use of data – beyond limits of media

includes procedures regarding file formats, copyright and permissions, persistent storage and geographic location, and metadata.

Data Preservation - File formats

decide which file formats are the most reliable and persistent for your data

prioritize platform-independent, character-based formats

prioritize UTF-8 character encoding

Now let's talk about metadata

metadata describes your data so it can be used, shared, and understood widely

Metadata in Plain Language

Questions you need to be prepared to answer about your data:

USGS Metadata in Plain Language

Examples

metadata formatted for web discovery

xml-encoded metadata

Difficulties

frankly, metadata is pretty boring

it takes a lot of time

lots of standards, no clear best choice

bad metadata negatively affects:

  • integrity
  • discoverability
  • preservability
  • useablity

4 main metadata "types"

  1. descriptive
  2. technical
  3. discovery
  4. administrative

Descriptive Metadata

includes things like:

  • abstract/methodology
  • attribute descriptions
  • purpose
  • uncertainty errors
  • access

Technical Metadata

includes things like:

  • CRS / projection / datum
  • attribute data types
  • software used
  • character encoding

Discovery Metadata

includes things like:

  • title
  • date
  • keywords
  • geographic extent

Administrative Metadata

includes things like:

  • copyright
  • contact info
  • status

Metadata Standards

why have metadata standards?

  • ease transformation/conversion
  • ensure proper interpretation

Metadata Standards

2 main geospatial metadata standards

Metadata Standard - ISO

flexible and internationally recognized

generally recommended

complex

documentation costs money

don't worry! there are several tools to help you create and edit metadata!

Metadata tools and editors

ArcGIS Pro!

catMDEdit

mdEditor.org (beta)

GeoNetwork

and more!!

creating metadata can be tedious. But remember: metadata will make your data more reproducible, sharable, and impactful.

motivational quote:

metadata is a love note to the future

Thanks!

Evan Thornberry

evan.thornberry@ubc.ca