Skip to content

Instantly share code, notes, and snippets.

@lukecampbell
Last active September 10, 2015 17:41
Show Gist options
  • Save lukecampbell/9f5a023de20627bebbb1 to your computer and use it in GitHub Desktop.
Save lukecampbell/9f5a023de20627bebbb1 to your computer and use it in GitHub Desktop.
GLOS Metadata

Metadata

References

Identifier

We need a unique key to associate with this metadata. Unfortunately, I don't think geonetwork (or any other CSW for that matter) ensure identifier uniqueness which I will need to bring up to IOOS at some point.

./gmd:fileIdentifier/gco:CharacterString/text()[1]

References

Title

A title describing what this dataset is.

/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:title/gco:CharacterString/text()[1]

References

Abstract

A summary of the dataset.

/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:abstract/gco:CharacterString/text()[1]

References

Data Provider

There's limitless ways to refer to data providers, I'm going to look at the data distributors.

/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:organisationName/gco:CharacterString/text()[1]

or

/gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:organisationName/gco:CharacterString/text()[1] | ./gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor/gmd:MD_Distributor/gmd:distributorContact/gmd:CI_ResponsibleParty/gmd:individualName/gco:CharacterString/text()[1]

References

Extents

Geo Extents

The standard supports 3 types of geo extent descriptions: EX_BoundingPolygon, EX_GeographicBoundingBox and EX_GeographicDescription. I've never actually seen an example of EX_GeographicDescription so I don't know how to do code it and the spec is as vague as they come. Polygon and Bounding Box, I will support. I currently only support bounding box but if we need polygon support, I can add it.

/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:westBoundLongitude/gco:Decimal/text()[1]
/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:eastBoundLongitude/gco:Decimal/text()[1]
/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:northBoundLatitude/gco:Decimal/text()[1]
/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:geographicElement/gmd:EX_GeographicBoundingBox/gmd:southBoundLatitude/gco:Decimal/text()[1]

References

Start Time

/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent/gml:TimePeriod/gml:beginPosition

End Time

/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:extent/gmd:EX_Extent/gmd:temporalElement/gmd:EX_TemporalExtent/gmd:extent/gml:TimePeriod/gml:endPosition

References

Services

Service metadata is how we can identify how to get the data the document is describing. Unfortunately this version of geonetwork subscribes to a very old model of ISO-19115 and the IOOS practice is to use ISO-19115-2. More information.

To identify service types within the document:

/gmd:identificationInfo/srv:SV_ServiceIdentification/srv:serviceType/gco:LocalName/text()[1]

but that doesn't work with Geonetwork's ISO representation so I use

/gmd:distributionInfo/gmd:MD_Distribution/gmd:transferOptions/gmd:MD_DigitalTransferOptions/gmd:onLine/gmd:CI_OnlineResource/gmd:protocol/gco:CharacterString/text()[1]

to determine what services are available and then I map those to the URLs

/gmd:distributionInfo/gmd:MD_Distribution/gmd:transferOptions/gmd:MD_DigitalTransferOptions/gmd:onLine/gmd:CI_OnlineResource/gmd:linkage/gmd:URL/text()[1]

Categories

So this has been a big sticking point for us for a while since it's a feature that's absolutely exclusive to Geonetwork's internal metadata structure. Ben crammed it into the ISO-19115s in BaseX but they are actually illegal now and they won't be harvested by third parties including the NGDC Geoportal. It's also extremely slow to request the categories from geonetwork, and it requires me to make 4 requests to get the cateogries for a single record.

I propose that we incorporate these categories into the actual metadata record (it will have to be by hand) but this way not only will it be part of the metadata and be harvestable by outside consumers but it makes the coding and incorporating into applications significantly easier.

What I would like to do is match the thesaurus name to "GLOS Categories" as our official source of cateogry names. The comma separated category list should be one of our approved catgories:

  • beach_health
  • binational
  • climatology
  • environmental
  • habs
  • hydrologic
  • invasiveSpecies
  • models
  • nearshore
  • nutrients
  • observationsBuoys
  • otherResources
  • qa_qc
  • satellite
  • temperature
  • water_quality

I'm not the first person to find this as a solution as I've found recently. This was a reply to Kathy from a post about geonetwork:

Hi Kathy, 

Not an expert on this, but I believe GN categories is something specific 
to GN, while CSW is an open standard. Thus, GN categories wouldn't be 
available through CSW. 
A workaround would be to quit using GN categories, and replace it by a 
hand-made thesaurus, in which you put your "categories". 
Categories would then be with the other keywords and available with CSW. 
Hope this helps, 

Jean 
/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:descriptiveKeywords/gmd:MD_Keywords[gmd:thesaurusName/gmd:CI_Citation/gmd:title/gco:CharacterString='GLOS Categories']/gmd:keyword/gco:CharacterString/text()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment