Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SHACL shape to validate SO namespace #59

Closed
mbjones opened this issue Dec 2, 2019 · 12 comments · Fixed by #103
Closed

add SHACL shape to validate SO namespace #59

mbjones opened this issue Dec 2, 2019 · 12 comments · Fixed by #103
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@mbjones
Copy link
Collaborator

mbjones commented Dec 2, 2019

On today's call, we agreed to update the guidance docs to recommend using the https variant of the schema.org namespace with a trailing slash /. We also discussed whether the trailing slash should be required in that it would be validated using a SHACL shape. This request is to add a SHACL shape to test that the namespace has a trailing slash, and if not, then throw an error. With this, we would have:

  • https://schema.org/: VALID
  • http://schema.org/: ?
  • https://schema.org: INVALID
  • http://schema.org: INVALID
@mbjones
Copy link
Collaborator Author

mbjones commented Dec 2, 2019

After further discussion on the call, we identified that SHACL may not work for this validation because it will treat the two namespace strings with and without a slash as different namespaces. We really need more of a pre-parser step that checks the namespace in @vocab before it is handed off for SHACL validation. @fils do you have thoughts on what would be best to handle this in Fence or other tools?

@ashepherd ashepherd added the enhancement New feature or request label Dec 2, 2019
@datadavev
Copy link
Collaborator

datadavev commented Dec 3, 2019

Here's a brute force solution using SHACL, only tests for SO:Dataset. Basically if it finds any node in a graph with a bad namespace then it fails validation.

I expect there's a more elegant way to do it:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix d1: <http://ns.dataone.org/schema/SO#> .

d1:DatasetBad1Shape
    a sh:NodeShape ;
    sh:targetClass <https://schema.orgDataset/> ;
    sh:message "Expecting SO namespace of <https://schema.org/> not <https://schema.org>" ;
    sh:not [
        sh:path rdf:type ;
        sh:minCount 1;
    ].
d1:DatasetBad2Shape
    a sh:NodeShape ;
    sh:targetClass <http://schema.org/Dataset> ;
    sh:message "Expecting SO namespace of <https://schema.org/> not <http://schema.org/>" ;
    sh:not [
        sh:path rdf:type ;
        sh:minCount 1;
    ].
d1:DatasetBad3Shape
    a sh:NodeShape ;
    sh:targetClass <http://schema.orgDataset/> ;
    sh:message "Expecting SO namespace of <https://schema.org/> not <http://schema.org>" ;
    sh:not [
        sh:path rdf:type ;
        sh:minCount 1;
    ].

Edit:
Here's a worked example using pyshacl:
https://so-tools.readthedocs.io/en/latest/test_namespace.html

@mbjones
Copy link
Collaborator Author

mbjones commented Dec 3, 2019

That looks great @datadavev. @fils can you add this to Fence too? Where is the collection of definitive shapes we're using for validation?

@rduerr
Copy link
Collaborator

rduerr commented Dec 3, 2019

+1 on @mbjones comments

@danbri
Copy link

danbri commented Dec 31, 2019

Would this complain if people used external extensions to make richer schema.org Dataset descriptions?

@datadavev
Copy link
Collaborator

The SHACL looks for http://schema.orgDataset/, https://schema.orgDataset/, or http://schema.org/Dataset and complains if found. It is agnostic with all other constructs, so will not complain about external extensions.

@danbri
Copy link

danbri commented Dec 31, 2019 via email

@mbjones mbjones added this to the v1.2 milestone Feb 28, 2020
@mbjones
Copy link
Collaborator Author

mbjones commented Feb 28, 2020

Seems like this SHACL shape is ready to be added to the guidelines. Should this be in v1.1 or v1.2? For now I will put it in v1.2 to avoid delaying the 1.1 release, but feel free to move it up if you know how it should be incorporated @datadavev @fils

@fils
Copy link
Collaborator

fils commented Mar 2, 2020

@mbjones I'm fine with 1.2 personally. There are some improvements to the way "recommendations" can be done in a SHACL shape.

Also, I'm doing some updates and will be talking with @datadavev about them today. So based on that we might make changes which would further support 1.2 as a target.

Also adding in some points about frames which I think will also be important along side shapes. So again, time to review and include that.

I've updated Fence at https://fence.gleaner.io/ as part of getting ready to chat with Dave.
Added in framing as a test option. Can now pull the geospatial elements properly based on the current Science on Schema guidance. Other code routes these then into base geometries to pull KML, GeoJSON, WKT etc from.

@mbjones mbjones linked a pull request Jan 22, 2021 that will close this issue
@mbjones
Copy link
Collaborator Author

mbjones commented Jan 22, 2021

Looks like initial SHACL support is ready to go. Can you please merge the PR @datadavev if you agree?

@datadavev
Copy link
Collaborator

@mbjones It's probably ok. My hesitation is that it should really be accompanied by a bunch of test cases since SHACL is sufficiently complicated that non-obvious errors and omissions may be present.

@mbjones
Copy link
Collaborator Author

mbjones commented Jan 28, 2021

ok, I merged PR #103 with the initial SHACL support. While this is not yet documented fully nor a complete service, it will likely be useful to many groups. The shape files do not fully specify a conformance suite to the guidelines, but they are a useful good start.

I think we should pick up the SHACL work for v1.3 to get agreement on the conformance shapes for various use cases, and provide documentation on usage. So, while the PR #103 is closed in the v1.2 release, we should open new issues in v1.3 for the documentation and shape changes we want to see. Thanks @datadavev !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants