RDF to ArangoDB (RPT) --------------------- `Virtualizng ArangoDB as a Triple Store` What is RPT? ============ **The RDF-topology Preserving Transformation (RPT) algorithm preserves the RDF graph structure by transforming each RDF statement into an edge in the Property Graph (PG).** Consider the following RDF Graph: .. code-block:: turtle @prefix ex: . @prefix xsd: . ex:book ex:publish_date "1963-03-22"^^xsd:date . ex:book ex:pages "100"^^xsd:integer . ex:book ex:cover 20 . ex:book ex:index 55 . RPT converts the triple ``(ex:book, ex:index, 55)`` into two nodes ``(ex:book)`` and ``(55)``, connected by an edge ``(ex:index)``. All other triples involving RDF resources, blank nodes, or literal values can be transformed in a similar way so that we obtain the Property Graph below: .. image:: _static/rpt.png :width: 300px :alt: RPT Graph The Algorithm below formalizes the RPT approach. For each triple, create a node for the subject (line 3) and the object (line 5), with an edge connecting them (line 12) - of course avoiding duplicate nodes for the same IRIs. .. image:: _static/rpt_algorithm.png :width: 500px :alt: RPT Algorithm Now, consider the following RDF-star Graph: .. code-block:: @prefix ex: . << ex:Mary ex:likes ex:Matt >> ex:certainty 0.5 . << ex:Mary ex:age 28 >> ex:certainty 1 . ArangoRDF's RPT transformation for RDF-star Graphs is slightly different from the transformation proposed in the paper. In order to preserve the concept of **virtualizing ArangoDB as a Triple Store**, conversion for RDF-star statements is identical to RDF triples. This is not the case for the RPT transformation proposed in the paper, which proposes to add the `ex:certainty 1` as an edge attribute to the edge connecting the nodes `ex:Mary` and `28`. Instead, ArangoRDF's RPT transformation expresses ``(..., ex:certainty, 1)`` as its own edge: .. image:: _static/rpt_star.png :width: 400px :alt: RPT Star Graph **Please Note:** The ``rdflib`` python package hasn't yet introduced support for `Quoted Triples `_, so ArangoRDF's support for RDF-star is based on `Triple Reification `_. As a result, the RDF-star Graph above can be processed with ArangoRDF as follows: .. code-block:: python from rdflib import Graph from arango import ArangoClient from arango_rdf import ArangoRDF data = """ @prefix ex: . @prefix rdf: . # << ex:Mary ex:likes ex:Matt >> ex:certainty 0.5 . # << ex:Mary ex:age 28 >> ex:certainty 1 . [] a rdf:Statement; rdf:subject ex:Mary; rdf:predicate ex:likes; rdf:object ex:Matt ; ex:certainty 0.5 . [] a rdf:Statement; rdf:subject ex:Mary; rdf:predicate ex:age; rdf:object 28 ; ex:certainty 1 . """ rdf_graph = Graph() rdf_graph.parse(data=data, format="turtle") db = ArangoClient().db() adbrdf = ArangoRDF(db) adbrdf.rdf_to_arangodb_by_rpt(name="DataRPT", rdf_graph=rdf_graph, overwrite_graph=True) RPT Collection Mapping Process ============================== The **RPT Collection Mapping Process** is defined as the algorithm used to map RDF Resources to ArangoDB Collections. In RPT, the ArangoDB Collections generated are consistent: 1. ``{Name}_URIRef``: The Vertex collection for ``rdflib.term.URIRef`` resources. 2. ``{Name}_BNode``: The Vertex collection for ``rdflib.term.BNode`` resources. 3. ``{Name}_Literal``: The Vertex collection for ``rdflib.term.Literal`` resources. 4. ``{Name}_Statement``: The Edge collection for all triples/quads. Using the python example from above, the RDF Resources of your RDF Graph would be stored under the following ArangoDB Collections: 1. DataRPT_URIRef - ``ex:Mary`` - ``ex:Matt`` 2. DataRPT_BNode - ``[]`` (1) - ``[]`` (2) 3. DataRPT_Literal - ``0.5`` - ``1`` 4. DataRPT_Statement - ``ex:Mary -> ex:likes -> ex:Matt`` - ``ex:Mary -> ex:age -> 28`` - ``(ex:Mary -> ex:likes -> ex:Matt) -> ex:certainty -> 0.5`` - ``(ex:Mary -> ex:age -> 28) -> ex:certainty -> 1`` This is the consistent naming scheme for all ArangoRDF RPT transformations. The name of the RDF Graph is used as a prefix for the 3 Vertex Collections and the 1 Edge Collection. Supported Cases =============== **Note**: RDF-to-ArangoDB functionality has been implemented using concepts described in the paper `Transforming RDF-star to Property Graphs: A Preliminary Analysis of Transformation Approaches `_. The paper presents a systematic list of test cases that transformation approaches need to fulfill. These test cases range from simple RDF Graphs to complex RDF-star Graphs. ArangoRDF's RPT interface can be observed `here <./specs.html#arango_rdf.main.ArangoRDF.rdf_to_arangodb_by_rpt>`_. `View how ArangoRDF's RPT transformation approach performs on these test cases in Colab `_. .. image:: _static/cases.png :width: 500px :alt: Cases