API Specification
This page contains the specification for all classes and methods available in ArangoRDF.
ArangoRDF
- class arango_rdf.main.ArangoRDF(db: ~arango.database.StandardDatabase, controller: ~arango_rdf.controller.ArangoRDFController = <arango_rdf.controller.ArangoRDFController object>, logging_lvl: str | int = 20, rdf_attribute_prefix: str = '_', insert_async: bool = False, enable_pgt_cache: bool = False)[source]
ArangoRDF: Transform RDF Graphs into ArangoDB Graphs & vice-versa.
Implemented using concepts referred in https://arxiv.org/abs/2210.05781.
- Parameters:
db (arango.database.Database) – A python-arango database instance
logging_lvl (str | int) – Defaults to logging.INFO. Other useful options are logging.DEBUG (more verbose), and logging.WARNING (less verbose).
rdf_attribute_prefix (str) – The prefix for RDF attributes (e.g., _uri, _value, _rdftype, etc.). Defaults to the original “_” symbol, but please NOTE that using an underscore “_”, results in these attributes being treated as ArangoDB system attributes. Using “$” is an alternative non-system prefix.
insert_async (bool) – If True, will insert documents asynchronously. Defaults to False.
enable_pgt_cache (bool) – If True, will enable the PGT term metadata cache to avoid repeated computations. Defaults to False. Not always useful, especially when terms are not repeated alot in the RDF graph.
- Raises:
TypeError – On invalid parameter types
- arangodb_to_rdf(name: str, rdf_graph: Graph, metagraph: Dict[str, Dict[str, Set[str]]], explicit_metagraph: bool = True, list_conversion_mode: str = 'static', dict_conversion_mode: str = 'static', infer_type_from_adb_v_col: bool = False, include_adb_v_col_statements: bool = False, include_adb_v_key_statements: bool = False, include_adb_e_key_statements: bool = False, namespace_collection_name: str | None = None, ignored_attributes: Set[str] | None = None, **adb_export_kwargs: Any) Graph[source]
Create an RDF Graph from an ArangoDB Graph via its Metagraph.
- Parameters:
name (str) – The name of the ArangoDB Graph
rdf_graph (rdflib.graph.Graph) – The target RDF Graph to insert into.
metagraph (arango_rdf.typings.ADBMetagraph) – An dictionary of dictionaries defining the ArangoDB Vertex & Edge Collections whose entries will be inserted into the RDF Graph.
explicit_metagraph (bool) – Only keep the document attributes specified in metagraph when importing to RDF (is True by default). Otherwise, all document attributes are included. Defaults to True.
list_conversion_mode (str) – Specify how ArangoDB JSON lists within and ArangoDB Document are processed into the RDF Graph. If “serialize”, JSON Objects will be serialized into RDF Literals. If “collection”, ArangoDB lists will be processed using the RDF Collection structure. If “container”, ArangoDB lists will be processed using the RDF Container structure. If “static”, elements within lists will be processed as individual statements. Defaults to “static”. NOTE: “serialize” is recommended if round-tripping is desired, but only if round-tripping via PGT.
dict_conversion_mode (str) – Specify how ArangoDB JSON Objects within an ArangoDB Document are processed into the RDF Graph. If “serialize”, JSON Objects will be serialized into RDF Literals. If “static”, elements within dictionaries will be processed as individual statements with the help of BNodes. Defaults to “static”. NOTE: “serialize” is recommended if round-tripping is desired, but only if round-tripping via PGT.
infer_type_from_adb_v_col (bool) – Specify whether rdf:type statements of the form resource rdf:type adb_v_col . should be inferred upon transferring ArangoDB Vertices into RDF.
include_adb_v_col_statements (bool) – Specify whether adb:collection statements of the form adb_vertex adb:collection adb_v_col . should be generated upon transferring ArangoDB Documents into RDF. This can be used to maintain document collections when a user is interested in round-tripping.
include_adb_v_key_statements (bool) – Specify whether adb:key statements of the form adb_vertex adb:key adb_vertex[“key”] . should be generated upon transferring ArangoDB Documennts into RDF. This can be used to maintain document keys when a user is interested in round-tripping.
include_adb_e_key_statements (bool) – Specify whether adb:key statements of the form adb_edge adb:key adb_edge[“key”] . should be generated upon transferring ArangoDB Edges into RDF. This can be used to maintain edge keys when a user is interested in round-tripping. NOTE: Enabling this option will impose Triple Reification on all ArangoDB Edges.
namespace_collection_name (str | None) – The name of the ArangoDB Collection to store the namespace prefixes of rdf_graph. Useful for re-constructing the original RDF Graph from the ArangoDB Graph. Defaults to None, which means that the namespace prefixes will not be stored.
ignored_attributes (Set[str] | None) – The set of ArangoDB Document attributes to ignore when transferring ArangoDB Documents into RDF. Defaults to None, which means that all attributes will be transferred. Cannot be used if explicit_metagraph is True.
adb_export_kwargs (Any) – Keyword arguments to specify AQL query options when fetching documents from the ArangoDB instance. Full parameter list: https://docs.python-arango.com/en/main/specs.html#arango.aql.AQL.execute
- Returns:
The RDF representation of the ArangoDB Graph.
- Return type:
rdflib.graph.Graph
- arangodb_collections_to_rdf(name: str, rdf_graph: Graph, v_cols: Set[str], e_cols: Set[str], list_conversion_mode: str = 'static', dict_conversion_mode: str = 'static', infer_type_from_adb_v_col: bool = False, include_adb_v_col_statements: bool = False, include_adb_v_key_statements: bool = False, include_adb_e_key_statements: bool = False, namespace_collection_name: str | None = None, ignored_attributes: Set[str] | None = None, **adb_export_kwargs: Any) Graph[source]
Create an RDF Graph from an ArangoDB Graph via its Collection Names.
- Parameters:
name (str) – The name of the ArangoDB Graph
rdf_graph (rdflib.graph.Graph) – The target RDF Graph to insert into.
v_cols (Set[str]) – The set of ArangoDB Vertex Collections to import to RDF.
e_cols (Set[str]) – The set of ArangoDB Edge Collections to import to RDF.
list_conversion_mode (str) – Specify how ArangoDB JSON lists within and ArangoDB Document are processed into the RDF Graph. If “serialize”, JSON Objects will be serialized into RDF Literals. If “collection”, ArangoDB lists will be processed using the RDF Collection structure. If “container”, ArangoDB lists will be processed using the RDF Container structure. If “static”, elements within lists will be processed as individual statements. Defaults to “static”. NOTE: “serialize” is recommended if round-tripping is desired, but only if round-tripping via PGT.
dict_conversion_mode (str) – Specify how ArangoDB JSON Objects within an ArangoDB Document are processed into the RDF Graph. If “serialize”, JSON Objects will be serialized into RDF Literals. If “static”, elements within dictionaries will be processed as individual statements with the help of BNodes. Defaults to “static”. NOTE: “serialize” is recommended if round-tripping is desired, but only if round-tripping via PGT.
infer_type_from_adb_v_col (bool) – Specify whether rdf:type statements of the form resource rdf:type adb_v_col . should be inferred upon transferring ArangoDB Vertices into RDF.
include_adb_v_col_statements (bool) – Specify whether adb:collection statements of the form adb_vertex adb:collection adb_v_col . should be generated upon transferring ArangoDB Documents into RDF. This can be used to maintain document collections when a user is interested in round-tripping.
include_adb_v_key_statements (bool) – Specify whether adb:key statements of the form adb_vertex adb:key adb_vertex[“key”] . should be generated upon transferring ArangoDB Documennts into RDF. This can be used to maintain document keys when a user is interested in round-tripping.
include_adb_e_key_statements (bool) – Specify whether adb:key statements of the form adb_edge adb:key adb_edge[“key”] . should be generated upon transferring ArangoDB Edges into RDF. This can be used to maintain edge keys when a user is interested in round-tripping. NOTE: Enabling this option will impose Triple Reification on all ArangoDB Edges.
namespace_collection_name (str | None) – The name of the ArangoDB Collection to store the namespace prefixes of rdf_graph. Useful for re-constructing the original RDF Graph from the ArangoDB Graph. Defaults to None, which means that the namespace prefixes will not be stored.
ignored_attributes (Set[str] | None) – The set of ArangoDB Document attributes to ignore when transferring ArangoDB Documents into RDF. Defaults to None, which means that all attributes will be transferred.
adb_export_kwargs (Any) – Keyword arguments to specify AQL query options when fetching documents from the ArangoDB instance. Full parameter list: https://docs.python-arango.com/en/main/specs.html#arango.aql.AQL.execute
- Returns:
The RDF representation of the ArangoDB Graph.
- Return type:
rdflib.graph.Graph
- arangodb_graph_to_rdf(name: str, rdf_graph: Graph, list_conversion_mode: str = 'static', dict_conversion_mode: str = 'static', infer_type_from_adb_v_col: bool = False, include_adb_v_col_statements: bool = False, include_adb_v_key_statements: bool = False, include_adb_e_key_statements: bool = False, namespace_collection_name: str | None = None, ignored_attributes: Set[str] | None = None, **adb_export_kwargs: Any) Graph[source]
Create an RDF Graph from an ArangoDB Graph via its Graph Name.
- Parameters:
name (str) – The name of the ArangoDB Graph
rdf_graph (rdflib.graph.Graph) – The target RDF Graph to insert into.
list_conversion_mode (str) – Specify how ArangoDB JSON lists within and ArangoDB Document are processed into the RDF Graph. If “serialize”, JSON Objects will be serialized into RDF Literals. If “collection”, ArangoDB lists will be processed using the RDF Collection structure. If “container”, ArangoDB lists will be processed using the RDF Container structure. If “static”, elements within lists will be processed as individual statements. Defaults to “static”. NOTE: “serialize” is recommended if round-tripping is desired, but only if round-tripping via PGT.
dict_conversion_mode (str) – Specify how ArangoDB JSON Objects within an ArangoDB Document are processed into the RDF Graph. If “serialize”, JSON Objects will be serialized into RDF Literals. If “static”, elements within dictionaries will be processed as individual statements with the help of BNodes. Defaults to “static”. NOTE: “serialize” is recommended if round-tripping is desired, but only if round-tripping via PGT.
infer_type_from_adb_v_col (bool) – Specify whether rdf:type statements of the form resource rdf:type adb_v_col . should be inferred upon transferring ArangoDB Vertices into RDF.
include_adb_v_col_statements (bool) – Specify whether adb:collection statements of the form adb_vertex adb:collection adb_v_col . should be generated upon transferring ArangoDB Documents into RDF. This can be used to maintain document collections when a user is interested in round-tripping.
include_adb_v_key_statements (bool) – Specify whether adb:key statements of the form adb_vertex adb:key adb_vertex[“key”] . should be generated upon transferring ArangoDB Documennts into RDF. This can be used to maintain document keys when a user is interested in round-tripping.
include_adb_e_key_statements (bool) – Specify whether adb:key statements of the form adb_edge adb:key adb_edge[“key”] . should be generated upon transferring ArangoDB Edges into RDF. This can be used to maintain edge keys when a user is interested in round-tripping. NOTE: Enabling this option will impose Triple Reification on all ArangoDB Edges.
namespace_collection_name (str | None) – The name of the ArangoDB Collection to store the namespace prefixes of rdf_graph. Useful for re-constructing the original RDF Graph from the ArangoDB Graph. Defaults to None, which means that the namespace prefixes will not be stored.
ignored_attributes (Set[str] | None) – The set of ArangoDB Document attributes to ignore when transferring ArangoDB Documents into RDF. Defaults to None, which means that all attributes will be transferred.
adb_export_kwargs (Any) – Keyword arguments to specify AQL query options when fetching documents from the ArangoDB instance. Full parameter list: https://docs.python-arango.com/en/main/specs.html#arango.aql.AQL.execute
- Returns:
The RDF representation of the ArangoDB Graph.
- Return type:
rdflib.graph.Graph
- rdf_to_arangodb_by_rpt(name: str, rdf_graph: Graph, contextualize_graph: bool = False, flatten_reified_triples: bool = True, use_hashed_literals_as_keys: bool = True, overwrite_graph: bool = False, batch_size: int | None = None, **adb_import_kwargs: Any) Graph[source]
Create an ArangoDB Graph from an RDF Graph using the RDF-topology Preserving Transformation (RPT) Algorithm.
RPT preserves the RDF Graph structure by transforming each RDF statement into a Property Graph Edge. More info on RPT can be found in the package’s README file, or in the following paper: https://arxiv.org/pdf/2210.05781.pdf.
This method will store the RDF Resources of rdf_graph under the following ArangoDB Collections:
{Name}_URIRef: Vertex collection forrdflib.term.URIRefresources.{Name}_BNode: Vertex collection forrdflib.term.BNoderesources.{Name}_Literal: Vertex collection forrdflib.term.Literalresources.{Name}_Statement: Edge collection for all triples/quads.
- Parameters:
name (str) – The name of the RDF Graph
rdf_graph – The RDF Graph object. NOTE: This object is modified in-place in order for PGT to work. Do not expect the original state of rdf_graph to be preserved.
contextualize_graph (bool) –
A work-in-progress flag that seeks to enhance the Terminology Box of rdf_graph by providing the following features:
Loading Meta Ontologies (i.e OWL, RDF, RDFS, etc.) into the RDF Graph
Providing Domain & Range Inference
Providing Domain & Range Introspection
flatten_reified_triples (bool) – If set to False, will preserve the RDF structure of reified triples. If set to True, will convert any reified triple into a “regular” Property Graph Edge. Defaults to True.
use_hashed_literals_as_keys (bool) – If set to False, will not use the hashed value of an RDF Literal as its ArangoDB Document Key (i.e a randomly-generated key will instead be used). If set to True, all RDF Literals with the same value will be represented as one single ArangoDB Document. Defaults to True.
overwrite_graph (bool Defaults to False.) – Overwrites the ArangoDB graph identified by name if it already exists, and drops its associated collections. Defaults to False.
batch_size (int | None) – If specified, runs the ArangoDB Data Ingestion process for every batch_size RDF triples/quads within rdf_graph. Defaults to len(rdf_graph).
adb_import_kwargs – Keyword arguments to specify additional parameters for ArangoDB document insertion. Full parameter list: https://docs.python-arango.com/en/main/specs.html#arango.collection.Collection.insert_many
adb_import_kwargs – Any
- Type:
rdf_graph: rdflib.graph.Graph
- Returns:
The ArangoDB Graph API wrapper.
- Return type:
arango.graph.Graph
- rdf_to_arangodb_by_pgt(name: str, rdf_graph: Graph, adb_col_statements: Graph | None = None, write_adb_col_statements: bool = True, contextualize_graph: bool = False, flatten_reified_triples: bool = True, overwrite_graph: bool = False, batch_size: int | None = None, namespace_collection_name: str | None = None, uri_map_collection_name: str | None = None, resource_collection_name: str | None = None, predicate_collection_name: str | None = None, **adb_import_kwargs: Any) Graph[source]
Create an ArangoDB Graph from an RDF Graph using the Property Graph Transformation (PGT) Algorithm.
PGT ensures that datatype property statements (i.e statements whose objects are Literals) are mapped to document properties in the Property Graph. Learn more about PGT here.
Contrary to RPT, this method will rely on the nature of the RDF Resource/Statement to determine which ArangoDB Collection it belongs to. This process is referred to as the ArangoDB Collection Mapping Process. Learn more about the PGT ArangoDB Collection Mapping Process here.
Contrary to RPT, regardless of whether contextualize_graph is set to True or not, all RDF Predicates within every RDF Statement in rdf_graph will be processed as their own ArangoDB Document, and will be stored under the “Property” Vertex Collection.
- Parameters:
name (str) – The name of the RDF Graph
rdf_graph – The RDF Graph object. NOTE: This object is modified in-place in order for PGT to work. Do not expect the original state of rdf_graph to be preserved.
adb_col_statements (rdflib.graph.Graph | None) – An optional RDF Graph containing ArangoDB Collection statements of the form adb_vertex http://arangodb/collection “adb_v_col” .. Useful for creating a custom ArangoDB Collection mapping of RDF Resources within rdf_graph. Defaults to None. NOTE: Cannot be used in conjunction with collection statements in rdf_graph.
write_adb_col_statements (bool) – Run the ArangoDB Collection Mapping Process for rdf_graph to write the ArangoDB Collection statements of the form adb_vertex http://arangodb/collection “adb_v_col” . ` into **adb_col_statements**. This parameter is ignored if **contextualize_graph** is set to True, as the ArangoDB Collection Mapping Process is required for Graph Contextualization. See :func:`write_adb_col_statements for more information.
contextualize_graph (bool) –
A work-in-progress flag that seeks to enhance the Terminology Box of rdf_graph by providing the following features:
Loading Meta Ontologies (i.e OWL, RDF, RDFS, etc.) into the RDF Graph
Providing Domain & Range Inference
Providing Domain & Range Introspection
flatten_reified_triples (bool) –
If set to False, will preserve the RDF structure of any Reified Triple. If set to True, will “flatten” any reified triples into a regular Property Graph Edge. Defaults to True.
overwrite_graph (bool) – Overwrites the ArangoDB graph identified by name if it already exists, and drops its associated collections. Defaults to False.
batch_size (int | None) – If specified, runs the ArangoDB Data Ingestion process for every batch_size RDF triples/quads within rdf_graph. Defaults to None.
namespace_collection_name (str | None) – The name of the ArangoDB Collection to store the namespace prefixes of rdf_graph. Useful for re-constructing the original RDF Graph from the ArangoDB Graph. Defaults to None, which means that the namespace prefixes will not be stored. Not included in the ArangoDB Graph Edge Definitions.
uri_map_collection_name (str | None) – If specified, in addition to storing the URIs of rdf_graph in their respective collection, the URIs will also be stored in the specified ArangoDB Collection to map to the collection name they correspond to. This could be then used for multi-file imports, allowing ArangoRDF to check if the URIs of rdf_graph have already been imported into the ArangoDB Graph to avoid going through the ArangoDB Collection Mapping Process (for that URI) again. Not included in the ArangoDB Graph Edge Definitions. Cannot be used in conjunction with resource_collection_name.
resource_collection_name (str | None) – If specified, will use this name as the ArangoDB Collection to store all RDF Resources, except Class and Property. This is useful for cases where you want to combine both RPT and PGT behavior, where rdf:type statements are stored as both edges and optionally as a property (i.e _types list), but not used for the ArangoDB Collection Mapping Process. Defaults to None. Cannot be used in conjunction with uri_map_collection_name.
predicate_collection_name (str | None) – If specified, will use this name as the ArangoDB Collection to store all Edges. This is useful for cases where you want to combine both RPT and PGT behavior, where the predicate label is not used as the ArangoDB Collection name, but rather as a property of the Edge. Defaults to None.
adb_import_kwargs – Keyword arguments to specify additional parameters for the ArangoDB Data Ingestion process. The full parameter list is here. # noqa: E501
- Type:
rdf_graph: rdflib.graph.Graph
- Returns:
The ArangoDB Graph API wrapper.
- Return type:
arango.graph.Graph
- write_adb_col_statements(rdf_graph: Graph, adb_col_statements: Graph | None = None, uri_map_collection_name: str | None = None) Graph[source]
RDF -> ArangoDB (PGT): Run the ArangoDB Collection Mapping Process for rdf_graph to map RDF Resources to their respective ArangoDB Collection.
The PGT Algorithm relies on the ArangoDB Collection Mapping Process to identify the ArangoDB Collection of every RDF Resource. Using this method prior to running
rdf_to_arangodb_by_pgt()allows you to visualize and modify the mapping. Learn more about the PGT ArangoDB Collection Mapping Process here.NOTE: Running this method prior to
rdf_to_arangodb_by_pgt()is unnecessary if the user is not interested in viewing/modifying the ArangoDB Mapping.NOTE: There can only be 1 adb:collection statement associated to each RDF Resource.
- Parameters:
rdf_graph (rdflib.graph.Graph) – The RDF Graph object.
adb_col_statements (Optional[rdflib.graph.Graph]) – An existing RDF Graph containing adb:collection statements. If not provided, a new RDF Graph will be created. Defaults to None. NOTE: The ArangoDB Collection Mapping Process relies heavily on mapping certain RDF Resources to the “Class” and “Property” ArangoDB Collections. Therefore, it is currently not possible to overwrite any RDF Resources that belong to these collections.
- migrate_unknown_resources(graph_name: str, uri_map_collection_name: str, **kwargs: Any) Tuple[int, int][source]
RDF -> ArangoDB (PGT): Migrate all UnknownResource statements to their respective ArangoDB Collection.
NOTE: This method is only available if the user has passed a value to the uri_map_collection_name parameter of the
rdf_to_arangodb_by_pgt()method.This method will migrate all UnknownResource statements to their respective ArangoDB Collection based on if the same RDF Resource exists in the uri_map_collection_name.
Recommended to run this method after
rdf_to_arangodb_by_pgt()if the user is not interested in maintaining the UnknownResource statements.- Parameters:
graph_name (str) – The name of the graph to migrate the Unknown Resources from.
uri_map_collection_name (str) – The name of the URI collection to migrate the Unknown Resources to.
kwargs (Any) – Keyword arguments passed to the AQL Query execution.
- Returns:
The number of Unknown Resources migrated and the number of edges updated.
- Return type:
Tuple[int, int]
- migrate_edges_to_attributes(graph_name: str, edge_collection_name: str, attribute_name: str | None = None, edge_direction: str = 'OUTBOUND', sort_clause: str | None = None, return_clause: str | None = None, filter_clause: str | None = None) int[source]
RDF –> ArangoDB (PGT): Migrate all edges in the specified edge collection to attributes. This method is useful when combined with the resource_collection_name parameter of the
rdf_to_arangodb_by_pgt()method.NOTE: It is recommended to run this method with edge_collection_name set to “type” after
rdf_to_arangodb_by_pgt()if the user has set the resource_collection_name parameter.- Parameters:
graph_name (str) – The name of the graph to migrate the edges from.
edge_collection_name (str) – The name of the edge collection to migrate.
attribute_name (Optional[str]) – The name of the attribute to migrate the edges to. Defaults to edge_collection_name, prefixed with the rdf_attribute_prefix parameter set in the constructor.
edge_direction (str) – The direction of the edges to migrate. Defaults to OUTBOUND.
sort_clause (Optional[str]) – A SORT statement to order the traversed vertices. Defaults to f”v.{self.__rdf_attribute_prefix}label”. If set to None, the vertex values will be ordered based on their traversal order.
return_clause (str) – A RETURN statement to return the specific value to add as an attribute from the traversed vertices. Defaults to f”v.{self.__rdf_attribute_prefix}label”. Another option can be f”v.{self.__rdf_attribute_prefix}uri”.
filter_clause (Optional[str]) – A FILTER statement to filter the traversed edges & target vertices. Defaults to None.
- Returns:
The number of documents updated.
- Return type:
int
- rdf_to_arangodb_by_lpg(name: str, rdf_graph: Graph, resource_collection_name: str = 'Node', predicate_collection_name: str = 'Edge', **pgt_kwargs: Any) Graph[source]
RDF -> ArangoDB (LPG): Convert an RDF Graph into an ArangoDB Graph using the Labeled Property Graph (LPG) model.
NOTE: It is highly recommend to use the
migrate_edges_to_attributes()method after this function to apply the RDF type statements as attributes to the ArangoDB Documents in order to follow the LPG model.from arango_rdf import ArangoRDF adbrdf = ArangoRDF(db) adbrdf.rdf_to_arangodb_by_lpg("Test", rdf_graph) # Traverse all edges in the "Edge" collection labeled as "type", # and apply the RDF type statements as a list of strings to to the # ArangoDB Documents. adbrdf.migrate_edges_to_attributes( "Test", "Edge", "_type", filter_clause="e._label == 'type'" )
This function is just a wrapper around the
rdf_to_arangodb_by_pgt()method, but with the following differences: - Parameter resource_collection_name is required, defaults to “Node” - Parameter predicate_collection_name is required, defaults to “Edge”- Parameters:
name (str) – The name of the ArangoDB Graph.
rdf_graph (RDFGraph) – The RDF Graph to convert.
resource_collection_name (str) – The name of the ArangoDB Collection to store the RDF Resources in.
predicate_collection_name (str) – The name of the ArangoDB Collection to store the RDF Predicates in.
pgt_kwargs (Any) – Keyword arguments to pass to the
rdf_to_arangodb_by_pgt()method.
- Returns:
The ArangoDB Graph.
- Return type:
arango.graph.Graph
- rdf_id_to_adb_key(rdf_id: str, rdf_term: URIRef | BNode | Literal | None = None) str[source]
RDF -> ArangoDB: Convert an RDF Resource ID string into an ArangoDB Key via some hashing function.
If rdf_term is provided, then the value of the statement rdf_term adb:key “<ArangoDB Document Key>” . will be used as the ArangoDB Key (assuming that said statement exists).
Current hashing function used: FarmHash
- Parameters:
rdf_id (str) – The string representation of an RDF Resource
rdf_term (Optional[URIRef | BNode | Literal]) – The optional RDF Term to check if it has an adb:key statement associated to it.
- Returns:
The ArangoDB _key equivalent of rdf_id
- Return type:
str
- hash(rdf_id: str) str[source]
RDF -> ArangoDB: Hash an RDF Resource ID string into an ArangoDB Key via some hashing function.
Current hashing function used: FarmHash
List of hashing functions tested & benchmarked: - Built-in hash() function - Hashlib MD5 - xxHash - MurmurHash - CityHash - FarmHash
- Parameters:
rdf_id (str) – The string representation of an RDF Resource
- Returns:
The ArangoDB _key equivalent of rdf_id
- Return type:
str
- rdf_id_to_adb_label(rdf_id: str) str[source]
RDF -> ArangoDB: Return the suffix of an RDF URI.
The suffix can (1) be used as an ArangoDB Collection name, or (2) be used as the _label property value for an ArangoDB Document.
For example: - http://example.com/Person -> “Person” - http://example.com/Person#Bob -> “Bob” - http://example.com/Person:Bob -> “Bob”
- Parameters:
rdf_id (str) – The string representation of a URIRef
- Returns:
The suffix of the RDF URI string
- Return type:
str
- extract_adb_col_statements(rdf_graph: Graph, keep_adb_col_statements_in_rdf_graph: bool = False) Graph[source]
ArangoDB <-> RDF: Extracts adb:collection statements from an RDF Graph.
- Parameters:
rdf_graph (rdflib.graph.Graph) – The RDF Graph to extract the statements from.
keep_adb_col_statements_in_rdf_graph (bool) – Keeps the ArangoDB Collection statements in the original graph once extracted. Defaults to False.
- Returns:
The ArangoDB Collection Mapping graph.
- Return type:
rdflib.graph.Graph
- extract_adb_key_statements(rdf_graph: Graph, keep_adb_key_statements_in_rdf_graph: bool = False) Graph[source]
ArangoDB <-> RDF: Extracts the adb:key statements from an RDF Graph.
- Parameters:
rdf_graph (rdflib.graph.Graph) – The RDF Graph to extract the statements from.
keep_adb_col_statements_in_rdf_graph (bool) – Keeps the ArangoDB Collection Mapping statements in the original graph once extracted. Defaults to False.
- Returns:
The ArangoDB Collection Mapping graph.
- Return type:
rdflib.graph.Graph
ArangoRDFController
- class arango_rdf.controller.ArangoRDFController[source]
Controller used in RDF-to-ArangoDB (PGT).
Responsible for handling how the ArangoDB Collection Mapping Process identifies the “ideal RDFS Class” among a selection of RDFS Classes for a given RDF Resource.
The “ideal RDFS Class” is defined as an RDFS Class whose local name best represents the RDF Resource in question. This local name will be used as the ArangoDB Collection name that will store rdf_resource.
Read more about how the PGT ArangoDB Collection Mapping Process works here.
- identify_best_class(rdf_resource: URIRef | BNode | Literal, class_set: Set[str], subclass_tree: Tree) str[source]
Find the ideal RDFS Class among a selection of RDFS Classes. Essential for the ArangoDB Collection Mapping Process used in RDF-to-ArangoDB (PGT).
Read more about how the PGT ArangoDB Collection Mapping Process works here.
The “ideal RDFS Class” is defined as an RDFS Class whose local name best represents the RDF Resource in question. This local name will be used as the ArangoDB Collection name that will store rdf_resource.
This system is a work-in-progress. Users are welcome to overwrite this method via their own implementation of the ArangoRDFController Class. Users are able to access the RDF Graph of the current RDF-to-ArangoDB transformation via self.rdf_graph, and the database instance via the self.db.
- Parameters:
rdf_resource (URIRef | BNode) – The RDF Resource in question.
class_set (Set[str]) – A set of RDFS Class URIs that are associated to rdf_resource via the RDF.Type relationship, either via explicit definition or via domain/range inference.
subclass_tree (arango_rdf.utils.Tree) – The Tree data structure representing the RDFS subClassOf Taxonomy. See
arango_rdf.main.ArangoRDF.__build_subclass_tree()for more info.
- Returns:
The string representation of the URI of the most suitable RDFS Class URI among the set of RDFS Classes to use as the ArangoDB Document Collection name for rdf_resource.
- Return type:
str