srtk.knowledge_graph package

srtk.knowledge_graph.graph_base module

Provide protocal for different kinds of knowledge graphs.

class srtk.knowledge_graph.graph_base.KnowledgeGraphBase

Bases: object

Knowledge graph base class.

abstract deduce_leaves(src: str, path: List[str], limit: int) List[str]

Deduce leave entities from source entity following the path.

Parameters:
  • src_entity (str) – source entity

  • path (tuple[str]) – path from source entity to destination entity

  • limit (int, optional) – limit of the number of leaves.

Returns:

list of leaves. Each leaf is a QID.

Return type:

list[str]

get_entity_label(entity: str) str

Get the label of an entity. Defaults to get_label.

Parameters:

entity (str) – entity identifier

Returns:

label of the entity

Return type:

str

abstract get_label(identifier: str) str

Get label of an entity or a relation.

Parameters:

identifier (str) – entity or relation identifier

Returns:

label of the entity or the relation

Return type:

str

abstract get_neighbor_relations(src: str, hop: int, limit: int) List[str]

Get n-hop neighbor relations of src.

Parameters:
  • src (str) – source entity

  • hop (int, optional) – hop of the relations. Defaults to 1.

  • limit (int, optional) – limit of the number of relations.

Returns:

list of relations (one-hop)

or list of tuples of relations (multi-hop)

Return type:

list[str] | list[tuple[str]]

get_relation_label(relation: str) str

Get the label of a relation. Defaults to get_label.

Parameters:

relation (str) – relation identifier

Returns:

label of the relation

Return type:

str

abstract search_one_hop_relations(src: str, dst: str) List[List[str]]

Search one hop relations between src and dst.

Parameters:
  • src (str) – source entity

  • dst (str) – destination entity

Returns:

list of paths, each path is a list of PIDs

Return type:

list[list[str]]

abstract search_two_hop_relations(src: str, dst: str) List[List[str]]

Search two hop relations between src and dst.

Parameters:
  • src (str) – source entity

  • dst (str) – destination entity

Returns:

list of paths, each path is a list of PIDs

Return type:

list[list[str]]

srtk.knowledge_graph.freebase module

class srtk.knowledge_graph.freebase.Freebase(endpoint, prepend_prefixes=True)

Bases: KnowledgeGraphBase

PREFIXES: str = '\n        PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>\n        PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n        PREFIX ns: <http://rdf.freebase.com/ns/>\n        '
deduce_leaves(src, path, limit=2000)

Deduce leave entities from source entity following the path.

Parameters:
  • src_entity (str) – source entity

  • path (tuple[str]) – path from source entity to destination entity

  • limit (int, optional) – limit of the number of leaves. Defaults to 2000.

Returns:

list of leaves. Each leaf is a QID.

Return type:

list[str]

deduce_leaves_from_multiple_srcs(srcs, path, limit=2000)

Deuce leave entities from multiple source entities following the path.

Parameters:
  • srcs (list[str]) – list of source entities

  • path (list[str]) – path from source entity to destination entity

  • limit (int, optional) – limit of the number of leaves. Defaults to 200.

Returns:

list of leaves. Each leaf is a QID.

Return type:

list[str]

static get_id_from_uri(uri)

Get id from uri.

get_label(identifier)

Get label of an entity or a relation.

Parameters:

identifier (str) – entity or relation identifier

Returns:

label of the entity or the relation

Return type:

str

get_neighbor_relations(src, hop=1, limit=100)

Get all relations connected to an entity. The relations are limited to direct relations (those with wdt: prefix).

Parameters:
  • src (str) – source entity

  • hop (int, optional) – hop of the relations. Defaults to 1.

  • limit (int, optional) – limit of the number of relations. Defaults to 100.

Returns:

list of relations. Each relation is a PID or a tuple of PIDs.

Return type:

list[str] | list[tuple(str,)]

get_relation_label(relation)

For freebase, relation label is the same as the relation identifier.

queryFreebase(query)
search_one_hop_relations(src, dst)

Search one hop relation between src and dst.

Parameters:
  • src (str) – source entity

  • dst (str) – destination entity

Returns:

list of paths, each path is a list of PIDs

Return type:

list[list[str]]

search_two_hop_relations(src, dst)

Search two hop relations between src and dst.

Parameters:
  • src (str) – source entity

  • dst (str) – destination entity

Returns:

list of paths, each path is a list of PIDs

Return type:

list[list[str]]

srtk.knowledge_graph.wikidata module

class srtk.knowledge_graph.wikidata.Wikidata(endpoint, prepend_prefixes=False, exclude_qualifiers=True)

Bases: KnowledgeGraphBase

ENTITY_PREFIX: str = 'http://www.wikidata.org/entity/Q'
PREFIXES: str = 'PREFIX wd: <http://www.wikidata.org/entity/>\n        PREFIX wds: <http://www.wikidata.org/entity/statement/>\n        PREFIX wdv: <http://www.wikidata.org/value/>\n        PREFIX wdt: <http://www.wikidata.org/prop/direct/>\n        PREFIX wikibase: <http://wikiba.se/ontology#>\n        PREFIX p: <http://www.wikidata.org/prop/>\n        PREFIX ps: <http://www.wikidata.org/prop/statement/>\n        PREFIX pq: <http://www.wikidata.org/prop/qualifier/>\n        PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n        PREFIX bd: <http://www.bigdata.com/rdf#>\n        PREFIX schema: <http://schema.org/>\n        '
deduce_leaves(src, path, limit=2000)

Deduce leave entities from source entity following the path.

Parameters:
  • src_entity (str) – source entity

  • path (tuple[str]) – path from source entity to destination entity

  • limit (int, optional) – limit of the number of leaves. Defaults to 2000.

Returns:

list of leaves. Each leaf is a QID.

Return type:

list[str]

deduce_leaves_from_multiple_srcs(srcs, path, limit=2000)

Deuce leave entities from multiple source entities following the path.

Parameters:
  • srcs (list[str]) – list of source entities

  • path (list[str]) – path from source entity to destination entity

  • limit (int, optional) – limit of the number of leaves. Defaults to 200.

Returns:

list of leaves. Each leaf is a QID.

Return type:

list[str]

get_description(identifier)

Get description of an entity or a relation. If no description is found, return None.

Parameters:

identifier (str) – entity or relation, a QID or a PID

Returns:

description of the entity or relation

Return type:

str | None

get_label(identifier)

Get label of an entity or a relation. If no label is found, return None.

Parameters:

identifier (str) – entity or relation, a QID or a PID

Returns:

label of the entity or relation

Return type:

str | None

get_neighbor_relations(src, hop=1, limit=100)

Get all relations connected to an entity. The relations are limited to direct relations (those with wdt: prefix).

Parameters:
  • src (str) – source entity

  • hop (int, optional) – hop of the relations. Defaults to 1.

  • limit (int, optional) – limit of the number of relations. Defaults to 100.

Returns:

list of relations. Each relation is a PID or a tuple of PIDs.

Return type:

list[str] | list[tuple(str,)]

static get_pid_from_uri(uri)

Get property id from uri.

get_quantifier_filter(var_name)

Get quantifier filter string where the var is restricted to be entities. If exclude_qualifiers is set to False, return empty string.

Note: in Wikidata, entities are prefixed with “http://www.wikidata.org/entity/Q”,

while qualifiers are non-entity (and mostly string) values.

static is_pid(pid)

Check if pid is a valid Wikidata property id.

static is_qid(qid)

Check if qid is a valid Wikidata entity id.

queryWikidata(query)
search_one_hop_relations(src, dst)

Search one hop relation between src and dst.

Parameters:
  • src (str) – source entity

  • dst (str) – destination entity

Returns:

list of paths, each path is a list of PIDs

Return type:

list[list[str]]

search_two_hop_relations(src, dst)

Search two hop relation between src and dst.

Parameters:
  • src (str) – source entity

  • dst (str) – destination entity

Returns:

list of paths, each path is a list of PIDs

Return type:

list[list[str]]