Complex Logical Query Answering & Neural Graph Databases

A collection of resources on the topic of Complex Logical Query Answering accompanying the paper Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases. Feel free to open PRs and issues to add new papers, datasets, and implementations!

This repo follows the Neural Query Engine taxonomy proposed in the paper (Figure 9).

📜 Categorization of papers

Graphs | Modalities

Triple-based KGs (44)

GQE, NeurIPS 2018
GQE+hashing, ICDM 2019
CGA, K-CAP 2019
TractOR, UAI 2020
Query2Box, ICLR 2020
BetaE, NeurIPS 2020
EmQL, NeurIPS 2020
MPQE, ICML 2020 Workshop
RotatE-Box, AKBC 2021
BiQE, AAAI 2021
Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
CQD, ICLR 2021
HypE, WWW 2021
NewLook, KDD 2021
ConE, NeurIPS 2021
PERM, NeurIPS 2021
Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
LogicE, arxiv 2021
MLPMix, ICLR 2022
FuzzQE, AAAI 2022
GNN-QE, ICML 2022
SMORE, KDD 2022
kgTransformer, KDD 2022
LinE, KDD 2022
Query2Particles, NAACL 2022
TAR, arxiv 2022
TeMP, arxiv 2022
FLEX, arxiv 2022
TFLEX, arxiv 2022
GNNQ, ISWC 2022
ENeSy, NeurIPS 2022
NodePiece-QE, NeurIPS 2022
GammaE, EMNLP 2022
NMP-QEM, EMNLP 2022
SignalE, KSEM 2022
Query2Geom, AICS 2022
LMPNN, ICLR 2023
QTO, arxiv 2023
Var2Vec, AAAI 2023
CQD A, arxiv 2023
SQE, arxiv 2023
NRN KDD 2023
FIT, arxiv 2023
WFRE, ACL 2023

Hyper-relational KGs (2)

StarQE, ICLR 2022
NQE, AAAI 2023

Hyper-graphs and Multi-modal graphs (0)

None as of March 2023

Graphs | Reasoning Domain

Discrete (Entities only) (45)

GQE, NeurIPS 2018
GQE+hashing, ICDM 2019
CGA, K-CAP 2019
TractOR, UAI 2020
Query2Box, ICLR 2020
BetaE, NeurIPS 2020
EmQL, NeurIPS 2020
MPQE, ICML 2020 Workshop
RotatE-Box, AKBC 2021
BiQE, AAAI 2021
Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
CQD, ICLR 2021
HypE, WWW 2021
NewLook, KDD 2021
ConE, NeurIPS 2021
PERM, NeurIPS 2021
Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
LogicE, arxiv 2021
MLPMix, ICLR 2022
StarQE, ICLR 2022
FuzzQE, AAAI 2022
GNN-QE, ICML 2022
CBR-SUBG, ICML 2022
SMORE, KDD 2022
kgTransformer, KDD 2022
LinE, KDD 2022
Query2Particles, NAACL 2022
TAR, arxiv 2022
TeMP, arxiv 2022
FLEX, arxiv 2022
GNNQ, ISWC 2022
ENeSy, NeurIPS 2022
NodePiece-QE, NeurIPS 2022
GammaE, EMNLP 2022
NMP-QEM, EMNLP 2022
SignalE, KSEM 2022
Query2Geom, AICS 2022
LMPNN, ICLR 2023
QTO, arxiv 2023
Var2Vec, AAAI 2023
NQE, AAAI 2023
CQD A, arxiv 2023
SQE, TMLR 2023
FIT, arxiv 2023
WFRE, ACL 2023

Discrete Temporal (Entities + Dates) (1)

TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph, arxiv 2022

Discrete + Continuous (Entities + string/numerical Literals) (0)

None as of March 2023

Graphs | Background Semantics

Facts-only (ABOX) (42)

GQE, NeurIPS 2018
GQE+hashing, ICDM 2019
TractOR, UAI 2020
Query2Box, ICLR 2020
BetaE, NeurIPS 2020
EmQL, NeurIPS 2020
MPQE, ICML 2020 Workshop
RotatE-Box, AKBC 2021
BiQE, AAAI 2021
Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
CQD, ICLR 2021
HypE, WWW 2021
NewLook, KDD 2021
ConE, NeurIPS 2021
PERM, NeurIPS 2021
LogicE, arxiv 2021
MLPMix, ICLR 2022
FuzzQE, AAAI 2022
GNN-QE, ICML 2022
CBR-SUBG, ICML 2022
SMORE, KDD 2022
kgTransformer, KDD 2022
LinE, KDD 2022
Query2Particles, NAACL 2022
FLEX, arxiv 2022
TFLEX, arxiv 2022
GNNQ, ISWC 2022
ENesy, NeurIPS 2022
NodePiece-QE, NeurIPS 2022
GammaE, EMNLP 2022
NMP-QEM, EMNLP 2022
SignalE, KSEM 2022
Query2Geom, AICS 2022
LMPNN, ICLR 2023
QTO, arxiv 2023
Var2Vec, AAAI 2023
NQE, AAAI 2023
CQD A, arxiv 2023
SQE, TMLR 2023
NRN KDD 2023
FIT, arxiv 2023
WFRE, ACL 2023

Class Hierarchy (3)

CGA, K-CAP 2019
TeMP, arxiv 2022
TAR, arxiv 2022

Complex axioms (TBOX) (1)

Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021

Modeling | Encoder

Shallow Embedding (32)

GQE, NeurIPS 2018
GQE+hashing, ICDM 2019
CGA, K-CAP 2019
TractOR, UAI 2020
Query2Box, ICLR 2020
BetaE, NeurIPS 2020
EmQL, NeurIPS 2020
Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
RotatE-Box, AKBC 2021
Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
HypE, WWW 2021
NewLook, KDD 2021
CQD, ICLR 2021
ConE, NeurIPS 2021
PERM, NeurIPS 2021
LogicE, arxiv 2021
FuzzQE, AAAI 2022
SMORE, KDD 2022
LinE, KDD 2022
TAR, arxiv 2022
Query2Particles, NAACL 2022
FLEX, arxiv 2022
TFLEX, arxiv 2022
GammaE, EMNLP 2022
NMP-QEM, EMNLP 2022
SignalE, KSEM 2022
Query2Geom, AICS 2022
QTO, arxiv 2023
Var2Vec, AAAI 2023
CQD A, arxiv 2023
FIT, arxiv 2023
WFRE, ACL 2023

Transductive Encoder (9)

MPQE, ICML 2020 Workshop
BiQE, AAAI 2021
kgTransformer, KDD 2022
MLPMix, ICLR 2022
StarQE, ICLR 2022
ENeSy NeurIPS 2022
LMPNN ICLR 2023
NQE, AAAI 2023
SQE, TMLR 2023

Inductive Encoder (4)

(TeMP) Type-aware embeddings for multi-hop reasoning over knowledge graphs, arxiv 2022
(GNN-QE) Neural-Symbolic Models for Logical Queries on Knowledge Graphs, ICML 2022
(GNNQ) GNNQ: A Neuro-Symbolic Approach for Query Answering over Incomplete Knowledge Graphs, ISWC 2022
(NodePiece-QE) Inductive Logical Query Answering in Knowledge Graphs, NeurIPS 2022

Modeling | Processor

Any Processor (2)

TeMP, arxiv 2022
NodePiece-QE, NeurIPS 2022

End-to-end Neural (15)

GQE, NeurIPS 2018
GQE+hashing, ICDM 2019
CGA, K-CAP 2019
MPQE, ICML 2020 Workshop
BiQE, AAAI 2021
MLPMix, ICLR 2022
StarQE, ICLR 2022
kgTransformer, KDD 2022
Query2Particles, NAACL 2022
SMORE, KDD 2022
GNNQ, ISWC 2022
SignalE, KSEM 2022
LMPNN ICLR 2023
SQE, TMLR 2023
WFRE, ACL 2023

Neuro-Symbolic | Geometric (8)

Query2Box, ICLR 2020
Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
RotatE-Box, AKBC 2021
NewLook, KDD 2021
HypE, WWW 2021
ConE, NeurIPS 2021
Query2Geom, AICS 2022

Neuro-Symbolic | Probabilistic (5)

BetaE, NeurIPS 2020
PERM, NeurIPS 2021
LinE, KDD 2022
GammaE, EMNLP 2022
NMP-QEM, EMNLP 2022

Neuro-Symbolic | Fuzzy Logic (16)

EmQL, NeurIPS 2020
TractOR, UAI 2020
CQD, ICLR 2021
LogicE, arxiv 2021
FuzzQE, AAAI 2022
TAR, arxiv 2022
FLEX, arxiv 2022
TFLEX, arxiv 2022
GNN-QE, ICML 2022
ENeSy NeurIPS 2022
QTO, arxiv 2023
NQE, AAAI 2023
Var2Vec, AAAI 2023
CQD A, arxiv 2023
FIT, arxiv 2023
WFRE, ACL 2023

Modeling | Decoder

Non-Parametric (all)

All existing models up to March 2023

Parametric (0)

None as of March 2023

Queries | Query Operators

Progressive scale of supported operators. That is, all models listed under the "NOT" category also support JOIN and UNION.

PROJECTION + JOIN (intersection) (10)

GQE, NeurIPS 2018
GQE+hashing, ICDM 2019
CGA, K-CAP 2019
TractOR, UAI 2020
MPQE, ICML 2020 Workshop
Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
BiQE, AAAI 2021
StarQE, ICLR 2022
SMORE, KDD 2022
GNNQ, ISWC 2022

+ UNION (9)

Query2Box, ICLR 2020
EmQL, NeurIPS 2020
HypE, WWW 2021
NewLook, KDD 2021
PERM, NeurIPS 2021
CQD ICLR’21
Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
kgTransformer, KDD 2022
Query2Geom, AICS 2022

+ NOT (negation) (23)

BetaE, NeurIPS 2020
ConE, NeurIPS 2021
LogicE, arxiv 2021
MLPMix, ICLR 2022
FuzzQE, AAAI 2022
GNN-QE, ICML 2022
LinE, KDD 2022
Query2Particles, NAACL 2022
GammaE, EMNLP 2022
NMP-QEM, EMNLP 2022
TAR, arxiv 2022
FLEX, arxiv 2022
TFLEX, arxiv 2022
ENeSy, NeurIPS 2022
SignalE, KSEM 2022
QTO, arxiv 2023
LMPNN, ICLR 2023
NQE, AAAI 2023
Var2Vec, AAAI 2023
CQD A, arxiv 2023
SQE, TMLR 2023
FIT, arxiv 2023
WFRE, ACL 2023

Kleene Plus (1)

RotatE-Box, AKBC 2021

FILTER (0)

None as of March 2023

AGGREGATIONS (GROUP BY, ORDER BY, etc) (0)

None as of March 2023

Queries | Query Patterns

Tree-structured (47)

All existing processors as of March 2023

Arbitrary DAGs (1)

FIT, arxiv 2023

Cyclic Queries (1)

FIT, arxiv 2023

Queries | Projected Variables

Zero Projected Vars (ASK queries) (0)

None as of March 2023

One Projected Variable (all)

All processors as of March 2023

Multiple Projected Variables (1)

(EFOk-CQA) EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation arxiv 2023

Metrics

Original metrics: ROC AUC and Average Percentile Rank over 1000 negative samples.
- Proposed by original GQE (NeurIPS 2018), used by GQE+hashing, CGA, and TractOR. Not used after.
Generalization: predicting hard answers (MRR / Hits@k).
- Introduced by Query2Box (ICLR 2020). Standard metric.
Generalization: from ranking to binary classification
- Proposed in Approximate knowledge graph query answering: from ranking to binary classification
Entailment: faithfulness - ability to recover easy answers (no link prediction) (MRR / Hits@k)
- Proposed by EmQL (NeurIPS 2020)
Estimating the cardinality of answer set size (Spearman's rank correlation, MAPE)
- Used in GNN-QE, QTO
Predicting easy answers before hard answers (ROC-AUC)
- Used in NodePiece-QE
Multiple variable queries, (multiply / marginal / joint) x (MRR/ HITs@k)
- Proposed in EFOk-CQA arxiv 2023

📈 Datasets and Benchmarking

Inference (datasets)

Transductive datasets (15)

(GQE datasets) GQE, NeurIPS 2018
(Query2Box datasets) Query2Box, ICLR 2020
(BetaE datasets) BetaE, NeurIPS 2020
(Regex datasets) Regex Queries, AKBC 2021
(BiQE dataset) BiQE, AAAI 2021
(Query2Onto datasets) Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
(EFO-1 dataset) Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs, NeurIPS 2021 (Datasets and Benchmarks)
(SMORE datasets) SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs, KDD 2022
(StarQE dataset) Query Embedding on Hyper-relational Knowledge Graphs ICLR 2022,
(TFLEX dataset) TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph, arxiv 2022
(WD50K-NFOL dataset) NQE: N-ary Query Embedding for Complex Query Answering over Hyper-relational Knowledge Graphs, AAAI 2023
(SQE dataset) Sequential Query Encoding For Complex Query Answering on Knowledge Graphs
(Real EFO-1) On Existential First Order Queries Inference on Knowledge Graphs, arxiv 2023
(Numerical CQA dataset) Knowledge Graph Reasoning over Entities and Numerical Values KDD 2023
(EFOk-CQA) EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation arxiv 2023

Inductive datasets (3)

(TeMP datasets) Type-aware embeddings for multi-hop reasoning over knowledge graphs, arxiv 2022
(InductiveQE datasets) Inductive Logical Query Answering in Knowledge Graphs NeurIPS 2022
(GNNQ dataset) GNNQ: A Neuro-Symbolic Approach for Query Answering over Incomplete Knowledge Graphs ISWC 2022

GQE Datasets

Are Bio and Reddit available at all? Introduced in GQE, used in 4 papers overall (GQE, GQE+hashing, CGA, TractOR).

BetaE Datasets

The main difference with Query2Box datasets: queries in the BetaE datasets have less than 100 answers. Has queries with negation.

Introduced in Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs, NeurIPS 2020

Graphs

Dataset	Entities	Relations	Training Edges	Validation Edges	Test Edges	Total Edges
FB15k	14,951	1,345	483,142	50,000	59,071	592,213
FB15k237	14,505	237	272,115	17,526	20,438	310,079
NELL995	63,361	200	114,213	14,324	14,267	142,804

Queries

Queries	Training	Training	Validation	Validation	Test	Test
Dataset	1p/2p/3p/2i/3i	2in/3in/inp/pin/pni	1p	others	1p	others
FB15k	273,710	27,371	59,097	8,000	67,016	8,000
FB15k237	149,689	14,968	20,101	5,000	22,812	5,000
NELL995	107,982	10,798	16,927	4,000	17,034	4,000

Average Number of Answers

Dataset	1p	2p	3p	2i	3i	ip	pi	2u	up	2in	3in	inp	pin	pni
FB15k	1.7	19.6	24.4	8.0	5.2	18.3	12.5	18.9	23.8	15.9	14.6	19.8	21.6	16.9
FB15k237	1.7	17.3	24.3	6.9	4.5	17.7	10.4	19.6	24.3	16.3	13.4	19.5	21.7	18.2
NELL995	1.6	14.9	17.5	5.7	6.0	17.4	11.9	14.9	19.0	12.9	11.1	12.9	16.0	13.0

Query2Box Datasets

Introduced in Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings, ICLR 2020.

EPFO queries are considered easier than BetaE datasets. Doesn't have queries with negations.

Graphs

Dataset	Entities	Relations	Training Edges	Validation Edges	Test Edges	Total Edges
FB15k	14,951	1,345	483,142	50,000	59,071	592,213
FB15k237	14,505	237	272,115	17,526	20,438	310,079
NELL995	63,361	200	114,213	14,324	14,267	142,804

Queries

Queries	Training	Training	Validation	Validation	Test	Test
Dataset	1p	others	1p	others	1p	others
FB15k	273,710	273,710	59,097	8,000	67,016	8,000
FB15k237	149,689	149,689	20,101	5,000	22,812	5,000
NELL995	107,982	107,982	16,927	4,000	17,034	4,000

Average Number of Answers

Dataset	1p	2p	3p	2i	3i	ip	pi	2u	up
FB15k	10.8	255.6	250.0	90.3	64.1	593.8	190.1	27.8	227.0
FB15k237	13.3	131.4	215.3	69.0	48.9	593.8	257.7	35.6	127.7
NELL995	8.5	56.6	65.3	30.3	15.9	310.0	144.9	14.4	62.5

CGA Datasets

GQE-like patterns mined on subsets of DBpedia and Wikidata. The datasets are DB18 and WikiGeo19, introduced in CGA, K-CAP 2019.

As of Sept 2022: not available.

Regex Queries

Queries emulating property paths in SPARQL with variable length of relation paths (up to length 5). Queries are EPFO queries, i.e., no negation. New operators over relations resemble those from SPARQL:

$r_1 / r_2 / \dots$ - relational path, aka classic projection queries
$r_1 \lor r_2$ - a union of decomposed patterns $(e, r_1, ?) \lor (e, r_2, ?)$
Kleene plus $r^{+}$ - one or more occurence of relation $r$, eg, $r_1/r_2^{+}$ corresponds to $r_1 / r_2$, $r_1 / r_2 / r_2$, $r_1 / r_2 / r_2 / r_2 / \dots$ up to some final depth. Those can be cyclic patterns.

Two datasets:

FB15k-Regex is based on Freebase, queries have less than 50 answers, 21 query types
Wiki100-Regex is based on query logs from the official Wikidata SPARQL endpoint, 5 query types.

Introduced in RotatE-Box, AKBC 2021.

Repo: GitHub - no actual data dumps are present :(

Graphs

Dataset	Entities	Relations	Training Edges	Validation Edges	Test Edges	Total Edges
FB15k	14,951	1,345	483,142	50,000	59,071	592,213
Wiki100	41,291	100	389,795	21,655	21,656	433,106

Queries

FB15k-Regex

Query type	Train	Valid	Test
$(e_1, r_1^+, ?)$	24,476	4,614	8,405
$(e_1, r_1/r_2, ?)$	25,378	4,927	8,844
$(e_1, r_1^+/r_2^+, ?)$	26,391	4,978	9,028
$(e_1, r_1^+/r_2^+/r_3^+, ?)$	25,470	4,878	8,816
$(e_1, r_1/r_2^+, ?)$	26,335	5,007	9,062
$(e_1, r_1^+/r_2, ?)$	27,614	5,229	9,429
$(e_1, r_1^+/r_2^+/r_3, ?)$	27,865	5,283	9,509
$(e_1, r_1^+/r_2/r_3^+, ?)$	26,366	5,058	9,159
$(e_1, r_1/r_2^+/r_3^+, ?)$	26,366	5,045	9,099
$(e_1, r_1/r_2/r_3^+, ?)$	26,703	5,155	9,313
$(e_1, r_1/r_2^+/r_3, ?)$	28,005	5,380	9,688
$(e_1, r_1^+/r_2/r_3, ?)$	27,884	5,338	9,632
$(e_1, r_1\lor r_2, ?)$	30,080	5,828	9,664
$(e_1, (r_1\lor r_2)/r_3, ?)$	31,559	6,606	10,974
$(e_1, r_1/(r_2\lor r_3), ?)$	41,886	7,755	13,611
$(e_1, r_1^+\lor r_2^+, ?)$	23,109	4,469	8,367
$(e_1, (r_1\lor r_2)/r_3^+, ?)$	27,658	5,738	9,711
$(e_1, (r_1^+\lor r_2^+)/r_3, ?)$	24,462	4,865	8,863
$(e_1, r_1^+/(r_2\lor r_3), ?)$	27,676	5,340	9,267
$(e_1, r_1/(r_2^+\lor r_3^+), ?)$	28,542	5,475	9,436
$(e_1, (r_1\lor r_2)^+, ?)$	26,260	5,523	10,360
Total	580,085	112,491	200,237

Wiki100-Regex

Query type	Train	Valid	Test
$(e_1, r_1^+, ?)$	490,562	24,878	23,443
$(e_1, r_1^+/r_2^+, ?)$	6,945	620	772
$(e_1, r_1/r_2^+, ?)$	85,253	10,013	8,377
$(e_1, r_1\lor r_2, ?)$	274,012	14,900	14,915
$(e_1, (r_1\lor r_2)^+, ?)$	348,274	15,720	15,311
Total	1,205,046	66,131	62,818

DAG Queries

Conjunctive queries (w/o union) not limited to 9 patterns from Query2Box/BetaE datasets. The task is to predict all intermediate entities, not just final leaf nodes. Query depth: 2-5; max 3 intersecting branches.

Introduced in Answering complex queries in knowledge graphs with bidirectional sequence encoders, AAAI 2021.

New FB15K-237-CQ and WN18RR-CQ datasets have two variations:

CQ (conjunctive queries) - Training on triples + paths + DAGs, Validation/Test on DAGs only;
Paths - Training on triples + paths, Validation/Test on paths only

Sept 2022: the datasets are not publicly available.

Graphs

Dataset	FB15K-237-CQ	FB15K-237-CQ	FB15K-237-CQ	WN18RR-CQ	WN18RR-CQ	WN18RR-CQ
Dataset	Train	Validation	Test	Train	Validation	Test
Entities	14,505	-	-	40,943	-	-
Relations	237	237	237	11	11	11
Triples	272,115	-	-	86,835	-	-
Paths	50,000	-	-	10,000	-	-
DAGs	48,865	2,785	2,599	9,465	112	95
Avg Masks	1.86	5.91	6.05	1.84	5.13	4.91
Avg Query Len (Tokens)	152	460	479	71	198	199

Queries

No detailed breakdown by query type is available, only the DAGs stats from the main table.

Dataset	FB15K-237-CQ	FB15K-237-CQ	FB15K-237-CQ	WN18RR-CQ	WN18RR-CQ	WN18RR-CQ
Dataset	Train	Validation	Test	Train	Validation	Test
Paths	50,000	-	-	10,000	-	-
DAGs	48,865	2,785	2,599	9,465	112	95
Avg Masks	1.86	5.91	6.05	1.84	5.13	4.91
Avg Query Len (Tokens)	152	460	479	71	198	199

EFO-1 Queries

Existential First-Order queries with Single Free Variable, extended from BetaE. The goal is to evaluate the combinatorial generalizability.

Introduced in Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs, NeurIPS 2021 (Datasets and Benchmarks)

Graphs

Queries	Training	Training	Validation	Validation	Test	Test
Dataset	1p/2p/3p/2i/3i	2in/3in/inp/pin/pni	1p	others	1p	others
FB15k	273,710	27,371	59,097	8,000	67,016	8,000
FB15k237	149,689	14,968	20,101	5,000	22,812	5,000
NELL995	107,982	10,798	16,927	4,000	17,034	4,000

Queries

Cannot list all the 301 query types. Details can be found in a summarization excel file here.

Real EFO-1 dataset

Rethinking the EFO-1 formulation by introducing leaf nodes, multi edge, and cycle. For standard FB15k, FB15k-237, and NELL - 9 new query types (10 with reworked pni type) including:

l - queries with existentially quantified variables as leaf nodes (2il, 3il)
m - queries with multiple relation projection edges from one variable to another (2m, 2nm, 3mp, 3pm, im)
c - queries with cycles (3c, cm)

All new query have 5000 instances in three KGs. Introduced in On Existential First Order Queries Inference on Knowledge Graphs, arxiv 2023. The dataset can be downloaded from here.

EPFO queries with Literals

Based on a variation of the FB15k-237 dataset with entity attributes (12,390 entities, 237 relations, 115 attributes, 29,229 (?) triples). Literals are restricted to numerical values, three additional filter functions (less than, equal, greater then).

The dataset includes standard 9 EPFO query types and adds 8 more variations of those patterns enriched with literals:

5 query types where literals are in queries, but the answer is an entity (ai, 2ai, pai, aip, au)
3 query types where literals are in queries, and the answer is a mean of relevant literal values (1ap, 2ap, 3ap)

Introduced in LitCQD: Multi-Hop Reasoning in Incomplete Knowledge Graphs with Numeric Literals, arxiv 2023

SQE Queries

Existential First-Order queries aimed at evaluating compositional generalization to OOD query patterns (29 in-distribution types, 29 out-of-distribution). In contrast to BetaE datasets, does not have restrictions on the number of answers per query, long tails are possible.

Introduced in Sequential Query Encoding For Complex Query Answering on Knowledge Graphs

Graphs

Queries	Training	Training	Validation	Test
Dataset	1p	others	all	all
FB15k	273,710	821,130	8,000	8,000
FB15k237	149,689	449,067	5,000	5,000
NELL995	107,982	323,946	4,000	4,000

Queries

58 query types, refer to Appendix A in the paper for the full list of patterns.

EFOk queries

Introduced by (EFOk-CQA) EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation with 741 query types in total.

Featured by:

existential first-order queries with more than multiple variables.
combinatorial space with multi-edge and cyclic queries

Numerical CQA Queries

The Numerical CQA queries both include entities and typed numerical attribute values.

Introduced by Knowledge Graph Reasoning over Entities and Numerical Values

Graphs

Graphs	Data Split	1p	2p	2i	3i	pi	ip	2u	up	All
FB15K	Training	304,633	138,192	226,729	288,874	260,057	233,834	284,301	284,931	2,021,551
	Validation	8,271	15,860	23,359	28,836	25,081	22,930	29,187	29,210	182,734
	Testing	7,969	15,431	23,346	28,865	24,810	22,232	29,212	29,274	181,139
DB15K	Training	124,851	99,698	140,427	190,413	171,353	163,687	190,364	194,244	1,275,037
	Validation	3,529	10,388	9,792	13,817	14,594	16,651	19,512	19,792	108,075
	Testing	3,387	10,047	9,914	14,603	14,642	15,897	19,504	19,773	107,767
YAGO15K	Training	84,014	76,238	136,282	183,850	162,712	145,994	183,963	183,459	1,156,512
	Validation	2,833	7,986	10,757	16,884	13,485	13,899	18,444	19,105	103,393
	Testing	2,713	7,949	10,935	17,171	13,481	13,526	18,433	18,997	103,205

Type-Aware Datasets

In addition to a normal graph of entities (instances) a-la BetaE datasets, the type-aware datasets offer an additional set of classes, classes hierarchy (from a pre-existing ontology), and instanceOf links between entities and classes.

Those datasets might include an additional task of predicting types of answer entities (Concept Retrieval).

LUBM, introduced in Neuro-Symbolic Ontology-Mediated Query Answering, OpenReview 2021
NELL, introduced in Neuro-Symbolic Ontology-Mediated Query Answering, OpenReview 2021. The base graph is the same as in the BetaE datasets, but a few ontological axioms were added.
YAGO 4, introduced in TAR: Neural Logical Reasoning across TBox and ABox
DBpedia, introduced in TAR: Neural Logical Reasoning across TBox and ABox

LUBM and NELL employ ontological axioms of the DL-Lite (R) family of Description Logics.

Graphs

TODO Figure out Concept Retrieval edges in TAR

Dataset	Entities	Relations	Axioms	Base Graph	Materialized Graph
LUBM	55,684	28	68	284k	565k
NELL	63,361	400	307	285k	497k

Axioms breakdown in ontologies for LUBM and NELL

Rules	LUBM	NELL
$\mathcal{O}$ (Total)	68	307
$A \sqsubseteq A'$ (Subclass)	13	-
$p \sqsubseteq s$	5	92
$p^{-} \sqsubseteq s$	28	215
$\exists p \sqsubseteq A$	11	-
$\exists p^{-} \sqsubseteq A$	11	-

Dataset	Entities	Relations	Classes	Training Edges	Validation Edges	Test Edges	Entity-Class Edges	Class Hierarchy Edges	Total Edges
YAGO 4	32,465	75	8,382	101,417	1,000	1,000	83,291	16,644	184,708
DBpedia	28,824	327	981	136,821	1,000	1,000	225,436	2,582	362,257

Queries

Dataset	Train / Test	1p	2p	3p	2i	3i	ip	pi	2u	up
LUBM	Plain (Train)	110,000	110,000	110,000	110,000	110,000	-	-	-	-
LUBM	Generalized (Train)	117,124	136,731	150,653	181,234	208,710	-	-	-	-
LUBM	Specialized (Train)	117,780	154,851	173,678	271,532	230,085	-	-	-	-
LUBM	Ontological (Train)	116,893	166,159	333,406	212,718	491,707	-	-	-	-
LUBM	Induction (w/ missing links in queries) (Val/Test)	8,000	8,000	8,000	8,000	8,000	8,000	8,000	8,000	8,000
LUBM	Deduction (w/o missing link in training) (Val/Test)	1,241	4,701	6,472	3,829	4,746	7,393	7,557	4,986	7,122
LUBM	Induction + Deduction (Val/Test)	8,000	8,000	8,000	8,000	8,000	8,000	8,000	7,986	8,000
NELL	Plain (Train)	107,982	107,982	107,982	107,982	107,982	-	-	-	-
NELL	Generalized (Train)	174,310	408,842	864,268	398,412	930,787	-	-	-	-
NELL	Specialized (Train)	174,310	419,664	906,609	401,954	936,537	-	-	-	-
NELL	Ontological (Train)	114,614	542,923	864,268	629,144	930,787	-	-	-	-
NELL	Induction (w/ missing links in queries) (Val/Test)	15,688	3,910	3,918	3,828	3,786	3,932	3,895	3,940	3,966
NELL	Deduction (w/o missing link in training) (Val/Test)	346	4,461	4,294	4,842	5,996	7,295	5,862	5,646	6,894
NELL	Induction + Deduction (Val/Test)	8,000	8,000	8,000	8,000	8,000	8,000	8,000	7,990	8,000

Queries	Training	Training	Validation	Validation	Test	Test
Dataset	1p	others	1p	others	1p	others
YAGO 4 (Concept Retrieval)	189,338	10,000	1,000	1,000	1,000	1,000
YAGO 4 (Entity Only)	101,417	10,000	1,000	1,000	1,000	1,000
YAGO 4 (Entity + Instantiations)	184,708	10,000	1,000	1,000	1,000	1,000
DBpedia (Concept Retrieval)	473,924	10,000	1,000	1,000	1,000	1,000
DBpedia (Entity Only)	136,821	10,000	1,000	1,000	1,000	1,000
DBpedia (Entity + Instantiations)	362,257	10,000	1,000	1,000	1,000	1,000

Very Large Datasets

Introduced in SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs, KDD 2022.

Training queries are sampled on-the-fly during training due to the huge size of underlying graphs.

The underlying graphs are FB400k (400K nodes), WikiKG 2 (2.5M nodes) (from OGB), and full Freebase (86M nodes) TODO: confirm with Hongyu the number of validation / test queries.

Graphs

Dataset	Entities	Relations	Training Edges	Validation Edges	Test Edges	Total Edges
FB400k	409,829	918	1,075,837	537,917	537,917	2,151,671
WikiKG2	2,500,604	535	16,109,182	429,456	598,543	17,137,181
Freebase	86,054,361	14,824	304,727,650	16,929,318	16,929,308	338,586,276

Queries

Queries	Validation	Validation	Test	Test
Dataset	1p	others	1p	others
FB400k	TODO	TODO	TODO	TODO
WikiKG2	TODO	TODO	TODO	TODO
Freebase	TODO	TODO	TODO	TODO

Hyper-Relational Datasets

The main difference of hyper-relational datasets is that edges are no longer plain triples $(h, r, t)$ but statements (in terms of Wikidata or RDF-Star) $\Big(h, r, t, (q_{ri}, q_{ei})_i\Big)$ with key-value (relation:entity) qualifiers $(q_{r}, q_{e})$ over the main triple. For example, in the statment (Albert Einstein, educated at, ETH Zurich, (degree, Bachelor)), the main triple is Albert Einstein, educated at, ETH Zurich and its unique qualifier is (degree, Bachelor). Qualifiers provide an additional context to the edge - the tail node might change with another qualifier, e.g., (Albert Einstein, educated at, University of Zurich, (degree, Doctorate)).

Entities and relations in qualifiers are still legit entities and relations which could be present in main triples. Some entities and relations can be found only in qualifiers.

The WD50K dataset has only conjunctive queries (projection + intersection), neither union nor negation.

Introduced in Query Embedding on Hyper-Relational Knowledge Graphs, ICLR 2022

The WD50K-NFOL dataset introduced in NQE: N-ary Query Embedding for Complex Query Answering over Hyper-relational Knowledge Graphsadds unions and negations, as well as possibility of variables at qualifier entity positions. **As of Nov 2022, not openly available)

Graph

The original WD50K graph from the StarE paper by Galkin et al.

Dataset	Entities	Relations	Qualifier-only Entities	Qualifier-only Relations	Training Edges	Validation Edges	Test Edges	Total Edges
WD50K	47,156	532	5460	45	166,435	23,913	46,159	236,508

32,167 edges have at least one key-value (relation:entity) qualifier.

Queries

Split	1p	2p	3p	2i	3i	ip	pi
train	24,819	313,088	5,950,990	48,513	318,735	306,022	1,088,539
validation	4,100	100,706	2,968,315	15,648	169,195	169,438	569,957
test	7,716	202,045	6,433,476	38,207	547,272	445,007	1,267,452

WD50K-NFOL stats are not yet available

Inductive Datasets

As of March 2023, there are no existing purely inductive datasets such that the training and validation/test graphs are different (validation and test containing new unseen entities) and predictions should only rely on the graph structure w/o external data.

Type-based Inductive

As a bridge between shallow transductive models and inductive inference, Type-aware Embeddings for Multi-Hop Reasoning over Knowledge Graphs propose to mine entity types as the invariant that remains the same at training and inference.

As a result, the following datasets assume an existing and known in advance class hierarchy (or a graph of classes). Technically, those can be put in the Type-Aware Datasets category. The query datasets only have EPFO queries (no negation).

Inductive splits have been published, see the GitHub issue

Graphs

The underlying graphs are FB15k-237-V2 and NELL995-V3 from Inductive relation prediction by subgraph reasoning by Teru et al, ICML 2020. The original repo and other datasets are here.

Training is performed on the Train Graph, but at validation/test time the model is fed with a new Inference Graph with completely new nodes. The Inference Graph has missing edges that have to be predicted at validation or test time.

Dataset	Relations	Types	Train Graph	Train Graph	Inference Graph	Inference Graph	Inference Graph	Inference Graph
			Train Entities	Train Edges	Inference Entities	Inference Edges	Validation Edges	Test Edges
FB15k-237-V2	203	3851	3,000	4,245	2,000	4,145	469	478
NELL995-V3	142	267	4,647	16,393	4,921	8.048	811	809

The type hierarchy created for those datasets remains unknown.

Queries

Queries	Training	Validation	Validation	Test	Test
Dataset	1p/2p/3p/2i/3i	1p	others	1p	others
FB15k-237-V2	9,964	1,738	2,000	791	1,000
NELL995-V3	12,010	2,197	2,000	1,167	1,500

Tree-like Conjunctive Inductive

The dataset proposed in GNNQ frames query answering as node classification. The dataset has 9 tree-like conjunctive queries (6 synthetic from WatDiv and 3 from FB15k237), no unions nor negations. For each query, there are P KGs with an answer entity satisfying a query and N KGs with negative samplies where an answer does not satisfy a query. Test splits have graphs with new entities (but the same query shapes).

Graphs

Many - each WatDiv query has 2K positive GRAPHS and 700K negative GRAPHS (each of about 100K triples); each FB15k237 query has about 1K positive GRAPHS and 1K negative GRAPHS (each of about 10K triples)

Queries

Each query in the table has many associated graphs where one node is an answer (positive graph sample) and where nodes are not answers (negative graph samples)

Query	Relations	Num atoms / tree depth	Train: pos/neg	Test: pos/neg
WatDiv-Q1	158	8 / 4	2114 / 699699	1085 / 349877
WatDiv-Q2	158	8 / 3	3258 / 698396	1769 / 349119
WatDiv-Q3	158	8 / 3	1520 / 700276	798 / 350165
WatDiv-Q4	158	10 / 4	2397 / 698986	1226 / 349546
WatDiv-Q5	158	10 / 4	6338 / 693988	2866 / 347570
WatDiv-Q6	158	10 / 4	7545 / 692439	3744 / 346290
FB15k237-Q1	237	7 / 4	1185 / 1180	395 / 395
FB15k237-Q2	237	7 / 4	650 / 660	220 / 220
FB15k237-Q3	237	5 / 4	860 / 870	290 / 290

Temporal Datasets

Introduced in TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph, KDD 2022.

Based on FOL operators, the dataset focuses on temporal reasoning, which includes after, before and between on any timestamp set.

Graphs

Dataset	Entities	Relations	Timestamps	Training Edges	Validation Edges	Test Edges	Total Edges
ICEWS14	7,128	230	365	72,826	8,941	8,963	90,730
ICEWS05-15	10,488	251	4,017	386,962	46,275	46,092	479,329
GDELT-500	500	20	366	2,735,685	341,961	341,961	3,419,607

Queries

Query Name	ICEWS14-Train	Validation	Test	ICES05-15-Train	Validation	Test	GDELT-500-Train	Validation	Test
Pe2	72826	3482	4037	368962	10000	10000	2215309	10000	10000
Pe3	72826	3492	4083	368962	10000	10000	2215309	10000	10000
Pe_Pt	7282	3385	3638	36896	10000	10000	221530	10000	10000
e2i	72826	3305	3655	368962	10000	10000	2215309	10000	10000
e3i	72826	2966	3023	368962	10000	10000	2215309	10000	10000
e2i_Pe	-	2913	2913	-	10000	10000	-	10000	10000
Pe_e2i	-	2913	2913	-	10000	10000	-	10000	10000
Pe_t2i	-	2913	2913	-	10000	10000	-	10000	10000
e2i_NPe	7282	3061	3192	36896	10000	10000	221530	10000	10000
e2i_peN	7282	2971	3031	36896	10000	10000	221530	10000	10000
Pe_e2i_Pe_NPe	7282	2968	3012	36896	10000	10000	221530	10000	10000
e2i_N	7282	2949	2975	36896	10000	10000	221530	10000	10000
e3i_N	7282	2913	2914	36896	10000	10000	221530	10000	10000
e2u	-	2913	2913	-	10000	10000	-	10000	10000
Pe_e2u	-	2913	2913	-	10000	10000	-	10000	10000
Pt_lPe	7282	4976	5608	36896	10000	10000	221530	10000	10000
Pt_rPe	7282	3321	3621	36896	10000	10000	221530	10000	10000
t2i	72826	5112	6631	368962	10000	10000	2215309	10000	10000
t3i	72826	3094	3296	368962	10000	10000	2215309	10000	10000
t2i_Pe	-	2913	2913	-	10000	10000	-	10000	10000
Pt_le2i	7282	3226	3466	36896	10000	10000	221530	10000	10000
Pt_re2i	7282	3236	3485	36896	10000	10000	221530	10000	10000
t2i_NPt	7282	4873	5464	36896	10000	10000	221530	10000	10000
t2i_PtN	7282	3300	3609	36896	10000	10000	221530	10000	10000
Pe_t2i_PtPe_NPt	7282	3031	3127	36896	10000	10000	221530	10000	10000
t2i_N	7282	3135	3328	36896	10000	10000	221530	10000	10000
t3i_N	7282	2924	2944	36896	10000	10000	221530	10000	10000
t2u	-	2913	2913	-	10000	10000	-	10000	10000
Pe_t2u	-	2913	2913	-	10000	10000	-	10000	10000
Pe_aPt	7282	4134	4733	68262	10000	10000	221530	10000	10000
Pe_bPt	7282	3970	4565	36896	10000	10000	221530	10000	10000
Pe_at2i	7282	4607	5338	36896	10000	10000	221530	10000	10000
Pe_bt2i	7282	4583	5386	36896	10000	10000	221530	10000	10000
between	7282	2913	2913	36896	10000	10000	221530	10000	10000

Dataset tools

Graph Query Sampler: Not a method, rather a dataset generator
EFO-1-QA-benchmark: Generating combiantorial tree-formed query types and sampling the data.
EFOk-CQA: Generating combinatorial existential first order query types with multiple (k) variables and sampling the data.

🔧 Implementations

KGReasoning: GQE, Query2Box, BetaE
CQD: GQE, Query2Box, BetaE, CQD
EFO-1-QA-benchmark: Query2Box, BetaE, LogicE, NewLook, ConE, FuzzQE
Query2particles
StarQE: StarQE
SMORE: GQE, Query2Box, BetaE + Very Large Datasets
GNN-QE: GNN-QE
InductiveQE: Inductive QE with NodePiece and GNN-QE
TAR: TAR
QE-TeMP: TeMP (based on KGReasoning)
GNNQ: GNNQ
SE-KGE: GQE, CGA, and geospatial model
LARK: LARK (uses Huggingface LLMs)
WFRE: WFRE
FIT: FIT
SQE: SQE with Transformer/LSTM/GRU/TCN, Tree-LSTM, Tree-RNN, BetaE, BiQE, ConE, FuzzQE, GQE, HypE, NerualMLP (Mixer), Query2Box, Query2Particles
NRN: NRN with GQE, Query2Box, Query2Particles
EFOk-CQA: EFOk

All Papers on Complex Logical Query Answering (54)

Click to expand

(GQE) Embedding Logical Queries on Knowledge Graphs NeurIPS 2018
(GQE + hashing) Learning to Hash for Efficient Search over Incomplete Knowledge Graphs ICDM 2019
(CGA) Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs K-CAP 2019, GQE + self-attention instead of DeepSet
(TractOR) Symbolic querying of vector spaces: Probabilistic databases meets relational embeddings, UAI 2020
(Query2Box) Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings ICLR 2020
(BetaE) Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs NeurIPS 2020
(EmQL) Faithful embeddings for knowledge base queries NeurIPS 2020
(MPQE) Message Passing Query Embedding ICML’20 Workshop
(RotatE-Box)Regex Queries over Incomplete Knowledge Bases AKBC’21
(BiQE) Answering complex queries in knowledge graphs with bidirectional sequence encoders, AAAI’21
Approximate knowledge graph query answering: from ranking to binary classification
Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding arxiv, 2021
(ConE) Cone: Cone embeddings for multi-hop reasoning over knowledge graphs NeurIPS’21
(PERM) Probabilistic entity representation model for reasoning over knowledge graphs (improv over BetaE) NeurIPS’21
(CQD) Complex Query Answering with Neural Link Predictors ICLR’21
(HypE) Self-Supervised Hyperboloid Representations from Logical Queries over Knowledge Graphs, WWW 2021
(NewLook) Neural-Answering Logical Queries on Knowledge Graphs (KDD’21)
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs, NeurIPS 2021 (Datasets and Benchmarks)
Neuro-Symbolic Ontology-Mediated Query Answering OpenReview 2021
(LogicE) Logic Embeddings for Complex Query Answering arxiv 2021
(StarQE) Query Embedding on Hyper-relational Knowledge Graphs ICLR 2022,
(MLPMix) Neural Methods for Logical Reasoning over Knowledge Graphs ICLR 2022
(FuzzQE) Fuzzy Logic Based Logical Query Answering on Knowledge Graphs, AAAI 2022
(GNN-QE) Neural-Symbolic Models for Logical Queries on Knowledge Graphs, ICML 2022
(SMORE) SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs KDD 2022
(kgTransformer) Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries KDD 2022
(Query2Particles) Query2Particles: Knowledge Graph Reasoning with Particle Embeddings, Findings NAACL’22
(TAR) TAR: Neural Logical Reasoning across TBox and ABox (arxiv, 2022)
(TeMP) Type-aware embeddings for multi-hop reasoning over knowledge graphs (IJCAI-ECAI 2022)
(FLEX) FLEX: Feature-Logic Embedding Framework for CompleX Knowledge Graph Reasoning (arxiv 2022)
(TFLEX) TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph (arxiv, 2022)
(LinE) LinE: Logical Query Reasoning over Hierarchical Knowledge Graphs KDD 2022
GNNQ: A Neuro-Symbolic Approach for Query Answering over Incomplete Knowledge Graphs ISWC 2022
(ENeSy) Neural-Symbolic Entangled Framework for Complex Query Answering NeurIPS 2022
(NodePiece-QE, InductiveQE) Inductive Logical Query Answering in Knowledge Graphs NeurIPS 2022
(RoMA) Reasoning over Multi-view Knowledge Graphs arxiv 2022, some new datasets, but no code/data published
(LMPNN) Logical Message Passing Networks With One-Hop Inference On Atomic Formulas ICLR'23
(GammaE) GammaE: Gamma Embeddings for Logical Queries on Knowledge Graphs EMNLP 2022
(NMP-QEM) Neural-based Mixture Probabilistic Query Embedding for Answering FOL queries on Knowledge Graphs, EMNLP 2022
(NQE) NQE: N-ary Query Embedding for Complex Query Answering over Hyper-relational Knowledge Graphs AAAI 2023
(QTO) Answering Complex Logical Queries on Knowledge Graphs via Query Computation Tree Optimization, ICML'23 submission
(SignalE) Signal Embeddings for Complex Logical Reasoning in Knowledge Graphs, KSEM'22
(Var2Vec) Efficient Embeddings of Logical Variables for Query Answering over Incomplete Knowledge Graphs, AAAI'23
(CQD-A) Adapting Neural Link Predictors for Complex Query Answering
(Query2Geom) Analysis of Attention Mechanisms in Box-Embedding Systems, 2023
(SQE) Sequential Query Encoding For Complex Query Answering on Knowledge Graphs, TMLR 2023
(CylE) CylE: Cylinder Embeddings for Multi-hop Reasoning over Knowledge Graphs, EACL 2023
(RoConE) Modeling Relational Patterns for Logical Query Answering over Knowledge Graphs
(FIT) On Existential First Order Queries Inference on Knowledge Graphs, arxiv 2023
(LitCQD) LitCQD: Multi-Hop Reasoning in Incomplete Knowledge Graphs with Numeric Literals, arxiv 2023
(LARK) Complex Logical Reasoning over Knowledge Graphs using Large Language Models, arxiv 2023
(WFRE) Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport, arxiv 2023
(NRN) Knowledge Graph Reasoning over Entities and Numerical Values KDD 2023
(EFOk-CQA) EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation arxiv 2023

Application Papers (7)

Click to expand

(SE-KGE) SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting, Transactions in GIS 2020, GQE with scalar (x,y) coordinate prediction / encoding
(LEGO) Lego: Latent execution-guided reasoning for multi-hop question answering on knowledge graphs, ICML 2021
(CBR-SubG) Knowledge base question answering by case-based reasoning over subgraphs ICML 2022, application to Question Answering, entailment only, custom datasets
(LogiRec) Towards High-Order Complementary Recommendation via Logical Reasoning Network Application: BetaE in RecSys, arxiv 2022
Context-aware explainable recommendation based on domain knowledge graph, Big Data and Cognitive Computing, 2022
(PLM4CLQA) Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning, arxiv 2023
Unifying structure reasoning and language model pre-training for complex reasoning, arxiv 2023

Potentially Relevant

Click to expand

Hybrid Structured and Similarity Queries over Wikidata plus Embeddings with Kypher-V, ISWC 2022
Combining RDF Graph Data and Embedding Models for an Augmented Knowledge Graph, BigNet 2018 Workshop @ WWW'18
TrQuery: An Embedding-based Framework for Recommanding SPARQL Queries, 2018
Towards Empty Answers in SPARQL: Approximating Querying with RDF Embedding, ISWC 2018

Citation

If you find this work useful, please cite the original paper:

@article{ren2023ngdb,
    title={Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases},
    author={Hongyu Ren and Mikhail Galkin and Michael Cochez and Zhaocheng Zhu and Jure Leskovec},
    year={2023},
    eprint={2303.14617},
    archivePrefix={arXiv},
}

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
asset		asset
LICENSE		LICENSE
README.md		README.md

License

neuralgraphdatabases/awesome-logical-query

Folders and files

Latest commit

History

Repository files navigation

Complex Logical Query Answering & Neural Graph Databases

📜 Categorization of papers

Graphs | Modalities

Graphs | Reasoning Domain

Graphs | Background Semantics

Modeling | Encoder

Modeling | Processor

Modeling | Decoder

Queries | Query Operators

Queries | Query Patterns

Queries | Projected Variables

Metrics

📈 Datasets and Benchmarking

Inference (datasets)

GQE Datasets

BetaE Datasets

Query2Box Datasets

CGA Datasets

Regex Queries

FB15k-Regex

Wiki100-Regex

DAG Queries

EFO-1 Queries

Real EFO-1 dataset

EPFO queries with Literals

SQE Queries

EFOk queries

Numerical CQA Queries

Type-Aware Datasets

Very Large Datasets

Hyper-Relational Datasets

Inductive Datasets

Type-based Inductive

Tree-like Conjunctive Inductive

Temporal Datasets

Dataset tools

🔧 Implementations

All Papers on Complex Logical Query Answering (54)

Application Papers (7)

Potentially Relevant

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Packages