Skip to content

A collection of resources on the topic of Complex Logical Query Answering

License

Notifications You must be signed in to change notification settings

neuralgraphdatabases/awesome-logical-query

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

Complex Logical Query Answering & Neural Graph Databases

A collection of resources on the topic of Complex Logical Query Answering accompanying the paper Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases. Feel free to open PRs and issues to add new papers, datasets, and implementations!

This repo follows the Neural Query Engine taxonomy proposed in the paper (Figure 9).

hierarchy

📜 Categorization of papers

Graphs | Modalities

Triple-based KGs (44)
  1. GQE, NeurIPS 2018
  2. GQE+hashing, ICDM 2019
  3. CGA, K-CAP 2019
  4. TractOR, UAI 2020
  5. Query2Box, ICLR 2020
  6. BetaE, NeurIPS 2020
  7. EmQL, NeurIPS 2020
  8. MPQE, ICML 2020 Workshop
  9. RotatE-Box, AKBC 2021
  10. BiQE, AAAI 2021
  11. Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
  12. CQD, ICLR 2021
  13. HypE, WWW 2021
  14. NewLook, KDD 2021
  15. ConE, NeurIPS 2021
  16. PERM, NeurIPS 2021
  17. Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
  18. LogicE, arxiv 2021
  19. MLPMix, ICLR 2022
  20. FuzzQE, AAAI 2022
  21. GNN-QE, ICML 2022
  22. SMORE, KDD 2022
  23. kgTransformer, KDD 2022
  24. LinE, KDD 2022
  25. Query2Particles, NAACL 2022
  26. TAR, arxiv 2022
  27. TeMP, arxiv 2022
  28. FLEX, arxiv 2022
  29. TFLEX, arxiv 2022
  30. GNNQ, ISWC 2022
  31. ENeSy, NeurIPS 2022
  32. NodePiece-QE, NeurIPS 2022
  33. GammaE, EMNLP 2022
  34. NMP-QEM, EMNLP 2022
  35. SignalE, KSEM 2022
  36. Query2Geom, AICS 2022
  37. LMPNN, ICLR 2023
  38. QTO, arxiv 2023
  39. Var2Vec, AAAI 2023
  40. CQD A, arxiv 2023
  41. SQE, arxiv 2023
  42. NRN KDD 2023
  43. FIT, arxiv 2023
  44. WFRE, ACL 2023
Hyper-relational KGs (2)
  1. StarQE, ICLR 2022
  2. NQE, AAAI 2023
Hyper-graphs and Multi-modal graphs (0)
  1. None as of March 2023

Graphs | Reasoning Domain

Discrete (Entities only) (45)
  1. GQE, NeurIPS 2018
  2. GQE+hashing, ICDM 2019
  3. CGA, K-CAP 2019
  4. TractOR, UAI 2020
  5. Query2Box, ICLR 2020
  6. BetaE, NeurIPS 2020
  7. EmQL, NeurIPS 2020
  8. MPQE, ICML 2020 Workshop
  9. RotatE-Box, AKBC 2021
  10. BiQE, AAAI 2021
  11. Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
  12. CQD, ICLR 2021
  13. HypE, WWW 2021
  14. NewLook, KDD 2021
  15. ConE, NeurIPS 2021
  16. PERM, NeurIPS 2021
  17. Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
  18. LogicE, arxiv 2021
  19. MLPMix, ICLR 2022
  20. StarQE, ICLR 2022
  21. FuzzQE, AAAI 2022
  22. GNN-QE, ICML 2022
  23. CBR-SUBG, ICML 2022
  24. SMORE, KDD 2022
  25. kgTransformer, KDD 2022
  26. LinE, KDD 2022
  27. Query2Particles, NAACL 2022
  28. TAR, arxiv 2022
  29. TeMP, arxiv 2022
  30. FLEX, arxiv 2022
  31. GNNQ, ISWC 2022
  32. ENeSy, NeurIPS 2022
  33. NodePiece-QE, NeurIPS 2022
  34. GammaE, EMNLP 2022
  35. NMP-QEM, EMNLP 2022
  36. SignalE, KSEM 2022
  37. Query2Geom, AICS 2022
  38. LMPNN, ICLR 2023
  39. QTO, arxiv 2023
  40. Var2Vec, AAAI 2023
  41. NQE, AAAI 2023
  42. CQD A, arxiv 2023
  43. SQE, TMLR 2023
  44. FIT, arxiv 2023
  45. WFRE, ACL 2023
Discrete Temporal (Entities + Dates) (1)
  1. TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph, arxiv 2022
Discrete + Continuous (Entities + string/numerical Literals) (0)
  1. None as of March 2023

Graphs | Background Semantics

Facts-only (ABOX) (42)
  1. GQE, NeurIPS 2018
  2. GQE+hashing, ICDM 2019
  3. TractOR, UAI 2020
  4. Query2Box, ICLR 2020
  5. BetaE, NeurIPS 2020
  6. EmQL, NeurIPS 2020
  7. MPQE, ICML 2020 Workshop
  8. RotatE-Box, AKBC 2021
  9. BiQE, AAAI 2021
  10. Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
  11. CQD, ICLR 2021
  12. HypE, WWW 2021
  13. NewLook, KDD 2021
  14. ConE, NeurIPS 2021
  15. PERM, NeurIPS 2021
  16. LogicE, arxiv 2021
  17. MLPMix, ICLR 2022
  18. FuzzQE, AAAI 2022
  19. GNN-QE, ICML 2022
  20. CBR-SUBG, ICML 2022
  21. SMORE, KDD 2022
  22. kgTransformer, KDD 2022
  23. LinE, KDD 2022
  24. Query2Particles, NAACL 2022
  25. FLEX, arxiv 2022
  26. TFLEX, arxiv 2022
  27. GNNQ, ISWC 2022
  28. ENesy, NeurIPS 2022
  29. NodePiece-QE, NeurIPS 2022
  30. GammaE, EMNLP 2022
  31. NMP-QEM, EMNLP 2022
  32. SignalE, KSEM 2022
  33. Query2Geom, AICS 2022
  34. LMPNN, ICLR 2023
  35. QTO, arxiv 2023
  36. Var2Vec, AAAI 2023
  37. NQE, AAAI 2023
  38. CQD A, arxiv 2023
  39. SQE, TMLR 2023
  40. NRN KDD 2023
  41. FIT, arxiv 2023
  42. WFRE, ACL 2023
Class Hierarchy (3)
  1. CGA, K-CAP 2019
  2. TeMP, arxiv 2022
  3. TAR, arxiv 2022
Complex axioms (TBOX) (1)
  1. Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021

Modeling | Encoder

Shallow Embedding (32)
  1. GQE, NeurIPS 2018
  2. GQE+hashing, ICDM 2019
  3. CGA, K-CAP 2019
  4. TractOR, UAI 2020
  5. Query2Box, ICLR 2020
  6. BetaE, NeurIPS 2020
  7. EmQL, NeurIPS 2020
  8. Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
  9. RotatE-Box, AKBC 2021
  10. Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
  11. HypE, WWW 2021
  12. NewLook, KDD 2021
  13. CQD, ICLR 2021
  14. ConE, NeurIPS 2021
  15. PERM, NeurIPS 2021
  16. LogicE, arxiv 2021
  17. FuzzQE, AAAI 2022
  18. SMORE, KDD 2022
  19. LinE, KDD 2022
  20. TAR, arxiv 2022
  21. Query2Particles, NAACL 2022
  22. FLEX, arxiv 2022
  23. TFLEX, arxiv 2022
  24. GammaE, EMNLP 2022
  25. NMP-QEM, EMNLP 2022
  26. SignalE, KSEM 2022
  27. Query2Geom, AICS 2022
  28. QTO, arxiv 2023
  29. Var2Vec, AAAI 2023
  30. CQD A, arxiv 2023
  31. FIT, arxiv 2023
  32. WFRE, ACL 2023
Transductive Encoder (9)
  1. MPQE, ICML 2020 Workshop
  2. BiQE, AAAI 2021
  3. kgTransformer, KDD 2022
  4. MLPMix, ICLR 2022
  5. StarQE, ICLR 2022
  6. ENeSy NeurIPS 2022
  7. LMPNN ICLR 2023
  8. NQE, AAAI 2023
  9. SQE, TMLR 2023
Inductive Encoder (4)
  1. (TeMP) Type-aware embeddings for multi-hop reasoning over knowledge graphs, arxiv 2022
  2. (GNN-QE) Neural-Symbolic Models for Logical Queries on Knowledge Graphs, ICML 2022
  3. (GNNQ) GNNQ: A Neuro-Symbolic Approach for Query Answering over Incomplete Knowledge Graphs, ISWC 2022
  4. (NodePiece-QE) Inductive Logical Query Answering in Knowledge Graphs, NeurIPS 2022

Modeling | Processor

Any Processor (2)
  1. TeMP, arxiv 2022
  2. NodePiece-QE, NeurIPS 2022
End-to-end Neural (15)
  1. GQE, NeurIPS 2018
  2. GQE+hashing, ICDM 2019
  3. CGA, K-CAP 2019
  4. MPQE, ICML 2020 Workshop
  5. BiQE, AAAI 2021
  6. MLPMix, ICLR 2022
  7. StarQE, ICLR 2022
  8. kgTransformer, KDD 2022
  9. Query2Particles, NAACL 2022
  10. SMORE, KDD 2022
  11. GNNQ, ISWC 2022
  12. SignalE, KSEM 2022
  13. LMPNN ICLR 2023
  14. SQE, TMLR 2023
  15. WFRE, ACL 2023
Neuro-Symbolic | Geometric (8)
  1. Query2Box, ICLR 2020
  2. Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
  3. Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
  4. RotatE-Box, AKBC 2021
  5. NewLook, KDD 2021
  6. HypE, WWW 2021
  7. ConE, NeurIPS 2021
  8. Query2Geom, AICS 2022
Neuro-Symbolic | Probabilistic (5)
  1. BetaE, NeurIPS 2020
  2. PERM, NeurIPS 2021
  3. LinE, KDD 2022
  4. GammaE, EMNLP 2022
  5. NMP-QEM, EMNLP 2022
Neuro-Symbolic | Fuzzy Logic (16)
  1. EmQL, NeurIPS 2020
  2. TractOR, UAI 2020
  3. CQD, ICLR 2021
  4. LogicE, arxiv 2021
  5. FuzzQE, AAAI 2022
  6. TAR, arxiv 2022
  7. FLEX, arxiv 2022
  8. TFLEX, arxiv 2022
  9. GNN-QE, ICML 2022
  10. ENeSy NeurIPS 2022
  11. QTO, arxiv 2023
  12. NQE, AAAI 2023
  13. Var2Vec, AAAI 2023
  14. CQD A, arxiv 2023
  15. FIT, arxiv 2023
  16. WFRE, ACL 2023

Modeling | Decoder

Non-Parametric (all)

All existing models up to March 2023

Parametric (0)
  1. None as of March 2023

Queries | Query Operators

Progressive scale of supported operators. That is, all models listed under the "NOT" category also support JOIN and UNION.

PROJECTION + JOIN (intersection) (10)
  1. GQE, NeurIPS 2018
  2. GQE+hashing, ICDM 2019
  3. CGA, K-CAP 2019
  4. TractOR, UAI 2020
  5. MPQE, ICML 2020 Workshop
  6. Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding, arxiv 2021
  7. BiQE, AAAI 2021
  8. StarQE, ICLR 2022
  9. SMORE, KDD 2022
  10. GNNQ, ISWC 2022
+ UNION (9)
  1. Query2Box, ICLR 2020
  2. EmQL, NeurIPS 2020
  3. HypE, WWW 2021
  4. NewLook, KDD 2021
  5. PERM, NeurIPS 2021
  6. CQD ICLR’21
  7. Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
  8. kgTransformer, KDD 2022
  9. Query2Geom, AICS 2022
+ NOT (negation) (23)
  1. BetaE, NeurIPS 2020
  2. ConE, NeurIPS 2021
  3. LogicE, arxiv 2021
  4. MLPMix, ICLR 2022
  5. FuzzQE, AAAI 2022
  6. GNN-QE, ICML 2022
  7. LinE, KDD 2022
  8. Query2Particles, NAACL 2022
  9. GammaE, EMNLP 2022
  10. NMP-QEM, EMNLP 2022
  11. TAR, arxiv 2022
  12. FLEX, arxiv 2022
  13. TFLEX, arxiv 2022
  14. ENeSy, NeurIPS 2022
  15. SignalE, KSEM 2022
  16. QTO, arxiv 2023
  17. LMPNN, ICLR 2023
  18. NQE, AAAI 2023
  19. Var2Vec, AAAI 2023
  20. CQD A, arxiv 2023
  21. SQE, TMLR 2023
  22. FIT, arxiv 2023
  23. WFRE, ACL 2023
Kleene Plus (1)
  1. RotatE-Box, AKBC 2021
FILTER (0)
  1. None as of March 2023
AGGREGATIONS (GROUP BY, ORDER BY, etc) (0)
  1. None as of March 2023

Queries | Query Patterns

Tree-structured (47)
  1. All existing processors as of March 2023
Arbitrary DAGs (1)
  1. FIT, arxiv 2023
Cyclic Queries (1)
  1. FIT, arxiv 2023

Queries | Projected Variables

Zero Projected Vars (ASK queries) (0)
  1. None as of March 2023
One Projected Variable (all)
  1. All processors as of March 2023
Multiple Projected Variables (1)
  1. (EFOk-CQA) EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation arxiv 2023

Metrics

  • Original metrics: ROC AUC and Average Percentile Rank over 1000 negative samples.
  • Generalization: predicting hard answers (MRR / Hits@k).
    • Introduced by Query2Box (ICLR 2020). Standard metric.
  • Generalization: from ranking to binary classification
  • Entailment: faithfulness - ability to recover easy answers (no link prediction) (MRR / Hits@k)
    • Proposed by EmQL (NeurIPS 2020)
  • Estimating the cardinality of answer set size (Spearman's rank correlation, MAPE)
  • Predicting easy answers before hard answers (ROC-AUC)
  • Multiple variable queries, (multiply / marginal / joint) x (MRR/ HITs@k)

📈 Datasets and Benchmarking

Inference (datasets)

Transductive datasets (15)
  1. (GQE datasets) GQE, NeurIPS 2018
  2. (Query2Box datasets) Query2Box, ICLR 2020
  3. (BetaE datasets) BetaE, NeurIPS 2020
  4. (Regex datasets) Regex Queries, AKBC 2021
  5. (BiQE dataset) BiQE, AAAI 2021
  6. (Query2Onto datasets) Neural-symbolic Approach for Ontology-mediated Query Answering, arxiv 2021
  7. (EFO-1 dataset) Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs, NeurIPS 2021 (Datasets and Benchmarks)
  8. (SMORE datasets) SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs, KDD 2022
  9. (StarQE dataset) Query Embedding on Hyper-relational Knowledge Graphs ICLR 2022,
  10. (TFLEX dataset) TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph, arxiv 2022
  11. (WD50K-NFOL dataset) NQE: N-ary Query Embedding for Complex Query Answering over Hyper-relational Knowledge Graphs, AAAI 2023
  12. (SQE dataset) Sequential Query Encoding For Complex Query Answering on Knowledge Graphs
  13. (Real EFO-1) On Existential First Order Queries Inference on Knowledge Graphs, arxiv 2023
  14. (Numerical CQA dataset) Knowledge Graph Reasoning over Entities and Numerical Values KDD 2023
  15. (EFOk-CQA) EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation arxiv 2023
Inductive datasets (3)
  1. (TeMP datasets) Type-aware embeddings for multi-hop reasoning over knowledge graphs, arxiv 2022
  2. (InductiveQE datasets) Inductive Logical Query Answering in Knowledge Graphs NeurIPS 2022
  3. (GNNQ dataset) GNNQ: A Neuro-Symbolic Approach for Query Answering over Incomplete Knowledge Graphs ISWC 2022

GQE Datasets

Are Bio and Reddit available at all? Introduced in GQE, used in 4 papers overall (GQE, GQE+hashing, CGA, TractOR).

BetaE Datasets

The main difference with Query2Box datasets: queries in the BetaE datasets have less than 100 answers. Has queries with negation.

Introduced in Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs, NeurIPS 2020

Graphs
Dataset Entities Relations Training Edges Validation Edges Test Edges Total Edges
FB15k 14,951 1,345 483,142 50,000 59,071 592,213
FB15k237 14,505 237 272,115 17,526 20,438 310,079
NELL995 63,361 200 114,213 14,324 14,267 142,804
Queries
Queries Training Training Validation Validation Test Test
Dataset 1p/2p/3p/2i/3i 2in/3in/inp/pin/pni 1p others 1p others
FB15k 273,710 27,371 59,097 8,000 67,016 8,000
FB15k237 149,689 14,968 20,101 5,000 22,812 5,000
NELL995 107,982 10,798 16,927 4,000 17,034 4,000
Average Number of Answers
Dataset 1p 2p 3p 2i 3i ip pi 2u up 2in 3in inp pin pni
FB15k 1.7 19.6 24.4 8.0 5.2 18.3 12.5 18.9 23.8 15.9 14.6 19.8 21.6 16.9
FB15k237 1.7 17.3 24.3 6.9 4.5 17.7 10.4 19.6 24.3 16.3 13.4 19.5 21.7 18.2
NELL995 1.6 14.9 17.5 5.7 6.0 17.4 11.9 14.9 19.0 12.9 11.1 12.9 16.0 13.0

Query2Box Datasets

Introduced in Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings, ICLR 2020.

EPFO queries are considered easier than BetaE datasets. Doesn't have queries with negations.

Graphs
Dataset Entities Relations Training Edges Validation Edges Test Edges Total Edges
FB15k 14,951 1,345 483,142 50,000 59,071 592,213
FB15k237 14,505 237 272,115 17,526 20,438 310,079
NELL995 63,361 200 114,213 14,324 14,267 142,804
Queries
Queries Training Training Validation Validation Test Test
Dataset 1p others 1p others 1p others
FB15k 273,710 273,710 59,097 8,000 67,016 8,000
FB15k237 149,689 149,689 20,101 5,000 22,812 5,000
NELL995 107,982 107,982 16,927 4,000 17,034 4,000
Average Number of Answers
Dataset 1p 2p 3p 2i 3i ip pi 2u up
FB15k 10.8 255.6 250.0 90.3 64.1 593.8 190.1 27.8 227.0
FB15k237 13.3 131.4 215.3 69.0 48.9 593.8 257.7 35.6 127.7
NELL995 8.5 56.6 65.3 30.3 15.9 310.0 144.9 14.4 62.5

CGA Datasets

GQE-like patterns mined on subsets of DBpedia and Wikidata. The datasets are DB18 and WikiGeo19, introduced in CGA, K-CAP 2019.

As of Sept 2022: not available.

Regex Queries

Queries emulating property paths in SPARQL with variable length of relation paths (up to length 5). Queries are EPFO queries, i.e., no negation. New operators over relations resemble those from SPARQL:

  • $r_1 / r_2 / \dots$ - relational path, aka classic projection queries
  • $r_1 \lor r_2$ - a union of decomposed patterns $(e, r_1, ?) \lor (e, r_2, ?)$
  • Kleene plus $r^{+}$ - one or more occurence of relation $r$, eg, $r_1/r_2^{+}$ corresponds to $r_1 / r_2$, $r_1 / r_2 / r_2$, $r_1 / r_2 / r_2 / r_2 / \dots$ up to some final depth. Those can be cyclic patterns.

Two datasets:

  • FB15k-Regex is based on Freebase, queries have less than 50 answers, 21 query types
  • Wiki100-Regex is based on query logs from the official Wikidata SPARQL endpoint, 5 query types.

Introduced in RotatE-Box, AKBC 2021.

Repo: GitHub - no actual data dumps are present :(

Graphs
Dataset Entities Relations Training Edges Validation Edges Test Edges Total Edges
FB15k 14,951 1,345 483,142 50,000 59,071 592,213
Wiki100 41,291 100 389,795 21,655 21,656 433,106
Queries

FB15k-Regex

Query type Train Valid Test
$(e_1, r_1^+, ?)$ 24,476 4,614 8,405
$(e_1, r_1/r_2, ?)$ 25,378 4,927 8,844
$(e_1, r_1^+/r_2^+, ?)$ 26,391 4,978 9,028
$(e_1, r_1^+/r_2^+/r_3^+, ?)$ 25,470 4,878 8,816
$(e_1, r_1/r_2^+, ?)$ 26,335 5,007 9,062
$(e_1, r_1^+/r_2, ?)$ 27,614 5,229 9,429
$(e_1, r_1^+/r_2^+/r_3, ?)$ 27,865 5,283 9,509
$(e_1, r_1^+/r_2/r_3^+, ?)$ 26,366 5,058 9,159
$(e_1, r_1/r_2^+/r_3^+, ?)$ 26,366 5,045 9,099
$(e_1, r_1/r_2/r_3^+, ?)$ 26,703 5,155 9,313
$(e_1, r_1/r_2^+/r_3, ?)$ 28,005 5,380 9,688
$(e_1, r_1^+/r_2/r_3, ?)$ 27,884 5,338 9,632
$(e_1, r_1\lor r_2, ?)$ 30,080 5,828 9,664
$(e_1, (r_1\lor r_2)/r_3, ?)$ 31,559 6,606 10,974
$(e_1, r_1/(r_2\lor r_3), ?)$ 41,886 7,755 13,611
$(e_1, r_1^+\lor r_2^+, ?)$ 23,109 4,469 8,367
$(e_1, (r_1\lor r_2)/r_3^+, ?)$ 27,658 5,738 9,711
$(e_1, (r_1^+\lor r_2^+)/r_3, ?)$ 24,462 4,865 8,863
$(e_1, r_1^+/(r_2\lor r_3), ?)$ 27,676 5,340 9,267
$(e_1, r_1/(r_2^+\lor r_3^+), ?)$ 28,542 5,475 9,436
$(e_1, (r_1\lor r_2)^+, ?)$ 26,260 5,523 10,360
Total 580,085 112,491 200,237

Wiki100-Regex

Query type Train Valid Test
$(e_1, r_1^+, ?)$ 490,562 24,878 23,443
$(e_1, r_1^+/r_2^+, ?)$ 6,945 620 772
$(e_1, r_1/r_2^+, ?)$ 85,253 10,013 8,377
$(e_1, r_1\lor r_2, ?)$ 274,012 14,900 14,915
$(e_1, (r_1\lor r_2)^+, ?)$ 348,274 15,720 15,311
Total 1,205,046 66,131 62,818

DAG Queries

Conjunctive queries (w/o union) not limited to 9 patterns from Query2Box/BetaE datasets. The task is to predict all intermediate entities, not just final leaf nodes. Query depth: 2-5; max 3 intersecting branches.

Introduced in Answering complex queries in knowledge graphs with bidirectional sequence encoders, AAAI 2021.

New FB15K-237-CQ and WN18RR-CQ datasets have two variations:

  • CQ (conjunctive queries) - Training on triples + paths + DAGs, Validation/Test on DAGs only;
  • Paths - Training on triples + paths, Validation/Test on paths only

Sept 2022: the datasets are not publicly available.

Graphs
Dataset FB15K-237-CQ FB15K-237-CQ FB15K-237-CQ WN18RR-CQ WN18RR-CQ WN18RR-CQ
Dataset Train Validation Test Train Validation Test
Entities 14,505 - - 40,943 - -
Relations 237 237 237 11 11 11
Triples 272,115 - - 86,835 - -
Paths 50,000 - - 10,000 - -
DAGs 48,865 2,785 2,599 9,465 112 95
Avg Masks 1.86 5.91 6.05 1.84 5.13 4.91
Avg Query Len (Tokens) 152 460 479 71 198 199
Queries

No detailed breakdown by query type is available, only the DAGs stats from the main table.

Dataset FB15K-237-CQ FB15K-237-CQ FB15K-237-CQ WN18RR-CQ WN18RR-CQ WN18RR-CQ
Dataset Train Validation Test Train Validation Test
Paths 50,000 - - 10,000 - -
DAGs 48,865 2,785 2,599 9,465 112 95
Avg Masks 1.86 5.91 6.05 1.84 5.13 4.91
Avg Query Len (Tokens) 152 460 479 71 198 199

EFO-1 Queries

Existential First-Order queries with Single Free Variable, extended from BetaE. The goal is to evaluate the combinatorial generalizability.

Introduced in Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs, NeurIPS 2021 (Datasets and Benchmarks)

Graphs
Queries Training Training Validation Validation Test Test
Dataset 1p/2p/3p/2i/3i 2in/3in/inp/pin/pni 1p others 1p others
FB15k 273,710 27,371 59,097 8,000 67,016 8,000
FB15k237 149,689 14,968 20,101 5,000 22,812 5,000
NELL995 107,982 10,798 16,927 4,000 17,034 4,000
Queries

Cannot list all the 301 query types. Details can be found in a summarization excel file here.

Real EFO-1 dataset

Rethinking the EFO-1 formulation by introducing leaf nodes, multi edge, and cycle. For standard FB15k, FB15k-237, and NELL - 9 new query types (10 with reworked pni type) including:

  • l - queries with existentially quantified variables as leaf nodes (2il, 3il)
  • m - queries with multiple relation projection edges from one variable to another (2m, 2nm, 3mp, 3pm, im)
  • c - queries with cycles (3c, cm)

All new query have 5000 instances in three KGs. Introduced in On Existential First Order Queries Inference on Knowledge Graphs, arxiv 2023. The dataset can be downloaded from here.

EPFO queries with Literals

Based on a variation of the FB15k-237 dataset with entity attributes (12,390 entities, 237 relations, 115 attributes, 29,229 (?) triples). Literals are restricted to numerical values, three additional filter functions (less than, equal, greater then).

The dataset includes standard 9 EPFO query types and adds 8 more variations of those patterns enriched with literals:

  • 5 query types where literals are in queries, but the answer is an entity (ai, 2ai, pai, aip, au)
  • 3 query types where literals are in queries, and the answer is a mean of relevant literal values (1ap, 2ap, 3ap)

Introduced in LitCQD: Multi-Hop Reasoning in Incomplete Knowledge Graphs with Numeric Literals, arxiv 2023

SQE Queries

Existential First-Order queries aimed at evaluating compositional generalization to OOD query patterns (29 in-distribution types, 29 out-of-distribution). In contrast to BetaE datasets, does not have restrictions on the number of answers per query, long tails are possible.

Introduced in Sequential Query Encoding For Complex Query Answering on Knowledge Graphs

Graphs
Queries Training Training Validation Test
Dataset 1p others all all
FB15k 273,710 821,130 8,000 8,000
FB15k237 149,689 449,067 5,000 5,000
NELL995 107,982 323,946 4,000 4,000
Queries

58 query types, refer to Appendix A in the paper for the full list of patterns.

EFOk queries

Introduced by (EFOk-CQA) EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation with 741 query types in total.

Featured by:

  • existential first-order queries with more than multiple variables.
  • combinatorial space with multi-edge and cyclic queries

Numerical CQA Queries

The Numerical CQA queries both include entities and typed numerical attribute values.

Introduced by Knowledge Graph Reasoning over Entities and Numerical Values

Graphs
Graphs Data Split 1p 2p 2i 3i pi ip 2u up All
FB15K Training 304,633 138,192 226,729 288,874 260,057 233,834 284,301 284,931 2,021,551
Validation 8,271 15,860 23,359 28,836 25,081 22,930 29,187 29,210 182,734
Testing 7,969 15,431 23,346 28,865 24,810 22,232 29,212 29,274 181,139
DB15K Training 124,851 99,698 140,427 190,413 171,353 163,687 190,364 194,244 1,275,037
Validation 3,529 10,388 9,792 13,817 14,594 16,651 19,512 19,792 108,075
Testing 3,387 10,047 9,914 14,603 14,642 15,897 19,504 19,773 107,767
YAGO15K Training 84,014 76,238 136,282 183,850 162,712 145,994 183,963 183,459 1,156,512
Validation 2,833 7,986 10,757 16,884 13,485 13,899 18,444 19,105 103,393
Testing 2,713 7,949 10,935 17,171 13,481 13,526 18,433 18,997 103,205

Type-Aware Datasets

In addition to a normal graph of entities (instances) a-la BetaE datasets, the type-aware datasets offer an additional set of classes, classes hierarchy (from a pre-existing ontology), and instanceOf links between entities and classes.

Those datasets might include an additional task of predicting types of answer entities (Concept Retrieval).

LUBM and NELL employ ontological axioms of the DL-Lite (R) family of Description Logics.

Graphs

TODO Figure out Concept Retrieval edges in TAR

Dataset Entities Relations Axioms Base Graph Materialized Graph
LUBM 55,684 28 68 284k 565k
NELL 63,361 400 307 285k 497k

Axioms breakdown in ontologies for LUBM and NELL

Rules LUBM NELL
$\mathcal{O}$ (Total) 68 307
$A \sqsubseteq A'$ (Subclass) 13 -
$p \sqsubseteq s$ 5 92
$p^{-} \sqsubseteq s$ 28 215
$\exists p \sqsubseteq A$ 11 -
$\exists p^{-} \sqsubseteq A$ 11 -
Dataset Entities Relations Classes Training Edges Validation Edges Test Edges Entity-Class Edges Class Hierarchy Edges Total Edges
YAGO 4 32,465 75 8,382 101,417 1,000 1,000 83,291 16,644 184,708
DBpedia 28,824 327 981 136,821 1,000 1,000 225,436 2,582 362,257
Queries
Dataset Train / Test 1p 2p 3p 2i 3i ip pi 2u up
LUBM Plain (Train) 110,000 110,000 110,000 110,000 110,000 - - - -
LUBM Generalized (Train) 117,124 136,731 150,653 181,234 208,710 - - - -
LUBM Specialized (Train) 117,780 154,851 173,678 271,532 230,085 - - - -
LUBM Ontological (Train) 116,893 166,159 333,406 212,718 491,707 - - - -
LUBM Induction (w/ missing links in queries) (Val/Test) 8,000 8,000 8,000 8,000 8,000 8,000 8,000 8,000 8,000
LUBM Deduction (w/o missing link in training) (Val/Test) 1,241 4,701 6,472 3,829 4,746 7,393 7,557 4,986 7,122
LUBM Induction + Deduction (Val/Test) 8,000 8,000 8,000 8,000 8,000 8,000 8,000 7,986 8,000
NELL Plain (Train) 107,982 107,982 107,982 107,982 107,982 - - - -
NELL Generalized (Train) 174,310 408,842 864,268 398,412 930,787 - - - -
NELL Specialized (Train) 174,310 419,664 906,609 401,954 936,537 - - - -
NELL Ontological (Train) 114,614 542,923 864,268 629,144 930,787 - - - -
NELL Induction (w/ missing links in queries) (Val/Test) 15,688 3,910 3,918 3,828 3,786 3,932 3,895 3,940 3,966
NELL Deduction (w/o missing link in training) (Val/Test) 346 4,461 4,294 4,842 5,996 7,295 5,862 5,646 6,894
NELL Induction + Deduction (Val/Test) 8,000 8,000 8,000 8,000 8,000 8,000 8,000 7,990 8,000
Queries Training Training Validation Validation Test Test
Dataset 1p others 1p others 1p others
YAGO 4 (Concept Retrieval) 189,338 10,000 1,000 1,000 1,000 1,000
YAGO 4 (Entity Only) 101,417 10,000 1,000 1,000 1,000 1,000
YAGO 4 (Entity + Instantiations) 184,708 10,000 1,000 1,000 1,000 1,000
DBpedia (Concept Retrieval) 473,924 10,000 1,000 1,000 1,000 1,000
DBpedia (Entity Only) 136,821 10,000 1,000 1,000 1,000 1,000
DBpedia (Entity + Instantiations) 362,257 10,000 1,000 1,000 1,000 1,000

Very Large Datasets

Introduced in SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs, KDD 2022.

Training queries are sampled on-the-fly during training due to the huge size of underlying graphs.

The underlying graphs are FB400k (400K nodes), WikiKG 2 (2.5M nodes) (from OGB), and full Freebase (86M nodes) TODO: confirm with Hongyu the number of validation / test queries.

Graphs
Dataset Entities Relations Training Edges Validation Edges Test Edges Total Edges
FB400k 409,829 918 1,075,837 537,917 537,917 2,151,671
WikiKG2 2,500,604 535 16,109,182 429,456 598,543 17,137,181
Freebase 86,054,361 14,824 304,727,650 16,929,318 16,929,308 338,586,276
Queries
Queries Validation Validation Test Test
Dataset 1p others 1p others
FB400k TODO TODO TODO TODO
WikiKG2 TODO TODO TODO TODO
Freebase TODO TODO TODO TODO

Hyper-Relational Datasets

The main difference of hyper-relational datasets is that edges are no longer plain triples $(h, r, t)$ but statements (in terms of Wikidata or RDF-Star) $\Big(h, r, t, (q_{ri}, q_{ei})_i\Big)$ with key-value (relation:entity) qualifiers $(q_{r}, q_{e})$ over the main triple. For example, in the statment (Albert Einstein, educated at, ETH Zurich, (degree, Bachelor)), the main triple is Albert Einstein, educated at, ETH Zurich and its unique qualifier is (degree, Bachelor). Qualifiers provide an additional context to the edge - the tail node might change with another qualifier, e.g., (Albert Einstein, educated at, University of Zurich, (degree, Doctorate)).

Entities and relations in qualifiers are still legit entities and relations which could be present in main triples. Some entities and relations can be found only in qualifiers.

The WD50K dataset has only conjunctive queries (projection + intersection), neither union nor negation.

Introduced in Query Embedding on Hyper-Relational Knowledge Graphs, ICLR 2022

The WD50K-NFOL dataset introduced in NQE: N-ary Query Embedding for Complex Query Answering over Hyper-relational Knowledge Graphsadds unions and negations, as well as possibility of variables at qualifier entity positions. **As of Nov 2022, not openly available)

Graph

The original WD50K graph from the StarE paper by Galkin et al.

Dataset Entities Relations Qualifier-only Entities Qualifier-only Relations Training Edges Validation Edges Test Edges Total Edges
WD50K 47,156 532 5460 45 166,435 23,913 46,159 236,508

32,167 edges have at least one key-value (relation:entity) qualifier.

Queries
Split 1p 2p 3p 2i 3i ip pi
train 24,819 313,088 5,950,990 48,513 318,735 306,022 1,088,539
validation 4,100 100,706 2,968,315 15,648 169,195 169,438 569,957
test 7,716 202,045 6,433,476 38,207 547,272 445,007 1,267,452

WD50K-NFOL stats are not yet available

Inductive Datasets

As of March 2023, there are no existing purely inductive datasets such that the training and validation/test graphs are different (validation and test containing new unseen entities) and predictions should only rely on the graph structure w/o external data.

Type-based Inductive

As a bridge between shallow transductive models and inductive inference, Type-aware Embeddings for Multi-Hop Reasoning over Knowledge Graphs propose to mine entity types as the invariant that remains the same at training and inference.

As a result, the following datasets assume an existing and known in advance class hierarchy (or a graph of classes). Technically, those can be put in the Type-Aware Datasets category. The query datasets only have EPFO queries (no negation).

Inductive splits have been published, see the GitHub issue

Graphs

The underlying graphs are FB15k-237-V2 and NELL995-V3 from Inductive relation prediction by subgraph reasoning by Teru et al, ICML 2020. The original repo and other datasets are here.

Training is performed on the Train Graph, but at validation/test time the model is fed with a new Inference Graph with completely new nodes. The Inference Graph has missing edges that have to be predicted at validation or test time.

Dataset Relations Types Train Graph Train Graph Inference Graph Inference Graph Inference Graph Inference Graph
Train Entities Train Edges Inference Entities Inference Edges Validation Edges Test Edges
FB15k-237-V2 203 3851 3,000 4,245 2,000 4,145 469 478
NELL995-V3 142 267 4,647 16,393 4,921 8.048 811 809

The type hierarchy created for those datasets remains unknown.

Queries
Queries Training Validation Validation Test Test
Dataset 1p/2p/3p/2i/3i 1p others 1p others
FB15k-237-V2 9,964 1,738 2,000 791 1,000
NELL995-V3 12,010 2,197 2,000 1,167 1,500

Tree-like Conjunctive Inductive

The dataset proposed in GNNQ frames query answering as node classification. The dataset has 9 tree-like conjunctive queries (6 synthetic from WatDiv and 3 from FB15k237), no unions nor negations. For each query, there are P KGs with an answer entity satisfying a query and N KGs with negative samplies where an answer does not satisfy a query. Test splits have graphs with new entities (but the same query shapes).

Graphs Many - each WatDiv query has 2K positive GRAPHS and 700K negative GRAPHS (each of about 100K triples); each FB15k237 query has about 1K positive GRAPHS and 1K negative GRAPHS (each of about 10K triples)
Queries

Each query in the table has many associated graphs where one node is an answer (positive graph sample) and where nodes are not answers (negative graph samples)

Query Relations Num atoms / tree depth Train: pos/neg Test: pos/neg
WatDiv-Q1 158 8 / 4 2114 / 699699 1085 / 349877
WatDiv-Q2 158 8 / 3 3258 / 698396 1769 / 349119
WatDiv-Q3 158 8 / 3 1520 / 700276 798 / 350165
WatDiv-Q4 158 10 / 4 2397 / 698986 1226 / 349546
WatDiv-Q5 158 10 / 4 6338 / 693988 2866 / 347570
WatDiv-Q6 158 10 / 4 7545 / 692439 3744 / 346290
FB15k237-Q1 237 7 / 4 1185 / 1180 395 / 395
FB15k237-Q2 237 7 / 4 650 / 660 220 / 220
FB15k237-Q3 237 5 / 4 860 / 870 290 / 290

Temporal Datasets

Introduced in TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph, KDD 2022.

Based on FOL operators, the dataset focuses on temporal reasoning, which includes after, before and between on any timestamp set.

Graphs
Dataset Entities Relations Timestamps Training Edges Validation Edges Test Edges Total Edges
ICEWS14 7,128 230 365 72,826 8,941 8,963 90,730
ICEWS05-15 10,488 251 4,017 386,962 46,275 46,092 479,329
GDELT-500 500 20 366 2,735,685 341,961 341,961 3,419,607
Queries
Query Name ICEWS14-Train Validation Test ICES05-15-Train Validation Test GDELT-500-Train Validation Test
Pe2 72826 3482 4037 368962 10000 10000 2215309 10000 10000
Pe3 72826 3492 4083 368962 10000 10000 2215309 10000 10000
Pe_Pt 7282 3385 3638 36896 10000 10000 221530 10000 10000
e2i 72826 3305 3655 368962 10000 10000 2215309 10000 10000
e3i 72826 2966 3023 368962 10000 10000 2215309 10000 10000
e2i_Pe - 2913 2913 - 10000 10000 - 10000 10000
Pe_e2i - 2913 2913 - 10000 10000 - 10000 10000
Pe_t2i - 2913 2913 - 10000 10000 - 10000 10000
e2i_NPe 7282 3061 3192 36896 10000 10000 221530 10000 10000
e2i_peN 7282 2971 3031 36896 10000 10000 221530 10000 10000
Pe_e2i_Pe_NPe 7282 2968 3012 36896 10000 10000 221530 10000 10000
e2i_N 7282 2949 2975 36896 10000 10000 221530 10000 10000
e3i_N 7282 2913 2914 36896 10000 10000 221530 10000 10000
e2u - 2913 2913 - 10000 10000 - 10000 10000
Pe_e2u - 2913 2913 - 10000 10000 - 10000 10000
Pt_lPe 7282 4976 5608 36896 10000 10000 221530 10000 10000
Pt_rPe 7282 3321 3621 36896 10000 10000 221530 10000 10000
t2i 72826 5112 6631 368962 10000 10000 2215309 10000 10000
t3i 72826 3094 3296 368962 10000 10000 2215309 10000 10000
t2i_Pe - 2913 2913 - 10000 10000 - 10000 10000
Pt_le2i 7282 3226 3466 36896 10000 10000 221530 10000 10000
Pt_re2i 7282 3236 3485 36896 10000 10000 221530 10000 10000
t2i_NPt 7282 4873 5464 36896 10000 10000 221530 10000 10000
t2i_PtN 7282 3300 3609 36896 10000 10000 221530 10000 10000
Pe_t2i_PtPe_NPt 7282 3031 3127 36896 10000 10000 221530 10000 10000
t2i_N 7282 3135 3328 36896 10000 10000 221530 10000 10000
t3i_N 7282 2924 2944 36896 10000 10000 221530 10000 10000
t2u - 2913 2913 - 10000 10000 - 10000 10000
Pe_t2u - 2913 2913 - 10000 10000 - 10000 10000
Pe_aPt 7282 4134 4733 68262 10000 10000 221530 10000 10000
Pe_bPt 7282 3970 4565 36896 10000 10000 221530 10000 10000
Pe_at2i 7282 4607 5338 36896 10000 10000 221530 10000 10000
Pe_bt2i 7282 4583 5386 36896 10000 10000 221530 10000 10000
between 7282 2913 2913 36896 10000 10000 221530 10000 10000

Dataset tools

  • Graph Query Sampler: Not a method, rather a dataset generator
  • EFO-1-QA-benchmark: Generating combiantorial tree-formed query types and sampling the data.
  • EFOk-CQA: Generating combinatorial existential first order query types with multiple (k) variables and sampling the data.

🔧 Implementations

  • KGReasoning: GQE, Query2Box, BetaE
  • CQD: GQE, Query2Box, BetaE, CQD
  • EFO-1-QA-benchmark: Query2Box, BetaE, LogicE, NewLook, ConE, FuzzQE
  • Query2particles
  • StarQE: StarQE
  • SMORE: GQE, Query2Box, BetaE + Very Large Datasets
  • GNN-QE: GNN-QE
  • InductiveQE: Inductive QE with NodePiece and GNN-QE
  • TAR: TAR
  • QE-TeMP: TeMP (based on KGReasoning)
  • GNNQ: GNNQ
  • SE-KGE: GQE, CGA, and geospatial model
  • LARK: LARK (uses Huggingface LLMs)
  • WFRE: WFRE
  • FIT: FIT
  • SQE: SQE with Transformer/LSTM/GRU/TCN, Tree-LSTM, Tree-RNN, BetaE, BiQE, ConE, FuzzQE, GQE, HypE, NerualMLP (Mixer), Query2Box, Query2Particles
  • NRN: NRN with GQE, Query2Box, Query2Particles
  • EFOk-CQA: EFOk

All Papers on Complex Logical Query Answering (54)

Click to expand
  1. (GQE) Embedding Logical Queries on Knowledge Graphs NeurIPS 2018
  2. (GQE + hashing) Learning to Hash for Efficient Search over Incomplete Knowledge Graphs ICDM 2019
  3. (CGA) Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs K-CAP 2019, GQE + self-attention instead of DeepSet
  4. (TractOR) Symbolic querying of vector spaces: Probabilistic databases meets relational embeddings, UAI 2020
  5. (Query2Box) Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings ICLR 2020
  6. (BetaE) Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs NeurIPS 2020
  7. (EmQL) Faithful embeddings for knowledge base queries NeurIPS 2020
  8. (MPQE) Message Passing Query Embedding ICML’20 Workshop
  9. (RotatE-Box)Regex Queries over Incomplete Knowledge Bases AKBC’21
  10. (BiQE) Answering complex queries in knowledge graphs with bidirectional sequence encoders, AAAI’21
  11. Approximate knowledge graph query answering: from ranking to binary classification
  12. Knowledge Sheaves: A Sheaf-Theoretic Framework for Knowledge Graph Embedding arxiv, 2021
  13. (ConE) Cone: Cone embeddings for multi-hop reasoning over knowledge graphs NeurIPS’21
  14. (PERM) Probabilistic entity representation model for reasoning over knowledge graphs (improv over BetaE) NeurIPS’21
  15. (CQD) Complex Query Answering with Neural Link Predictors ICLR’21
  16. (HypE) Self-Supervised Hyperboloid Representations from Logical Queries over Knowledge Graphs, WWW 2021
  17. (NewLook) Neural-Answering Logical Queries on Knowledge Graphs (KDD’21)
  18. Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs, NeurIPS 2021 (Datasets and Benchmarks)
  19. Neuro-Symbolic Ontology-Mediated Query Answering OpenReview 2021
  20. (LogicE) Logic Embeddings for Complex Query Answering arxiv 2021
  21. (StarQE) Query Embedding on Hyper-relational Knowledge Graphs ICLR 2022,
  22. (MLPMix) Neural Methods for Logical Reasoning over Knowledge Graphs ICLR 2022
  23. (FuzzQE) Fuzzy Logic Based Logical Query Answering on Knowledge Graphs, AAAI 2022
  24. (GNN-QE) Neural-Symbolic Models for Logical Queries on Knowledge Graphs, ICML 2022
  25. (SMORE) SMORE: Knowledge Graph Completion and Multi-hop Reasoning in Massive Knowledge Graphs KDD 2022
  26. (kgTransformer) Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries KDD 2022
  27. (Query2Particles) Query2Particles: Knowledge Graph Reasoning with Particle Embeddings, Findings NAACL’22
  28. (TAR) TAR: Neural Logical Reasoning across TBox and ABox (arxiv, 2022)
  29. (TeMP) Type-aware embeddings for multi-hop reasoning over knowledge graphs (IJCAI-ECAI 2022)
  30. (FLEX) FLEX: Feature-Logic Embedding Framework for CompleX Knowledge Graph Reasoning (arxiv 2022)
  31. (TFLEX) TFLEX: Temporal Feature-Logic Embedding Framework for Complex Reasoning over Temporal Knowledge Graph (arxiv, 2022)
  32. (LinE) LinE: Logical Query Reasoning over Hierarchical Knowledge Graphs KDD 2022
  33. GNNQ: A Neuro-Symbolic Approach for Query Answering over Incomplete Knowledge Graphs ISWC 2022
  34. (ENeSy) Neural-Symbolic Entangled Framework for Complex Query Answering NeurIPS 2022
  35. (NodePiece-QE, InductiveQE) Inductive Logical Query Answering in Knowledge Graphs NeurIPS 2022
  36. (RoMA) Reasoning over Multi-view Knowledge Graphs arxiv 2022, some new datasets, but no code/data published
  37. (LMPNN) Logical Message Passing Networks With One-Hop Inference On Atomic Formulas ICLR'23
  38. (GammaE) GammaE: Gamma Embeddings for Logical Queries on Knowledge Graphs EMNLP 2022
  39. (NMP-QEM) Neural-based Mixture Probabilistic Query Embedding for Answering FOL queries on Knowledge Graphs, EMNLP 2022
  40. (NQE) NQE: N-ary Query Embedding for Complex Query Answering over Hyper-relational Knowledge Graphs AAAI 2023
  41. (QTO) Answering Complex Logical Queries on Knowledge Graphs via Query Computation Tree Optimization, ICML'23 submission
  42. (SignalE) Signal Embeddings for Complex Logical Reasoning in Knowledge Graphs, KSEM'22
  43. (Var2Vec) Efficient Embeddings of Logical Variables for Query Answering over Incomplete Knowledge Graphs, AAAI'23
  44. (CQD-A) Adapting Neural Link Predictors for Complex Query Answering
  45. (Query2Geom) Analysis of Attention Mechanisms in Box-Embedding Systems, 2023
  46. (SQE) Sequential Query Encoding For Complex Query Answering on Knowledge Graphs, TMLR 2023
  47. (CylE) CylE: Cylinder Embeddings for Multi-hop Reasoning over Knowledge Graphs, EACL 2023
  48. (RoConE) Modeling Relational Patterns for Logical Query Answering over Knowledge Graphs
  49. (FIT) On Existential First Order Queries Inference on Knowledge Graphs, arxiv 2023
  50. (LitCQD) LitCQD: Multi-Hop Reasoning in Incomplete Knowledge Graphs with Numeric Literals, arxiv 2023
  51. (LARK) Complex Logical Reasoning over Knowledge Graphs using Large Language Models, arxiv 2023
  52. (WFRE) Wasserstein-Fisher-Rao Embedding: Logical Query Embeddings with Local Comparison and Global Transport, arxiv 2023
  53. (NRN) Knowledge Graph Reasoning over Entities and Numerical Values KDD 2023
  54. (EFOk-CQA) EFOk-CQA: Towards Knowledge Graph Complex Query Answering beyond Set Operation arxiv 2023

Application Papers (7)

Click to expand
  1. (SE-KGE) SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting, Transactions in GIS 2020, GQE with scalar (x,y) coordinate prediction / encoding
  2. (LEGO) Lego: Latent execution-guided reasoning for multi-hop question answering on knowledge graphs, ICML 2021
  3. (CBR-SubG) Knowledge base question answering by case-based reasoning over subgraphs ICML 2022, application to Question Answering, entailment only, custom datasets
  4. (LogiRec) Towards High-Order Complementary Recommendation via Logical Reasoning Network Application: BetaE in RecSys, arxiv 2022
  5. Context-aware explainable recommendation based on domain knowledge graph, Big Data and Cognitive Computing, 2022
  6. (PLM4CLQA) Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning, arxiv 2023
  7. Unifying structure reasoning and language model pre-training for complex reasoning, arxiv 2023

Potentially Relevant

Click to expand
  1. Hybrid Structured and Similarity Queries over Wikidata plus Embeddings with Kypher-V, ISWC 2022
  2. Combining RDF Graph Data and Embedding Models for an Augmented Knowledge Graph, BigNet 2018 Workshop @ WWW'18
  3. TrQuery: An Embedding-based Framework for Recommanding SPARQL Queries, 2018
  4. Towards Empty Answers in SPARQL: Approximating Querying with RDF Embedding, ISWC 2018

Citation

If you find this work useful, please cite the original paper:

@article{ren2023ngdb,
    title={Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases},
    author={Hongyu Ren and Mikhail Galkin and Michael Cochez and Zhaocheng Zhu and Jure Leskovec},
    year={2023},
    eprint={2303.14617},
    archivePrefix={arXiv},
}

About

A collection of resources on the topic of Complex Logical Query Answering

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published