You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our current UNION-based bind join algorithm (see PhysicalOpBindJoinWithUNION and ExecOpBindJoinSPARQLwithUNION) creates a UNION pattern with FILTERs added into each UNION part. That's unnecessarily complex for the accessed SPARQL endpoint because the given graph pattern before the FILTER is repeated within each of the UNION parts.
A better idea is to create a UNION pattern where each part of the UNION is a version of the given graph pattern in which the join variables have been substituted by applying one of the solutions of the current input batch. To be able to figure out which of the UNION parts a retrieved solution mapping comes from and, thus, to be able to figure out which of the input solutions the retrieved solution has to be joined with, each UNION part needs to be extended with a BIND clause of the form BIND( x AS ?cnt ) where ?cnt is a new variable (needs to be the same in all UNION parts) and x is an integer that is different for each UNION part.
The text was updated successfully, but these errors were encountered:
Actually, instead of adding BIND clauses, one of the variables that remains after substituting (which, thus, is not a join variable) should all be renamed in each of the UNION parts such that it is a different variable in each UNION part (e.g., ?x is renamed to ?x1, ?x2, etc. -- but careful! there shouldn't be another variables with any of these new variable names). Then, depending on which of these new variables is bound in a given solution mapping retrieved from the SPARQL endpoint, it is possible to figure out which of the input solution mappings is the corresponding join partner. This idea is called bound join in the FedX paper and, in comparison to both unions with filters and unions with BIND clauses, it reduces both the size of the request queries and the size of the responses to these queries (i.e., the amount of data that is shipped both ways between HeFQUIN and the SPARQL endpoint).
Our current UNION-based bind join algorithm (see
PhysicalOpBindJoinWithUNION
andExecOpBindJoinSPARQLwithUNION
) creates a UNION pattern with FILTERs added into each UNION part. That's unnecessarily complex for the accessed SPARQL endpoint because the given graph pattern before the FILTER is repeated within each of the UNION parts.A better idea is to create a UNION pattern where each part of the UNION is a version of the given graph pattern in which the join variables have been substituted by applying one of the solutions of the current input batch. To be able to figure out which of the UNION parts a retrieved solution mapping comes from and, thus, to be able to figure out which of the input solutions the retrieved solution has to be joined with, each UNION part needs to be extended with a BIND clause of the form
BIND( x AS ?cnt )
where?cnt
is a new variable (needs to be the same in all UNION parts) andx
is an integer that is different for each UNION part.The text was updated successfully, but these errors were encountered: