Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: overriding intermediate data when no outputs asked #25

Open
ankostis opened this issue Oct 2, 2019 · 2 comments
Open

BUG: overriding intermediate data when no outputs asked #25

ankostis opened this issue Oct 2, 2019 · 2 comments

Comments

@ankostis
Copy link

ankostis commented Oct 2, 2019

In the following diagram, all data are given, and asked is different, depending on whether we expicitly ask it in the outputs:

  • when asked output asked, it checks overriden is already given and dose not recompute it,
  • without outputs, it overrides it.

Code to reproduce it:

def test_pruning_not_overrides_given_intermediate():
    # Test #25: not overriding intermediate data when an output is not asked
    graph = compose(name="graph")(
        operation(name="unjustly run", needs=["a"], provides=["overriden"])(lambda a: a),
        operation(name="op", needs=["overriden", "c"], provides=["asked"])(add),
    )
    graph.net.plot('t.png')
    assert graph({'a': 5, 'overriden': 1, "c": 2}, ['asked']) == {'asked': 3}  # that's ok
    assert graph({'a': 5, 'overriden': 1, "c": 2}) == {'a': 5, 'overriden': 1, "c": 2, 'asked': 3}  # FAILs

Root cause:

  • The code that prunes the predecessors of intermediate data runs only when outputs asked.
  • The existing "pruning" TCs were too simple to catch this.

Note that the pruning code in v1.2.4 is buggy (#24), so it cannot be used as is.

ankostis added a commit to ankostis/graphtik that referenced this issue Oct 2, 2019
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 2, 2019
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 2, 2019
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 2, 2019
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 2, 2019
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 2, 2019
needed to refactor the new pruning algorithm.

- expected 2 TCs fail for yet-to-be-solved yahoo#24 & yahoo#25 bugs.
@ankostis
Copy link
Author

ankostis commented Oct 2, 2019

Another "too pruning" but more complex example is when a multi-output operation MUST NOT run, not to override a given intermediate input, but it MUST run, to provide other outputs.

In the diagram below, must run is needed for e but must not override overriden.
t

Code to reproduce:

def test_pruning_multiouts_not_override_intermediates():
    # Test #25: v.1.2.4 overrides intermediate data when a previous operation
    # must run for its other outputs (outputs asked or not)
    netop = compose(name="netop")(
        operation(name="must run", needs=["a"], provides=["overriden", "e"])
        (lambda x: (x, 2 * x)),
        operation(name="op1", needs=["overriden", "c"], provides=["d"])(add),
        operation(name="op2", needs=["d", "e"], provides=["asked"])(lambda x, y: x * y),
    )

    # FAILs
    # - on v1.2.4 with KeyError: 'e',
    # - # - on #18(unsatisfied) + #23(ordered-sets) with empty result.
    assert netop({"a": 5, "overriden": 1, "c": 2}, ["asked"]) == {"asked": 3}
    # FAILs
    # - on v1.2.4 with (overriden, asked) = (5, 70) instead of (1, 13)
    # - # - on #18(unsatisfied) + #23(ordered-sets) like v1.2.4.
    assert (
        netop({"a": 5, "overriden": 1, "c": 2})
        ==
        {"a": 5, "overriden": 1, "c": 2, "asked": 3})

@ankostis
Copy link
Author

ankostis commented Oct 2, 2019

Or this even simpler one:
t

def test_pruning_multiouts_not_override_intermediates1():
    # Test #25: v.1.2.4 overrides intermediate data when a previous operation
    # must run for its other outputs (outputs asked or not)
    netop = compose(name="netop")(
        operation(name="must run", needs=["a"], provides=["overriden", "calced"])
        (lambda x: (x, 2 * x)),
        operation(name="add", needs=["overriden", "calced"], provides=["asked"])(add),
    )
    netop.net.plot('t.png')
    # FAILs
    # - on v1.2.4 with KeyError: 'e',
    # - on #18(unsatisfied) + #23(ordered-sets) with empty result.
    assert netop({"a": 5, "overriden": 1, "c": 2}, ["asked"]) == {"asked": 3}
    # FAILs
    # - on v1.2.4 with (overriden, asked) = (5, 15) instead of (1, 1)
    # - on #18(unsatisfied) + #23(ordered-sets) like v1.2.4.
    assert (
        netop({"a": 5, "overriden": 1})
        ==
        {"a": 5, "overriden": 1, "calced": 10, "asked": 3})

ankostis added a commit to ankostis/graphtik that referenced this issue Oct 3, 2019
+ Pruning behaves correctly also when outputs given;
  this happens by breaking incoming provide-links
  to any given intermedediate inputs.
+ Unsatisfied detection now includes those without outputs
  due to broken links (above).
+ Remove some uneeded "glue" from unsatisfied-detection code,
  leftover from previous compile() refactoring.
+ Renamed satisfiable --> satisfied.
+ Improved unknown output requested raise-message.
+ x2 TCs, in yahoo#24 and 1st in yahoo#25 now PASS.
- 2x TCs in yahoo#25 still FAIL, and need "Pinning" of given-inputs
  (the operation MUST and MUST NOT run in these cases).
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 3, 2019
+ Pruning behaves correctly also when outputs given;
  this happens by breaking incoming provide-links
  to any given intermedediate inputs.
+ Unsatisfied detection now includes those without outputs
  due to broken links (above).
+ Remove some uneeded "glue" from unsatisfied-detection code,
  leftover from previous compile() refactoring.
+ Renamed satisfiable --> satisfied.
+ Improved unknown output requested raise-message.
+ x2 TCs, in yahoo#24 and 1st in yahoo#25 now PASS.
- 2x TCs in yahoo#25 still FAIL, and need "Pinning" of given-inputs
  (the operation MUST and MUST NOT run in these cases).
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 3, 2019
+ Pruning behaves correctly also when outputs given;
  this happens by breaking incoming provide-links
  to any given intermedediate inputs.
+ Unsatisfied detection now includes those without outputs
  due to broken links (above).
+ Remove some uneeded "glue" from unsatisfied-detection code,
  leftover from previous compile() refactoring.
+ Renamed satisfiable --> satisfied.
+ Improved unknown output requested raise-message.
+ x2 TCs, in yahoo#24 and 1st in yahoo#25 now PASS.
- 2x TCs in yahoo#25 still FAIL, and need "Pinning" of given-inputs
  (the operation MUST and MUST NOT run in these cases).
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 3, 2019
+ Pruning behaves correctly also when outputs given;
  this happens by breaking incoming provide-links
  to any given intermedediate inputs.
+ Unsatisfied detection now includes those without outputs
  due to broken links (above).
+ Remove some uneeded "glue" from unsatisfied-detection code,
  leftover from previous compile() refactoring.
+ Renamed satisfiable --> satisfied.
+ Improved unknown output requested raise-message.
+ x2 TCs, in yahoo#24 and 1st in yahoo#25 now PASS.
- 2x TCs in yahoo#25 still FAIL, and need "Pinning" of given-inputs
  (the operation MUST and MUST NOT run in these cases).
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 3, 2019
+ Pruning behaves correctly also when outputs given;
  this happens by breaking incoming provide-links
  to any given intermedediate inputs.
+ Unsatisfied detection now includes those without outputs
  due to broken links (above).
+ Remove some uneeded "glue" from unsatisfied-detection code,
  leftover from previous compile() refactoring.
+ Renamed satisfiable --> satisfied.
+ Improved unknown output requested raise-message.
+ x3 TCs PASS, x1 in yahoo#24 and the first x2 in yahoo#25.
- 1x TCs in yahoo#25 still FAIL, and need "Pinning" of given-inputs
  (the operation MUST and MUST NOT run in these cases).
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 3, 2019
+ Pruning behaves correctly also when outputs given;
  this happens by breaking incoming provide-links
  to any given intermedediate inputs.
+ Unsatisfied detection now includes those without outputs
  due to broken links (above).
+ Remove some uneeded "glue" from unsatisfied-detection code,
  leftover from previous compile() refactoring.
+ Renamed satisfiable --> satisfied.
+ Improved unknown output requested raise-message.
+ x3 TCs PASS, x1 in yahoo#24 and the first x2 in yahoo#25.
- 1x TCs in yahoo#25 still FAIL, and need "Pinning" of given-inputs
  (the operation MUST and MUST NOT run in these cases).
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 3, 2019
+ Pruning behaves correctly also when outputs given;
  this happens by breaking incoming provide-links
  to any given intermedediate inputs.
+ Unsatisfied detection now includes those without outputs
  due to broken links (above).
+ Remove some uneeded "glue" from unsatisfied-detection code,
  leftover from previous compile() refactoring.
+ Renamed satisfiable --> satisfied.
+ Improved unknown output requested raise-message.
+ x3 TCs PASS, x1 in yahoo#24 and the first x2 in yahoo#25.
- 1x TCs in yahoo#25 still FAIL, and need "Pinning" of given-inputs
  (the operation MUST and MUST NOT run in these cases).
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 3, 2019
+ Insert "PinInstructions" in the execution-plan to avoid overwritting.
+ Add `_overwrite_collector` in `compose()` to collect re-calculated values.
+ FIX the last TC in yahoo#25.
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 4, 2019
…must run

+ Insert "PinInstructions" in the execution-plan to avoid overwritting.
+ Add `_overwrite_collector` in `compose()` to collect re-calculated values.
+ FIX the last TC in yahoo#25.
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 4, 2019
…must run

- WIP: PARALLEL execution not adding PINS!
+ Insert "PinInstructions" in the execution-plan to avoid overwritting.
+ Add `_overwrite_collector` in `compose()` to collect re-calculated values.
+ FIX the last TC in yahoo#25.
ankostis added a commit to ankostis/graphtik that referenced this issue Oct 4, 2019
+ ALL TCs OK after ~10 fails with yahoo#25, yahoo#25 bugs, BUT...
- WIP: STILL no PIN on PARALLEL.
+ move check if value in asked outputs before cache-evicting it
  in build-execution-plan time - compute methods
  don't need outputs anymore.
+ test: speed up parallel/multihtread TCs
  by reducing delays & repetitions.
+ refact: network rightfully adopted stray functions
  for parallel processing - they all worke on the net.graph,
+ upd: networkx api by indexing on `dag.nodes` views.
+ enh: add log message when deleting in parallel
  (in par with sequential code).
+ refact: var-renames, if-then-else simplifications, pythonisms.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant