Skip to content

Commit

Permalink
Change into a python example
Browse files Browse the repository at this point in the history
  • Loading branch information
martamaja10 committed Sep 27, 2024
1 parent f34ada0 commit 991d20f
Showing 1 changed file with 14 additions and 12 deletions.
26 changes: 14 additions & 12 deletions tree/dataframe/src/RDataFrame.cxx
Original file line number Diff line number Diff line change
Expand Up @@ -456,12 +456,14 @@ executed at the moment they are called, but they are **lazy**, i.e. delayed unti
accessed through the smart pointer. At that time, the event loop is triggered and *all* results are produced
simultaneously.
### Properly exploiting RDataFrame laziness
For yet another example of the difference between the correct and incorrect running of the event-loop, see the following
two code snippets. We assume our ROOT file has branches a, b and c.
The correct way - the dataset is only processed once.
~~~{.cpp}
ROOT::RDataFrame df_correct(treename, filename);
~~~{.py}
df_correct = ROOT.RDataFrame(treename, filename);
h_a = df_correct.Histo1D("a")
h_b = df_correct.Histo1D("b")
Expand All @@ -471,23 +473,23 @@ h_a_val = h_a.GetValue()
h_b_val = h_b.GetValue()
h_c_val = h_c.GetValue()
std::cout << "How many times was the dataset processed? " << df_wrong.GetNRuns() << " times." << std::endl; // The answer will be - 1 time.
print(f"How many times was the data set processed? {df_wrong.GetNRuns()} time.") # The answer will be 1 time.
~~~
An incorrect way - the dataset is processed three times.
~~~{.cpp}
ROOT::RDataFrame df_incorrect(treename, filename);
~~~{.py}
df_incorrect = ROOT.RDataFrame(treename, filename);
auto h_a = df_incorrect.Histo1D("a");
auto h_a_val = h_a.GetValue();
h_a = df_incorrect.Histo1D("a")
h_a_val = h_a.GetValue()
auto h_b = df_incorrect.Histo1D("b");
auto h_b_val = h_b.GetValue();
h_b = df_incorrect.Histo1D("b")
h_b_val = h_b.GetValue()
auto h_c = df_incorrect.Histo1D("c");
auto h_c_val = h_c.GetValue();
h_c = df_incorrect.Histo1D("c")
h_c_val = h_c.GetValue()
std::cout << "How many times was the dataset processed? " << df_wrong.GetNRuns() << " times." << std::endl; // TThe answer will be - 3 times.
print(f"How many times was the data set processed? {df_wrong.GetNRuns()} times.") # The answer will be 3 times.
~~~
It is therefore good practice to declare all your transformations and actions *before* accessing their results, allowing
Expand Down

0 comments on commit 991d20f

Please sign in to comment.