Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forecast not being generated when horizon longer than 48h #1245

Open
devansh287 opened this issue Nov 19, 2024 · 13 comments
Open

Forecast not being generated when horizon longer than 48h #1245

devansh287 opened this issue Nov 19, 2024 · 13 comments

Comments

@devansh287
Copy link
Contributor

My goal is to make a forecast with a horizon of five days. However, that is not being generated even though the log claims that it is generated. You can reproduce the steps given below to arrive at the problem situation.

Steps taken so far

I edited the list of acceptable horizons in supported_horizons() function in flexmeasures/utils/time_utils.py to include 5 days.

def supported_horizons() -> list[timedelta]:
    return [
        timedelta(hours=1),
        timedelta(hours=6),
        timedelta(hours=24),
        timedelta(hours=48),
        timedelta(days=5),
    ]

I also modified the cli verification in the create_forecasts function in flexmeasures/cli/data_add.py and added support for 5 days (which is 120 hours).

@click.option(
    "--horizon",
    "horizons_as_hours",
    multiple=True,
    type=click.Choice(["1", "6", "24", "48", "120"]),
    default=["1", "6", "24", "48","120"],
    help="Forecasting horizon in hours. This argument can be given multiple times. Defaults to all possible horizons.",
)

I added a sensor with a daily resolution. You can put in the asset_id of any asset that you have created

flexmeasures add sensor --name 'grid' --unit kW --event-resolution 'P1D' --timezone 'UTC' --asset <asset_id> --attributes '{"capacity_in_mw": 0.5}

Load the following data into a csv.

echo "event_start,event_value
2024-08-31T00:00:00+00:00,10.6761666667
2024-09-01T00:00:00+00:00,11.583923034
2024-09-02T00:00:00+00:00,11.4842008797
2024-09-03T00:00:00+00:00,10.962666667
2024-09-04T00:00:00+00:00,10.864166667
2024-09-05T00:00:00+00:00,10.3890000003
2024-09-06T00:00:00+00:00,10.53233334
2024-09-07T00:00:00+00:00,10.1858233333
2024-09-08T00:00:00+00:00,4.9081
2024-09-09T00:00:00+00:00,11.20635
2024-09-10T00:00:00+00:00,7.383595
2024-09-11T00:00:00+00:00,7.2467288136
2024-09-12T00:00:00+00:00,9.1068333333
2024-09-13T00:00:00+00:00,9.75605
2024-09-14T00:00:00+00:00,8.8384333333
2024-09-15T00:00:00+00:00,6.32779661
2024-09-16T00:00:00+00:00,9.2901
2024-09-17T00:00:00+00:00,5.2769333333
2024-09-18T00:00:00+00:00,9.3428333333
2024-09-19T00:00:00+00:00,9.084866667
2024-09-20T00:00:00+00:00,8.471466667
2024-09-21T00:00:00+00:00,7.5002333333
2024-09-22T00:00:00+00:00,4.7099333333
2024-09-23T00:00:00+00:00,7.406195254
2024-09-24T00:00:00+00:00,3.7123783333
2024-09-25T00:00:00+00:00,7.8368233333
2024-09-26T00:00:00+00:00,6.918516667
2024-09-27T00:00:00+00:00,9.9498833333
2024-09-28T00:00:00+00:00,5.9200033333
2024-09-29T00:00:00+00:00,6.6264166667
2024-09-30T00:00:00+00:00,7.2758333333
2024-10-01T00:00:00+00:00,3.3952
2024-10-02T00:00:00+00:00,1.9717333333
2024-10-03T00:00:00+00:00,11.2791666667
2024-10-04T00:00:00+00:00,1.5925672172
2024-10-05T00:00:00+00:00,1.9095242373
2024-10-06T00:00:00+00:00,3.8564
2024-10-07T00:00:00+00:00,9.5928333333
2024-10-08T00:00:00+00:00,11.2720666667
2024-10-09T00:00:00+00:00,7.0110666667
2024-10-10T00:00:00+00:00,10.8362383333
2024-10-11T00:00:00+00:00,7.049716667
2024-10-12T00:00:00+00:00,5.6846333333
2024-10-13T00:00:00+00:00,11.4374833333
2024-10-14T00:00:00+00:00,3.237644048
2024-10-15T00:00:00+00:00,1.7008166667
2024-10-16T00:00:00+00:00,2.8421833333
2024-10-17T00:00:00+00:00,4.91585
2024-10-18T00:00:00+00:00,11.1368333333
2024-10-19T00:00:00+00:00,11.1151666667
2024-10-20T00:00:00+00:00,7.34765
2024-10-21T00:00:00+00:00,4.7092333333
2024-10-22T00:00:00+00:00,7.406195254
2024-10-23T00:00:00+00:00,3.7123783333
2024-10-24T00:00:00+00:00,7.8368233333
2024-10-25T00:00:00+00:00,6.918516667
2024-10-26T00:00:00+00:00,9.9498833333
2024-10-27T00:00:00+00:00,5.9200033333
2024-10-28T00:00:00+00:00,6.6264166667
2024-10-29T00:00:00+00:00,7.2758333333
2024-10-30T00:00:00+00:00,9.2856666667
2024-10-31T00:00:00+00:00,10.4966666667
2024-11-01T00:00:00+00:00,11.1850000000
2024-11-02T00:00:00+00:00,12.1566666667
2024-11-03T00:00:00+00:00,11.7691666667
2024-11-04T00:00:00+00:00,12.3788333333
2024-11-05T00:00:00+00:00,10.4931666667
2024-11-06T00:00:00+00:00,9.5936666667
2024-11-07T00:00:00+00:00,8.4711666667
2024-11-08T00:00:00+00:00,9.2856666667
2024-11-09T00:00:00+00:00,10.4966666667
2024-11-10T00:00:00+00:00,11.1850000000
2024-11-11T00:00:00+00:00,12.1566666667
2024-11-12T00:00:00+00:00,11.7691666667
2024-11-13T00:00:00+00:00,12.3788333333
2024-11-14T00:00:00+00:00,10.4931666667" > sensor_data_full_utc.csv

Add the csv data to the sensor. Please replace sensor_id with the id that you get when you create the sensor.

flexmeasures add beliefs --sensor <sensor_id> --source 'actuals' sensor_data_full_utc.csv --timezone 'UTC'

As you can see in the csv, we have actuals till November 14th. Lets try making a forecast for Nov 19th, which we should be able to do since our horizon has been expanded to five days.

flexmeasures add forecasts --sensor <sensor_id> --from-date '2024-11-19' --to-date '2024-11-19' --horizon 120 --as-job

This is what we get in the log:

2024-11-18 20:07:26 Running Forecasting Job ebf719f6-9e05-423a-b2c9-b6aab1040fe0: grid for 5 days, 0:00:00 on model 'linear-OLS', from 2024-11-19 00:00:00+00:00 to 2024-11-20 00:00:00+00:00
2024-11-18 20:07:27 Job ebf719f6-9e05-423a-b2c9-b6aab1040fe0 made 1 forecasts.

However, as we can see in the screenshot, no forecast has actually been made.

image

I tried downloading the data as a csv, and it does show a belief for Nov 19th, but the value is empty.

image

Note: the values are all shifted by five hours and thirty minutes as I am in India, which is UTC+5:30.

Is there a misconfiguration preventing the 5-day forecast from being generated/saved properly? Are additional modifications required to support a 5-day horizon beyond supported_horizons() and CLI verification?

@Flix6x
Copy link
Contributor

Flix6x commented Nov 19, 2024

Potential fix

Thank you for taking the time to write a clear issue. I believe your problem has to do with the belief time of your actuals. Could you try adding the actuals using:

flexmeasures add beliefs --sensor <sensor_id> --source 'actuals' sensor_data_full_utc.csv --timezone 'UTC' --horizon 0

and then run the forecaster again?

Explanation

In your last screenshot it became clear to me that your actuals have currently been recorded with the belief time set to the moment at which you ran the CLI command.

Subsequently, the forecaster will ignore any data that shouldn't be considered to have been known at the time of creating the 5-day forecast.

The --horizon 0 option I use above indicates that the actuals should be treated as if they had been recorded right after they occurred. Therefore, the forecaster will treat these records as if they were known at the time of creating the forecast.

@devansh287
Copy link
Contributor Author

devansh287 commented Nov 20, 2024

I really appreciate your response. I blew away everything on the sensor and added the actuals again as you told me to, but I am still getting no predictions for Nov 19th. Attaching a screenshot here:
image

I hid the rows before Nov 14th for ease of viewing. As we can see the belief time is still unchanged for the actuals.

@Ragnar-the-mighty
Copy link
Contributor

Subsequently, the forecaster will ignore any data that shouldn't be considered to have been known at the time of creating the 5-day forecast.

The --horizon 0 option I use above indicates that the actuals should be treated as if they had been recorded right after they occurred. Therefore, the forecaster will treat these records as if they were known at the time of creating the forecast.

@Flix6x
I am confused by this explanation.
In the above example the horizon of 1 day or 5 days does not affect if data is known or not.

if to-date - horizon < last_actual_data_date
then you should be able to make a forecast

In fact the longer/larger horizon the further back in time data is used for the forecast, assuming the to-date stays the same.

@BelhsanHmida
Copy link
Contributor

Hello @devansh287, can you help clear up something. i'm noticing that in the second picture the belief time for event_start: 2024-11-14T05:30:00+05:30 is still 2024-11-18T20:06:42.083+05:30 (which is the belief time of when you added the beliefs and did not set the --horizon param).

This might lead me to consider that maybe there was a mix-up between old values on that sensor. because if you use the command on a fresh sensor:
flexmeasures add beliefs --sensor <sensor_id> --source 'actuals' sensor_data_full_utc.csv --timezone 'UTC' --horizon 0.

the result would be:
event_start belief_time
2024-11-14T05:30:00+05:30 2024-11-15T05:30:00+05:30.

So can you try again with a new sensor and see if the same error persists.

@devansh287
Copy link
Contributor Author

Appreciate your response, @BelhsanHmida. I did try what you suggested, but I am still not getting the desired forecast. Here is a snapshot of the sensor's csv. As you can see, the belief time is off by one day for the actuals.

image

I am also attaching some additional messages I got from the worker log when it claimed that the forecast was generated:

2024-11-26 21:50:34 Running Forecasting Job ba141a3e-dc8e-4f03-a5be-57bb5a77e5bc: grid for 5 days, 0:00:00 on model 'linear-OLS', from 2024-11-19 00:00:00+00:00 to 2024-11-20 00:00:00+00:00
2024-11-26 21:50:35 Job ba141a3e-dc8e-4f03-a5be-57bb5a77e5bc made 1 forecasts.
2024-11-26 21:50:35 /usr/local/lib/python3.10/dist-packages/timely_beliefs/beliefs/classes.py:633: SAWarning: Object of type <TimedBelief> not in session, add operation along 'Sensor.beliefs' will not proceed (This warning originated from the Session 'autoflush' process, which was invoked automatically in response to a user-initiated operation.)
2024-11-26 21:50:35   df = pd.DataFrame(session.execute(q))
2024-11-26 21:50:35 /usr/local/lib/python3.10/dist-packages/timely_beliefs/beliefs/classes.py:633: SAWarning: Object of type <TimedBelief> not in session, add operation along 'DataSource.beliefs' will not proceed (This warning originated from the Session 'autoflush' process, which was invoked automatically in response to a user-initiated operation.)
2024-11-26 21:50:35   df = pd.DataFrame(session.execute(q))
2024-11-26 21:50:35 /usr/local/lib/python3.10/dist-packages/timely_beliefs/beliefs/classes.py:332: SAWarning: Object of type <TimedBelief> not in session, add operation along 'Sensor.beliefs' will not proceed (This warning originated from the Session 'autoflush' process, which was invoked automatically in response to a user-initiated operation.)
2024-11-26 21:50:35   session.execute(smt)
2024-11-26 21:50:35 /usr/local/lib/python3.10/dist-packages/timely_beliefs/beliefs/classes.py:332: SAWarning: Object of type <TimedBelief> not in session, add operation along 'DataSource.beliefs' will not proceed (This warning originated from the Session 'autoflush' process, which was invoked automatically in response to a user-initiated operation.)
2024-11-26 21:50:35   session.execute(smt) 

Hope this is helpful.

@BelhsanHmida
Copy link
Contributor

Hello @devansh287 , thanks for the detailed response. just wanted to let you know this currently being looked into.

@BelhsanHmida
Copy link
Contributor

Hello @devansh287,
Apologies for the delay in getting back to you – I hope this issue is still relevant. Here's an update: we've successfully recreated the issue you encountered and traced it to missing data in the lags generated by the Timetomodel package.

@devansh287
Copy link
Contributor Author

Dear @BelhsanHmida,
Wishing you and your family a happy, healthy, and prosperous New Year!
Yes, this issue is still relevant. How difficult do you think the fix is?

@BelhsanHmida
Copy link
Contributor

Thanks for the kind words @devansh287 – I appreciate it, and I wish the same for you! I'm glad to hear the issue is still relevant.

As for the fix, it's difficult to estimate since we've traced the issue to the timetomodel package, which has been challenging to debug. That said, we're currently working on a potential workaround to enable forecasting for up to 5 days. I'll keep you updated as soon as we make progress.

@devansh287
Copy link
Contributor Author

@BelhsanHmida, we want to do a 30-day forecast. Is there anything we can do to help?

@nhoening nhoening changed the title Forecast not being generated Forecast not being generated when horizon longer than 48h Jan 20, 2025
@nhoening
Copy link
Contributor

Hi @devansh287 we're trying to find time for this again this week, meaning to find out why lags are not properly created in the timetomodel package - or in the collaboration of FlexMeasures code in data/models/forecasting with timetomodel. That is basically debugging work, in code we ourselves have not updated in more than a year.

Hopefully this would be an easy fix once we find out where it is going wrong. In the meantime we are working on a new implementation of the way FlexMeasures makes forecasts, but that will not be released very soon - we're testing it for simulations, and then will implement it for use in FlexMeasures directly.

Do you believe a 30-day forecast will be very useful?

@devansh287
Copy link
Contributor Author

Hi @nhoening,

Thank you for your continued efforts on this issue.

To clarify our use case, our sensor operates with a daily resolution, not hourly. As a result, any forecasts involving this sensor will inherently span multiple days. With the introduction of peak power tariffs, the ability to forecast daily peak power consumption over a 30-day period will be particularly valuable for planning and cost optimization.

Looking forward to your thoughts on this and any guidance you can provide on potential next steps.

@BelhsanHmida
Copy link
Contributor

Hello @devansh287,

We managed to trace the issue to TimeToModel: featuring.py::construct_features. The problem arises due to
date_indices and outcome_var_df having different timestamp offsets, which leads to mismatched
concatenation in:

df = pd.concat([df, outcome_var_df], axis=1)

This results in missing values in the DataFrame passed for creating lags, ultimately leading to
no prediction output.

Solution:
To bypass this issue, we recommend changing the timezone configuration to UTC by setting:
FLEXMEASURES_TIMEZONE="UTC". Once this is done, forecasts should run correctly.

Steps to Reproduce:

  1. Add a sensor:
    flexmeasures add sensor --name 'grid' --unit kW --event-resolution 'P1D' --timezone 'UTC' --asset <asset_id> --attributes '{"capacity_in_mw": 0.5}'

  2. Add beliefs:
    flexmeasures add beliefs --sensor <sensor_id> --source 'actuals' sensor_data_full_utc.csv --timezone 'UTC' --horizon 0

  3. Generate forecasts:
    flexmeasures add forecasts --sensor 67624 --from-date '2024-11-16' --to-date '2024-11-19' --horizon 120 --as-job

Outcome:
This process will result in forecasts up to 5 days.

Image

Note: For 30-day forecasting, more testing is needed as it's currently not stable. However, I suggest
testing 30-day forecasts yourself and try changing the --from-date and --to-date values incrementally from dates that work.
Additionally, necessary changes should be made to:

  • flexmeasures/utils/time_utils.py : supported_horizons()
  • flexmeasures/cli/data_add.py : create_forecasts()

Hope this message is of help, and I look forward to making the 30-day forecast work correctly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants