Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fake data #354

Open
peterdudfield opened this issue Aug 14, 2024 · 13 comments
Open

Fake data #354

peterdudfield opened this issue Aug 14, 2024 · 13 comments
Labels
good first issue Good for newcomers

Comments

@peterdudfield
Copy link
Collaborator

Detailed Description

It would be great to run the api with fake data. This means users could run this locally without having to connect to a database.
The FE users could then use this API with fake data.

Context

  • To run this API locally right now, you need access to OCF database

Possible Implementation

  • add FAKE as a envvionrment
  • add if statement in each route, to produce fake data.
  • Usefuly to have a general solar profile a bit like this
  • Need to match return types for each route
  • Could do one route at a time, doesnt need to be one big PR
  • Might want to add some noise on the forecast, so the actual is exactly the same
@peterdudfield peterdudfield added the good first issue Good for newcomers label Aug 14, 2024
@VikramsDataScience
Copy link
Contributor

Hello there @peterdudfield. This definitely looks to be a more complex and interesting change. If possible I'd like to attempt a contribution to this change?

If so, could I just clarify the following points, please (apologies if these questions are a bit dumb/obvious!):

  • Would creating the new FAKE environment be under the shared workflows repo?
  • Would the new fake data be a modification to the nowcasting repo? Namely the fake.py module? Or something that's entirely new?
  • Could you also clarify the statement regarding:

add if statement in each route, to produce fake data.

What is meant by the route (again, sorry if these are obvious/dumb questions to ask)?

Need to match return types for each route

  • Would the return types be in reference to matching the dtypes that the DummyDBPredictedPowerProduction() class returns found in the _models.py module?

Might want to add some noise on the forecast, so the actual is exactly the same

  • Looking at your implementation in the _basicSolarPowerProductionFunc() function in the client.py this seems to address that problem? Unless you're looking to improve this function further? In which case I could potentially look to conduct some research on how that might be improved (for instance, Meta did quite a cool implementation of a fourier series for their Prophet algorithm to address seasonality that might prove useful)? I can't promise I'll be successful, but if you'd like, I can certainly give it a try! Please do let me know how you'd like me to proceed, and I'll do my best.

@peterdudfield
Copy link
Collaborator Author

Thanks @VikramsDataScience for getting involved.

  1. I was thinking just adding FAKE as a new environmental variable. Which can be turn on and off
  2. I would try to only modify this repo, to keep it simple
  3. So, I would add something like this in each route
if os.enviorn['FAKE'].lower() = 'true':
     return make_fake_data(....)
  1. in the api, there are an number of different routes or urls. For example here

  2. Id try to keep the same return objects we give already. Not creating new ones

  3. Interesting idea. I would first use this one, and we can always update it later.

Hope this helps, and please do ask more questions

@peterdudfield
Copy link
Collaborator Author

This repo and code is how I was thinking it could be done

@VikramsDataScience
Copy link
Contributor

Apologies @peterdudfield! I've been a little time poor, and distracted lately. I erroneously created a PR (openclimatefix/india-api#76) in the india-api repo to address this issue, but I think I made the changes to the incorrect repo! I've since closed it, because I think we're looking to make similar changes but to this repo? If so, which module should I be looking to modify in this repo?

@peterdudfield
Copy link
Collaborator Author

No problem. See point 4 above, but we should try to fake all the 'routes' of the api

@VikramsDataScience
Copy link
Contributor

VikramsDataScience commented Sep 23, 2024

Hey @peterdudfield. I've made the changes to what I think are the correct modules, but I'm running python 3.11.5 on my local machine and venv, but it looks like the .pre-commit-config.yaml requires python 3.9? Is there any way around this, as I'm really not too keen on downgrading my existing python version?

I've also tried to install 3.9 separately to avoid downgrading my system version. And, from there creating a venv that's built from 3.9, but I'm having a variety of challenges!

@peterdudfield
Copy link
Collaborator Author

yea, i would stick to python 3.11.

Does it stop you submitting code?

@VikramsDataScience
Copy link
Contributor

Yeah, exactly! It prevents the commit from going through. Are you okay with me modifying the default_language_version in the .pre-commit-config.yaml?

@peterdudfield
Copy link
Collaborator Author

yea, feel free too

@VikramsDataScience
Copy link
Contributor

Cool. Thank you.

Now that I've got the changes made to the .pre-commit-config.yaml file, it looks like my changes are failing some of the pre-commit checks. It's a bit late here, but if its cool with you, I'll take a gander at resolving these issues over tomorrow/day after, and report back if I'm having any additional challenges?

Otherwise, if all is good, I'll create a PR, and we can work together to get it right :).

@peterdudfield
Copy link
Collaborator Author

yea, of course,
Thanks so much for helping out on this

VikramsDataScience added a commit to VikramsDataScience/uk-pv-national-gsp-api that referenced this issue Sep 24, 2024
VikramsDataScience added a commit to VikramsDataScience/uk-pv-national-gsp-api that referenced this issue Sep 24, 2024
VikramsDataScience added a commit to VikramsDataScience/uk-pv-national-gsp-api that referenced this issue Sep 24, 2024
@VikramsDataScience
Copy link
Contributor

Sorry @peterdudfield about all the commits! That was me having fun and playing around with the pre-commit library 😂

@VikramsDataScience
Copy link
Contributor

Hey @peterdudfield. I've created the PR. It was initially failing the CI checks, but I've modified the .pre-commit-config.yaml again to use a more flexible python3 build over the python3.11.5. It looks like using the 3.11.5 was too narrow a declaration in the CI workflow, and was raising errors when trying to build the environment.

I also quite foolishly left the modified relative imports that I used for testing in my push, not realising that the dockerfile already sets the PYTHONPATH.

The CI tests seem to be passing now. When you've got some time, could you review the changes in the PR, and let me know if I'm on the right track, please?

peterdudfield pushed a commit that referenced this issue Dec 20, 2024
* Fake data Issue #354

* Fake data Issue #354

* Fake data Issue #354

* Changed relative imports and removed pre-commit from requirements.txt

* Using more flexible python3 version to attempt to fix pre-commit.ci build issue

* Written new Unit Test for fake forecast with specified GSP ID

* Written unit tests for all endpoints/routes and modified gsp.py to support new tests

* Removed duplicated Test Cases

* 1st test case for fake environment

* Fixed accidental activation of fake environment in gsp.py module

* Written and tested remaining test cases for is_fake

* Moved isintance check to prior to is_fake condition

* Added 2 Tests to test_national.py and cleaned up some logic in test_gsp test cases

* Modified gsp.py and national.py modules and accompanying test cases to address feedback

* Fixed incorrect for loop iteration through list (should be singular ForecastSQL object) to forecasts object in the test_national.py test cases

* Modified test cases to use NationalForecastValue, ForecastValue, and the ManyForecasts as the return objects

* Modified test cases to uss pytest.fixture() to yield values from db_session

* Possible fix for test_read_latest_all_gsp_normalized() and test_read_latest_all_gsp()

* 1st experiment for test_read_truth_national_gsp() and test_read_forecast_values_gsp()

* 1st experiment with make_fake_gsp_yields()

* 2nd experiment with make_fake_gsp_yields() - modified test_gsp routes

* 3rd experiment with make_fake_gsp_yields() - modified List Comprehension

* 4th experiment with make_fake_gsp_yields() - hard coded _gsp_id_

* Removed yield and fixture

* Experiment: Create a separate tests/fake/test_gsp_fake.py test case module
peterdudfield added a commit that referenced this issue Dec 20, 2024
* Fake data Issue #354 (#360)

* Fake data Issue #354

* Fake data Issue #354

* Fake data Issue #354

* Changed relative imports and removed pre-commit from requirements.txt

* Using more flexible python3 version to attempt to fix pre-commit.ci build issue

* Written new Unit Test for fake forecast with specified GSP ID

* Written unit tests for all endpoints/routes and modified gsp.py to support new tests

* Removed duplicated Test Cases

* 1st test case for fake environment

* Fixed accidental activation of fake environment in gsp.py module

* Written and tested remaining test cases for is_fake

* Moved isintance check to prior to is_fake condition

* Added 2 Tests to test_national.py and cleaned up some logic in test_gsp test cases

* Modified gsp.py and national.py modules and accompanying test cases to address feedback

* Fixed incorrect for loop iteration through list (should be singular ForecastSQL object) to forecasts object in the test_national.py test cases

* Modified test cases to use NationalForecastValue, ForecastValue, and the ManyForecasts as the return objects

* Modified test cases to uss pytest.fixture() to yield values from db_session

* Possible fix for test_read_latest_all_gsp_normalized() and test_read_latest_all_gsp()

* 1st experiment for test_read_truth_national_gsp() and test_read_forecast_values_gsp()

* 1st experiment with make_fake_gsp_yields()

* 2nd experiment with make_fake_gsp_yields() - modified test_gsp routes

* 3rd experiment with make_fake_gsp_yields() - modified List Comprehension

* 4th experiment with make_fake_gsp_yields() - hard coded _gsp_id_

* Removed yield and fixture

* Experiment: Create a separate tests/fake/test_gsp_fake.py test case module

* split national tests

* role back changes in tests

* fix national test

* rename

* is_fake to is_fake()

* add to readme, don't use any caching in tests

* role back,

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* tests run locally

---------

Co-authored-by: Vikram Pande <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants