Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of Validated entity state #2011

Open
NSUWAL123 opened this issue Dec 20, 2024 · 13 comments
Open

Addition of Validated entity state #2011

NSUWAL123 opened this issue Dec 20, 2024 · 13 comments
Assignees
Labels
backend Related to backend code enhancement New feature or request

Comments

@NSUWAL123
Copy link
Contributor

Is your feature request related to a problem? Please describe.
We have 4 statuses for an entity:

READY = 0
 OPENED_IN_ODK = 1
 SURVEY_SUBMITTED = 2
 MARKED_BAD = 6

Currently, when a submission is marked bad, the entity status is to be updated to MARKED_BAD. There's no VALIDATED state. The SURVEY_SUBMITTED case is only ideal if there's no BAD entity.
But when a submission is made again to the BAD entity and the submission is approved, what should the entity status be? 1. Fallback to SURVEY_SUBMITTED which doesn't make sense, or 2. VALIDATED.

Describe the solution you'd like
So, to solve the problem we should add a new status called VALIDATED.

@NSUWAL123 NSUWAL123 added the enhancement New feature or request label Dec 20, 2024
@NSUWAL123 NSUWAL123 added the backend Related to backend code label Dec 20, 2024
@spwoodcock
Copy link
Member

spwoodcock commented Dec 20, 2024

SURVEY_SUBMITTED is essentially an indicator to say the mapping has been done.

The validation status is handled elsewhere: internally in the ODK database, so we can access the validation state from the ODK API (avoiding the need to store this twice).

The BAD status was added here to indicate more easily the features that failed validation, without having to call the ODK API directly (its a small amount of necessary duplication).

But when the user submits another submission to override / correct the previous one, this submission will have to be validated.

It makes sense to me to mark the status as SURVEY_SUBMITTED once again, then afterwards either leave it as it is, or mark BAD if it fails validation again.

Does that make sense?

@Sujanadh
Copy link
Collaborator

I think we need a status for validated features; from a validator perspective, I wouldn't know which feature is validated since every feature would have status survey_submitted whether they are just submitted or have been validated. If there is a distinct status, then I would know which feature to pick during validation. We only have validation status, which is approved in individual submissions. If validators approve the submission, then we can update the status of the entity to validated. When all features are validated within a task, then we can auto-change the task status to validated.

@spwoodcock
Copy link
Member

But there is a validation field built into ODK right?

It seems like duplication to ignore that and put the validation in FMTM instead.

Plus validation should apply to submissions for features, not to the features themselves.

You mention a workflow where we 'pick features' for validation. But instead I think we should base it on if there are submissions. If we have an unreviewed submission, then we prompt for it to be reviewed, then marked as validated in ODK

@spwoodcock
Copy link
Member

spwoodcock commented Dec 23, 2024

Essentially what I'm trying to convey is that all the 'feature' level statuses were designed around mapping.

The only time validation is involved is for marking the feature bad for the mapper to see, then re-map. The mapper doesn't care if its validated, only if its adequately mapped.

The actual validation state for submissions of features is stored in ODK

@NSUWAL123
Copy link
Contributor Author

You are right in the context of the mapper frontend. But when the validator validates the entity via the project page through the map, how does the validator know that they have validated the feature?
For example, thousands of entities could be out there in a single task. Suppose the survey_submitted and validated status would be specified with green color then wouldn't it be hard for the validator navigating through thousands of entities to find the next entity to be validated as all entity colors would be same?

@spwoodcock
Copy link
Member

Yeah you make a good point!

This all depends on if we want to store the feature level validation state in ODK, or in FMTM.

ODK manages the validation status per submission, while we would manage it per entity in FMTM

@manjitapandey
Copy link
Contributor

with #1946, Since we are changing the validation workflow to directly go to validation page from project details page. It will be essential for validators to know which features are already validated and which are at ready to validation state.

So we need to show different colour for validated and ready to validate features in project details page. Can we obtain that using feature level validation state in ODK ??

@spwoodcock
Copy link
Member

Depends on if there is an API to get the review state of all submissions on a project - not at my computer to check.

But the fallback is to add a new state for entities as everyone has mentioned. The main downside there is that we can handle multiple submissions, as we will do entity based validation instead of submission based validation

@Sujanadh
Copy link
Collaborator

This issue is all related to map view. Retrieving all review states will also need to have entity id in the response. There is n't any api available to get just review states of all submissions. We can get them from the api to fetch all submissions metadata only but there won't be any entities id linked. This approach will fall back to getting all submissions data , get review states and entity ids which itself is very slow process and complicated.

@spwoodcock
Copy link
Member

spwoodcock commented Jan 10, 2025

Thanks for the info 👍

So just to clarify, we are suggesting:

  • Add a VALIDATED status for the features / entities in our db.

  • Still use the ODK review state for submissions.

  • This way its possible to review multiple submissions for an entity, however the entity status will become VALIDATED aften one submission review is done.

Am I correct so far?

Question

  • The odk_entities table was made to make the entity statuses more easily available in the frontend via electric.

  • However, now we also have a table to track 'new' geometries (and 'bad' geometries, but these are duplicated).

How would we manage the VALIDATED state of the new geometries / entities?

Follow Up Question

  • Adding a new geometry via ODK should trigger the creation of a new entity, right?

  • So on the next call of the sync status API, the entity should be duplicated in the odkentities table I would think.

This raises the question, do we need the extra geometrylog table we just created, or can we simply track all the geometry / entity statuses in the odkentities table? 😅

We could add a new column for geojson, which is optional. We only fill it for new entities created by the user

@Sujanadh we should think this through together

@Sujanadh
Copy link
Collaborator

Thanks for the detail breakdown.

You are correct, we use review state of submission for validation, and make the feature validated if any submission in case of multiple submissions is approved.

Regarding your question on how we manage validated state for the new geometries.
We will create entity once geometry is drawn, so that it is loaded in odk collect. They are also saved in the geometrylog table, as far as I know the purpose of creating geometrylog table for bad geom and new geom is to provide instant pulse effect and visualization in outmost layer in map via electric sql. Without table, we need to load the entire fgb and entities status to check for the bad geom. Once the new geom is created as an entity in odk, the workflow for the validation will be same as it is for all the geom. When user clicks new geom or bad geom we load that tasks entities and allow them to validate it.

In case of bad geom, the entry will be duplicated, there will be record in both table odk_entities and geometrylog. I like the idea you proposed to add geojson column in odk_entities, in case of new geom, the question is when user draws a polygon should it be visible instantly to all the users? if not then we can create entity and load it by syncing status and use electric sql to fetch new geom or bad geom only, which might not be real time. We also need to add new status "NEW" as entity status as well. This approach will help in validation process we don't have to load tasks entities, user can just click the new feature or bad feature and start validation.

@spwoodcock
Copy link
Member

Thanks, that helps clarify things!

So we are proposing two things:

New Statuses For Entities

READY = 0
OPENED_IN_ODK = 1
SURVEY_SUBMITTED = 2
NEW_GEOM = 3
# 4 available for any future status
VALIDATED = 5
MARKED_BAD = 6
  • Where NEW_GEOM and MARKED_BAD also imply that the survey was submitted (alternative values to SURVEY_SUBMITTED = 2).
  • The VALIDATED status can be applied to existing, new geoms, right? But I'm uncertain about bad geoms, depending on the workflow we choose.

Existing geom: SURVEY_SUBMITTED --> VALIDATED ✅
New geom: NEW_GEOM --> VALIDATED ✅
Bad geom: MARKED_BAD --> VALIDATED ❌

Either of these approaches:

  • Bad geoms remain, but the project manager corrects the geometry with field feedback to generate a corrected data extract. Another project is made with the new corrected data for the bad geoms.
  • The 'bad' geom is checked by the project manager, then then mark is as VALIDATED if the user already submitted a new geometry for it.
    I guess this comes down to what the project manager advises the mapper to do in the field (collect new geoms in first pass, or revisit a second time after the geoms are corrected).

The geometrylog table

  • We started down this route, let's keep it for now.
  • Once I have the webhook implementation done, I don't think we will need this table anymore though.
  • Proposed alternative approach:
    • Users collecting a new geometry should have an entity created for them in ODK.
    • Users marking a geometry as bad will update status=6 in the entity.
    • In both cases, this triggers the webhook, meaning the new entity is inserted in odk_entities, or the entity status is updated.
    • In either case (new or bad), we also insert the geometry as either the JavaRosa string or a geojson JSONB. We should compare the difference in performance between them. The JavaRosa string is less data, but the JSONB might have other advantages such as better indexing.
    • When the data is updated in odk_entities, the data will be reflected real time in all users browsers.
    • We can then delete the geometrylog table?

@spwoodcock
Copy link
Member

The actionable item here is adding two additional enum states to EntityState (both backend and frontend):

  • NEW_GEOM = 3
  • VALIDATED = 5

It might be nice to remove these numeric values in future too, simply setting NEW_GEOM = "NEW_GEOM", but that's a problem for another day too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to backend code enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants