-
Notifications
You must be signed in to change notification settings - Fork 552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add upsert behavior for doc_ids (not just condition) #394
Comments
Submitted PR #395. Gardening and sleep lend clarity. Instead of adding a new argument to upsert, I followed the pattern established by insert and respected Document.doc_id. |
First, thanks for your work @benetherington! As you described your motivation and general design in this issue and also opened a Pull Request, I'll answer regarding the more general questions here and then take a look a the PR and give feedback on the code over there 🙂
Interesting! This difference didn't occur to me when adding Long story short, I agree that this is a weird spot in TinyDB's API.
I agree 100%
I think you found a nice solution in your PR, I'll comment on that over there 😉 Again, thanks for your work and the thought you put into this, it's very much appreciated! |
Closing this as #395 has been merged |
PR forthcoming, but I wanted to talk alternatives out first.
Upsert is one of my favorite tricks implemented by TinyDB. I'm currently working on an API where I'd really like to duplicate server db IDs in my local db. Fewer mistakes that way, I figure. Currently, Table.insert and Table.update handle doc_ids in very different ways. Insert doesn't take a doc_id argument, but instead requires a pre-built Document. Update, on the other hand, takes doc_ids as an argument. I suspect this has to do with the way Query() works, and that unifying this behavior would require a major version (maybe two! 😄)
The simplest solution would be to add a new method, maybe Table.upsert_by_id, but instead, I'd like to make upsert smarter to begin with. There are a few details to iron out, though.
Upsert bridges an odd gap, in that updating changes one or may records, but inserting only changes one. Upsert already wraps an insert-generated doc_id in a list to mimic a single updated doc_id's return value. This is good. An ID-aware upsert should accept doc_ids (plural). If it accepted doc_id (singular), it would be expected to return a single ID, not a list. That would be confusing.
Simple is better. When I started working on this, my first instinct was to break upsert into two branches of an if statement. One branch would handle cond but not doc_ids, the other would handle doc_ids but not cond. I added complexity by asserting only one argument was specified. I added further complexity when I had to loop over multiple inserts. I added EVEN FURTHER complexity when I tried to handle mixed existing and not-existing IDs. I stopped writing code when I was considering breaking off the doc_ids branch into a private method that upsert could call multiple times. It wasn't pretty, folks.
That's where I'm at right now, and I need to break off into a different task. When I come back, my plan is to keep the
if doc_ids is not None
branching structure, but A) group missing and existing IDs into two separate lists and handle them separately, B) change update_multiple to take a doc_ids argument and handle some of this logic, and maybe make it more useful to someone else down the line.Does this all sound like a good plan of attack? Is the juice not worth the squeeze, and would Table.upsert_by_id be a better solution all round?
Thanks, folks, I really love this library, and I hope I can make it better. It fills a fantastic niche!
EDIT: AHHH should have checked open PRs first! There's already work to change update_multiple! Okay, I think that's really going to be the direction to take this.
The text was updated successfully, but these errors were encountered: