-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add FromSql/ToSql<Timestamp> to Pg backend for OffsetDateTime #3973
Conversation
0ce6ea6
to
d53143c
Compare
d53143c
to
5d6beda
Compare
Thanks for opening this PR. This is really appreciated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not impl ToSql/FromSql<Timestamp, _> for OffsetDateTime
, see below for in-depth explanations.
I would however be ok with an impl ToSql/FromSql<Timestamptz, _> for OffsetDateTime
@@ -44,6 +49,23 @@ impl ToSql<Timestamp, Pg> for PrimitiveDateTime { | |||
} | |||
} | |||
|
|||
// Delegate offset datetimes in terms of UTC primitive datetimes; this stores everything in the DB as UTC | |||
#[cfg(all(feature = "time", feature = "postgres_backend"))] | |||
impl ToSql<Timestamp, Pg> for OffsetDateTime { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OffsetDateTime
is timezone-aware, Timestamp
isn't.
OffsetDateTime
should be allowed to map to Timestamptz
.
PrimitiveDateTime
should be allowed to map to Timestamp
.
However, implementing ToSql<Timestamp, _> for OffsetDateTime
would let users implicitly drop timezone information by writing table::non_timezone_aware_field.eq(timezone_aware_data)
, which is bug-prone.
This goes strongly against the general principle of non-implicit-cast otherwise used in Rust.
=> I would prefer that we do not do this, and let users write table::non_timezone_aware_field.eq(timezone_aware_data.forget_offset())
if they really want to - which they probably don't: they probably want to update their database schema to be timezone-aware instead (that is, by using the Timestamptz
type instead of Timestamp
, if they are feeding actually feeding timezone-aware data to their database).
NB: I couldn't find an actual OffsetDateTime::forget_offset
function in time
(or one named differently that would do that), which I find very surprising (I don't use time), but that would be an issue to solve via a PR in time
, not in diesel
. (But then maybe such a function doesn't exist because the behavior of a function with a such signature may be unclear because choosing whether to convert by going to UTC or by forgetting offset is arbitrary.) (In any case you can write PrimitiveDateTime::new(dt.date(), dt.time())
until that gets merged)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While strictly true, this winds up being incredibly verbose in practice. In particular where users of diesel
don't control their own schema, it's convenient to specify the UTC convention at the row mapping. This is probably why the MySql impl exists.
When deriving Queryable
for large structs on Pg
you currently have to write the field mapping as PrimitiveDateTime
. When doing almost anything with the data later (including serializing for REST output) you wind up with a lot of assume_utc()
calls. This becomes confusing to read, because "hey this Postgres time is in UTC" annotation is separated from the database call that populated the PrimitiveDateTime
.
In the other direction when storing rows, you wind up having to write wrapper fns to manually convert OffsetDateTime
into PrimitiveDateTime
for each and every insertion. This is easy to get wrong: if you simply call date()
and time()
without first explicitly converting to UTC you can easily wind up inadvertently storing an incorrect timestamp. There is no impl in time
to perform this conversion, making it almost deliberately awkward. I assume the time
crate author has a philosophical reasoning here, in that most people should store and operate on OffsetDateTime
instead of PrimitiveDateTime
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case where you don't control your database schema, I think you have two reasonable options:
- Use
serialize_as
/serialize_fn
/deserialize_as
/deserialize_fn
on your Insertable/Queryable struct - this removes the need for a conversion at every struct construction (resp. usage) - Since it turns out that the actual wire serialization for the
Timestamptz
type is precisely that of serializing aTimestamp
asUtc
(cf the module this PR updates), you could update yourschema.rs
to replaceTimestamp
byTimestamptz
there after the CLI generates it, in effect asserting that when aTimestamp
is used in this table, it is indeed expected to be anUTC
timestamp. You would then appropriately use timezone-aware throughout your rust application, guaranteeing that these are read and inserted as such, and avoiding non-timezone-aware types in your application (which is best in any case).
This is probably why the MySql impl exists
As outlined by weiznich the reason is more likely that MySQL has a different meaning than Postgres for the Timestamp
type, where it is actually timezone-aware (in that it converts provided time from local to UTC for storage):
https://dev.mysql.com/doc/refman/8.0/en/datetime.html
So it looks like on the MySQL side, Timestamp
should be allowed to map from timezone-aware types, and DateTime
should be allowed to map from non-timezone-aware types. (It looks like it may be possible to extract the appropriate information to behave properly in every case from the MysqlTime
that gets deserialized. I'm surprised that we assume utc instead of reading time_zone_displacement (doc: https://dev.mysql.com/worklog/task/?id=10828) but that may be correct.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. I didn't realize the Timestamp behavior was that different, but I guess it does make sense here.
you could update your schema.rs to replace Timestamp by Timestamptz there after the CLI generates it, in effect asserting that when a Timestamp is used in this table, it is indeed expected to be an UTC timestamp.
This is a really clever approach. I was able to make this switch in my own code, and things generally worked as you've described. The only place I ran into issues was with a gt
comparison against now
... to me that indicates the NOW()
typing in Postgres actually isn't correct either. The Postgres documentation seems to indicate that the type returned should be Timestamptz
and not naked Timestamp
. I was able to work around it by doing a local now_utc
and doing a local subtraction, but probably something to add to a corrections list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC now
has AsExpression
implementations for both Timestamp
and Timestamptz
, so one shouldn't hit any error if .into_sql()
ing it to Timestamptz
. I'm surprised that doesn't work implicitly anymore... It's maybe a matter of Nullable/not Nullable...
@@ -17,6 +17,11 @@ use crate::sql_types::{Date, Time, Timestamp, Timestamptz}; | |||
// Postgres timestamps start from January 1st 2000. | |||
const PG_EPOCH: PrimitiveDateTime = datetime!(2000-1-1 0:00:00); | |||
|
|||
fn to_primitive_datetime(dt: OffsetDateTime) -> PrimitiveDateTime { | |||
let dt = dt.to_offset(UtcOffset::UTC); | |||
PrimitiveDateTime::new(dt.date(), dt.time()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems arbitrary.
I think I would have expected the one that just drops timezone information (turns into naive) without converting to UTC, but arguably that's arbitrary as well.
In fact, it seems that there's really no non-arbitrary way to implicitly drop timezone information.
=> I would prefer that we do not do this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is arbitrary, but this is also just mirroring the current behavior of the existing MySql
implementation.
If all your timestamps are sanitized internally to UTC it's reasonably common for applications to store without timezone to save space. Since local tz obviously doesn't work, the only reasonable assumption is UTC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main difference to mysql here is that postgresql has a distinct timestamp type (Timestamp with time zone
), which does exactly what you describe. So it seems to be a bit odd to reimplement that behavior here.
Given that there is no non-arbitrary way to implicitly drop timezone information I personally would prefer to keep that step explicit for the user. I would go as far and argue that accepting the mysql impl was a mistake back then. Maybe we should put a #[deprecated]
on that impl instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does trait implementation deprecation work now? 🤩
(Or do we just want to use the without-deprecated
feature?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if rustc will show the deprecation warning, but we have the without-deprectaed
feature flag for such cases.
Commented inline above, but I'm more or less copying the existing code from the MySql impl that assumes UTC. It's a reasonable default, as long as it's consistent on both insertion and selection. Existing code: |
Ok(prim.assume_utc()) | ||
} | ||
} | ||
|
||
#[cfg(all(feature = "time", feature = "postgres_backend"))] | ||
impl FromSql<Timestamptz, Pg> for PrimitiveDateTime { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: I see that we already do implement this, both here and in the chrono
module, and I don't like it for the same reasons.
But here we probably don't want to remove it until the next major release, because that would otherwise be breaking.
In that case, you may want to PR the removal of the existing MySql impl here: |
Maybe a reasonable compromise here would be to gate the "assume UTC on |
That may be a reasonable compromise, depending on whether other solutions turn out to be acceptable without introducing additional maintenance complexity with feature flags. |
Similarly I don't think these impls: diesel/diesel/src/pg/types/date_and_time/time.rs Lines 47 to 59 in 877611a
or these impls: diesel/diesel/src/pg/types/date_and_time/chrono.rs Lines 50 to 62 in 877611a
should exist (especially the Mainly for me the risk is that people start mistakenly inserting non timezone-aware datetimes as timezone-aware data in a |
I'm going to withdraw this PR. I think the plan to deprecate the mismatched stuff makes a lot more sense. Let's see if there's a good way to document that it's possible on |
@Ten0 to follow up, modifying |
You should use the schema patch feature where you can set a patch file for the schema in the diesel.toml, I would expect that locked-schema makes its comparison after. Or you can avoid locked-schema and run the diff check yourself in CI, after e.g. |
A patch file is the correct solution. The lock check is performed after modifying the generated schema via the patch. See the relevant guide for details. |
I considered a patch file, but it doesn't scale when folks on the team add new tables with |
There is currently no way to replace all occurrence of one type with another while generating your schema.rs file. I think it might be possible for you to just add that a script check (via grep or whatever) to your CI setup and just reject any As a more general note: From time to time I think about introducing a more flexible overwrite mechanism that can be configured via the |
What we do is we only So you could do: diesel migration run
sed -i 's/Timestamp /Timestamptz /' src/schema.rs
git diff --exit-code src/schema.rs That doesn't require any additional Diesel feature, and does make the CI fail if the schema has changed (as a bonus, it prints out the diff between the expected and actual schema). |
Yeah, I think I'll set up a @weiznich I don't wanna make more work for something that's likely only a problem for me. The existing patch functionality should be enough. |
Addresses #3581