Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resetting Schema and Catalog, when connection returend to pool #104

Conversation

rPraml
Copy link
Contributor

@rPraml rPraml commented Oct 22, 2024

When a connection is taken from the pool and schema and/or catalog will be changed, the schema/catalog is not reset. So the next one will get a connection, that is not in the expected schema/catalog and might read the wrong data.

I've also refactored the change in #5 a bit that it also handles the current catalog.

Sidenotes:
Currently, these setters are reset when connection is put back to pool:

  • setAutocommit (already handled)
  • setReadonly (already handled)
  • setTransactionIsolation (already handled)
  • setCatalog (handled in this PR)
  • setSchema (handled in this PR)

Generally, we should think about all setters on the connection, if we either reset them or if we throw an exception, when someone tries to use them.

  • setTypeMap
  • setHoldability
  • setNetworkTimeout
  • setClientInfo
  • setShardingKey (since java 9)

When a connection is taken from the pool and schema and/or catalog will
be changed, the schema/catalog is not reset. So the next one will get a
connection, that is not in the expected schema/catalog
@rbygrave rbygrave added this to the 9.1 milestone Oct 23, 2024
@rbygrave rbygrave merged commit 73b2204 into ebean-orm:master Oct 23, 2024
1 check passed
@@ -115,6 +122,8 @@ final class PooledConnection extends ConnectionDelegator {
this.maxStackTrace = pool.maxStackTraceSize();
this.creationTime = System.currentTimeMillis();
this.lastUseTime = creationTime;
this.currentSchema = this.originalSchema = connection.getSchema();
this.currentCatalog = this.originalCatalog = connection.getCatalog();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @rbygrave thanks for merging,
this PR solves our current problem.

But I want to share a few more thoughts that occurred to me about this PR:

getSchema and getCatalog may communicate with the database. In other words, we send some TCP/IP packets over the network. Since this code is only called when the pool grows, which should rarely happen, it will not cost much runtime.

But what is your philosophy here?

  • be as fast as possible (then we might read the current value once before changing it)
  • trade off between performance and complexity/security
  • be as secure as possible (then we might restore the current value always and not only when resetXXX is set)

The latter point in particular is giving me a headache: as long as you use the setSchema and setCatalog methods, everything works correctly. But as soon as someone would manually execute a statement like set schema xyz (or use database xyz), the pool doesn't recognize a schema change and the connection points to the wrong database.

How I discovered this:

You may remember our problem that we currently have a separate connection pool for each tenant, which leads to a resource problem for many tenants. (Ebean runs with TenantMode.DB_WITH_MASTER)

We have now experimented to see whether we can use a connection pool and simply switch the schema beforehand. (We still use DB_WITH_MASTER but our TenantDataSourceProvider only provides a simple delegate data source that takes care of the schema change)
In concrete terms, there is a new SchemaSwitchingDataSource(config.getDataSource(), tenantSchema) for each tenant, which then points to the MASTER db and uses the config.getDataSource() pool from ebean.

This also worked well for the tenants, as long as the SchemaSwitchingDataSource is always used and a schema change is carried out beforehand.

The code that runs in our management tenant (=MASTER) uses plain config.getDataSource() and here I noticed that I was sometimes in the wrong schema (fixed by this PR)

As mentioned above, as long as setSchema / setCatalog is used, everything works now. But if some bad written code will invoke a set schema xyz statement, the management tenant may point to the wrong schema when it picks that connection, because it does no schema switch here. (By the way, the same problem may exist also for autocommit and isolation level - but it is not as bad as if incorrect data is accessed, as would be the case here.)

So I'm wondering,

  • how and if we should prevent this in the DataSourcePool.
  • whether I prevent this in my application. E.g. using DataSourcePoolListener.onAfterBorrowConnection that checks the connection or use also a SchemaSwitchingDataSource for the master DB (I think I will take this option)
  • or rely that every developer writes correct code 😉

I don't think there is any need for action on your part for now, but I wanted to share this with you.

Roland

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants