Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

presto other database support #831

Open
kfox1111 opened this issue Jul 11, 2019 · 4 comments
Open

presto other database support #831

kfox1111 opened this issue Jul 11, 2019 · 4 comments

Comments

@kfox1111
Copy link

Would it be possible to target presto to mysql/postgresql directly and support disabling deploying hive? It may be much simpler to use for those that don't have large clusters.

@chancez
Copy link
Contributor

chancez commented Jul 11, 2019

Supporting other databases is in the backlog, and is something I've wanted for a while but is just lower priority currently as we're working on a GA release. Adding support for other databases is relatively simple, but removing Hive is trickier.

Removing Hive is somewhat difficult because we use the map datatype in Presto/Hive for Prometheus metric data, which is basically only supported by Hive for Presto right now. We would need a way to configure prometheusMetricImporterDataSource's with a flat column structure without the maps, mapping labels from the results into columns via configuration in the datasource. Then we would need to update some of the ReportQueries to handle the new data model, but that would actually be fairly easy since we model the tables like this already, using views. After this, we could then use Mysql/Postgresql for storing metric data.

We're also exploring options like a native Prometheus connector for Presto, which would potentially allow us to stop importing metrics altogether, which could make this a lot easier. In this case, we would have dataSources which simply map directly to Prometheus time series tables in presto, and we would be able to write our reportQueries against those tables, and then reports could store data into mysql/postgresql.

@mmariani
Copy link
Contributor

mmariani commented Jul 11, 2019

If you plan to fully support Postgres for storing metrics (jsonb could be an alternative to the map type), consider the Timescale extension which has magic table partitioning under the hood. It has some limitations (no surrogate primary keys etc) but otherwise it's 98% compatible with much better performance.

@kfox1111
Copy link
Author

Interesting. Thanks for the info.

@chancez
Copy link
Contributor

chancez commented Jul 11, 2019

@mmariani jsonb isn't supported by Presto, and leveraging timescaledb wouldn't help any since Presto would likely be unable to pushdown much of the filtering today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants