Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csvz-meta-columns column types #9

Open
MarkPflug opened this issue Sep 10, 2020 · 2 comments
Open

csvz-meta-columns column types #9

MarkPflug opened this issue Sep 10, 2020 · 2 comments
Labels
help wanted Extra attention is needed

Comments

@MarkPflug
Copy link

The spec for this file feels rather useless unless some minimal set of standard types is defined. A well-defined schema would allow an database import tool to construct the appropriate table in the database. Without a standard set fallback to "string" would be needed when an unknown type was encountered.

I would propose as a minimum:

  • boolean (true/false, 0/1)
  • int (byte/short/long?)
  • float (float/double)
  • date (datetime)
  • string (might be worth specifying ascii vs unicode)
  • binary (Base64)

Possibly also include:

  • guid
  • time (timespan/duration)
@secretGeek secretGeek added the help wanted Extra attention is needed label Sep 11, 2020
@secretGeek
Copy link
Owner

Agreed.

Data types to be defined in a set of further spec fragments.

Need to pull together basic datatypes from existing meta formats (xml/json/sql)

@secretGeek
Copy link
Owner

secretGeek commented Sep 11, 2020

One thought.... (and this covers both 'basic' types and 'user defined' types...)

The columns.csv might say that a column has a datatype of "boolean"

How do we know what boolean means?

We check if there is a file called:

  _meta\types.csv

....if there is....

we check if there is a "boolean" type defined in there.

...if there is no boolean in there, then we (the author of a tool for doing csvz stuff.....) check if "boolean" has a default meaning in csvz land.

We find it does! Ah... it means....

boolean (a binary value, that is by default encoded as the characters "0" (ascii 44*) and "1" (ascii 45) with 0 traditionally meaning false and 1 meaning true.

(* i assume i got those numbers wrong... just a made up example.. there is not yet a spec for any fundamental types....) -- so then the tool creator can work out for themselves the appropriate way to store that (e.g. if they're writing a csvz -> sql server tool they might use 'bit' -- and if they're creating a csvz -> json tool they'll use something else... but they'll match the semantics described therein...)

... or perhaps we did find it in the "types.csv" file.... wherein it said..... other type details... encoding, minimum, maximum, etc... enough to specify it in terms of the basic types..... it could even refer one to custom encoders/decoders .....

-- or perhaps we didn't find it in types.csv .... nor was it a fundamental type defined in the csvz spec... then it's a .... string... with the default expectations of a string.... which can be veryu large... and thus it is transmitted in tact... and the person at the other end can make sense of it if they need to.... (e.g. it might be a custom formatted string... ... and the way to decode it is transmitted "out of band") ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants