Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLDB-1829 jseval rows #548

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions container_files/public_html/doc/builtin/lang/Javascript.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,29 @@ MLDB's atomic types are represented in Javascript as follows:
- An array will be used as a compound path with the elements as specified.


### ExpressionValue object

MLDB's non-atomic types are represented in Javascript by the
`ExpressionValue` class, which represents the various atomic and
structured values of MLDB, plus their associated timestamps.

That object has the following methods:

- `toJs()` will return a Javascript representation of the current
values, stripping off the timestamps. Structures and arrays
are supported.
- `when()` will return the timestamp associated with a value.
- `at()` will return a new ExpressionValue with the timestamp
modified to happen at the given point in time. This can be
used to put timestamps back on values which have been processed
with Javascript code.
- `columns()` returns an object with the column names as keys and
their ExpressionValue values as values. Note that this method
only un-nests by one level.

ExpressionValue objects are primarily used by the `jseval` function
with simplified arguments off.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a link to the description of simplified arguments.



### Filesystem access

Expand Down
25 changes: 23 additions & 2 deletions container_files/public_html/doc/builtin/sql/ValueExpression.md
Original file line number Diff line number Diff line change
Expand Up @@ -639,7 +639,7 @@ The SQL function `jseval` allows for the inline definition of functions using Ja

1. A text string containing the text of the function to be evaluated. This
must be a valid Javascript function, which will return with the `return`
function. For example, `return x + y`. This must be a constant string,
keyword. For example, `return x + y`. This must be a constant string,
it cannot be an expression that is evaluated at run time.
2. A text string containing the names of all of the parameters that will be
passed to the function, as they are referred to within the function. For
Expand All @@ -649,13 +649,21 @@ The SQL function `jseval` allows for the inline definition of functions using Ja
any SQL expressions and will be bound to the parameters passed in to the
function.

There are two ways that values in arguments can be represented in Javascript:
simplified (the default), and a non-simplified representation that is accessed
by adding a `!` character to the parameter name argument (for example, `!x,y`
instead of `x,y`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouf. Mes soucis se froncent. Utiliser le "!" ce qui veut généralement dire "not"? Je ne dis pas que c'est mauvais, je me demande si on est certain que c'est la bonne chose à faire.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Une alternative serait un nom de fonction différent. jseval<something>.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moi aussi j'ai sourcillé en voyant le "!"...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal was to avoid breaking existing uses of jseval, without cluttering up the syntax. Should we use jseval2? Or jsexec? Any other thoughts? Because it's more complex to write a js function with ExpressionValues as inputs, I didn't want to make this the default way of doing things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just add an optional parameter?
jseval('return row;', 'row', {*}, fancyStructuredThingy=true)
(with right syntax...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The js way:

jseval('return row;', 'row', {*}, {fancyStructuredThingy :true})

Je pense que c'est la meilleure solution jusqu'à maintenant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't work because it is possible to pass more than one argument to jseval

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right... From the doc

As many argument as are listed in part 2, in the same order. These can be any SQL expressions and will be bound to the parameters passed in to the function.

Then we are back to the alternate function name.


### Simplified arguments

The result of the function will be the result of calling the function on the
supplied arguments. This will be converted into a result as follows:

- A `null` will remain a `null`
- A Javascript number, string or `Date` will be converted to the equivalent
MLDB number, string or timestamp;
- An object (dictionary) will be converted to a row
- An object (dictionary) or array will be converted to a row, with each
element represented as a `[column name, value, timestamp]` tuple.

In all cases, the timestamp on the output will be equal to the latest of the
timestamps on the arguments passed in to the function.
Expand Down Expand Up @@ -695,3 +703,16 @@ log to the console to aid debugging. Documentation for this object can be found

You can also take a look at the ![](%%nblink _tutorials/Executing JavaScript Code Directly in SQL Queries Using the jseval Function Tutorial) for examples of how to use the `jseval` function.

### Non-simplified arguments

If the first character of the argument string is `!`, then non-simplified
arguments are used. These are harder to work with in Javascript, but allow
for the entire set of values in MLDB to be represented, especially structured
values or those with repeated columns or multiple timestamps per value.

For example, the following query yields the same as `(SELECT x:1, y:2)`,
in other words it doesn't mess around with the values:

```sql
SELECT jseval('return row;', '!row', {*}) AS * FROM (SELECT x:1, y:2)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example seems artificial. If it is, I would consider adding an example that is more realistic (e.g. the use case that forces us to add the non-simplified argument).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an artificial example, but it's also that only way to return a real structure from jseval. We can add a better use-case.

```
4 changes: 2 additions & 2 deletions plugins/lang/js/dataset_js.cc
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

#include "dataset_js.h"
#include "mldb/core/dataset.h"
#include "mldb/types/js/id_js.h"
#include "id_js.h"


using namespace std;
Expand All @@ -25,7 +25,7 @@ namespace MLDB {

v8::Handle<v8::Object>
DatasetJS::
create(std::shared_ptr<Dataset> dataset, JsPluginContext * context)
create(std::shared_ptr<Dataset> dataset, JsThreadContext * context)
{
auto obj = context->Dataset->GetFunction()->NewInstance();
auto * wrapped = new DatasetJS();
Expand Down
2 changes: 1 addition & 1 deletion plugins/lang/js/dataset_js.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ struct DatasetJS: public JsObjectBase {
std::shared_ptr<Dataset> dataset;

static v8::Handle<v8::Object>
create(std::shared_ptr<Dataset> dataset, JsPluginContext * context);
create(std::shared_ptr<Dataset> dataset, JsThreadContext * context);

static Dataset *
getShared(const v8::Handle<v8::Object> & val);
Expand Down
4 changes: 2 additions & 2 deletions plugins/lang/js/function_js.cc
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

#include "function_js.h"
#include "mldb/core/function.h"
#include "mldb/types/js/id_js.h"
#include "id_js.h"


using namespace std;
Expand All @@ -23,7 +23,7 @@ namespace MLDB {

v8::Handle<v8::Object>
FunctionJS::
create(std::shared_ptr<Function> function, JsPluginContext * context)
create(std::shared_ptr<Function> function, JsThreadContext * context)
{
auto obj = context->Function->GetFunction()->NewInstance();
auto * wrapped = new FunctionJS();
Expand Down
2 changes: 1 addition & 1 deletion plugins/lang/js/function_js.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ struct FunctionJS: public JsObjectBase {
std::shared_ptr<Function> function;

static v8::Handle<v8::Object>
create(std::shared_ptr<Function> function, JsPluginContext * context);
create(std::shared_ptr<Function> function, JsThreadContext * context);

static Function *
getShared(const v8::Handle<v8::Object> & val);
Expand Down
2 changes: 1 addition & 1 deletion types/js/id_js.h → plugins/lang/js/id_js.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

#pragma once

#include "mldb/soa/js/js_utils.h"
#include "mldb/plugins/lang/js/js_utils.h"
#include "mldb/types/id.h"

namespace Datacratic {
Expand Down
83 changes: 65 additions & 18 deletions plugins/lang/js/js_common.cc
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ CellValue from_js(const JS::JSValue & value, CellValue *)
return CellValue(Date::fromSecondsSinceEpoch(value->NumberValue() / 1000.0));
else if (value->IsObject()) {
// Look if it's already a CellValue
JsPluginContext * cxt = JsContextScope::current();
JsThreadContext * cxt = JsContextScope::current();
if (cxt->CellValue->HasInstance(value)) {
return CellValueJS::getShared(value);
}
Expand Down Expand Up @@ -126,8 +126,9 @@ void to_js(JS::JSValue & value, const CellValue & val)
to_js(value, val.toString());
}
else {
cerr << endl << endl << endl << "((((((( CELLVALUE ))))))))" << endl << endl;
// Get our context so we can return a proper object
JsPluginContext * cxt = JsContextScope::current();
JsThreadContext * cxt = JsContextScope::current();
value = CellValueJS::create(val, cxt);
}
}
Expand Down Expand Up @@ -178,18 +179,63 @@ void to_js(JS::JSValue & value, const Path & val)

void to_js(JS::JSValue & value, const ExpressionValue & val)
{
to_js(value, val.getAtom());
JsThreadContext * cxt = JsContextScope::current();
value = ExpressionValueJS::create(val, cxt);
}

ExpressionValue from_js(const JS::JSValue & value, ExpressionValue *)
{
// NOTE: we currently pretend that CellValue and ExpressionValue
// are the same thing; they are not. We will eventually need to
// allow proper JS access to full-blown ExpressionValue objects,
// backed with a JS object.
if (value->IsNull() || value->IsUndefined())
return ExpressionValue::null(Date::notADate());
else if (value->IsNumber())
return ExpressionValue(value->NumberValue(), Date::notADate());
else if (value->IsDate())
return ExpressionValue(Date::fromSecondsSinceEpoch(value->NumberValue() / 1000.0),
Date::notADate());
else if (value->IsArray()) {
// It must be an embedding
StructValue result;

auto arrPtr = v8::Array::Cast(*value);
for(size_t i=0; i<arrPtr->Length(); ++i) {
PathElement key(i);
v8::Local<v8::Value> val = arrPtr->Get(i);
ExpressionValue ev = from_js(val, (ExpressionValue *)0);
result.emplace_back(std::move(key), std::move(ev));
}

return std::move(result);
}
else if (value->IsObject()) {
// Look if it's already an ExpressionValue
JsThreadContext * cxt = JsContextScope::current();
if (cxt->ExpressionValue->HasInstance(value)) {
return ExpressionValueJS::getShared(value);
}

CellValue val = from_js(value, (CellValue *)0);
return ExpressionValue(val, Date::notADate());
// Look if it's already a CellValue
if (cxt->CellValue->HasInstance(value)) {
return ExpressionValue(CellValueJS::getShared(value),
Date::notADate());
}

auto objPtr = v8::Object::Cast(*value);

// It must be a nested structure
StructValue result;

v8::Local<v8::Array> properties = objPtr->GetOwnPropertyNames();

for (size_t i=0; i<properties->Length(); ++i) {
v8::Local<v8::Value> key = properties->Get(i);
v8::Local<v8::Value> val = objPtr->Get(key);
ExpressionValue ev = from_js(val, (ExpressionValue *)0);
result.emplace_back(PathElement(JS::utf8str(key)), std::move(ev));
}

return std::move(result);
}
else return ExpressionValue(JS::utf8str(value), Date::notADate());
}

ScriptStackFrame
Expand Down Expand Up @@ -337,18 +383,18 @@ JsObjectBase::
js_object_.Clear();
}

JsPluginContext *
JsThreadContext *
JsObjectBase::
getContext(const v8::Handle<v8::Object> & val)
{
return reinterpret_cast<JsPluginContext *>
return reinterpret_cast<JsThreadContext *>
(v8::Handle<v8::External>::Cast
(val->GetInternalField(1))->Value());
}

void
JsObjectBase::
wrap(v8::Handle<v8::Object> handle, JsPluginContext * context)
wrap(v8::Handle<v8::Object> handle, JsThreadContext * context)
{
ExcAssert(js_object_.IsEmpty());

Expand Down Expand Up @@ -420,9 +466,10 @@ garbageCollectionCallback(v8::Persistent<v8::Value> value, void *data)
/*****************************************************************************/

JsContextScope::
JsContextScope(JsPluginContext * context)
JsContextScope(JsThreadContext * context)
: context(context)
{
ExcAssert(context);
enter(context);
}

Expand All @@ -439,9 +486,9 @@ JsContextScope::
exit(context);
}

static __thread std::vector<JsPluginContext *> * jsContextStack = nullptr;
static __thread std::vector<JsThreadContext *> * jsContextStack = nullptr;

JsPluginContext *
JsThreadContext *
JsContextScope::
current()
{
Expand All @@ -452,16 +499,16 @@ current()

void
JsContextScope::
enter(JsPluginContext * context)
enter(JsThreadContext * context)
{
if (!jsContextStack)
jsContextStack = new std::vector<JsPluginContext *>();
jsContextStack = new std::vector<JsThreadContext *>();
jsContextStack->push_back(context);
}

void
JsContextScope::
exit(JsPluginContext * context)
exit(JsThreadContext * context)
{
if (current() != context)
throw ML::Exception("JS context stack consistency error");
Expand Down
Loading