Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert dt.isna() to FExpr #3444

Open
wants to merge 54 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 45 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
4c4b05d
add header for isna
samukweku Mar 12, 2023
c609de1
implemenation of isna fexpr
samukweku Mar 13, 2023
7acfade
update docs
samukweku Mar 13, 2023
d69b35f
modify docs
samukweku Mar 13, 2023
adfd351
fixes
samukweku Mar 13, 2023
629df72
add method for isna
samukweku Mar 15, 2023
f34944f
add tests for isna method
samukweku Mar 15, 2023
4c5351c
format with clang-format
samukweku Mar 15, 2023
59fc3f6
indent
samukweku Mar 15, 2023
18d1f97
update fexpr_isna
samukweku Mar 15, 2023
d30c226
add to api/fexpr.rst
samukweku Mar 15, 2023
a704458
fix spelling
samukweku Mar 15, 2023
0fcbd68
fix docs failing
samukweku Mar 15, 2023
ae3ee4b
Update docs/api/math/isna.rst
samukweku Mar 23, 2023
af6be42
Update docs/api/math/isna.rst
samukweku Mar 23, 2023
ff06759
Update docs/api/math/isna.rst
samukweku Mar 23, 2023
adb4cf4
Update src/core/expr/funary/isna/fexpr_isna.cc
samukweku Mar 25, 2023
64b86db
ditch isna folder
samukweku Mar 25, 2023
d95f0c1
update fexpr
samukweku Mar 25, 2023
0733560
simplify logic for isna
samukweku Mar 26, 2023
80b687f
further simplify isna logic
samukweku Mar 26, 2023
d87df85
cleanup fillna.cc
samukweku Mar 26, 2023
7a382b8
use ternary operator where possible
samukweku Mar 26, 2023
781c2c3
fix indent
samukweku Mar 26, 2023
9037d0b
use FExpr_FuncUnary for isna
samukweku Mar 29, 2023
8704c50
updates based on feedback
samukweku Apr 17, 2023
890e56d
fix test fails
samukweku Apr 21, 2023
2a43959
updates based on feedback
samukweku Apr 22, 2023
b58e930
restore newline
samukweku Apr 24, 2023
4c90f41
updates
samukweku Apr 24, 2023
8e78f88
move make_isna_col to isna.h
samukweku Apr 25, 2023
ef1a9cb
update based on feedback
samukweku Apr 28, 2023
8fdec20
enhance examples
samukweku Apr 28, 2023
1c7098a
move fexpr_isna to main folder
samukweku Apr 28, 2023
ed687fe
add isna to docs
samukweku Apr 28, 2023
c7dcddf
rename doc_isna to doc_dt_isna
samukweku Apr 28, 2023
b32397d
doc_dt_isna
samukweku Apr 28, 2023
a734bcf
Implement slicing for categorical columns (#3379)
oleksiyskononenko Apr 25, 2023
f59c1b6
Minor refactoring of methods to get the underlying column type (#3458)
oleksiyskononenko Apr 28, 2023
3df983f
Update docs/api/dt/isna.rst
samukweku May 2, 2023
d41570f
Update docs/api/fexpr.rst
samukweku May 2, 2023
55efdfe
Update docs/api/math/isna.rst
samukweku May 2, 2023
46535b3
implemenation of isna fexpr
samukweku Mar 13, 2023
8856726
update based on feedback
samukweku Apr 28, 2023
3a6d83b
fixes based on feedback
samukweku May 3, 2023
b06f1f5
Update src/core/column/isna.h
samukweku May 5, 2023
95590b6
Update src/core/expr/fexpr_isna.cc
samukweku May 5, 2023
7e70e25
remove irrelevant import
samukweku May 5, 2023
aa2ee23
updates based on code review
samukweku May 5, 2023
05f51ad
fix for isna with non fexprs
samukweku May 6, 2023
aae69c8
import isna from a single point
samukweku May 6, 2023
aaec02e
copyright date updates
samukweku May 6, 2023
46f53fe
update copyright dates
samukweku May 6, 2023
a193d14
Merge branch 'main' into samukweku/fexpr_isna
samukweku May 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions docs/api/dt/isna.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@

.. xfunction:: datatable.isna
:src: src/core/expr/fexpr_isna.cc pyfn_isna
:tests: tests/math/test-isna.py
:cvar: doc_dt_isna
:signature: isna(cols)

Test if the column elements are missing values.

Parameters
----------
cols: FExpr
Input columns.

return: FExpr
f-expression that returns `0` for valid elements and `1` otherwise.
All the resulting columns will have `bool8` stypes
and as many rows/columns as there are in `cols`.

Examples
--------

.. code-block:: python

>>> from datatable import dt, f
>>> from datetime import datetime
>>> DT = dt.Frame({'age': [5.0, 6.0, None],
... 'born': [None,
... datetime(1939, 5, 27, 0, 0),
... datetime(1940, 4, 25, 0, 0)],
... 'name': ['Alfred', 'Batman', ''],
... 'toy': [None, 'Batmobile', 'Joker']})
>>> DT
| age born name toy
| float64 time64 str32 str32
-- + ------- ------------------- ------ ---------
0 | 5 NA Alfred NA
1 | 6 1939-05-27T00:00:00 Batman Batmobile
2 | NA 1940-04-25T00:00:00 Joker
[3 rows x 4 columns]
>>> DT[:, dt.isna(f[:])]
| age born name toy
| bool8 bool8 bool8 bool8
-- + ----- ----- ----- -----
0 | 0 1 0 1
1 | 0 0 0 0
2 | 1 0 0 0
[3 rows x 4 columns]


8 changes: 6 additions & 2 deletions docs/api/fexpr.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@
- Remove columns from the ``FExpr``.


Arithmeritc operators
---------------------
Arithmetic operators
--------------------

.. list-table::
:widths: auto
Expand Down Expand Up @@ -190,6 +190,9 @@
* - :meth:`.first()`
- Same as :func:`dt.first()`.

* - :meth:`.isna()`
- Same as :func:`dt.isna()`.

* - :meth:`.last()`
- Same as :func:`dt.last()`.

Expand Down Expand Up @@ -320,6 +323,7 @@
.extend() <fexpr/extend>
.fillna() <fexpr/fillna>
.first() <fexpr/first>
.isna() <fexpr/isna>
.last() <fexpr/last>
.len() <fexpr/len>
.max() <fexpr/max>
Expand Down
7 changes: 7 additions & 0 deletions docs/api/fexpr/isna.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

.. xmethod:: datatable.FExpr.isna
:src: src/core/expr/fexpr.cc PyFExpr::isna
:cvar: doc_FExpr_isna
:signature: isna()

Equivalent to :func:`dt.isna(cols)`.
3 changes: 3 additions & 0 deletions docs/api/index-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,8 @@ Functions
- Calculate covariance between two columns
* - :func:`fillna()`
- Impute missing values
* - :func:`isna()`
- Test for missing values
* - :func:`max()`
- Find the largest element per column
* - :func:`mean()`
Expand Down Expand Up @@ -266,6 +268,7 @@ Other
init_styles() <dt/init_styles>
intersect() <dt/intersect>
iread() <dt/iread>
isna() <dt/isna>
join() <dt/join>
last() <dt/last>
max() <dt/max>
Expand Down
10 changes: 6 additions & 4 deletions docs/api/math/isna.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@

.. xfunction:: datatable.math.isna
:src: src/core/expr/funary/floating.cc resolve_op_isna
:src: src/core/expr/fexpr_isna.cc pyfn_isna
:tests: tests/math/test-isna.py
:cvar: doc_math_isna
:signature: isna(x)
:cvar: doc_dt_isna
:signature: isna(cols)

Same as :func:`dt.isna()`.


Returns `True` if the argument is NA, and `False` otherwise.
20 changes: 19 additions & 1 deletion src/core/column/isna.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
//------------------------------------------------------------------------------
#ifndef dt_COLUMN_ISNA_h
#define dt_COLUMN_ISNA_h
#include "column/const.h"
#include "column/virtual.h"
#include "stype.h"
namespace dt {
Expand Down Expand Up @@ -62,7 +63,24 @@ class Isna_ColumnImpl : public Virtual_ColumnImpl {
};



static Column make_isna_col(Column&& col) {
switch (col.stype()) {
case SType::VOID: return Const_ColumnImpl::make_bool_column(col.nrows(), true);
case SType::BOOL:
case SType::INT8: return Column(new Isna_ColumnImpl<int8_t>(std::move(col)));
case SType::INT16: return Column(new Isna_ColumnImpl<int16_t>(std::move(col)));
case SType::DATE32:
case SType::INT32: return Column(new Isna_ColumnImpl<int32_t>(std::move(col)));
case SType::TIME64:
case SType::INT64: return Column(new Isna_ColumnImpl<int64_t>(std::move(col)));
case SType::FLOAT32: return Column(new Isna_ColumnImpl<float>(std::move(col)));
case SType::FLOAT64: return Column(new Isna_ColumnImpl<double>(std::move(col)));
case SType::STR32:
case SType::STR64: return Column(new Isna_ColumnImpl<CString>(std::move(col)));
default: throw RuntimeError();
samukweku marked this conversation as resolved.
Show resolved Hide resolved
}
}

} // namespace dt
#endif

3 changes: 2 additions & 1 deletion src/core/documentation.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ extern const char* doc_dt_ifelse;
extern const char* doc_dt_init_styles;
extern const char* doc_dt_intersect;
extern const char* doc_dt_iread;
extern const char* doc_dt_isna;
extern const char* doc_dt_join;
extern const char* doc_dt_last;
extern const char* doc_dt_max;
Expand Down Expand Up @@ -112,7 +113,6 @@ extern const char* doc_math_hypot;
extern const char* doc_math_isclose;
extern const char* doc_math_isfinite;
extern const char* doc_math_isinf;
extern const char* doc_math_isna;
extern const char* doc_math_ldexp;
extern const char* doc_math_lgamma;
extern const char* doc_math_log10;
Expand Down Expand Up @@ -297,6 +297,7 @@ extern const char* doc_FExpr_cumsum;
extern const char* doc_FExpr_extend;
extern const char* doc_FExpr_fillna;
extern const char* doc_FExpr_first;
extern const char* doc_FExpr_isna;
extern const char* doc_FExpr_last;
extern const char* doc_FExpr_max;
extern const char* doc_FExpr_mean;
Expand Down
11 changes: 10 additions & 1 deletion src/core/expr/fexpr.cc
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,6 @@ DECLARE_METHOD(&PyFExpr::re_match)




//------------------------------------------------------------------------------
// Miscellaneous
//------------------------------------------------------------------------------
Expand Down Expand Up @@ -480,6 +479,16 @@ DECLARE_METHOD(&PyFExpr::first)
->docs(dt::doc_FExpr_first);


oobj PyFExpr::isna(const XArgs&) {
auto isnaFn = oobj::import("datatable", "math", "isna");
return isnaFn.call({this});
}

DECLARE_METHOD(&PyFExpr::isna)
->name("isna")
->docs(dt::doc_FExpr_isna);


oobj PyFExpr::last(const XArgs&) {
auto lastFn = oobj::import("datatable", "last");
return lastFn.call({this});
Expand Down
1 change: 1 addition & 0 deletions src/core/expr/fexpr.h
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,7 @@ class PyFExpr : public py::XObject<PyFExpr> {
py::oobj extend(const py::XArgs&);
py::oobj fillna(const py::XArgs&);
py::oobj first(const py::XArgs&);
py::oobj isna(const py::XArgs&);
py::oobj last(const py::XArgs&);
py::oobj max(const py::XArgs&);
py::oobj mean(const py::XArgs&);
Expand Down
21 changes: 1 addition & 20 deletions src/core/expr/fexpr_fillna.cc
Original file line number Diff line number Diff line change
Expand Up @@ -183,26 +183,6 @@ class FExpr_FillNA : public FExpr_Func {

return wf;
}


static Column make_isna_col(Column&& col) {
switch (col.stype()) {
case SType::VOID: return Const_ColumnImpl::make_bool_column(col.nrows(), true);
case SType::BOOL:
case SType::INT8: return Column(new Isna_ColumnImpl<int8_t>(std::move(col)));
case SType::INT16: return Column(new Isna_ColumnImpl<int16_t>(std::move(col)));
case SType::DATE32:
case SType::INT32: return Column(new Isna_ColumnImpl<int32_t>(std::move(col)));
case SType::TIME64:
case SType::INT64: return Column(new Isna_ColumnImpl<int64_t>(std::move(col)));
case SType::FLOAT32: return Column(new Isna_ColumnImpl<float>(std::move(col)));
case SType::FLOAT64: return Column(new Isna_ColumnImpl<double>(std::move(col)));
case SType::STR32:
case SType::STR64: return Column(new Isna_ColumnImpl<CString>(std::move(col)));
default: throw RuntimeError();
}
}

};


Expand Down Expand Up @@ -236,3 +216,4 @@ DECLARE_PYFN(&pyfn_fillna)


}} // dt::expr

64 changes: 64 additions & 0 deletions src/core/expr/fexpr_isna.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
//------------------------------------------------------------------------------
// Copyright 2022-2023 H2O.ai
samukweku marked this conversation as resolved.
Show resolved Hide resolved
//
// Permission is hereby granted, free of charge, to any person obtaining a
// copy of this software and associated documentation files (the "Software"),
// to deal in the Software without restriction, including without limitation
// the rights to use, copy, modify, merge, publish, distribute, sublicense,
// and/or sell copies of the Software, and to permit persons to whom the
// Software is furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
// FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
// IN THE SOFTWARE.
//------------------------------------------------------------------------------
#include "column/const.h"
#include "column/isna.h"
#include "expr/fexpr_column.h"
#include "documentation.h"
#include "expr/fexpr_func_unary.h"
#include "expr/eval_context.h"
#include "expr/workframe.h"
#include "python/xargs.h"
#include "stype.h"
namespace dt {
namespace expr {


class FExpr_ISNA : public FExpr_FuncUnary {
public:
using FExpr_FuncUnary::FExpr_FuncUnary;


std::string name() const override {
return "isna";
}

Column evaluate1(Column&& col) const override{
return make_isna_col(std::move(col));
}
};


static py::oobj pyfn_isna(const py::XArgs &args) {
auto isna = args[0].to_oobj();
return PyFExpr::make(new FExpr_ISNA(as_fexpr(isna)));
}

DECLARE_PYFN(&pyfn_isna)
->name("isna")
->docs(doc_dt_isna)
->arg_names({"cols"})
->n_positional_args(1)
->n_required_args(1);


}} // dt::expr

6 changes: 3 additions & 3 deletions src/core/expr/fnary/rowcount.cc
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@
// IN THE SOFTWARE.
//------------------------------------------------------------------------------
#include <algorithm>
#include "column/isna.h"
samukweku marked this conversation as resolved.
Show resolved Hide resolved
#include "column/const.h"
#include "column/func_nary.h"
#include "documentation.h"
#include "expr/fnary/fnary.h"
#include "expr/funary/umaker.h"
#include "python/xargs.h"
namespace dt {
namespace expr {
Expand All @@ -49,7 +49,6 @@ static bool op_rowcount(size_t i, int32_t* out, const colvec& columns) {
return true;
}


Column FExpr_RowCount::apply_function(colvec&& columns,
const size_t nrows,
const size_t) const
Expand All @@ -59,7 +58,8 @@ Column FExpr_RowCount::apply_function(colvec&& columns,
}
for (size_t i = 0; i < columns.size(); ++i) {
xassert(columns[i].nrows() == nrows);
columns[i] = unaryop(Op::ISNA, std::move(columns[i]));
Column coli = columns[i];
columns[i] = make_isna_col(std::move(coli));
}
return Column(new FuncNary_ColumnImpl<int32_t>(
std::move(columns), op_rowcount, nrows, SType::INT32));
Expand Down
46 changes: 0 additions & 46 deletions src/core/expr/funary/floating.cc
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
// IN THE SOFTWARE.
//------------------------------------------------------------------------------
#include <cmath>
#include "column/isna.h"
#include "documentation.h"
#include "expr/funary/pyfn.h"
#include "expr/funary/umaker.h"
Expand Down Expand Up @@ -124,51 +123,6 @@ umaker_ptr resolve_op_sign(SType stype) {
}
}




//------------------------------------------------------------------------------
// Op::ISNA
//------------------------------------------------------------------------------

py::PKArgs args_isna(1, 0, 0, false, false, {"x"}, "isna", dt::doc_math_isna);


template <typename T>
class isna_umaker : public umaker {
public:
Column compute(Column&& col) const override {
return Column(new Isna_ColumnImpl<T>(std::move(col)));
}
};


umaker_ptr resolve_op_isna(SType stype) {
switch (stype) {
case SType::VOID: {
return umaker_ptr(new umaker_const(
Const_ColumnImpl::make_bool_column(1, true)));
}
case SType::BOOL:
case SType::INT8: return umaker_ptr(new isna_umaker<int8_t>());
case SType::INT16: return umaker_ptr(new isna_umaker<int16_t>());
case SType::DATE32:
case SType::INT32: return umaker_ptr(new isna_umaker<int32_t>());
case SType::TIME64:
case SType::INT64: return umaker_ptr(new isna_umaker<int64_t>());
case SType::FLOAT32: return umaker_ptr(new isna_umaker<float>());
case SType::FLOAT64: return umaker_ptr(new isna_umaker<double>());
case SType::STR32:
case SType::STR64: return umaker_ptr(new isna_umaker<CString>());
default:
throw TypeError() << "Function `isna` cannot be applied to a "
"column of type `" << stype << "`";
}
}




//------------------------------------------------------------------------------
// Op::ISINF
//------------------------------------------------------------------------------
Expand Down
Loading