Skip to content

Releases: Zeroto521/my-data-toolkit

v0.0.20

30 Dec 13:11
6e407f7
Compare
Choose a tag to compare

Highlights of this release

Hightly support H3 (Hexagonal hierarchical geospatial indexing system) via .to_h3 and .H3.*.

>>> import dtoolkit.geoaccessor
>>> import pandas as pd
>>> df = pd.DataFrame({"x": [122, 100], "y": [55, 1]}).from_xy('x', 'y', crs=4326)
>>> df
     x   y                    geometry
0  122  55  POINT (122.00000 55.00000)
1  100   1   POINT (100.00000 1.00000)

# GeoDataFrame -> h3 cell

>>> df_with_h3 = df.to_h3(8)
>>> df_with_h3
                      x   y                    geometry
612845052823076863  122  55  POINT (122.00000 55.00000)
614269156845420543  100   1   POINT (100.00000 1.00000)

# Calculate h3 cell area

>>> df_with_h3.h3.area
612845052823076863    710781.770906
614269156845420543    852134.191671
dtype: float64

# h3 cell -> GeoDataFrame

>>> df_parent_cell = df_with_h3.h3.to_parent()
>>> df_parent_cell
                      x   y                    geometry
608341453197803519  122  55  POINT (122.00000 55.00000)
609765557230632959  100   1   POINT (100.00000 1.00000)
>>> df_parent_cell.h3.to_points()
                      x   y                    geometry
608341453197803519  122  55  POINT (122.00991 55.00606)
609765557230632959  100   1   POINT (100.00504 0.99852)

New features and improvements

  • #739, #800, #817, #825: New geoaccessor dtoolkit.geoaccessor.geoseries.to_h3 to convert geometry to h3 index.
  • #778: Speed up dtoolkit.accessor.series.textdistance_matrix.
  • #779, #811, #819: New geoaccessor dtoolkit.geoaccessor.dataframe.H3 to handle h3's geohash.
  • #784: New accessor dtoolkit.accessor.series.to_zh.
  • #794, #797: New geoaccessor for GeoDataFrame dtoolkit.geoaccessor.geodataframe.xy.
  • #801: New accessor for Series dtoolkit.accessor.series.invert_or_not.
  • #803: New geoaccessor dtoolkit.geoaccessor.geoseries.select_geom_type.
  • #804: New geoaccessor dtoolkit.geoaccessor.geoseries.radius.
  • #809: New accessor for Index dtoolkit.accessor.index.len.

Small bug-fix

  • #780: Fix dtoolkit.geoaccessor.dataframe.to_geoframe's geometry is GeoSeries.
  • #816: Fix dtoolkit.geoaccessor.dataframe.to_geoframe result CRS is missing.
  • #822: dtoolkit.geoaccessor.dataframe.to_geoframe supports replacing old geometry.
  • #824: Fix inputting GeoDataFrame but dtoolkit.accessor.dataframe.repeat return DataFrame.

API changes

  • #807: dtoolkit.geoaccessor.geodataframe.get_coordinates -> dtoolkit.geoaccessor.geodataframe.coordinates.
  • #814: Drop keyword argument drop.

Full Changelog

v0.0.20rc1

28 Dec 13:11
46683a9
Compare
Choose a tag to compare
v0.0.20rc1 Pre-release
Pre-release

Highlights of this release

Hightly support H3 (Hexagonal hierarchical geospatial indexing system) via .to_h3 and .H3.*.

>>> import dtoolkit.geoaccessor
>>> import pandas as pd
>>> df = pd.DataFrame({"x": [122, 100], "y": [55, 1]}).from_xy('x', 'y', crs=4326)
>>> df
     x   y                    geometry
0  122  55  POINT (122.00000 55.00000)
1  100   1   POINT (100.00000 1.00000)

# GeoDataFrame -> h3 cell

>>> df_with_h3 = df.to_h3(8)
>>> df_with_h3
                      x   y                    geometry
612845052823076863  122  55  POINT (122.00000 55.00000)
614269156845420543  100   1   POINT (100.00000 1.00000)

# Calculate h3 cell area

>>> df_with_h3.h3.area
612845052823076863    710781.770906
614269156845420543    852134.191671
dtype: float64

# h3 cell -> GeoDataFrame

>>> df_parent_cell = df_with_h3.h3.to_parent()
>>> df_parent_cell
                      x   y                    geometry
608341453197803519  122  55  POINT (122.00000 55.00000)
609765557230632959  100   1   POINT (100.00000 1.00000)
>>> df_parent_cell.h3.to_points()
                      x   y                    geometry
608341453197803519  122  55  POINT (122.00991 55.00606)
609765557230632959  100   1   POINT (100.00504 0.99852)

New features and improvements

  • #739, #800, #817, #825: New geoaccessor dtoolkit.geoaccessor.geoseries.to_h3 to convert geometry to h3 index.
  • #778: Speed up dtoolkit.accessor.series.textdistance_matrix.
  • #779, #811, #819: New geoaccessor dtoolkit.geoaccessor.dataframe.H3 to handle h3's geohash.
  • #784: New accessor dtoolkit.accessor.series.to_zh.
  • #794, #797: New geoaccessor for GeoDataFrame dtoolkit.geoaccessor.geodataframe.xy.
  • #801: New accessor for Series dtoolkit.accessor.series.invert_or_not.
  • #803: New geoaccessor dtoolkit.geoaccessor.geoseries.select_geom_type.
  • #804: New geoaccessor dtoolkit.geoaccessor.geoseries.radius.
  • #809: New accessor for Index dtoolkit.accessor.index.len.

Small bug-fix

  • #780: Fix dtoolkit.geoaccessor.dataframe.to_geoframe's geometry is GeoSeries.
  • #816: Fix dtoolkit.geoaccessor.dataframe.to_geoframe result CRS is missing.
  • #822: dtoolkit.geoaccessor.dataframe.to_geoframe supports replacing old geometry.
  • #824: Fix inputting GeoDataFrame but dtoolkit.accessor.dataframe.repeat return DataFrame.

API changes

  • #807: dtoolkit.geoaccessor.geodataframe.get_coordinates -> dtoolkit.geoaccessor.geodataframe.coordinates.
  • #814: Drop keyword argument drop.

v0.0.19

11 Dec 02:38
3eb6824
Compare
Choose a tag to compare

Highlights of this release:

  • #574, #752, #757, #758: Supported python 3.11.
  • #772: Simplify importing import dtoolkit == import dtoolkit.accessor.

New features and improvements:

  • #724: New accessor for Series to calculate text distance dtoolkit.accessor.series.textdistance.
  • #745: dtoolkit.geoaccessor.geodataframe.duplicated_geometry's predicate support to directly compare value.
  • #748: dtoolkit.geoaccessor.geoseries.xy support to return DataFrame.
  • #760: dtoolkit.accessor.dataframe.repeat support to use column as the input.
  • #768: New accessor dtoolkit.accessor.dataframe.change_axis_type.

Small bug-fix:

  • #576: Fix DataFrame.append's FutureWarning.
  • #765: Fix sklearn pipeline visualization can't print OneHotEncoder.
  • #776: After v0.0.17 github release page don't have tarball file anymore.

API changes:

  • #762: Drop columns arguments for error_report.

Full Changelog

v0.0.19rc3

10 Dec 09:56
cc1113c
Compare
Choose a tag to compare
v0.0.19rc3 Pre-release
Pre-release

Highlights of this release:

  • #574, #752, #757, #758: Supported python 3.11.
  • #772: Simplify importing import dtoolkit == import dtoolkit.accessor.

New features and improvements:

  • #724: New accessor for Series to calculate text distance dtoolkit.accessor.series.textdistance.
  • #745: dtoolkit.geoaccessor.geodataframe.duplicated_geometry's predicate support to directly compare value.
  • #748: dtoolkit.geoaccessor.geoseries.xy support to return DataFrame.
  • #760: dtoolkit.accessor.dataframe.repeat support to use column as the input.
  • #768: New accessor dtoolkit.accessor.dataframe.change_axis_type.

Small bug-fix:

  • #576: Fix DataFrame.append's FutureWarning.
  • #765: Fix sklearn pipeline visualization can't print OneHotEncoder.
  • #776: After v0.0.17 github release page don't have tarball file anymore.

API changes:

  • #762: Drop columns arguments for error_report.

v0.0.18

14 Oct 05:14
d6a90cd
Compare
Choose a tag to compare

Highlights of this release

Pandas accessors

  • #715: New accessor dtoolkit.accessor.series.equal to compare pandas-object with other.

GeoPandas accessors

  • #699, #701, #702, #704, #705, #706, #707, #735: New geoaccessor to generate great circle distances, dtoolkit.geoaccessor.geoseries.geodistance and dtoolkit.geoaccessor.geoseries.geodistance_matrix.
  • #696: New geoaccessor to handle China webmap offset problem, dtoolkit.geoaccessor.geoseries.cncrs_offset.
  • #691, #703: New geoaccessor to filter geometry via spatial relationship, dtoolkit.geoaccessor.geoseries.filter_geometry.
  • #679, #680, #682: New geoaccessor to check Polygon whether having hole and count hole , dtoolkit.geoaccessor.geoseries.has_hole and dtoolkit.geoaccessor.geoseries.hole_counts.

Pipeline

  • #688: New accessor dtoolkit.accessor.dataframe.weighted_mean for DataFrame.
  • #685: Let Pipeline's fit_predict and predict support outputting DataFrame.

API changes

  • #694, #695: pygeos isn't an optional dependency anymore.
  • #665: Drop dtoolkit.geoaccessor.geoseries.utm_crs.

Small bug-fix

  • #714, #716: Fix dtoolkit.accessor.dataframe.decompose can't collapse dict.
  • #692: Reset non-monotonic index.

Full Changelog

v0.0.18rc3

10 Oct 12:06
10579f7
Compare
Choose a tag to compare
v0.0.18rc3 Pre-release
Pre-release

New features and improvements

  • #721: New accessor for Series to convert datetime type, dtoolkit.accessor.series.to_datetime.
  • #715: New accessor dtoolkit.accessor.series.equal to compare pandas-object with other.
  • #712: Support use DataFrame's column as the distance for dtoolkit.geoaccessor.geodataframe.geobuffer.
  • #711, #713: New geoaccessor for GeoSeries to return tuple of coordinates (x, y), dtoolkit.geoaccessor.geoseries.xy.
  • #701, #704, #705, #706: New geoaccessor to generate great circle distances matrix, dtoolkit.geoaccessor.geoseries.geodistance_matrix.
  • #699, #702, #707: New geoaccessor to calculate two coordinates distance on earth, dtoolkit.geoaccessor.geoseries.geodistance.
  • #696: New geoaccessor to handle China webmap offset problem, dtoolkit.geoaccessor.geoseries.cncrs_offset.
  • #691, #703: New geoaccessor to filter geometry via spatial relationship, dtoolkit.geoaccessor.geoseries.filter_geometry.
  • #688: New accessor dtoolkit.accessor.dataframe.weighted_mean for DataFrame.
  • #685: Let Pipeline's fit_predict and predict support outputting DataFrame.
  • #680, #682: New geoaccessor to check Polygon whether having hole, dtoolkit.geoaccessor.geoseries.has_hole.
  • #679: New geoaccessor to count the hole number of Polygon, dtoolkit.geoaccessor.geoseries.hole_counts.
  • #668: Add a new option dropna for dtoolkit.accessor.series.values_to_dict to handle nan value.
  • #667: New accessor dtoolkit.accessor.series.dropna_index.

API changes

  • #694, #695: pygeos isn't an optional dependency anymore.
  • #665: Drop dtoolkit.geoaccessor.geoseries.utm_crs.

Small bug-fix

  • #714, #716: Fix dtoolkit.accessor.dataframe.decompose can't collapse dict.
  • #692: Reset non-monotonic index.

v0.0.17

15 Aug 02:37
fa6ec3f
Compare
Choose a tag to compare

Highlights of this release

  • Speed up geoaccessor geobuffer via UTM CRS (#638).
  • Require minimal Python 3.8+ (#554).
  • eval and query work for Series now (#492, #551).

New features and improvements

  • New geoaccessor compute geographic area geoarea (#640).
  • A syntactic sugar to parallelize multi-jobs parallelize (#635, #641).
  • New geoaccessor to label / drop duplicate geometry: duplicated_geometry_groups, duplicated_geometry, and drop_duplicates_geometry (#631, #632).
  • New accessor for Series swap_index_values (#630).
  • New accessor group by index groupby_index (#625).
  • New geoaccessor for GeoDataFrame toposimplify (#624, #649, #651).
  • to_series gets only value_column also return Series from DataFrame (#620).
  • New accessor for Series jenks_bin and jenks_breaks (#618, #629).
  • New accessor for Series filter_in (#614).
  • New geoaccessor for GeoDataFrame to_geoseries (#609).
  • New geoaccessor remove active geometry drop_geometry (#599).
  • New geoaccessor for Series from_wkt (#596).
  • New geoaccessor get coordinates from addresses geocode and get addresses from coordinates reverse_geocode (#591, #594, #643, #636, #652).
  • New level option for Index accessor to_set (#586).
  • Speed up Series accessor to_set (#585).
  • New geoaccessor from_wkb (#584, #598).
  • New geoaccessor to_geoframe (#568, #642, #646).

Small bug-fix

  • Avoid GeoDataFrame constructor mutating the original (inputting) DataFrame (#644).
  • Avoid fillna_regression mutating the original dataframe (#622).
  • Compat with sklearn 1.2 stricter class parameters checking (#602).
  • geobuffer uses the active geometry to generate buffers (#583).
  • Hook accessor method's attrs into both class and instance (#580).

API changes

  • Add deprecated warning for utm_crs (#637, #645).
  • Remove warning message and drop inplace option (#555).
  • Use positional-only arguments (/) to limit name (#435).

Full Changelog

v0.0.17rc1

13 Aug 12:35
5f1e029
Compare
Choose a tag to compare
v0.0.17rc1 Pre-release
Pre-release

Highlights of this release

  • Speed up geoaccessor geobuffer via UTM CRS (#638).
  • Require minimal Python 3.8+ (#554).
  • eval and query work for Series now (#492, #551).

New features and improvements

  • New geoaccessor compute geographic area geoarea (#640).
  • A syntactic sugar to parallelize multi-jobs parallelize (#635, #641).
  • New geoaccessor to label / drop duplicate geometry: duplicated_geometry_groups, duplicated_geometry, and drop_duplicates_geometry (#631, #632).
  • New accessor for Series swap_index_values (#630).
  • New accessor group by index groupby_index (#625).
  • New geoaccessor for GeoDataFrame toposimplify (#624, #649, #651).
  • to_series gets only value_column also return Series from DataFrame (#620).
  • New accessor for Series jenks_bin and jenks_breaks (#618, .#629)
  • New accessor for Series filter_in (#614).
  • New geoaccessor for GeoDataFrame to_geoseries (#609).
  • New geoaccessor remove active geometry drop_geometry (#599).
  • New geoaccessor for Series from_wkt (#596).
  • New geoaccessor get coordinates from addresses geocode and get addresses from coordinates reverse_geocode (#591, #594, #643, #636, #652).
  • New level option for Index accessor to_set (#586).
  • Speed up Series accessor to_set (#585).
  • New geoaccessor from_wkb (#584, #598).
  • New geoaccessor to_geoframe (#568, #642, #646).

Small bug-fix

  • Avoid GeoDataFrame constructor mutating the original (inputting) DataFrame (#644).
  • Avoid fillna_regression mutating the original dataframe (#622).
  • Compat with sklearn 1.2 stricter class parameters checking (#602).
  • geobuffer uses the active geometry to generate buffers (#583).
  • Hook accessor method's attrs into both class and instance (#580).

API changes

  • Add deprecated warning for utm_crs (#637, #645).
  • Remove warning message and drop inplace option (#555).
  • Use positional-only arguments (/) to limit name (#435).

v0.0.16

30 May 00:15
289c080
Compare
Choose a tag to compare

New features and improvements

  • New accessor dtoolkit.accessor.dataframe.fillna_regression (#556, #567).
  • New unique option for dtoolkit.accessor.dataframe.values_to_dict (#548).
  • Speed up dtoolkit.util._exception.find_stack_level (#546).
  • dtoolkit.accessor.dataframe.filter_in's how only works on condition DataFrame's columns (#545).
  • dtoolkit.accessor.series.to_set speeds up especial to large data (#542, #543).
  • dtoolkit.accessor.dataframe.drop_inf's inf option supports + and - (#539).
  • New accessor dtoolkit.accessor.dataframe.boolean for DataFrame (#537, #538).
  • New complement option for dtoolkit.accessor.dataframe.filter_in (#533).
  • New Index method dtoolkit.accessor.index.to_set (#529).
  • New method dtoolkit.accessor.dataframe.decompose for DataFrame (#488, #573).

API changes

  • Add deprecated warning for dtoolkit.transformer.pipeline (#558).
  • Split dtoolkit.transformer scripts into sub-pakcages (#557).
  • Drop inplace for dtoolkit.accessor.dataframe.drop_inf (#540).
  • Drop generic package (#535).
  • Drop inplace option of dtoolkit.accessor.dataframe.filter_in (#518, #531, #559).

Full Changelog

v0.0.16rc2

27 May 02:35
6323df1
Compare
Choose a tag to compare
v0.0.16rc2 Pre-release
Pre-release

New features and improvements

  • New accessor dtoolkit.accessor.dataframe.fillna_regression (#556, #567).
  • New unique option for dtoolkit.accessor.dataframe.values_to_dict (#548).
  • Speed up dtoolkit.util._exception.find_stack_level (#546).
  • dtoolkit.accessor.dataframe.filter_in's how only works on condition DataFrame's columns (#545).
  • dtoolkit.accessor.series.to_set speeds up especial to large data (#542, #543).
  • dtoolkit.accessor.dataframe.drop_inf's inf option supports + and - (#539).
  • New accessor dtoolkit.accessor.dataframe.boolean for DataFrame (#537, #538).
  • New complement option for dtoolkit.accessor.dataframe.filter_in (#533).
  • New Index method dtoolkit.accessor.index.to_set (#529).
  • New method dtoolkit.accessor.dataframe.decompose for DataFrame (#488).

API changes

  • Add deprecated warning for dtoolkit.transformer.pipeline (#558).
  • Split dtoolkit.transformer scripts into sub-pakcages (#557).
  • Drop inplace for dtoolkit.accessor.dataframe.drop_inf (#540).
  • Drop generic package (#535).
  • Drop inplace option of dtoolkit.accessor.dataframe.filter_in (#518, #531, #559).