Skip to content

Commit

Permalink
Merge pull request #31 from abstractqqq/docs
Browse files Browse the repository at this point in the history
Docs
  • Loading branch information
abstractqqq authored Dec 17, 2023
2 parents df7ff06 + ef54be6 commit d54a0f2
Show file tree
Hide file tree
Showing 70 changed files with 23,579 additions and 121 deletions.
2 changes: 0 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@ num = "0.4.1"
faer = {version = "0.15", features = ["ndarray", "nightly"]}
ndarray = "0.15.6" # see if we can get rid of this
hashbrown = {version = "0.14.2", features=["nightly"]}
# Try realfft instead, which is a wrapper around rustfft but specializes for reals and seems to have better perf
# rustfft = "6.1.0"
itertools = "0.12.0"
aho-corasick = "1.1"
rand = {version = "0.8.5"} # Simd support feature seems to be broken atm
Expand Down
3 changes: 3 additions & 0 deletions docs/docs/complex_ext.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Extension for Complex Numbers

::: polars_ds.complex_ext
44 changes: 44 additions & 0 deletions docs/docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Polars-ds

A Polars Plugin aiming to simplify common numerical/string data analysis procedures. This means that the most basic data science, stats, NLP related tasks can be done natively inside a dataframe, without leaving dataframe world. This also means that for simple data pipelines, you do not need to install NumPy/Scipy/Scikit-learn, which saves a lot of space, which is great under constrained resources.

Its goal is NOT to replace SciPy, or NumPy, but rather it tries reduce dependency for simple analysis, and tries to reduce Python side code and UDFs, which are often performance bottlenecks.

## Getting Started
```bash
pip install polars_ds
```

and

```python
import polars_ds
```
when you are using the namespaces provided by the package.

## Examples

Generating random numbers, and running t-test, normality test inside a dataframe
```python
df.with_columns(
pl.col("a").stats_ext.sample_normal(mean = 0.5, std = 1.).alias("test1")
, pl.col("a").stats_ext.sample_normal(mean = 0.5, std = 2.).alias("test2")
).select(
pl.col("test1").stats_ext.ttest_ind(pl.col("test2"), equal_var = False).alias("t-test")
, pl.col("test1").stats_ext.normal_test().alias("normality_test")
).select(
pl.col("t-test").struct.field("statistic").alias("t-tests: statistics")
, pl.col("t-test").struct.field("pvalue").alias("t-tests: pvalue")
, pl.col("normality_test").struct.field("statistic").alias("normality_test: statistics")
, pl.col("normality_test").struct.field("pvalue").alias("normality_test: pvalue")
)
```

Blazingly fast string similarity comparisons. (Thanks to [RapidFuzz](https://docs.rs/rapidfuzz/latest/rapidfuzz/))
```python
df2.select(
pl.col("word").str_ext.levenshtein("world", return_sim = True)
).head()
```

And a lot more!
3 changes: 3 additions & 0 deletions docs/docs/num_ext.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Extension for General Numerical Features/Metrics/Quantities

::: polars_ds.num_ext
3 changes: 3 additions & 0 deletions docs/docs/stats_ext.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Extension for Statistical Tests and Samples

::: polars_ds.stats_ext
3 changes: 3 additions & 0 deletions docs/docs/str_ext.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Extension for String Manipulation and Metrics

::: polars_ds.str_ext
20 changes: 20 additions & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
site_name: Polars-ds Docs

nav:
- Home: index.md
- Complex Extension: complex_ext.md
- Numerical Extension: num_ext.md
- Stats Extension: stats_ext.md
- String Extension: str_ext.md

theme:
name: material

plugins:
- search
- mkdocstrings:
handlers:
python:
paths: [../python]
selection:
docstring_style: numpy
4 changes: 4 additions & 0 deletions docs/requirements-docs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
mkdocs
mkdocstrings[python]
mkdocs-material
pytkdocs[numpy-style]
Loading

0 comments on commit d54a0f2

Please sign in to comment.