Skip to content

Commit

Permalink
added docs
Browse files Browse the repository at this point in the history
  • Loading branch information
abstractqqq committed Dec 18, 2023
1 parent d54a0f2 commit e73c8d0
Show file tree
Hide file tree
Showing 61 changed files with 23,465 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docs/complex_ext.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Extension for Complex Numbers

::: polars_ds.complex_ext
44 changes: 44 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Polars-ds

A Polars Plugin aiming to simplify common numerical/string data analysis procedures. This means that the most basic data science, stats, NLP related tasks can be done natively inside a dataframe, without leaving dataframe world. This also means that for simple data pipelines, you do not need to install NumPy/Scipy/Scikit-learn, which saves a lot of space, which is great under constrained resources.

Its goal is NOT to replace SciPy, or NumPy, but rather it tries reduce dependency for simple analysis, and tries to reduce Python side code and UDFs, which are often performance bottlenecks.

## Getting Started
```bash
pip install polars_ds
```

and

```python
import polars_ds
```
when you are using the namespaces provided by the package.

## Examples

Generating random numbers, and running t-test, normality test inside a dataframe
```python
df.with_columns(
pl.col("a").stats_ext.sample_normal(mean = 0.5, std = 1.).alias("test1")
, pl.col("a").stats_ext.sample_normal(mean = 0.5, std = 2.).alias("test2")
).select(
pl.col("test1").stats_ext.ttest_ind(pl.col("test2"), equal_var = False).alias("t-test")
, pl.col("test1").stats_ext.normal_test().alias("normality_test")
).select(
pl.col("t-test").struct.field("statistic").alias("t-tests: statistics")
, pl.col("t-test").struct.field("pvalue").alias("t-tests: pvalue")
, pl.col("normality_test").struct.field("statistic").alias("normality_test: statistics")
, pl.col("normality_test").struct.field("pvalue").alias("normality_test: pvalue")
)
```

Blazingly fast string similarity comparisons. (Thanks to [RapidFuzz](https://docs.rs/rapidfuzz/latest/rapidfuzz/))
```python
df2.select(
pl.col("word").str_ext.levenshtein("world", return_sim = True)
).head()
```

And a lot more!
3 changes: 3 additions & 0 deletions docs/num_ext.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Extension for General Numerical Features/Metrics/Quantities

::: polars_ds.num_ext
3 changes: 3 additions & 0 deletions docs/stats_ext.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Extension for Statistical Tests and Samples

::: polars_ds.stats_ext
3 changes: 3 additions & 0 deletions docs/str_ext.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
## Extension for String Manipulation and Metrics

::: polars_ds.str_ext
20 changes: 20 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
site_name: Polars-ds Docs

nav:
- Home: index.md
- Complex Extension: complex_ext.md
- Numerical Extension: num_ext.md
- Stats Extension: stats_ext.md
- String Extension: str_ext.md

theme:
name: material

plugins:
- search
- mkdocstrings:
handlers:
python:
paths: [../python]
selection:
docstring_style: numpy
4 changes: 4 additions & 0 deletions requirements-docs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
mkdocs
mkdocstrings[python]
mkdocs-material
pytkdocs[numpy-style]
Loading

0 comments on commit e73c8d0

Please sign in to comment.