Skip to content

Commit

Permalink
Merge pull request #7 from TidierOrg/updated-docs-without-tilde
Browse files Browse the repository at this point in the history
Removed tilde and updated dependencies
  • Loading branch information
Karandeep Singh authored Aug 7, 2023
2 parents 4a29c1f + 52a30b3 commit 0405447
Show file tree
Hide file tree
Showing 10 changed files with 68 additions and 72 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/Documenter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ jobs:
- uses: julia-actions/setup-julia@v1
- uses: julia-actions/cache@v1
with:
cache-registries: "true"
cache-registries: "false"
- name: Install documentation dependencies
run: julia --project=docs -e 'using Pkg; pkg"dev ."; Pkg.instantiate()'
- name: Build and deploy
Expand Down
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# TidierCats.jl updates

## v0.1.1 - 2023-08-06
- Added the `TidierCats.jl` functions to the `TidierData.jl` list of `not_vectorized[]` functions, which means that the user does *not* need to explicitly prefix them with a `~` when used inside of a `@mutate()` within `TidierData.jl`. Thus, all the `~` prefixes have been removed from the examples.

## v0.1.0 - Initial commit
- Released to Julia general registry
3 changes: 1 addition & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
name = "TidierCats"
uuid = "79ddc9fe-4dbf-4a56-a832-df41fb326d23"
authors = ["Daniel Rizk"]
version = "0.1.0"
version = "0.1.1"

[deps]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Reexport = "189a3867-3050-52da-a836-e630ba90ab69"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
TidierData = "fe2206b3-d496-4ee9-a338-6a095c4ece80"

[compat]
CategoricalArrays = "0.10"
Expand Down
22 changes: 10 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@

`TidierCats.jl` has one main goal: to implement forcats's straightforward syntax and of ease of use while working with categorical variables for Julia users. While this package was develeoped to work seamelessly with `Tidier.jl` fucntions and macros, it can also work as a indepentenly as a standalone package. This package is powered by CateogricalArrays.jl


## What functions does TidierCats.jl support?

- `cat_rev()`
Expand All @@ -25,7 +24,6 @@
- `cat_lump_prop()`
- `as_categorical()`


## Installation

For the development version:
Expand Down Expand Up @@ -59,7 +57,7 @@ This function changes the order of levels in a categorical variable. It accepts

```julia
custom_order = @chain df begin
@mutate(CatVar = ~cat_relevel(CatVar, ["Zilch", "Medium", "High", "Low"]))
@mutate(CatVar = cat_relevel(CatVar, ["Zilch", "Medium", "High", "Low"]))
end

print(levels(df[!,:CatVar]))
Expand All @@ -76,7 +74,7 @@ This function reverses the order of levels in a categorical variable. It only re

```julia
reversed_order = @chain df begin
@mutate(CatVar = ~cat_rev(CatVar))
@mutate(CatVar = cat_rev(CatVar))
end

print(levels(df[!,:CatVar]))
Expand Down Expand Up @@ -109,7 +107,7 @@ end

```julia
orderedbyfrequency = @chain df begin
@mutate(CatVar = ~cat_infreq(CatVar))
@mutate(CatVar = cat_infreq(CatVar))
end

print(levels(df[!,:CatVar]))
Expand All @@ -126,7 +124,7 @@ This function lumps the least frequent levels into a new "Other" level. It accep

```julia
lumped_cats = @chain df begin
@mutate(CatVar = ~cat_lump(CatVar,2))
@mutate(CatVar = cat_lump(CatVar,2))
end

print(levels(df[!,:CatVar]))
Expand All @@ -149,11 +147,11 @@ df3 = DataFrame(
)

df4 = @chain df3 begin
@mutate(cat_var= ~cat_reorder(cat_var, order_var, "median" ))
@mutate(cat_var= cat_reorder(cat_var, order_var, "median" ))
end

@chain df3 begin
@mutate(catty = ~as_categorical(cat_var))
@mutate(catty = as_categorical(cat_var))
@group_by(cat_var)
@summarise(median = median(order_var))
end
Expand All @@ -179,7 +177,7 @@ This function collapses levels in a categorical variable according to a specifie

```julia
df5 = @chain df begin
@mutate(CatVar = ~cat_collapse(CatVar, Dict("Low" => "bad", "Zilch" => "bad")))
@mutate(CatVar = cat_collapse(CatVar, Dict("Low" => "bad", "Zilch" => "bad")))
end

@chain df begin
Expand Down Expand Up @@ -215,7 +213,7 @@ This function converts a standard Julia array to a categorical array. The only a
test = DataFrame( w = ["A", "B", "C", "D"])

@chain test begin
@mutate(w = ~as_categorical(w))
@mutate(w = as_categorical(w))
end
```

Expand All @@ -234,7 +232,7 @@ This function wil lump any cargory with less than the minimum number of entries

```julia
lumpedbymin = @chain df begin
@mutate(CatVar = ~cat_lump_min(CatVar, 14))
@mutate(CatVar = cat_lump_min(CatVar, 14))
end

print(levels(df[!,:CatVar]))
Expand All @@ -252,7 +250,7 @@ This function wil lump any cargory with less than the minimum proportion and rec

```julia
lumpedbyprop = @chain df begin
@mutate(CatVar = ~cat_lump_prop(CatVar, .25, "new name"))
@mutate(CatVar = cat_lump_prop(CatVar, .25, "new name"))
end

print(levels(df[!,:CatVar]))
Expand Down
8 changes: 4 additions & 4 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
[deps]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
Chain = "8be319e6-bccf-4806-a6f7-6fae938471bc"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterMarkdown = "997ab1e6-3595-5248-9280-8efb232c3433"
Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
Tidier = "f0413319-3358-4bb0-8e7c-0c83523a93bd"
TidierData = "fe2206b3-d496-4ee9-a338-6a095c4ece80"
TidierCats = "79ddc9fe-4dbf-4a56-a832-df41fb326d23"

[compat]
TidierData = ">=0.9.2"
57 changes: 27 additions & 30 deletions docs/examples/UserGuide/supported_functions.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
using Tidier
using TidierData
using TidierCats
using CategoricalArrays
using Random

Random.seed!(10)
Expand All @@ -9,7 +8,6 @@ categories = ["High", "Medium", "Low", "Zilch"]

random_indices = rand(1:length(categories), 57)


df = DataFrame(
ID = 1:57,
CatVar = categorical([categories[i] for i in random_indices], levels = categories)
Expand All @@ -20,27 +18,27 @@ first(df, 5)
# This function changes the order of levels in a categorical variable. It accepts two arguments - a column name and an array of levels in the desired order.

custom_order = @chain df begin
@mutate(CatVar = ~cat_relevel(CatVar, ["Zilch", "Medium", "High", "Low"]))
@mutate(CatVar = cat_relevel(CatVar, ["Zilch", "Medium", "High", "Low"]))
end

print(levels(df[!,:CatVar]))
print(levels(df.CatVar))

# and

print(levels(custom_order[!,:CatVar]))
print(levels(custom_order.CatVar))


# ## `cat_rev()`
# This function reverses the order of levels in a categorical variable. It only requires one argument - the column name whose levels are to be reversed
reversed_order = @chain df begin
@mutate(CatVar = ~cat_rev(CatVar))
@mutate(CatVar = cat_rev(CatVar))
end

print(levels(df[!,:CatVar]))
print(levels(df.CatVar))

# and

print(levels(reversed_order[!,:CatVar]))
print(levels(reversed_order.CatVar))

# ## `cat_infreq()`
# This function reorders levels of a categorical variable based on their frequencies, with most frequent level first. The single argument is column name
Expand All @@ -50,14 +48,14 @@ print(levels(reversed_order[!,:CatVar]))
end

orderedbyfrequency = @chain df begin
@mutate(CatVar = ~cat_infreq(CatVar))
@mutate(CatVar = cat_infreq(CatVar))
end

print(levels(df[!,:CatVar]))
print(levels(df.CatVar))

# and

print(levels(orderedbyfrequency[!,:CatVar]))
print(levels(orderedbyfrequency.CatVar))


@chain df begin
Expand All @@ -68,14 +66,14 @@ end
# This function lumps the least frequent levels into a new "Other" level. It accepts two arguments - a column name and an integer specifying the number of levels to keep.

lumped_cats = @chain df begin
@mutate(CatVar = ~cat_lump(CatVar,2))
@mutate(CatVar = cat_lump(CatVar,2))
end

print(levels(df[!,:CatVar]))
print(levels(df.CatVar))

# and

print(levels(lumped_cats[!,:CatVar]))
print(levels(lumped_cats.CatVar))


@chain lumped_cats begin
Expand All @@ -91,43 +89,42 @@ df3 = DataFrame(
)

df4 = @chain df3 begin
@mutate(cat_var= ~cat_reorder(cat_var, order_var, "median" ))
@mutate(cat_var= cat_reorder(cat_var, order_var, "median" ))
end


print(levels(df3[!,:cat_var]))
print(levels(df3.cat_var))

# and

print(levels(df4[!,:cat_var]))
print(levels(df4.cat_var))


@chain df3 begin
@mutate(catty = ~as_categorical(cat_var))
@mutate(catty = as_categorical(cat_var))
@group_by(catty)
#@summarise(median = median(order_var))
end

# ## `cat_collapse()`
# This function collapses levels in a categorical variable according to a specified mapping. It requires two arguments - a categorical column and a dictionary that maps original levels to new ones.

df5 = @chain df begin
@mutate(CatVar = ~cat_collapse(CatVar, Dict("Low" => "bad", "Zilch" => "bad")))
@mutate(CatVar = cat_collapse(CatVar, Dict("Low" => "bad", "Zilch" => "bad")))
end

print(levels(df[!,:CatVar]))
print(levels(df.CatVar))

# and

print(levels(df5[!,:CatVar]))
print(levels(df5.CatVar))

# ## `as_categorical()`
# This function converts a standard Julia array to a categorical array. The only argument it needs is the colunn name to be converted.

test = DataFrame( w = ["A", "B", "C", "D"])

@chain test begin
@mutate(w = ~as_categorical(w))
@mutate(w = as_categorical(w))
end

# ## `cat_lump_min()`
Expand All @@ -137,28 +134,28 @@ end
@count(CatVar)
end
lumpedbymin = @chain df begin
@mutate(CatVar = ~cat_lump_min(CatVar, 14))
@mutate(CatVar = cat_lump_min(CatVar, 14))
end

print(levels(df[!,:CatVar]))
print(levels(df.CatVar))

# and

print(levels(lumpedbymin[!,:CatVar]))
print(levels(lumpedbymin.CatVar))

# ## `cat_lump_min()`
# This function wil lump any cargory with less than the minimum proportion and recateogrize it as "Other" as the default, or a category name chosen by the user

lumpedbyprop = @chain df begin
@mutate(CatVar = ~cat_lump_prop(CatVar, .25, "wow"))
@mutate(CatVar = cat_lump_prop(CatVar, .25, "wow"))
end


print(levels(df[!,:CatVar]))
print(levels(df.CatVar))

# and

print(levels(lumpedbyprop[!,:CatVar]))
print(levels(lumpedbyprop.CatVar))


# ## `cat_na_value_to_level()`
Expand Down
6 changes: 3 additions & 3 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
using Documenter, DocumenterMarkdown
using Tidier, TidierCats
using CategoricalArrays
using TidierCats

DocTestMeta = quote
using Tidier, TidierCats, DataFrames, Chain, Statistics
using TidierData, TidierCats, Statistics
end

DocMeta.setdocmeta!(TidierCats,
:DocTestSetup,
DocTestMeta;
Expand Down
25 changes: 13 additions & 12 deletions docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,19 @@
<img src="assets/TidierCats\_logo.png" align="left" style="padding-right:10px"; width="150"></img>

## TidierCats
## TidierCats.jl

The goal of this package is to bring the convenience and simple usability of Forcats in R to Julia. This package was designed to work with Tidier.jl, but can also work independently.
The goal of this package is to bring the convenience and simple usability of `forcats` in R to Julia. This package was designed to work with `Tidier.jl` but can also work independently.

This package re-exports `CategoricalArrays.jl`.

This package includes:
In addition, this package includes:

- `cat_rev`
- `cat_relevel`
- `cat_infreq`
- `cat_lump`
- `cat_reorder`
- `cat_collapse`
- `cat_lump_min`
- `cat_lump_prop`
- `as_categorical`
- `cat_rev()`
- `cat_relevel()`
- `cat_infreq()`
- `cat_lump()`
- `cat_reorder()`
- `cat_collapse()`
- `cat_lump_min()`
- `cat_lump_prop()`
- `as_categorical()`
3 changes: 2 additions & 1 deletion docs/src/reference.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
```@meta
DocTestSetup= quote
DocTestSetup = quote
using TidierData
using TidierCats
end
```
Expand Down
7 changes: 0 additions & 7 deletions src/TidierCats.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,6 @@ using Reexport
export cat_rev, cat_relevel, cat_infreq, cat_lump, cat_reorder, cat_collapse, cat_lump_min, cat_lump_prop, as_categorical
include("catsdocstrings.jl")

function __init__()
try
append!(Main.TidierData.not_vectorized[], [:cat_rev, :cat_relevel, :cat_infreq, :cat_lump, :cat_reorder, :cat_collapse, :cat_lump_min, :cat_lump_prop, :as_categorical])
catch
end
end

"""
$docstring_cat_rev
"""
Expand Down

2 comments on commit 0405447

@kdpsingh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/89173

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.1.1 -m "<description of version>" 040544712b18524c5c67cc7399954110605a33ad
git push origin v0.1.1

Please sign in to comment.