Skip to content

Commit

Permalink
Forked from Tidier.jl to TidierData.jl, assigned new UUID, bumped ver…
Browse files Browse the repository at this point in the history
…sion to 0.8.0
  • Loading branch information
Karandeep Singh committed Jul 28, 2023
1 parent 37c992f commit 05c3af3
Show file tree
Hide file tree
Showing 34 changed files with 182 additions and 180 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/Documenter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
with:
cache-registries: "true"
- name: Install documentation dependencies
run: julia --project=docs -e 'using Pkg; pkg"dev ."; Pkg.instantiate()'
run: julia --project=docs -e 'using Pkg; pkg"latest ."; Pkg.instantiate()'
- name: Build and deploy
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # For authentication with GitHub Actions token
Expand Down
5 changes: 4 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
# Tidier.jl updates
# TidierData.jl updates

## v1.0.0 - 2023-07-28
- `Tidier.jl` cloned and changed to `TidierData.jl`

## v0.7.7 - 2023-07-15
- Added documentation on how to interpolate variables inside of `for` loops. Note: `!!` interpolation doesn't work inside of `for` loops because macros are expanded during parsing and not at runtime.
Expand Down
6 changes: 3 additions & 3 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "Tidier"
uuid = "f0413319-3358-4bb0-8e7c-0c83523a93bd"
name = "TidierData"
uuid = "fe2206b3-d496-4ee9-a338-6a095c4ece80"
authors = ["Karandeep Singh"]
version = "0.7.7"
version = "0.8.0"

[deps]
Chain = "8be319e6-bccf-4806-a6f7-6fae938471bc"
Expand Down
55 changes: 27 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,31 @@
# Tidier.jl
# TidierData.jl

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/TidierOrg/Tidier.jl/blob/main/LICENSE)
[![Docs: Latest](https://img.shields.io/badge/Docs-Latest-blue.svg)](https://tidierorg.github.io/Tidier.jl/dev)
[![Build Status](https://github.com/TidierOrg/Tidier.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/TidierOrg/Tidier.jl/actions/workflows/CI.yml?query=branch%3Amain)
[![Downloads](https://shields.io/endpoint?url=https://pkgs.genieframework.com/api/v1/badge/Tidier&label=Downloads)](https://pkgs.genieframework.com?packages=Tidier)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/TidierOrg/TidierData.jl/blob/main/LICENSE)
[![Docs: Latest](https://img.shields.io/badge/Docs-Latest-blue.svg)](https://tidierorg.github.io/TidierData.jl/dev)
[![Build Status](https://github.com/TidierOrg/TidierData.jl/actions/workflows/CI.yml/badge.svg?branch=main)](https://github.com/TidierOrg/TidierData.jl/actions/workflows/CI.yml?query=branch%3Amain)
[![Downloads](https://shields.io/endpoint?url=https://pkgs.genieframework.com/api/v1/badge/TidierData&label=Downloads)](https://pkgs.genieframework.com?packages=TidierData)

<img src="/docs/src/assets/Tidier_jl_logo.png" align="right" style="padding-left:10px;" width="150"/>

## What is Tidier.jl?
## What is TidierData.jl?

Tidier.jl is a 100% Julia implementation of the R tidyverse
mini-language in Julia. Powered by the DataFrames.jl package and Julia’s
extensive meta-programming capabilities, Tidier.jl is an R user’s love
TidierData.jl is a 100% Julia implementation of the dplyr and tidyr R packages. Powered by the DataFrames.jl package and Julia’s
extensive meta-programming capabilities, TidierData.jl is an R user’s love
letter to data analysis in Julia.

`Tidier.jl` has three goals, which differentiate it from other data analysis
`TidierData.jl` has three goals, which differentiate it from other data analysis
meta-packages in Julia:

1. **Stick as closely to tidyverse syntax as possible:** Whereas other
1. **Stick as closely to dplyr and tidyr syntax as possible:** Whereas other
meta-packages introduce Julia-centric idioms for working with
DataFrames, this package’s goal is to reimplement parts of tidyverse
in Julia. This means that `Tidier.jl` uses *tidy expressions* as opposed
DataFrames, this package’s goal is to reimplement dplyr and tidyr
in Julia. This means that `TidierData.jl` uses *tidy expressions* as opposed
to idiomatic Julia expressions. An example of a tidy expression is
`a = mean(b)`.

2. **Make broadcasting mostly invisible:** Broadcasting trips up many R
users switching to Julia because R users are used to most functions
being vectorized. `Tidier.jl` currently uses a lookup table to decide
being vectorized. `TidierData.jl` currently uses a lookup table to decide
which functions *not* to vectorize; all other functions are
automatically vectorized. Read the documentation page on "Autovectorization"
to read about how this works, and how to override the defaults.
Expand All @@ -36,17 +35,17 @@ meta-packages in Julia:
The first argument in the first instance above is treated as a scalar,
whereas the second instance is treated as a tuple. This can be very confusing
to R users because `1 == c(1)` is `TRUE` in R, whereas in Julia `1 == (1,)`
evaluates to `false`. The design philosophy in `Tidier.jl` is that the user
evaluates to `false`. The design philosophy in `TidierData.jl` is that the user
should feel free to provide a scalar or a tuple as they see fit anytime
multiple values are considered valid for a given argument, such as in
`across()`, and `Tidier.jl` will figure out how to dispatch it.
`across()`, and `TidierData.jl` will figure out how to dispatch it.

## Installation

For the stable version:

```
] add Tidier
] add TidierData
```

The `]` character starts the Julia [package manager](https://docs.julialang.org/en/v1/stdlib/Pkg/). Press the backspace key to return to the Julia prompt.
Expand All @@ -56,27 +55,27 @@ or

```julia
using Pkg
Pkg.add("Tidier")
Pkg.add("TidierData")
```

For the newest version:

```
] add Tidier#main
] add TidierData#main
```

or

```julia
using Pkg
Pkg.add(url="https://github.com/TidierOrg/Tidier.jl")
Pkg.add(url="https://github.com/TidierOrg/TidierData.jl")
```

## What functions does Tidier.jl support?
## What functions does TidierData.jl support?

To support R-style programming, Tidier.jl is implemented using macros.
To support R-style programming, TidierData.jl is implemented using macros.

Tidier.jl currently supports the following top-level macros:
TidierData.jl currently supports the following top-level macros:

- `@glimpse()`
- `@select()`, `@rename()`, and `@distinct()`
Expand All @@ -93,7 +92,7 @@ Tidier.jl currently supports the following top-level macros:
- `@drop_na()`
- `@clean_names()` (as in R's `janitor::clean_names()` function)

Tidier.jl also supports the following helper functions:
TidierData.jl also supports the following helper functions:

- `across()`
- `desc()`
Expand All @@ -104,14 +103,14 @@ Tidier.jl also supports the following helper functions:
- `starts_with()`, `ends_with()`, `matches()`, and `contains()`
- `as_float()`, `as_integer()`, and `as_string()`

See the documentation [Home](https://tidierorg.github.io/Tidier.jl/dev/) page for a guide on how to get started, or the [Reference](https://tidierorg.github.io/Tidier.jl/dev/reference/) page for a detailed guide to each of the macros and functions.
See the documentation [Home](https://tidierorg.github.io/TidierData.jl/latest/) page for a guide on how to get started, or the [Reference](https://tidierorg.github.io/TidierData.jl/latest/reference/) page for a detailed guide to each of the macros and functions.

## Example

Let's select the first five movies in our dataset whose budget exceeds the mean budget. Unlike in R, where we pass an `na.rm = TRUE` argument to remove missing values, in Julia we wrap the variable with a `skipmissing()` to remove the missing values before the `mean()` is calculated.

```julia
using Tidier
using TidierData
using RDatasets

movies = dataset("ggplot2", "movies");
Expand All @@ -138,8 +137,8 @@ end

## What’s new

See [NEWS.md](https://github.com/TidierOrg/Tidier.jl/blob/main/NEWS.md) for the latest updates.
See [NEWS.md](https://github.com/TidierOrg/TidierData.jl/blob/main/NEWS.md) for the latest updates.

## What's missing

Is there a tidyverse feature missing that you would like to see in Tidier.jl? Please file a GitHub issue. Because Tidier.jl primarily wraps DataFrames.jl, our decision to integrate a new feature will be guided by how well-supported it is within DataFrames.jl and how likely other users are to benefit from it.
Is there a tidyverse feature missing that you would like to see in TidierData.jl? Please file a GitHub issue. Because TidierData.jl primarily wraps DataFrames.jl, our decision to integrate a new feature will be guided by how well-supported it is within DataFrames.jl and how likely other users are to benefit from it.
2 changes: 1 addition & 1 deletion docs/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterMarkdown = "997ab1e6-3595-5248-9280-8efb232c3433"
Literate = "98b081ad-f1c9-55d3-8b20-4c87d4299306"
RDatasets = "ce6b1742-4840-55fa-b093-852dadbb1d8b"
Tidier = "f0413319-3358-4bb0-8e7c-0c83523a93bd"
TidierData = "fe2206b3-d496-4ee9-a338-6a095c4ece80"
4 changes: 2 additions & 2 deletions docs/examples/Contributors/Howto.jl
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# ## Contribute to Documentation
# Contributing with examples can be done by first creating a new file example
# [here](https://tidierorg.github.io/Tidier.jl/tree/main/docs/examples/UserGuide)
# [here](https://tidierorg.github.io/TidierData.jl/tree/main/docs/examples/UserGuide)

# !!! info
# - `your_new_file.jl` at `docs/examples/UserGuide/`

# Once this is done you need to add a new entry [here](https://tidierorg.github.io/Tidier.jl/blob/main/docs/mkdocs.yml)
# Once this is done you need to add a new entry [here](https://tidierorg.github.io/TidierData.jl/blob/main/docs/mkdocs.yml)
# at the bottom and the appropriate level.

# !!! info
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/UserGuide/across.jl
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# `across()` is a helper function that is typically used inside `@mutate()` or `@summarize` to operate on multiple columns and/or multiple functions. Notice that `across()` accepts two arguments, a set of variables and a set of functions. If providing multiple variables or functions, these should be provided as a tuple -- in other words, wrapped in parentheses and separated by commas. If you want to skip missing values, you can "fuse" the summary function (such as `mean()`) with the `skipmissing()` function by using the fuction fusion operator, which you can type out in Julia by typing `\circ` and then pressing `[Tab]` such that it reads `mean∘skipmissing`.

using Tidier
using TidierData
using RDatasets

movies = dataset("ggplot2", "movies");
Expand Down
4 changes: 2 additions & 2 deletions docs/examples/UserGuide/arrange.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Arranging is the way to sort a data frame. `@arrange()` can take multiple arguments. Arguments refer to columns that are sorted in ascending order by default. If you want to sort in descending order, make sure to wrap the column name in `desc()` as shown below.

# `DataFrames.jl` does not currently support the `sort()` function on grouped data frames. In order to make this work in `Tidier.jl`, if you apply `@arrange()` to a GroupedDataFrame, `@arrange()` will temporarily ungroup the data, perform the `sort()`, and then re-group by the original grouping variables.
# `DataFrames.jl` does not currently support the `sort()` function on grouped data frames. In order to make this work in `TidierData.jl`, if you apply `@arrange()` to a GroupedDataFrame, `@arrange()` will temporarily ungroup the data, perform the `sort()`, and then re-group by the original grouping variables.

using Tidier
using TidierData
using RDatasets

movies = dataset("ggplot2", "movies");
Expand Down
8 changes: 4 additions & 4 deletions docs/examples/UserGuide/autovec.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# In general, Tidier.jl uses a lookup table to decide which functions *not* to vectorize. For example, `mean()` is listed as a function that should never be vectorized. Also, any function used inside of `@summarize()` is also never automatically vectorized. Any function that is not included in this list *and* is used in a context other than `@summarize()` is automatically vectorized.
# In general, TidierData.jl uses a lookup table to decide which functions *not* to vectorize. For example, `mean()` is listed as a function that should never be vectorized. Also, any function used inside of `@summarize()` is also never automatically vectorized. Any function that is not included in this list *and* is used in a context other than `@summarize()` is automatically vectorized.

# This "auto-vectorization" makes working with Tidier.jl more R-like and convenient. However, if you ever define your own function and try to use it, Tidier.jl may unintentionally vectorize it for you. To prevent auto-vectorization, you can prefix your function with a `~`.
# This "auto-vectorization" makes working with TidierData.jl more R-like and convenient. However, if you ever define your own function and try to use it, TidierData.jl may unintentionally vectorize it for you. To prevent auto-vectorization, you can prefix your function with a `~`.

using Tidier
using TidierData
using RDatasets

df = DataFrame(a = repeat('a':'e', inner = 2), b = [1,1,1,2,2,2,3,3,3,4], c = 11:20)
Expand Down Expand Up @@ -35,7 +35,7 @@ end
@mutate(d = c - ~mean(c))
end

# If for some crazy reason, you *did* want to vectorize `mean()`, you are always allowed to vectorize it, and Tidier.jl won't un-vectorize it.
# If for some crazy reason, you *did* want to vectorize `mean()`, you are always allowed to vectorize it, and TidierData.jl won't un-vectorize it.

@chain df begin
@mutate(d = c - mean.(c))
Expand Down
4 changes: 2 additions & 2 deletions docs/examples/UserGuide/binding.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Whereas joins are useful for combining data frames based on matching keys, another way to combine data frames is to bind them together, which can be done either by rows or by columns. `Tidier.jl` implements these actions using `@bind_rows()` and `@bind_cols()`, respectively.
# Whereas joins are useful for combining data frames based on matching keys, another way to combine data frames is to bind them together, which can be done either by rows or by columns. `TidierData.jl` implements these actions using `@bind_rows()` and `@bind_cols()`, respectively.

# Let's generate three data frames to combine.

using Tidier
using TidierData

df1 = DataFrame(a=1:3, b=1:3);

Expand Down
12 changes: 6 additions & 6 deletions docs/examples/UserGuide/column_names.jl
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
# When referring to column names, Tidier.jl is a bit unusual for a Julia package in that it does not use symbols. This is because Tidier.jl uses *tidy expressions*, which in R lingo equates to a style of programming referred to as "non-standard evaluation." If you are creating a new column `a` containing a value that is the mean of column `b`, you would simply write `a = mean(b)`.
# When referring to column names, TidierData.jl is a bit unusual for a Julia package in that it does not use symbols. This is because TidierData.jl uses *tidy expressions*, which in R lingo equates to a style of programming referred to as "non-standard evaluation." If you are creating a new column `a` containing a value that is the mean of column `b`, you would simply write `a = mean(b)`.

# However, there may be times when you wish to create or refer to a column containing a space in it. Let's start by creating some column names containing a space in their name.

using Tidier
using TidierData

df = DataFrame(var"my name" = ["Ada", "Twist"],
var"my age" = [40, 50])

# To create a column name containing a space, we used the `var"column name"` notation. Because `DataFrame()` is a regular Julia function, this is the standard way to refer to a variable containing a space, which is why we need to use this here.

# This notation *also* works inside of Tidier.jl.
# This notation *also* works inside of TidierData.jl.

# ## `var"column name"` notation

Expand All @@ -19,7 +19,7 @@ df = DataFrame(var"my name" = ["Ada", "Twist"],
@mutate(var"age in 10 years" = var"my age" + 10)
end

# However, typing out the `var"column name"` can become cumbersome. Tidier.jl also supports another shorthand notation to refer to column names containing spaces or other special characters: backticks.
# However, typing out the `var"column name"` can become cumbersome. TidierData.jl also supports another shorthand notation to refer to column names containing spaces or other special characters: backticks.

# ## Backtick notation

Expand All @@ -29,11 +29,11 @@ end
@mutate(`age in 10 years` = `my age` + 10)
end

# Backticks are an R convention. While they are not specific to tidyverse, they are a convenient way to refer to column names that otherwise would not parse correctly as a single entity. Backticks are supported in *all* Tidier.jl functions where column names may be referenced.
# Backticks are an R convention. While they are not specific to tidyverse, they are a convenient way to refer to column names that otherwise would not parse correctly as a single entity. Backticks are supported in *all* TidierData.jl functions where column names may be referenced.

# ## Cleaning up column names

# Another option is to clean up the column names so that you do not have spaces to begin with. In R, this is usually accomplished using the `janitor` package. In Julia, the Cleaner.jl package provides this functionality, which we have wrapped inside of Tidier.jl.
# Another option is to clean up the column names so that you do not have spaces to begin with. In R, this is usually accomplished using the `janitor` package. In Julia, the Cleaner.jl package provides this functionality, which we have wrapped inside of TidierData.jl.

@chain df begin
@clean_names
Expand Down
Loading

0 comments on commit 05c3af3

Please sign in to comment.