diff --git a/README.md b/README.md
index 2be3d610..57ee1300 100644
--- a/README.md
+++ b/README.md
@@ -3,7 +3,7 @@
 # MathOptAI.jl
 
 [![Build Status](https://github.com/lanl-ansi/MathOptAI.jl/workflows/CI/badge.svg)](https://github.com/lanl-ansi/MathOptAI.jl/actions?query=workflow%3ACI)
-[![codecov](https://codecov.io/gh/lanl-ansi/MathOptAI.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/lanl-ansi/MathOptAI.jl)
+[![Code coverage](https://codecov.io/gh/lanl-ansi/MathOptAI.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/lanl-ansi/MathOptAI.jl)
 
 MathOptAI.jl is a [JuMP](https://jump.dev) extension for embedding trained AI,
 machine learning, and statistical learning models into a JuMP optimization
diff --git a/docs/make.jl b/docs/make.jl
index b4309f56..c83565e6 100644
--- a/docs/make.jl
+++ b/docs/make.jl
@@ -57,7 +57,7 @@ end
 _literate_directory(joinpath(@__DIR__, "src", "tutorials"))
 
 # ==============================================================================
-#  makedocs
+#  Build the documentation
 # ==============================================================================
 
 Documenter.makedocs(;
diff --git a/docs/src/developers/design_principles.md b/docs/src/developers/design_principles.md
index 288b765b..14da5a4a 100644
--- a/docs/src/developers/design_principles.md
+++ b/docs/src/developers/design_principles.md
@@ -28,7 +28,7 @@ MathOptAI chooses to use "predictor" as the synonym for the machine learning
 model. Hence, we have `AbstractPredictor`, `add_predictor`, and
 `build_predictor`.
 
-In contrast, gurob-machinelearning tends to use "regression model" and OMLT
+In contrast, gurobi-machinelearning tends to use "regression model" and OMLT
 uses "formulation."
 
 We choose "predictor" because all models we implement are of the form
@@ -36,7 +36,7 @@ We choose "predictor" because all models we implement are of the form
 
 We do not use "machine learning model" because we have support for the linear
 and logistic regression models of classical statistical fitting. We could have
-used "regression model", but we find that models like neural networks and
+used "regression model," but we find that models like neural networks and
 binary decision trees are not commonly thought of as regression models.
 
 ## Inputs are vectors
@@ -50,7 +50,7 @@ In our opinion, Julia libraries often take a laissez-faire approach to the types
 that they support. In the optimistic case, this can lead to novel behavior by
 combining two packages that the package author had previously not considered or
 tested. In the pessimistic case, this can lead to incorrect results or cryptic
-error messagges.
+error messages.
 
 Exceptions to the `Vector` rule will be carefully considered and tested.
 
@@ -175,7 +175,7 @@ elementwise activation functions, like `sigmoid_activation_function`.
 The downside to this approach is that it treats activation functions as special,
 leading to issues such as [OMLT#125](https://github.com/cog-imperial/OMLT/issues/125).
 
-In constrast, MathOptAI treats activation functions as a vector-valued predictor
+In contrast, MathOptAI treats activation functions as a vector-valued predictor
 like any other:
 ```julia
 y, formulation = MathOptAI.add_predictor(model, MathOptAI.ReLU(), x)
@@ -210,7 +210,7 @@ In contrast, MathOptAI tries to take a maximally modular approach, where the
 user can control how the layers are formulated at runtime, including using a
 custom formulation that is not defined in MathOptAI.jl.
 
-Currently, we achive this with a `config` dictionary, which maps the various
+Currently, we achieve this with a `config` dictionary, which maps the various
 neural network layers to an [`AbstractPredictor`](@ref). For example:
 ```julia
 chain = Flux.Chain(Flux.Dense(1 => 16, Flux.relu), Flux.Dense(16 => 1));
diff --git a/docs/src/manual/AbstractGPs.md b/docs/src/manual/AbstractGPs.md
index f1882603..6633952b 100644
--- a/docs/src/manual/AbstractGPs.md
+++ b/docs/src/manual/AbstractGPs.md
@@ -1,7 +1,7 @@
 # AbstractGPs.jl
 
 [AbstractGPs.jl](https://github.com/JuliaGaussianProcesses/AbstractGPs.jl) is a
-library for fittinng Gaussian Processes in Julia.
+library for fitting Gaussian Processes in Julia.
 
 ## Basic example
 
diff --git a/docs/src/manual/predictors.md b/docs/src/manual/predictors.md
index 45f02f59..492c1f5a 100644
--- a/docs/src/manual/predictors.md
+++ b/docs/src/manual/predictors.md
@@ -42,7 +42,7 @@ linear unit (ReLU).
 
  * [`ReLU`](@ref): requires the solver to support the `max` nonlinear operator.
  * [`ReLUBigM`](@ref): requires the solver to support mixed-integer linear
-   programs, and requires the user to have a priori knowledge of a suitable
+   programs, and requires the user to have prior knowledge of a suitable
    value for the "big-M" parameter.
  * [`ReLUQuadratic`](@ref): requires the solver to support quadratic equality
    constraints
diff --git a/docs/src/tutorials/decision_trees.jl b/docs/src/tutorials/decision_trees.jl
index bacacff7..7eabb4ea 100644
--- a/docs/src/tutorials/decision_trees.jl
+++ b/docs/src/tutorials/decision_trees.jl
@@ -40,7 +40,7 @@ function read_df(filename)
 end
 
 # There are two important files. The first, `college_student_enroll-s1-1.csv`,
-# contains historial admissions data on anonymized students, their SAT score,
+# contains historical admissions data on anonymized students, their SAT score,
 # their GPA, their merit scholarships, and whether the enrolled in the college.
 
 train_df = read_df("college_student_enroll-s1-1.csv")
@@ -76,8 +76,8 @@ evaluate_df
 
 model = Model()
 
-# First, we add a new columnn to `evaluate_df`, with one JuMP decision variable
-# for each row. It is important the the `.merit` column name in `evaluate_df`
+# First, we add a new column to `evaluate_df`, with one JuMP decision variable
+# for each row. It is important the `.merit` column name in `evaluate_df`
 # matches the name in `train_df`.
 
 evaluate_df.merit = @variable(model, 0 <= x_merit[1:n_students] <= 2.5);
diff --git a/docs/src/tutorials/mnist.jl b/docs/src/tutorials/mnist.jl
index 6b3ed436..e1dbab52 100644
--- a/docs/src/tutorials/mnist.jl
+++ b/docs/src/tutorials/mnist.jl
@@ -73,7 +73,7 @@ function data_loader(data; batchsize, shuffle = false)
 end
 
 # and here is a function to score the percentage of correct labels, where we
-# assign a label by choosing the label of the highest softmax in the final
+# assign a label by choosing the label of the highest `softmax` in the final
 # layer.
 
 function score_model(predictor, data)
diff --git a/docs/src/tutorials/mnist_lux.jl b/docs/src/tutorials/mnist_lux.jl
index 296c0158..cd41799e 100644
--- a/docs/src/tutorials/mnist_lux.jl
+++ b/docs/src/tutorials/mnist_lux.jl
@@ -81,7 +81,7 @@ function data_loader(data; batchsize, shuffle = false)
 end
 
 # and here is a function to score the percentage of correct labels, where we
-# assign a label by choosing the label of the highest softmax in the final
+# assign a label by choosing the label of the highest `softmax` in the final
 # layer.
 
 function score_model(predictor, data)
diff --git a/docs/src/tutorials/pytorch.jl b/docs/src/tutorials/pytorch.jl
index bde80403..2f1d1490 100644
--- a/docs/src/tutorials/pytorch.jl
+++ b/docs/src/tutorials/pytorch.jl
@@ -17,14 +17,14 @@
 # See [CondaPkg.jl](https://github.com/JuliaPy/CondaPkg.jl) for more control
 # over how to link Julia to an existing Python environment. For example, if you
 # have an existing Python installation (with PyTorch installed), and it is
-# available in the current conda environment, set:
+# available in the current Conda environment, set:
 #
 # ```julia
 # ENV["JULIA_CONDAPKG_BACKEND"] = "Current"
 # ```
 #
 # before importing PythonCall.jl. If the Python installation can be found on
-# the path and it is not in a conda environment, set:
+# the path and it is not in a Conda environment, set:
 #
 # ```julia
 # ENV["JULIA_CONDAPKG_BACKEND"] = "Null"
@@ -58,7 +58,7 @@ import Plots
 # The model is unimportant, but for this example, we are trying to fit noisy
 # observations of the function ``f(x) = x^2 - 2x``.
 
-# In Python, I ran:
+# In Python, we ran:
 # ```python
 # #!/usr/bin/python3
 # import torch
diff --git a/docs/src/tutorials/student_enrollment.jl b/docs/src/tutorials/student_enrollment.jl
index fc1497ca..af0574d7 100644
--- a/docs/src/tutorials/student_enrollment.jl
+++ b/docs/src/tutorials/student_enrollment.jl
@@ -39,7 +39,7 @@ function read_df(filename)
 end
 
 # There are two important files. The first, `college_student_enroll-s1-1.csv`,
-# contains historial admissions data on anonymized students, their SAT score,
+# contains historical admissions data on anonymized students, their SAT score,
 # their GPA, their merit scholarships, and whether the enrolled in the college.
 
 train_df = read_df("college_student_enroll-s1-1.csv")
@@ -58,7 +58,7 @@ n_students = size(evaluate_df, 1)
 # The first step is to train a logistic regression model to predict the Boolean
 # `enroll` column based on the `SAT`, `GPA`, and `merit` columns.
 
-model_glm = GLM.glm(
+predictor = GLM.glm(
     GLM.@formula(enroll ~ 0 + SAT + GPA + merit),
     train_df,
     GLM.Bernoulli(),
@@ -75,19 +75,19 @@ evaluate_df
 
 model = Model()
 
-# First, we add a new columnn to `evaluate_df`, with one JuMP decision variable
-# for each row. It is important the the `.merit` column name in `evaluate_df`
+# First, we add a new column to `evaluate_df`, with one JuMP decision variable
+# for each row. It is important the `.merit` column name in `evaluate_df`
 # matches the name in `train_df`.
 
 evaluate_df.merit = @variable(model, 0 <= x_merit[1:n_students] <= 2.5);
 evaluate_df
 
-# Then, we use [`MathOptAI.add_predictor`](@ref) to embed `model_glm` into the
+# Then, we use [`MathOptAI.add_predictor`](@ref) to embed `predictor` into the
 # JuMP `model`. [`MathOptAI.add_predictor`](@ref) returns a vector of variables,
 # one for each row inn `evaluate_df`, corresponding to the output `enroll` of
 # our logistic regression.
 
-evaluate_df.enroll, _ = MathOptAI.add_predictor(model, model_glm, evaluate_df);
+evaluate_df.enroll, _ = MathOptAI.add_predictor(model, predictor, evaluate_df);
 evaluate_df
 
 # The `.enroll` column name in `evaluate_df` is just a name. It doesn't have to
diff --git a/ext/MathOptAIDecisionTreeExt.jl b/ext/MathOptAIDecisionTreeExt.jl
index fd5e0dc3..8c2d8a8a 100644
--- a/ext/MathOptAIDecisionTreeExt.jl
+++ b/ext/MathOptAIDecisionTreeExt.jl
@@ -34,7 +34,7 @@ julia> size(features)
 
 julia> labels = truth.(Vector.(eachrow(features)));
 
-julia> ml_model = DecisionTree.build_tree(labels, features)
+julia> predictor = DecisionTree.build_tree(labels, features)
 Decision Tree
 Leaves: 3
 Depth:  2
@@ -43,7 +43,7 @@ julia> model = Model();
 
 julia> @variable(model, 0 <= x[1:2] <= 1);
 
-julia> y, _ = MathOptAI.add_predictor(model, ml_model, x);
+julia> y, _ = MathOptAI.add_predictor(model, predictor, x);
 
 julia> y
 1-element Vector{VariableRef}:
@@ -80,12 +80,12 @@ julia> size(features)
 
 julia> labels = truth.(Vector.(eachrow(features)));
 
-julia> ml_model = DecisionTree.build_tree(labels, features)
+julia> tree = DecisionTree.build_tree(labels, features)
 Decision Tree
 Leaves: 3
 Depth:  2
 
-julia> MathOptAI.build_predictor(ml_model)
+julia> predictor = MathOptAI.build_predictor(tree)
 BinaryDecisionTree{Float64,Int64} [leaves=3, depth=2]
 ```
 """
diff --git a/ext/MathOptAIGLMExt.jl b/ext/MathOptAIGLMExt.jl
index ee8b799f..7df66683 100644
--- a/ext/MathOptAIGLMExt.jl
+++ b/ext/MathOptAIGLMExt.jl
@@ -27,13 +27,13 @@ julia> using GLM, JuMP, MathOptAI
 
 julia> X, Y = rand(10, 2), rand(10);
 
-julia> model_glm = GLM.lm(X, Y);
+julia> predictor = GLM.lm(X, Y);
 
 julia> model = Model();
 
 julia> @variable(model, x[1:2]);
 
-julia> y, _ = MathOptAI.add_predictor(model, model_glm, x);
+julia> y, _ = MathOptAI.add_predictor(model, predictor, x);
 
 julia> y
 1-element Vector{VariableRef}:
@@ -65,9 +65,9 @@ julia> using GLM, MathOptAI
 
 julia> X, Y = rand(10, 2), rand(10);
 
-julia> model_glm = GLM.lm(X, Y);
+julia> model = GLM.lm(X, Y);
 
-julia> MathOptAI.build_predictor(model_glm)
+julia> predictor = MathOptAI.build_predictor(model)
 Affine(A, b) [input: 2, output: 1]
 ```
 """
@@ -99,7 +99,7 @@ julia> using GLM, JuMP, MathOptAI
 
 julia> X, Y = rand(10, 2), rand(Bool, 10);
 
-julia> model_glm = GLM.glm(X, Y, GLM.Bernoulli());
+julia> predictor = GLM.glm(X, Y, GLM.Bernoulli());
 
 julia> model = Model();
 
@@ -107,7 +107,7 @@ julia> @variable(model, x[1:2]);
 
 julia> y, _ = MathOptAI.add_predictor(
            model,
-           model_glm,
+           predictor,
            x;
            sigmoid = MathOptAI.Sigmoid(),
        );
@@ -154,9 +154,9 @@ julia> using GLM, MathOptAI
 
 julia> X, Y = rand(10, 2), rand(Bool, 10);
 
-julia> model_glm = GLM.glm(X, Y, GLM.Bernoulli());
+julia> model = GLM.glm(X, Y, GLM.Bernoulli());
 
-julia> MathOptAI.build_predictor(model_glm)
+julia> predictor = MathOptAI.build_predictor(model)
 Pipeline with layers:
  * Affine(A, b) [input: 2, output: 1]
  * Sigmoid()
diff --git a/src/MathOptAI.jl b/src/MathOptAI.jl
index 2df1e53a..2c630fed 100644
--- a/src/MathOptAI.jl
+++ b/src/MathOptAI.jl
@@ -155,7 +155,7 @@ function add_predictor end
     add_predictor(model::JuMP.AbstractModel, predictor, x::Matrix)
 
 Return a `Matrix`, representing `y` such that `y[:, i] = predictor(x[:, i])` for
-each columnn `i`.
+each column `i`.
 
 ## Example
 
diff --git a/src/predictors/BinaryDecisionTree.jl b/src/predictors/BinaryDecisionTree.jl
index 3c04a37d..95da4e84 100644
--- a/src/predictors/BinaryDecisionTree.jl
+++ b/src/predictors/BinaryDecisionTree.jl
@@ -21,7 +21,7 @@ An [`AbstractPredictor`](@ref) that represents a binary decision tree.
 
 To represent the tree `x[1] <= 0.0 ? -1 : (x[1] <= 1.0 ? 0 : 1)`, do:
 
-```jldoctest doc_decision_tree
+```jldoctest
 julia> using JuMP, MathOptAI
 
 julia> model = Model();