Machine Learning with R by Brett Lantz is a book that provides an introduction to machine learning using R. As far as I can tell, Packt Publishing does not make its datasets available online unless you buy the book and create a user account which can be a problem if you are checking the book out from the library or borrowing the book from a friend. All of these datasets are in the public domain but simply needed some cleaning up and recoding to match the format in the book.
- In your Mac or Linux envirounment, open a terminal and change to the directory where you want your data to be downloaded.
- Go to the github page you want to download it's data (for example the challenger data in chapter 6: https://github.com/stedy/Machine-Learning-with-R-datasets/blob/master/challenger.csv)
- On the right side, you will find a button called "raw". Click on it.
- Copy the url you will get for the new page (in our example I got https://raw.githubusercontent.com/stedy/Machine-Learning-with-R-datasets/master/challenger.csv)
- put the following command in the terminal screen wget name_of_url
so in our example it should be like this
wget https://raw.githubusercontent.com/stedy/Machine-Learning-with-R-datasets/master/challenger.csv
No datasets used
usedcars.csv could not be found online
wisc_bc_data.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/
sms_spam.csv from http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/
credit.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/
mushrooms.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/mushroom/
challenger.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/space-shuttle/
insurance.csv could not be found online
whitewines.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/
concrete.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/concrete/compressive/
letterdata.csv from https://archive.ics.uci.edu/ml/machine-learning-databases/letter-recognition/
groceries.csv is from arules package but probably just easier to call library(arules); data(Groceries)
snsdata.csv could not be found online
sms_results.csv is likely from the sms_test_pred
object in Chapter 4 but difficult to be sure.
credit.csv is likely the same file from Chapter 5.
credit.csv from Chapter 5 is reused.
No datasets used