-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
f6daf20
commit 2a0cc7f
Showing
1 changed file
with
1 addition
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"metadata":{"kernelspec":{"language":"python","display_name":"Python 3","name":"python3"},"language_info":{"name":"python","version":"3.7.12","mimetype":"text/x-python","codemirror_mode":{"name":"ipython","version":3},"pygments_lexer":"ipython3","nbconvert_exporter":"python","file_extension":".py"}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"markdown","source":"**This notebook is an exercise in the [Pandas](https://www.kaggle.com/learn/pandas) course. You can reference the tutorial at [this link](https://www.kaggle.com/residentmario/indexing-selecting-assigning).**\n\n---\n","metadata":{}},{"cell_type":"markdown","source":"# Introduction\n\nIn this set of exercises we will work with the [Wine Reviews dataset](https://www.kaggle.com/zynicide/wine-reviews). ","metadata":{}},{"cell_type":"markdown","source":"Run the following cell to load your data and some utility functions (including code to check your answers).","metadata":{}},{"cell_type":"code","source":"import pandas as pd\n\nreviews = pd.read_csv(\"../input/wine-reviews/winemag-data-130k-v2.csv\", index_col=0)\npd.set_option(\"display.max_rows\", 5)\n\nfrom learntools.core import binder; binder.bind(globals())\nfrom learntools.pandas.indexing_selecting_and_assigning import *\nprint(\"Setup complete.\")","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:12.936486Z","iopub.execute_input":"2022-04-05T10:57:12.937314Z","iopub.status.idle":"2022-04-05T10:57:14.081492Z","shell.execute_reply.started":"2022-04-05T10:57:12.937265Z","shell.execute_reply":"2022-04-05T10:57:14.080572Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"Look at an overview of your data by running the following line.","metadata":{}},{"cell_type":"code","source":"reviews.head()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.08455Z","iopub.execute_input":"2022-04-05T10:57:14.084768Z","iopub.status.idle":"2022-04-05T10:57:14.102906Z","shell.execute_reply.started":"2022-04-05T10:57:14.084742Z","shell.execute_reply":"2022-04-05T10:57:14.101759Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Exercises","metadata":{}},{"cell_type":"markdown","source":"## 1.\n\nSelect the `description` column from `reviews` and assign the result to the variable `desc`.","metadata":{}},{"cell_type":"code","source":"# Your code here\ndesc = reviews ['description']\n\n# Check your answer\nq1.check()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.104443Z","iopub.execute_input":"2022-04-05T10:57:14.104747Z","iopub.status.idle":"2022-04-05T10:57:14.12082Z","shell.execute_reply.started":"2022-04-05T10:57:14.104706Z","shell.execute_reply":"2022-04-05T10:57:14.119816Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"Follow-up question: what type of object is `desc`? If you're not sure, you can check by calling Python's `type` function: `type(desc)`.","metadata":{}},{"cell_type":"code","source":"#q1.hint()\n#q1.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.124105Z","iopub.execute_input":"2022-04-05T10:57:14.124589Z","iopub.status.idle":"2022-04-05T10:57:14.13125Z","shell.execute_reply.started":"2022-04-05T10:57:14.124552Z","shell.execute_reply":"2022-04-05T10:57:14.130308Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## 2.\n\nSelect the first value from the description column of `reviews`, assigning it to variable `first_description`.","metadata":{}},{"cell_type":"code","source":"first_description = reviews.description.iloc[0]\n\n# Check your answer\nq2.check()\nfirst_description","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.132526Z","iopub.execute_input":"2022-04-05T10:57:14.132743Z","iopub.status.idle":"2022-04-05T10:57:14.151147Z","shell.execute_reply.started":"2022-04-05T10:57:14.132709Z","shell.execute_reply":"2022-04-05T10:57:14.150321Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"#q2.hint()\n#q2.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.152791Z","iopub.execute_input":"2022-04-05T10:57:14.153606Z","iopub.status.idle":"2022-04-05T10:57:14.158918Z","shell.execute_reply.started":"2022-04-05T10:57:14.153558Z","shell.execute_reply":"2022-04-05T10:57:14.158052Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## 3. \n\nSelect the first row of data (the first record) from `reviews`, assigning it to the variable `first_row`.","metadata":{}},{"cell_type":"code","source":"first_row = reviews.iloc[0]\n\n# Check your answer\nq3.check()\nfirst_row","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.160456Z","iopub.execute_input":"2022-04-05T10:57:14.161377Z","iopub.status.idle":"2022-04-05T10:57:14.181819Z","shell.execute_reply.started":"2022-04-05T10:57:14.161334Z","shell.execute_reply":"2022-04-05T10:57:14.180927Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"#q3.hint()\n#q3.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.183119Z","iopub.execute_input":"2022-04-05T10:57:14.183432Z","iopub.status.idle":"2022-04-05T10:57:14.188019Z","shell.execute_reply.started":"2022-04-05T10:57:14.183392Z","shell.execute_reply":"2022-04-05T10:57:14.186901Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## 4.\n\nSelect the first 10 values from the `description` column in `reviews`, assigning the result to variable `first_descriptions`.\n\nHint: format your output as a pandas Series.","metadata":{}},{"cell_type":"code","source":"first_descriptions = reviews.description.head(10)\n\n# Check your answer\nq4.check()\nfirst_descriptions","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.18958Z","iopub.execute_input":"2022-04-05T10:57:14.190443Z","iopub.status.idle":"2022-04-05T10:57:14.211322Z","shell.execute_reply.started":"2022-04-05T10:57:14.190399Z","shell.execute_reply":"2022-04-05T10:57:14.210606Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"#q4.hint()\n#q4.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.213584Z","iopub.execute_input":"2022-04-05T10:57:14.214043Z","iopub.status.idle":"2022-04-05T10:57:14.217102Z","shell.execute_reply.started":"2022-04-05T10:57:14.214008Z","shell.execute_reply":"2022-04-05T10:57:14.216445Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## 5.\n\nSelect the records with index labels `1`, `2`, `3`, `5`, and `8`, assigning the result to the variable `sample_reviews`.\n\nIn other words, generate the following DataFrame:\n\n![](https://i.imgur.com/sHZvI1O.png)","metadata":{}},{"cell_type":"code","source":"a = [1, 2, 3, 5, 8]\nsample_reviews = reviews.loc[a]\n\n# Check your answer\nq5.check()\nsample_reviews","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.218294Z","iopub.execute_input":"2022-04-05T10:57:14.218511Z","iopub.status.idle":"2022-04-05T10:57:14.253469Z","shell.execute_reply.started":"2022-04-05T10:57:14.218485Z","shell.execute_reply":"2022-04-05T10:57:14.252827Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"#q5.hint()\n#q5.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.272969Z","iopub.execute_input":"2022-04-05T10:57:14.273651Z","iopub.status.idle":"2022-04-05T10:57:14.277734Z","shell.execute_reply.started":"2022-04-05T10:57:14.273602Z","shell.execute_reply":"2022-04-05T10:57:14.276763Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## 6.\n\nCreate a variable `df` containing the `country`, `province`, `region_1`, and `region_2` columns of the records with the index labels `0`, `1`, `10`, and `100`. In other words, generate the following DataFrame:\n\n![](https://i.imgur.com/FUCGiKP.png)","metadata":{}},{"cell_type":"code","source":"a = [0, 1, 10, 100]\nb=['country','province','region_1','region_2']\ndf = reviews.loc[a,b]\n\n# Check your answer\nq6.check()\ndf","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.421196Z","iopub.execute_input":"2022-04-05T10:57:14.421489Z","iopub.status.idle":"2022-04-05T10:57:14.441431Z","shell.execute_reply.started":"2022-04-05T10:57:14.421458Z","shell.execute_reply":"2022-04-05T10:57:14.440458Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"#q6.hint()\n#q6.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.557653Z","iopub.execute_input":"2022-04-05T10:57:14.558546Z","iopub.status.idle":"2022-04-05T10:57:14.562047Z","shell.execute_reply.started":"2022-04-05T10:57:14.558501Z","shell.execute_reply":"2022-04-05T10:57:14.561158Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## 7.\n\nCreate a variable `df` containing the `country` and `variety` columns of the first 100 records. \n\nHint: you may use `loc` or `iloc`. When working on the answer this question and the several of the ones that follow, keep the following \"gotcha\" described in the tutorial:\n\n> `iloc` uses the Python stdlib indexing scheme, where the first element of the range is included and the last one excluded. \n`loc`, meanwhile, indexes inclusively. \n\n> This is particularly confusing when the DataFrame index is a simple numerical list, e.g. `0,...,1000`. In this case `df.iloc[0:1000]` will return 1000 entries, while `df.loc[0:1000]` return 1001 of them! To get 1000 elements using `loc`, you will need to go one lower and ask for `df.iloc[0:999]`. ","metadata":{}},{"cell_type":"code","source":"a = ['country','variety']\ndf = reviews.loc[:99, a]\n\n# Check your answer\nq7.check()\ndf","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.652048Z","iopub.execute_input":"2022-04-05T10:57:14.652789Z","iopub.status.idle":"2022-04-05T10:57:14.669421Z","shell.execute_reply.started":"2022-04-05T10:57:14.652741Z","shell.execute_reply":"2022-04-05T10:57:14.668671Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"#q7.hint()\n#q7.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.720188Z","iopub.execute_input":"2022-04-05T10:57:14.720503Z","iopub.status.idle":"2022-04-05T10:57:14.723892Z","shell.execute_reply.started":"2022-04-05T10:57:14.720471Z","shell.execute_reply":"2022-04-05T10:57:14.72331Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## 8.\n\nCreate a DataFrame `italian_wines` containing reviews of wines made in `Italy`. Hint: `reviews.country` equals what?","metadata":{}},{"cell_type":"code","source":"italian_wines = reviews.loc[reviews.country=='Italy']\n\n# Check your answer\nq8.check()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.845959Z","iopub.execute_input":"2022-04-05T10:57:14.846596Z","iopub.status.idle":"2022-04-05T10:57:14.8838Z","shell.execute_reply.started":"2022-04-05T10:57:14.846558Z","shell.execute_reply":"2022-04-05T10:57:14.883204Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"#q8.hint()\n#q8.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:14.980435Z","iopub.execute_input":"2022-04-05T10:57:14.980852Z","iopub.status.idle":"2022-04-05T10:57:14.983924Z","shell.execute_reply.started":"2022-04-05T10:57:14.980814Z","shell.execute_reply":"2022-04-05T10:57:14.98328Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"## 9.\n\nCreate a DataFrame `top_oceania_wines` containing all reviews with at least 95 points (out of 100) for wines from Australia or New Zealand.","metadata":{}},{"cell_type":"code","source":"top_oceania_wines = reviews.loc [(reviews.country.isin(['Australia','New Zealand'])) & (reviews.points>=95)]\n\n# Check your answer\nq9.check()\ntop_oceania_wines","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:15.119265Z","iopub.execute_input":"2022-04-05T10:57:15.119615Z","iopub.status.idle":"2022-04-05T10:57:15.153807Z","shell.execute_reply.started":"2022-04-05T10:57:15.119581Z","shell.execute_reply":"2022-04-05T10:57:15.15317Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"#q9.hint()\n#q9.solution()","metadata":{"execution":{"iopub.status.busy":"2022-04-05T10:57:15.330681Z","iopub.execute_input":"2022-04-05T10:57:15.331107Z","iopub.status.idle":"2022-04-05T10:57:15.335065Z","shell.execute_reply.started":"2022-04-05T10:57:15.331068Z","shell.execute_reply":"2022-04-05T10:57:15.334284Z"},"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Keep going\n\nMove on to learn about **[summary functions and maps](https://www.kaggle.com/residentmario/summary-functions-and-maps)**.","metadata":{}},{"cell_type":"markdown","source":"---\n\n\n\n\n*Have questions or comments? Visit the [course discussion forum](https://www.kaggle.com/learn/pandas/discussion) to chat with other learners.*","metadata":{}}]} |