diff --git a/_images/01e66fdece1c06d9d2ab36de01d04ac8d4eb1db75d9752af1f821e93bea1604e.png b/_images/01e66fdece1c06d9d2ab36de01d04ac8d4eb1db75d9752af1f821e93bea1604e.png new file mode 100644 index 000000000..0cbe43657 Binary files /dev/null and b/_images/01e66fdece1c06d9d2ab36de01d04ac8d4eb1db75d9752af1f821e93bea1604e.png differ diff --git a/_images/027d5a00539b4be3504a0168d4c00400f61aa083ba2a69dfc31e6ea8fef57d6b.png b/_images/027d5a00539b4be3504a0168d4c00400f61aa083ba2a69dfc31e6ea8fef57d6b.png deleted file mode 100644 index 0304f6c65..000000000 Binary files a/_images/027d5a00539b4be3504a0168d4c00400f61aa083ba2a69dfc31e6ea8fef57d6b.png and /dev/null differ diff --git a/_images/0522866330231fa2080e854e9d5fdbc6e63c618e7d2b6881af2bfb6eabe79a90.png b/_images/0522866330231fa2080e854e9d5fdbc6e63c618e7d2b6881af2bfb6eabe79a90.png deleted file mode 100644 index b2441a397..000000000 Binary files a/_images/0522866330231fa2080e854e9d5fdbc6e63c618e7d2b6881af2bfb6eabe79a90.png and /dev/null differ diff --git a/_images/0ad8ea179db874928d64f02173c404c55183b9117f6c438a9fa6be030dfc5975.png b/_images/0ad8ea179db874928d64f02173c404c55183b9117f6c438a9fa6be030dfc5975.png new file mode 100644 index 000000000..3dc94794f Binary files /dev/null and b/_images/0ad8ea179db874928d64f02173c404c55183b9117f6c438a9fa6be030dfc5975.png differ diff --git a/_images/0b6958ae440d33aebbcd79653f1003606d4b3118fe5e1f3bb86006816a3e703a.png b/_images/0b6958ae440d33aebbcd79653f1003606d4b3118fe5e1f3bb86006816a3e703a.png new file mode 100644 index 000000000..426535975 Binary files /dev/null and b/_images/0b6958ae440d33aebbcd79653f1003606d4b3118fe5e1f3bb86006816a3e703a.png differ diff --git a/_images/0c7550d535f0278175b8c1c49432fc2772621bd1cb93228ba19e3c566e32ae1a.png b/_images/0c7550d535f0278175b8c1c49432fc2772621bd1cb93228ba19e3c566e32ae1a.png deleted file mode 100644 index 80ca184f9..000000000 Binary files a/_images/0c7550d535f0278175b8c1c49432fc2772621bd1cb93228ba19e3c566e32ae1a.png and /dev/null differ diff --git a/_images/0fe01dcff6e97df32e78bb68708a64215a4a9bdf5112ad8146b0358cc0f62b55.png b/_images/0fe01dcff6e97df32e78bb68708a64215a4a9bdf5112ad8146b0358cc0f62b55.png new file mode 100644 index 000000000..5f4eb0c63 Binary files /dev/null and b/_images/0fe01dcff6e97df32e78bb68708a64215a4a9bdf5112ad8146b0358cc0f62b55.png differ diff --git a/_images/1027374749a31f12f13f0f993153c8cd74c24838f73f5d092005820a60d842df.png b/_images/1027374749a31f12f13f0f993153c8cd74c24838f73f5d092005820a60d842df.png deleted file mode 100644 index a6347a4ba..000000000 Binary files a/_images/1027374749a31f12f13f0f993153c8cd74c24838f73f5d092005820a60d842df.png and /dev/null differ diff --git a/_images/116104ce35df99d16650676c641dd050503c92673eb5ea03715e120513cd3f0f.png b/_images/116104ce35df99d16650676c641dd050503c92673eb5ea03715e120513cd3f0f.png deleted file mode 100644 index 5f6e3c818..000000000 Binary files a/_images/116104ce35df99d16650676c641dd050503c92673eb5ea03715e120513cd3f0f.png and /dev/null differ diff --git a/_images/124ca54eb9e5f8a334a26e971836ff232a64d732c55089ffdc7d92ed1fcd948a.png b/_images/124ca54eb9e5f8a334a26e971836ff232a64d732c55089ffdc7d92ed1fcd948a.png new file mode 100644 index 000000000..0a50d3c4c Binary files /dev/null and b/_images/124ca54eb9e5f8a334a26e971836ff232a64d732c55089ffdc7d92ed1fcd948a.png differ diff --git a/_images/1937fc917f9ec711cabb21379467e2581c2d20c05ed7bb1762efa2a637a8d09f.png b/_images/1937fc917f9ec711cabb21379467e2581c2d20c05ed7bb1762efa2a637a8d09f.png deleted file mode 100644 index a2770f39f..000000000 Binary files a/_images/1937fc917f9ec711cabb21379467e2581c2d20c05ed7bb1762efa2a637a8d09f.png and /dev/null differ diff --git a/_images/1e335f3000191c4aaa033f400119af5c3b9ccd86318ef3c9365f4262de5fdb68.png b/_images/1e335f3000191c4aaa033f400119af5c3b9ccd86318ef3c9365f4262de5fdb68.png new file mode 100644 index 000000000..7a7e718bc Binary files /dev/null and b/_images/1e335f3000191c4aaa033f400119af5c3b9ccd86318ef3c9365f4262de5fdb68.png differ diff --git a/_images/1e64029747a1ef073c224724d20b3e12ebf1ead6392d1004b2935ecadd3723bd.png b/_images/1e64029747a1ef073c224724d20b3e12ebf1ead6392d1004b2935ecadd3723bd.png deleted file mode 100644 index d5bf469dd..000000000 Binary files a/_images/1e64029747a1ef073c224724d20b3e12ebf1ead6392d1004b2935ecadd3723bd.png and /dev/null differ diff --git a/_images/1ee8b865671a2ec2a79d7e3397c5d58cdcbd7475eff44511d0f699d8707382f8.png b/_images/1ee8b865671a2ec2a79d7e3397c5d58cdcbd7475eff44511d0f699d8707382f8.png new file mode 100644 index 000000000..4828dff23 Binary files /dev/null and b/_images/1ee8b865671a2ec2a79d7e3397c5d58cdcbd7475eff44511d0f699d8707382f8.png differ diff --git a/_images/1f50d1286f8f618a8d4b50066e2edaf4e537e35b44e900b17277493f67bd9daf.png b/_images/1f50d1286f8f618a8d4b50066e2edaf4e537e35b44e900b17277493f67bd9daf.png deleted file mode 100644 index 81052be0f..000000000 Binary files a/_images/1f50d1286f8f618a8d4b50066e2edaf4e537e35b44e900b17277493f67bd9daf.png and /dev/null differ diff --git a/_images/208c57ab761416b84edfc78cde51afa79934fe92c36787f1d0b5917737f9d357.png b/_images/208c57ab761416b84edfc78cde51afa79934fe92c36787f1d0b5917737f9d357.png new file mode 100644 index 000000000..e4695248e Binary files /dev/null and b/_images/208c57ab761416b84edfc78cde51afa79934fe92c36787f1d0b5917737f9d357.png differ diff --git a/_images/258959bcadd4c804a7311d08c7c7fc55e615d53c3b96dc817578d68a131c8842.png b/_images/258959bcadd4c804a7311d08c7c7fc55e615d53c3b96dc817578d68a131c8842.png new file mode 100644 index 000000000..9cfb65f0a Binary files /dev/null and b/_images/258959bcadd4c804a7311d08c7c7fc55e615d53c3b96dc817578d68a131c8842.png differ diff --git a/_images/2fb0b0a4b02019d117adbabee3d5d0507a9c94d60fc5c58400149f32d481fe16.png b/_images/2fb0b0a4b02019d117adbabee3d5d0507a9c94d60fc5c58400149f32d481fe16.png deleted file mode 100644 index 40ed13a87..000000000 Binary files a/_images/2fb0b0a4b02019d117adbabee3d5d0507a9c94d60fc5c58400149f32d481fe16.png and /dev/null differ diff --git a/_images/363de2948bcaab1634dfcb8a787e2d6eb215733fb7260a5241a417937cb93207.png b/_images/363de2948bcaab1634dfcb8a787e2d6eb215733fb7260a5241a417937cb93207.png deleted file mode 100644 index 1a08da07f..000000000 Binary files a/_images/363de2948bcaab1634dfcb8a787e2d6eb215733fb7260a5241a417937cb93207.png and /dev/null differ diff --git a/_images/38adfa6c210945b5616ca3b825b0db39808af1cb6b4c133d9ff6faa4d02c8bed.png b/_images/38adfa6c210945b5616ca3b825b0db39808af1cb6b4c133d9ff6faa4d02c8bed.png new file mode 100644 index 000000000..559ec0d4e Binary files /dev/null and b/_images/38adfa6c210945b5616ca3b825b0db39808af1cb6b4c133d9ff6faa4d02c8bed.png differ diff --git a/_images/3a99f568b3347b7f38a410fc4b078ea630d1a785ae744a15d4c8eb61ab71d9f2.png b/_images/3a99f568b3347b7f38a410fc4b078ea630d1a785ae744a15d4c8eb61ab71d9f2.png deleted file mode 100644 index 64d0298f6..000000000 Binary files a/_images/3a99f568b3347b7f38a410fc4b078ea630d1a785ae744a15d4c8eb61ab71d9f2.png and /dev/null differ diff --git a/_images/3e082a1a3664385fde73ceb17ed1b99d566808a050788c128d05b5820d5d9da7.png b/_images/3e082a1a3664385fde73ceb17ed1b99d566808a050788c128d05b5820d5d9da7.png new file mode 100644 index 000000000..818820552 Binary files /dev/null and b/_images/3e082a1a3664385fde73ceb17ed1b99d566808a050788c128d05b5820d5d9da7.png differ diff --git a/_images/426fe7b4cff8a531c7c310e5b714884e9d19f1d6deaf987ef8c71b8fa7f0c08d.png b/_images/426fe7b4cff8a531c7c310e5b714884e9d19f1d6deaf987ef8c71b8fa7f0c08d.png new file mode 100644 index 000000000..25158ba0a Binary files /dev/null and b/_images/426fe7b4cff8a531c7c310e5b714884e9d19f1d6deaf987ef8c71b8fa7f0c08d.png differ diff --git a/_images/42b4a8274e883eb742e53763ba1f450032402d4e8e3a28ec87cbb78485593404.png b/_images/42b4a8274e883eb742e53763ba1f450032402d4e8e3a28ec87cbb78485593404.png deleted file mode 100644 index 776aed298..000000000 Binary files a/_images/42b4a8274e883eb742e53763ba1f450032402d4e8e3a28ec87cbb78485593404.png and /dev/null differ diff --git a/_images/4498752f6f0677a4d3f997446130d7bfc11dd91048abbf8743290725450eb42f.png b/_images/4498752f6f0677a4d3f997446130d7bfc11dd91048abbf8743290725450eb42f.png deleted file mode 100644 index fcc92febd..000000000 Binary files a/_images/4498752f6f0677a4d3f997446130d7bfc11dd91048abbf8743290725450eb42f.png and /dev/null differ diff --git a/_images/52d8fb26eb9eb7c7d183411bb8bc681925cb28cee33462a27890b58679eda629.png b/_images/52d8fb26eb9eb7c7d183411bb8bc681925cb28cee33462a27890b58679eda629.png deleted file mode 100644 index 7b759e080..000000000 Binary files a/_images/52d8fb26eb9eb7c7d183411bb8bc681925cb28cee33462a27890b58679eda629.png and /dev/null differ diff --git a/_images/58b76fc1ccab386c7a34e3413fca8faa3393472db13d34ef91c7015dcbe5fbe7.png b/_images/58b76fc1ccab386c7a34e3413fca8faa3393472db13d34ef91c7015dcbe5fbe7.png new file mode 100644 index 000000000..368f5d680 Binary files /dev/null and b/_images/58b76fc1ccab386c7a34e3413fca8faa3393472db13d34ef91c7015dcbe5fbe7.png differ diff --git a/_images/5a0443b022db383942a27c20acdd6b8c7ed0258039a05147d6891df5fdc5f52f.png b/_images/5a0443b022db383942a27c20acdd6b8c7ed0258039a05147d6891df5fdc5f52f.png deleted file mode 100644 index 870a23ae1..000000000 Binary files a/_images/5a0443b022db383942a27c20acdd6b8c7ed0258039a05147d6891df5fdc5f52f.png and /dev/null differ diff --git a/_images/5fd74df20ad59564762e3cdd603f458571ae4794eac7e01b83f55c25fe8047bc.png b/_images/5fd74df20ad59564762e3cdd603f458571ae4794eac7e01b83f55c25fe8047bc.png new file mode 100644 index 000000000..81c0ccd66 Binary files /dev/null and b/_images/5fd74df20ad59564762e3cdd603f458571ae4794eac7e01b83f55c25fe8047bc.png differ diff --git a/_images/62010fb41643ba01d43e8d7e5c0f5376497d8312bb172f3f19643e4dabfa78e3.png b/_images/62010fb41643ba01d43e8d7e5c0f5376497d8312bb172f3f19643e4dabfa78e3.png deleted file mode 100644 index 4cecfd183..000000000 Binary files a/_images/62010fb41643ba01d43e8d7e5c0f5376497d8312bb172f3f19643e4dabfa78e3.png and /dev/null differ diff --git a/_images/64c942d42075f7c44b2a146ea90c6f9b41ce7824a1fa39babb9bb046fc9ec361.png b/_images/64c942d42075f7c44b2a146ea90c6f9b41ce7824a1fa39babb9bb046fc9ec361.png new file mode 100644 index 000000000..52efa7794 Binary files /dev/null and b/_images/64c942d42075f7c44b2a146ea90c6f9b41ce7824a1fa39babb9bb046fc9ec361.png differ diff --git a/_images/68d93051103d363584f935ee7294e702987106093edccc5ebee9e62f0c80fdfc.png b/_images/68d93051103d363584f935ee7294e702987106093edccc5ebee9e62f0c80fdfc.png new file mode 100644 index 000000000..549e05e60 Binary files /dev/null and b/_images/68d93051103d363584f935ee7294e702987106093edccc5ebee9e62f0c80fdfc.png differ diff --git a/_images/6e35dffb680833c1b6fed9ac74960ab1c19475f9df827ecd06318c7b954b6c77.png b/_images/6e35dffb680833c1b6fed9ac74960ab1c19475f9df827ecd06318c7b954b6c77.png deleted file mode 100644 index 70b01f434..000000000 Binary files a/_images/6e35dffb680833c1b6fed9ac74960ab1c19475f9df827ecd06318c7b954b6c77.png and /dev/null differ diff --git a/_images/6fe2431ab2860d68a5cbbd3b8bae1522332cf168bb5e477404facc6791d88f13.png b/_images/6fe2431ab2860d68a5cbbd3b8bae1522332cf168bb5e477404facc6791d88f13.png new file mode 100644 index 000000000..48b617e54 Binary files /dev/null and b/_images/6fe2431ab2860d68a5cbbd3b8bae1522332cf168bb5e477404facc6791d88f13.png differ diff --git a/_images/7827a94926af2af5a5913097aeb103df4d37efd4838fae9928ae6cce894ab514.png b/_images/7827a94926af2af5a5913097aeb103df4d37efd4838fae9928ae6cce894ab514.png new file mode 100644 index 000000000..149a0b9d9 Binary files /dev/null and b/_images/7827a94926af2af5a5913097aeb103df4d37efd4838fae9928ae6cce894ab514.png differ diff --git a/_images/7b6ceb31af3987cc9d9bf2e29bb967316133e4c3999dfc9ac8a9b4f674914eff.png b/_images/7b6ceb31af3987cc9d9bf2e29bb967316133e4c3999dfc9ac8a9b4f674914eff.png new file mode 100644 index 000000000..795acb58e Binary files /dev/null and b/_images/7b6ceb31af3987cc9d9bf2e29bb967316133e4c3999dfc9ac8a9b4f674914eff.png differ diff --git a/_images/804f03ccd90f611cf09d0a2877ff4b38a034653f1561809ace702c35ddbeb643.png b/_images/804f03ccd90f611cf09d0a2877ff4b38a034653f1561809ace702c35ddbeb643.png new file mode 100644 index 000000000..74f4f9056 Binary files /dev/null and b/_images/804f03ccd90f611cf09d0a2877ff4b38a034653f1561809ace702c35ddbeb643.png differ diff --git a/_images/8727e9f63828f06e8362a1525dfc9eeafb49514a173961af9e1f12b49f15dbb8.png b/_images/8727e9f63828f06e8362a1525dfc9eeafb49514a173961af9e1f12b49f15dbb8.png deleted file mode 100644 index 0c414c36d..000000000 Binary files a/_images/8727e9f63828f06e8362a1525dfc9eeafb49514a173961af9e1f12b49f15dbb8.png and /dev/null differ diff --git a/_images/903ca253ee93134930868fd7b7f6740bfdddb6017f4644af9c480e43ed4b67c1.png b/_images/903ca253ee93134930868fd7b7f6740bfdddb6017f4644af9c480e43ed4b67c1.png deleted file mode 100644 index e88a5053c..000000000 Binary files a/_images/903ca253ee93134930868fd7b7f6740bfdddb6017f4644af9c480e43ed4b67c1.png and /dev/null differ diff --git a/_images/92e36ced055bb12ff5e4930b06dc91a85c90c7a63a066571e6a79696032beb18.png b/_images/92e36ced055bb12ff5e4930b06dc91a85c90c7a63a066571e6a79696032beb18.png new file mode 100644 index 000000000..915ae8bef Binary files /dev/null and b/_images/92e36ced055bb12ff5e4930b06dc91a85c90c7a63a066571e6a79696032beb18.png differ diff --git a/_images/9590555d4c3e191e67e5e104078c6f7ce5036ad3eb6fad6d07023fd3a1e1cb80.png b/_images/9590555d4c3e191e67e5e104078c6f7ce5036ad3eb6fad6d07023fd3a1e1cb80.png deleted file mode 100644 index 4c1b3e9a7..000000000 Binary files a/_images/9590555d4c3e191e67e5e104078c6f7ce5036ad3eb6fad6d07023fd3a1e1cb80.png and /dev/null differ diff --git a/_images/969c0ed009e2ffa918a559366f1855ffb6e3e2992ab113f89dd262f3cacce599.png b/_images/969c0ed009e2ffa918a559366f1855ffb6e3e2992ab113f89dd262f3cacce599.png deleted file mode 100644 index 4e6ecd577..000000000 Binary files a/_images/969c0ed009e2ffa918a559366f1855ffb6e3e2992ab113f89dd262f3cacce599.png and /dev/null differ diff --git a/_images/976653ac5d2b6a6282a8c46257fa3f0d30fc3317073b45c08cf5ae24dba79e9e.png b/_images/976653ac5d2b6a6282a8c46257fa3f0d30fc3317073b45c08cf5ae24dba79e9e.png deleted file mode 100644 index 23df08a5f..000000000 Binary files a/_images/976653ac5d2b6a6282a8c46257fa3f0d30fc3317073b45c08cf5ae24dba79e9e.png and /dev/null differ diff --git a/_images/9a237d36bd97b228b1ebd0367fc6c50b8aacf9bb59c447cef28c1acd45dbf8ea.png b/_images/9a237d36bd97b228b1ebd0367fc6c50b8aacf9bb59c447cef28c1acd45dbf8ea.png deleted file mode 100644 index 98f75f0b0..000000000 Binary files a/_images/9a237d36bd97b228b1ebd0367fc6c50b8aacf9bb59c447cef28c1acd45dbf8ea.png and /dev/null differ diff --git a/_images/9cd0c17f5a84349390570a10cfb7edabfefa8f6b1e2d21ad2ec1b8952a43b2d0.png b/_images/9cd0c17f5a84349390570a10cfb7edabfefa8f6b1e2d21ad2ec1b8952a43b2d0.png deleted file mode 100644 index 3e9aa1f03..000000000 Binary files a/_images/9cd0c17f5a84349390570a10cfb7edabfefa8f6b1e2d21ad2ec1b8952a43b2d0.png and /dev/null differ diff --git a/_images/9e703116f3f72d9227a61ed67899b9b1d77ef20a62f729a7c75202e0c7cd78bd.png b/_images/9e703116f3f72d9227a61ed67899b9b1d77ef20a62f729a7c75202e0c7cd78bd.png deleted file mode 100644 index 93f6ab442..000000000 Binary files a/_images/9e703116f3f72d9227a61ed67899b9b1d77ef20a62f729a7c75202e0c7cd78bd.png and /dev/null differ diff --git a/_images/9fff407a25907bd41a1abd6a92210a69b26e39b6d9e72324c3faeb4260092f22.png b/_images/9fff407a25907bd41a1abd6a92210a69b26e39b6d9e72324c3faeb4260092f22.png deleted file mode 100644 index 68c34052b..000000000 Binary files a/_images/9fff407a25907bd41a1abd6a92210a69b26e39b6d9e72324c3faeb4260092f22.png and /dev/null differ diff --git a/_images/a02686147976a4974ba8e5701f00ef2d150f2c7a16c06c9e0bcb02ee6187f747.png b/_images/a02686147976a4974ba8e5701f00ef2d150f2c7a16c06c9e0bcb02ee6187f747.png new file mode 100644 index 000000000..c21746c56 Binary files /dev/null and b/_images/a02686147976a4974ba8e5701f00ef2d150f2c7a16c06c9e0bcb02ee6187f747.png differ diff --git a/_images/a07028c89eb4196e8bd7e4d4666e80806c04e00602ce204cc6944fddb83b7e9c.png b/_images/a07028c89eb4196e8bd7e4d4666e80806c04e00602ce204cc6944fddb83b7e9c.png deleted file mode 100644 index 1db5c9138..000000000 Binary files a/_images/a07028c89eb4196e8bd7e4d4666e80806c04e00602ce204cc6944fddb83b7e9c.png and /dev/null differ diff --git a/_images/a20a795700d05cb881645f8d4da00285671fdb94ac13116c389d399f5e75ed7b.png b/_images/a20a795700d05cb881645f8d4da00285671fdb94ac13116c389d399f5e75ed7b.png deleted file mode 100644 index 19d359efc..000000000 Binary files a/_images/a20a795700d05cb881645f8d4da00285671fdb94ac13116c389d399f5e75ed7b.png and /dev/null differ diff --git a/_images/a731d1b3b5433b622817b78a3b695950dab7c395dd97b2a7786b42116069a906.png b/_images/a731d1b3b5433b622817b78a3b695950dab7c395dd97b2a7786b42116069a906.png new file mode 100644 index 000000000..ba8cc2a69 Binary files /dev/null and b/_images/a731d1b3b5433b622817b78a3b695950dab7c395dd97b2a7786b42116069a906.png differ diff --git a/_images/b2056fa661f8c25fdd8991d19559b9ea2c9f2e0de82302657b07b187b861a355.png b/_images/b2056fa661f8c25fdd8991d19559b9ea2c9f2e0de82302657b07b187b861a355.png new file mode 100644 index 000000000..a668d265b Binary files /dev/null and b/_images/b2056fa661f8c25fdd8991d19559b9ea2c9f2e0de82302657b07b187b861a355.png differ diff --git a/_images/b36df54e4cbfc212054ec9c356041684b296051613764d63dcd190509a585a9c.png b/_images/b36df54e4cbfc212054ec9c356041684b296051613764d63dcd190509a585a9c.png deleted file mode 100644 index 38d2cda34..000000000 Binary files a/_images/b36df54e4cbfc212054ec9c356041684b296051613764d63dcd190509a585a9c.png and /dev/null differ diff --git a/_images/b5752918a7baa2d398fc96edf81ee6b215d72d9fdfb3610a52e52f97b6e63f4d.png b/_images/b5752918a7baa2d398fc96edf81ee6b215d72d9fdfb3610a52e52f97b6e63f4d.png new file mode 100644 index 000000000..994bcc37c Binary files /dev/null and b/_images/b5752918a7baa2d398fc96edf81ee6b215d72d9fdfb3610a52e52f97b6e63f4d.png differ diff --git a/_images/c1b539edf1e6733760f4d74e58256b1f57414f27c01788d439ade461e98f39c4.png b/_images/c1b539edf1e6733760f4d74e58256b1f57414f27c01788d439ade461e98f39c4.png new file mode 100644 index 000000000..6a041498b Binary files /dev/null and b/_images/c1b539edf1e6733760f4d74e58256b1f57414f27c01788d439ade461e98f39c4.png differ diff --git a/_images/c3d51314392ba6f390c6be666b7ab333bed32939a482b60843df13799983f5e7.png b/_images/c3d51314392ba6f390c6be666b7ab333bed32939a482b60843df13799983f5e7.png new file mode 100644 index 000000000..bb9e8c512 Binary files /dev/null and b/_images/c3d51314392ba6f390c6be666b7ab333bed32939a482b60843df13799983f5e7.png differ diff --git a/_images/caed91b90bf1059987a6b8faaf83c6a78ab5c33a0805f2711ad205e0f6c1489d.png b/_images/caed91b90bf1059987a6b8faaf83c6a78ab5c33a0805f2711ad205e0f6c1489d.png new file mode 100644 index 000000000..f8c0f6d42 Binary files /dev/null and b/_images/caed91b90bf1059987a6b8faaf83c6a78ab5c33a0805f2711ad205e0f6c1489d.png differ diff --git a/_images/cf7a7673ca0704a9847d2c4ec02f69c554ce1e324bf452b8a8817c8f235fadd3.png b/_images/cf7a7673ca0704a9847d2c4ec02f69c554ce1e324bf452b8a8817c8f235fadd3.png deleted file mode 100644 index c605fb6a8..000000000 Binary files a/_images/cf7a7673ca0704a9847d2c4ec02f69c554ce1e324bf452b8a8817c8f235fadd3.png and /dev/null differ diff --git a/_images/cfacddd703cf16c0c65beb747de7780e425df44ad48f8000c7ece25270bf240c.png b/_images/cfacddd703cf16c0c65beb747de7780e425df44ad48f8000c7ece25270bf240c.png new file mode 100644 index 000000000..253ea7fef Binary files /dev/null and b/_images/cfacddd703cf16c0c65beb747de7780e425df44ad48f8000c7ece25270bf240c.png differ diff --git a/_images/d2dc1eaec0de1e6210091c07f1e1dbd9eb70b95593a74b392935c24d8f505242.png b/_images/d2dc1eaec0de1e6210091c07f1e1dbd9eb70b95593a74b392935c24d8f505242.png new file mode 100644 index 000000000..6dcad1aa5 Binary files /dev/null and b/_images/d2dc1eaec0de1e6210091c07f1e1dbd9eb70b95593a74b392935c24d8f505242.png differ diff --git a/_images/da7614d8d35733e9f32602a53fd2d6110b4a817ebc1b7c889cf783b7d4dc4e2d.png b/_images/da7614d8d35733e9f32602a53fd2d6110b4a817ebc1b7c889cf783b7d4dc4e2d.png new file mode 100644 index 000000000..450d4182a Binary files /dev/null and b/_images/da7614d8d35733e9f32602a53fd2d6110b4a817ebc1b7c889cf783b7d4dc4e2d.png differ diff --git a/_images/e64bf19e7fb3d9c1bad7a5b685cc4ccdd0ebee67898b1f4dd7b770db1d98e0ed.png b/_images/e64bf19e7fb3d9c1bad7a5b685cc4ccdd0ebee67898b1f4dd7b770db1d98e0ed.png new file mode 100644 index 000000000..3317e4d1f Binary files /dev/null and b/_images/e64bf19e7fb3d9c1bad7a5b685cc4ccdd0ebee67898b1f4dd7b770db1d98e0ed.png differ diff --git a/_images/ed062b50214a192a401ba32bd20cdcdcb82875d0a6b7aa23aa0918dd41c62494.png b/_images/ed062b50214a192a401ba32bd20cdcdcb82875d0a6b7aa23aa0918dd41c62494.png deleted file mode 100644 index 14a781a1e..000000000 Binary files a/_images/ed062b50214a192a401ba32bd20cdcdcb82875d0a6b7aa23aa0918dd41c62494.png and /dev/null differ diff --git a/_images/ed236e08f361b99be15fb0aebb313b4a614465f6a03b7f25ad4b8b98b38ef653.png b/_images/ed236e08f361b99be15fb0aebb313b4a614465f6a03b7f25ad4b8b98b38ef653.png new file mode 100644 index 000000000..c3fe7189b Binary files /dev/null and b/_images/ed236e08f361b99be15fb0aebb313b4a614465f6a03b7f25ad4b8b98b38ef653.png differ diff --git a/_images/ee0e802a4a632e9671ffb3ae723408df8516630503c49c9ee2336efe6f39c9ea.png b/_images/ee0e802a4a632e9671ffb3ae723408df8516630503c49c9ee2336efe6f39c9ea.png new file mode 100644 index 000000000..2b5f2b4b0 Binary files /dev/null and b/_images/ee0e802a4a632e9671ffb3ae723408df8516630503c49c9ee2336efe6f39c9ea.png differ diff --git a/_images/eed795e853302faf5dce9966892359d12bb14142d8db73070de7029531c1f5b7.png b/_images/eed795e853302faf5dce9966892359d12bb14142d8db73070de7029531c1f5b7.png deleted file mode 100644 index ac99797b1..000000000 Binary files a/_images/eed795e853302faf5dce9966892359d12bb14142d8db73070de7029531c1f5b7.png and /dev/null differ diff --git a/_images/f6775c9cfacd000140139bba907417d558bfbe112c49635f08a71427b9b14cd2.png b/_images/f6775c9cfacd000140139bba907417d558bfbe112c49635f08a71427b9b14cd2.png new file mode 100644 index 000000000..5a735250c Binary files /dev/null and b/_images/f6775c9cfacd000140139bba907417d558bfbe112c49635f08a71427b9b14cd2.png differ diff --git a/_images/f7c7861e22b2b54fd6dbdbba5ac136870baed455ff926a540fc2cb70d73a2f2e.png b/_images/f7c7861e22b2b54fd6dbdbba5ac136870baed455ff926a540fc2cb70d73a2f2e.png deleted file mode 100644 index 462db4023..000000000 Binary files a/_images/f7c7861e22b2b54fd6dbdbba5ac136870baed455ff926a540fc2cb70d73a2f2e.png and /dev/null differ diff --git a/_images/b44bdaee47507fab314d6b806af33cb0478e2dc48cf2717353811942e003c15e.png b/_images/f967e8e0388c93d375fb7565d7710df30536775bae32fba02bcffaf286210e22.png similarity index 82% rename from _images/b44bdaee47507fab314d6b806af33cb0478e2dc48cf2717353811942e003c15e.png rename to _images/f967e8e0388c93d375fb7565d7710df30536775bae32fba02bcffaf286210e22.png index ced5f2c6f..aa42d95ca 100644 Binary files a/_images/b44bdaee47507fab314d6b806af33cb0478e2dc48cf2717353811942e003c15e.png and b/_images/f967e8e0388c93d375fb7565d7710df30536775bae32fba02bcffaf286210e22.png differ diff --git a/_images/fc51a1271daac754bf679fbfc4e72961c1b7cb54c5451f13fbae17f7e75c428f.png b/_images/fc51a1271daac754bf679fbfc4e72961c1b7cb54c5451f13fbae17f7e75c428f.png deleted file mode 100644 index debadc870..000000000 Binary files a/_images/fc51a1271daac754bf679fbfc4e72961c1b7cb54c5451f13fbae17f7e75c428f.png and /dev/null differ diff --git a/_images/fd04fad067c0f4fa7da82726bfb7e50570c6204ec1717fafba326d13a0274032.png b/_images/fd04fad067c0f4fa7da82726bfb7e50570c6204ec1717fafba326d13a0274032.png deleted file mode 100644 index fed347835..000000000 Binary files a/_images/fd04fad067c0f4fa7da82726bfb7e50570c6204ec1717fafba326d13a0274032.png and /dev/null differ diff --git a/_sources/python_scripts/cross_validation_stratification.py b/_sources/python_scripts/cross_validation_stratification.py index ad39bdb3f..d1f8abc95 100644 --- a/_sources/python_scripts/cross_validation_stratification.py +++ b/_sources/python_scripts/cross_validation_stratification.py @@ -36,10 +36,11 @@ model = make_pipeline(StandardScaler(), LogisticRegression()) # %% [markdown] -# Once we created our model, we will use the cross-validation framework to -# evaluate it. We will use the `KFold` cross-validation strategy. We will define -# a dataset with nine samples and repeat the cross-validation three times (i.e. -# `n_splits`). +# Once the model is created, we can evaluate it using cross-validation. We start +# by using the `KFold` strategy. +# +# Let's review how this strategy works. For such purpose, we define a dataset +# with nine samples and split the dataset into three folds (i.e. `n_splits=3`). # %% import numpy as np @@ -51,12 +52,12 @@ print("TRAIN:", train_index, "TEST:", test_index) # %% [markdown] -# By defining three splits, we will use three samples for testing and six for -# training each time. `KFold` does not shuffle by default. It means that it will -# select the three first samples for the testing set at the first split, then -# the next three samples for the second split, and the three next for the -# last split. In the end, all samples have been used in testing at least once -# among the different splits. +# By defining three splits, we use three samples (1-fold) for testing and six +# (2-folds) for training each time. `KFold` does not shuffle by default. It +# means that the three first samples are selected for the testing set at the +# first split, then the three next three samples for the second split, and the +# three next for the last split. In the end, all samples have been used in +# testing at least once among the different splits. # # Now, let's apply this strategy to check the generalization performance of our # model. @@ -73,8 +74,8 @@ # %% [markdown] # It is a real surprise that our model cannot correctly classify any sample in -# any cross-validation split. We will now check our target's value to understand -# the issue. +# any cross-validation split. We now check our target's value to understand the +# issue. # %% import matplotlib.pyplot as plt @@ -86,18 +87,17 @@ _ = plt.title("Class value in target y") # %% [markdown] -# We see that the target vector `target` is ordered. It will have some -# unexpected consequences when using the `KFold` cross-validation. To illustrate -# the consequences, we will show the class count in each fold of the -# cross-validation in the train and test set. +# We see that the target vector `target` is ordered. This has some unexpected +# consequences when using the `KFold` cross-validation. To illustrate the +# consequences, we show the class count in each fold of the cross-validation in +# the train and test set. # # Let's compute the class counts for both the training and testing sets using # the `KFold` cross-validation, and plot these information in a bar plot. # -# We will iterate given the number of split and check how many samples of each -# are present in the training and testing set. We will store the information -# into two distincts lists; one for the training set and one for the testing -# set. +# We iterate given the number of split and check how many samples of each are +# present in the training and testing set. We then store the information into +# two distinct lists; one for the training set and one for the testing set. # %% import pandas as pd @@ -114,8 +114,8 @@ test_cv_counts.append(target_test.value_counts()) # %% [markdown] -# To plot the information on a single figure, we will concatenate the -# information regarding the fold within the same dataset. +# To plot the information on a single figure, we concatenate the information +# regarding the fold within the same dataset. # %% train_cv_counts = pd.concat( @@ -138,13 +138,13 @@ train_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Training set") +_ = plt.title("Training set class counts") # %% test_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Test set") +_ = plt.title("Test set class counts") # %% [markdown] # We can confirm that in each fold, only two of the three classes are present in @@ -168,7 +168,7 @@ # 90%. Now that we solved our first issue, it would be interesting to check if # the class frequency in the training and testing set is equal to our original # set's class frequency. It would ensure that we are training and testing our -# model with a class distribution that we will encounter in production. +# model with a class distribution that we would encounter in production. # %% train_cv_counts = [] @@ -191,13 +191,13 @@ train_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Training set") +_ = plt.title("Training set class counts\n(with suffling)") # %% test_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Test set") +_ = plt.title("Test set class counts\n(with suffling)") # %% [markdown] # We see that neither the training and testing sets have the same class @@ -242,18 +242,27 @@ train_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Training set") +_ = plt.title("Training set class counts\n(with stratifying)") # %% test_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Test set") +_ = plt.title("Test set class counts\n(with stratifying)") # %% [markdown] # In this case, we observe that the class counts are very close both in the # train set and the test set. The difference is due to the small number of # samples in the iris dataset. # -# In conclusion, this is a good practice to use stratification within the -# cross-validation framework when dealing with a classification problem. +# In other words, stratifying is more effective than just shuffling when it +# comes to making sure that the distributions of classes in all the folds are +# representative of the entire dataset. As training and testing folds have +# similar class distributions, stratifying leads to a more realistic measure of +# the model’s ability to generalize. This is specially important when the +# performance metrics depend on the proportion of the positive class, as we will +# see in a future notebook. +# +# The interested reader can learn about other stratified cross-validation +# techniques in the [scikit-learn user +# guide](https://scikit-learn.org/stable/modules/cross_validation.html#cross-validation-iterators-with-stratification-based-on-class-labels). diff --git a/appendix/notebook_timings.html b/appendix/notebook_timings.html index d84635b8a..4b470dcd2 100644 --- a/appendix/notebook_timings.html +++ b/appendix/notebook_timings.html @@ -668,555 +668,555 @@

Notebook timings

python_scripts/01_tabular_data_exploration

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

9.52

+

7.58

python_scripts/01_tabular_data_exploration_ex_01

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

0.82

+

0.75

python_scripts/01_tabular_data_exploration_sol_01

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

3.73

+

3.78

python_scripts/02_numerical_pipeline_cross_validation

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

2.08

+

2.19

python_scripts/02_numerical_pipeline_ex_00

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

1.65

+

1.55

python_scripts/02_numerical_pipeline_ex_01

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

1.51

+

1.67

python_scripts/02_numerical_pipeline_hands_on

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

1.92

+

1.84

python_scripts/02_numerical_pipeline_introduction

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

5.01

+

4.99

python_scripts/02_numerical_pipeline_scaling

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

3.3

+

3.19

python_scripts/02_numerical_pipeline_sol_00

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

5.51

+

5.28

python_scripts/02_numerical_pipeline_sol_01

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

1.79

+

1.58

python_scripts/03_categorical_pipeline

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

2.98

+

2.82

python_scripts/03_categorical_pipeline_column_transformer

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

4.38

+

4.16

python_scripts/03_categorical_pipeline_ex_01

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

1.69

+

1.5

python_scripts/03_categorical_pipeline_ex_02

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

6.22

+

5.82

python_scripts/03_categorical_pipeline_sol_01

-

2024-05-17 08:43

+

2024-05-17 08:45

cache

-

4.08

+

4.53

python_scripts/03_categorical_pipeline_sol_02

-

2024-05-17 08:44

+

2024-05-17 08:46

cache

-

27.34

+

27.14

python_scripts/03_categorical_pipeline_visualization

-

2024-05-17 08:44

+

2024-05-17 08:46

cache

-

1.97

+

1.82

python_scripts/cross_validation_baseline

-

2024-05-17 08:44

+

2024-05-17 08:46

cache

-

9.92

+

9.71

python_scripts/cross_validation_ex_01

-

2024-05-17 08:44

+

2024-05-17 08:46

cache

-

1.16

+

1.3

python_scripts/cross_validation_ex_02

-

2024-05-17 08:44

+

2024-05-17 08:46

cache

-

1.32

+

1.12

python_scripts/cross_validation_grouping

-

2024-05-17 08:44

+

2024-05-17 08:46

cache

-

6.28

+

6.01

python_scripts/cross_validation_learning_curve

-

2024-05-17 08:44

+

2024-05-17 08:46

cache

-

12.75

+

12.59

python_scripts/cross_validation_nested

-

2024-05-17 08:45

+

2024-05-17 08:47

cache

-

20.76

+

20.96

python_scripts/cross_validation_sol_01

-

2024-05-17 08:45

+

2024-05-17 08:47

cache

-

10.23

+

10.16

python_scripts/cross_validation_sol_02

-

2024-05-17 08:45

+

2024-05-17 08:47

cache

-

7.78

+

7.89

python_scripts/cross_validation_stratification

-

2024-05-17 08:45

+

2024-05-17 08:47

cache

-

3.18

+

3.02

python_scripts/cross_validation_time

-

2024-05-17 08:45

+

2024-05-17 08:47

cache

-

6.19

+

6.1

python_scripts/cross_validation_train_test

-

2024-05-17 08:45

+

2024-05-17 08:47

cache

-

10.34

+

10.27

python_scripts/cross_validation_validation_curve

-

2024-05-17 08:46

+

2024-05-17 08:48

cache

-

18.6

+

18.11

python_scripts/datasets_adult_census

-

2024-05-17 08:46

+

2024-05-17 08:48

cache

-

2.94

+

2.78

python_scripts/datasets_ames_housing

-

2024-05-17 08:46

+

2024-05-17 08:48

cache

-

9.5

+

9.04

python_scripts/datasets_bike_rides

-

2024-05-17 08:46

+

2024-05-17 08:48

cache

-

9.26

+

8.88

python_scripts/datasets_blood_transfusion

-

2024-05-17 08:46

+

2024-05-17 08:48

cache

-

5.83

+

5.7

python_scripts/datasets_california_housing

-

2024-05-17 08:47

+

2024-05-17 08:48

cache

-

15.09

+

14.99

python_scripts/dev_features_importance

-

2024-05-17 08:47

+

2024-05-17 08:49

cache

-

53.73

+

53.47

python_scripts/ensemble_adaboost

-

2024-05-17 08:47

+

2024-05-17 08:49

cache

-

3.11

+

3.14

python_scripts/ensemble_bagging

-

2024-05-17 08:48

+

2024-05-17 08:49

cache

4.76

python_scripts/ensemble_ex_01

-

2024-05-17 08:48

+

2024-05-17 08:49

cache

-

1.65

+

1.5

python_scripts/ensemble_ex_02

-

2024-05-17 08:48

+

2024-05-17 08:49

cache

-

1.58

+

1.61

python_scripts/ensemble_ex_03

-

2024-05-17 08:48

+

2024-05-17 08:49

cache

-

1.62

+

1.45

python_scripts/ensemble_ex_04

-

2024-05-17 08:48

+

2024-05-17 08:50

cache

-

1.61

+

1.48

python_scripts/ensemble_gradient_boosting

-

2024-05-17 08:49

+

2024-05-17 08:51

cache

-

63.09

+

63.07

python_scripts/ensemble_hist_gradient_boosting

-

2024-05-17 08:49

+

2024-05-17 08:51

cache

-

41.73

+

41.13

python_scripts/ensemble_hyperparameters

-

2024-05-17 08:50

+

2024-05-17 08:52

cache

-

58.14

+

57.66

python_scripts/ensemble_introduction

-

2024-05-17 08:51

+

2024-05-17 08:53

cache

-

23.07

+

22.77

python_scripts/ensemble_random_forest

-

2024-05-17 08:51

+

2024-05-17 08:53

cache

-

21.45

+

21.28

python_scripts/ensemble_sol_01

-

2024-05-17 08:51

+

2024-05-17 08:53

cache

-

21.19

+

20.39

python_scripts/ensemble_sol_02

-

2024-05-17 08:52

+

2024-05-17 08:53

cache

-

2.23

+

2.21

python_scripts/ensemble_sol_03

-

2024-05-17 08:53

+

2024-05-17 08:54

cache

-

60.68

+

65.12

python_scripts/ensemble_sol_04

-

2024-05-17 08:53

+

2024-05-17 08:55

cache

-

40.78

+

40.4

python_scripts/feature_selection_ex_01

-

2024-05-17 08:53

+

2024-05-17 08:55

cache

-

1.5

+

1.48

python_scripts/feature_selection_introduction

-

2024-05-17 08:53

+

2024-05-17 08:55

cache

-

12.71

+

12.5

python_scripts/feature_selection_limitation_model

-

2024-05-17 08:54

+

2024-05-17 08:56

cache

-

45.12

+

44.74

python_scripts/feature_selection_sol_01

-

2024-05-17 08:54

+

2024-05-17 08:56

cache

-

5.96

+

6.07

python_scripts/linear_models_ex_01

-

2024-05-17 08:54

+

2024-05-17 08:56

cache

-

1.3

+

1.33

python_scripts/linear_models_ex_02

-

2024-05-17 08:54

+

2024-05-17 08:56

cache

-

1.2

+

1.16

python_scripts/linear_models_ex_03

-

2024-05-17 08:54

+

2024-05-17 08:56

cache

-

1.23

+

1.22

python_scripts/linear_models_ex_04

-

2024-05-17 08:54

+

2024-05-17 08:56

cache

-

2.14

+

2.13

python_scripts/linear_models_feature_engineering_classification

-

2024-05-17 08:55

+

2024-05-17 08:56

cache

-

7.06

+

7.12

python_scripts/linear_models_regularization

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

9.13

+

9.05

python_scripts/linear_models_sol_01

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

2.0

+

2.15

python_scripts/linear_models_sol_02

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

6.08

+

5.86

python_scripts/linear_models_sol_03

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

15.12

+

15.27

python_scripts/linear_models_sol_04

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

5.44

+

5.21

python_scripts/linear_regression_in_sklearn

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

2.28

+

2.29

python_scripts/linear_regression_non_linear_link

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

3.51

+

3.52

python_scripts/linear_regression_without_sklearn

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

2.68

+

2.64

python_scripts/logistic_regression

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

2.96

+

3.12

python_scripts/metrics_classification

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

3.12

+

2.93

python_scripts/metrics_ex_01

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

1.72

+

1.7

python_scripts/metrics_ex_02

-

2024-05-17 08:55

+

2024-05-17 08:57

cache

-

1.34

+

1.09

python_scripts/metrics_regression

-

2024-05-17 08:56

+

2024-05-17 08:57

cache

-

2.61

+

2.59

python_scripts/metrics_sol_01

-

2024-05-17 08:56

+

2024-05-17 08:57

cache

-

2.49

+

2.25

python_scripts/metrics_sol_02

-

2024-05-17 08:56

+

2024-05-17 08:57

cache

-

4.83

+

4.89

python_scripts/parameter_tuning_ex_02

-

2024-05-17 08:56

+

2024-05-17 08:58

cache

-

1.76

+

1.63

python_scripts/parameter_tuning_ex_03

-

2024-05-17 08:56

+

2024-05-17 08:58

cache

-

1.51

+

1.46

python_scripts/parameter_tuning_grid_search

-

2024-05-17 08:56

+

2024-05-17 08:58

cache

-

10.29

+

10.34

python_scripts/parameter_tuning_manual

-

2024-05-17 08:56

+

2024-05-17 08:58

cache

-

4.54

+

4.4

python_scripts/parameter_tuning_nested

-

2024-05-17 08:56

+

2024-05-17 08:58

cache

-

24.96

+

24.99

python_scripts/parameter_tuning_parallel_plot

-

2024-05-17 08:56

+

2024-05-17 08:58

cache

-

2.89

+

2.78

python_scripts/parameter_tuning_randomized_search

-

2024-05-17 08:57

+

2024-05-17 08:59

cache

-

40.18

+

25.58

python_scripts/parameter_tuning_sol_02

-

2024-05-17 08:57

+

2024-05-17 08:59

cache

-

5.95

+

6.23

python_scripts/parameter_tuning_sol_03

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

19.67

+

19.72

python_scripts/trees_classification

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

2.71

+

2.68

python_scripts/trees_dataset

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

2.93

+

3.0

python_scripts/trees_ex_01

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

1.8

+

1.48

python_scripts/trees_ex_02

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

1.29

+

1.11

python_scripts/trees_hyperparameters

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

5.42

+

5.3

python_scripts/trees_regression

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

3.16

+

2.91

python_scripts/trees_sol_01

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

2.7

+

2.67

python_scripts/trees_sol_02

-

2024-05-17 08:58

+

2024-05-17 08:59

cache

-

2.37

+

2.47

diff --git a/python_scripts/02_numerical_pipeline_cross_validation.html b/python_scripts/02_numerical_pipeline_cross_validation.html index bbd4f1e5d..fb915da54 100644 --- a/python_scripts/02_numerical_pipeline_cross_validation.html +++ b/python_scripts/02_numerical_pipeline_cross_validation.html @@ -799,12 +799,12 @@

The need for cross-validation -
CPU times: user 464 ms, sys: 271 ms, total: 735 ms
-Wall time: 413 ms
+
CPU times: user 481 ms, sys: 238 ms, total: 719 ms
+Wall time: 403 ms
 
-
{'fit_time': array([0.0630393 , 0.05985069, 0.05934858, 0.05829835, 0.05691195]),
- 'score_time': array([0.01378393, 0.01406264, 0.01310515, 0.01331496, 0.01302361]),
+
{'fit_time': array([0.06043458, 0.05871677, 0.05862284, 0.05633068, 0.05741453]),
+ 'score_time': array([0.01347184, 0.01342702, 0.01305532, 0.01317239, 0.01284552]),
  'test_score': array([0.79557785, 0.80049135, 0.79965192, 0.79873055, 0.80456593])}
 
diff --git a/python_scripts/02_numerical_pipeline_scaling.html b/python_scripts/02_numerical_pipeline_scaling.html index 4b67bfa6c..fe3747a24 100644 --- a/python_scripts/02_numerical_pipeline_scaling.html +++ b/python_scripts/02_numerical_pipeline_scaling.html @@ -2067,7 +2067,7 @@

Model fitting with preprocessing -
The accuracy using a LogisticRegression is 0.807 with a fitting time of 0.134 seconds in 60 iterations
+
The accuracy using a LogisticRegression is 0.807 with a fitting time of 0.138 seconds in 60 iterations
 
diff --git a/python_scripts/03_categorical_pipeline.html b/python_scripts/03_categorical_pipeline.html index f6d7387b6..80cc61665 100644 --- a/python_scripts/03_categorical_pipeline.html +++ b/python_scripts/03_categorical_pipeline.html @@ -2078,8 +2078,8 @@

Evaluate our predictive pipeline -
{'fit_time': array([0.18344617, 0.16978073, 0.1794219 , 0.18088317, 0.17538881]),
- 'score_time': array([0.02263308, 0.0238204 , 0.02229023, 0.02480221, 0.02325177]),
+
{'fit_time': array([0.1810813 , 0.16806316, 0.17796063, 0.18232203, 0.16931438]),
+ 'score_time': array([0.02264929, 0.02345228, 0.02224159, 0.02423429, 0.02271461]),
  'test_score': array([0.83232675, 0.83570478, 0.82831695, 0.83292383, 0.83497133])}
 
diff --git a/python_scripts/03_categorical_pipeline_column_transformer.html b/python_scripts/03_categorical_pipeline_column_transformer.html index a15f2b151..5af0c93a0 100644 --- a/python_scripts/03_categorical_pipeline_column_transformer.html +++ b/python_scripts/03_categorical_pipeline_column_transformer.html @@ -1571,8 +1571,8 @@

Evaluation of the model with cross-validation -
{'fit_time': array([0.25613689, 0.25960398, 0.22366261, 0.24478149, 0.27019453]),
- 'score_time': array([0.0276804 , 0.0271585 , 0.02736831, 0.02919555, 0.02713871]),
+
{'fit_time': array([0.25266504, 0.25575709, 0.22375464, 0.24230623, 0.26495242]),
+ 'score_time': array([0.02791929, 0.0277667 , 0.02767706, 0.02842665, 0.02699661]),
  'test_score': array([0.85116184, 0.84993346, 0.8482801 , 0.85257985, 0.85544636])}
 
@@ -1644,8 +1644,8 @@

Fitting a more powerful model -
CPU times: user 689 ms, sys: 11.7 ms, total: 701 ms
-Wall time: 700 ms
+
CPU times: user 673 ms, sys: 11.9 ms, total: 685 ms
+Wall time: 684 ms
 
@@ -1657,7 +1657,7 @@

Fitting a more powerful model -
-
-
The mean cross-validation accuracy is: 0.874 ± 0.002 with a fitting time of 4.252
+
The mean cross-validation accuracy is: 0.873 ± 0.002 with a fitting time of 4.264
 
@@ -835,7 +835,7 @@

Scaling numerical features -
The mean cross-validation accuracy is: 0.874 ± 0.003 with a fitting time of 4.295
+
The mean cross-validation accuracy is: 0.873 ± 0.002 with a fitting time of 4.257
 
@@ -891,7 +891,7 @@

One-hot encoding of categorical variables -
The mean cross-validation accuracy is: 0.873 ± 0.003 with a fitting time of 17.117
+
The mean cross-validation accuracy is: 0.873 ± 0.003 with a fitting time of 16.863
 
diff --git a/python_scripts/cross_validation_baseline.html b/python_scripts/cross_validation_baseline.html index ec0c1360f..217fb1cc4 100644 --- a/python_scripts/cross_validation_baseline.html +++ b/python_scripts/cross_validation_baseline.html @@ -743,13 +743,13 @@

Comparing model performance with a simple baseline
count    30.000000
-mean     45.781621
-std       1.165621
-min      43.343258
-25%      44.934040
-50%      45.870101
-75%      46.681409
-max      48.122259
+mean     45.739021
+std       1.199512
+min      43.009317
+25%      44.852893
+50%      45.789081
+75%      46.483567
+max      48.333765
 Name: Decision tree regressor, dtype: float64
 
@@ -826,152 +826,152 @@

Comparing model performance with a simple baseline 0 - 46.849316 + 46.498200 90.713153 1 - 47.060510 + 46.462715 90.539353 2 - 44.386111 + 44.428989 91.941912 3 - 43.792578 + 43.865327 90.213912 4 - 47.649948 + 48.333765 92.015862 5 - 44.809461 + 45.125717 90.542490 6 - 44.321769 + 44.725138 89.757566 7 - 44.841198 + 44.901204 92.477244 8 - 44.978043 + 44.532377 90.947952 9 - 45.148078 + 45.114938 91.991373 10 - 46.579575 + 46.760074 92.023571 11 - 46.405369 + 46.490518 90.556965 12 - 45.550643 + 45.809124 91.539567 13 - 45.250819 + 44.977102 91.185225 14 - 46.994709 + 47.010698 92.298971 15 - 45.011550 + 44.646918 91.084639 16 - 46.329916 + 46.440914 90.984471 17 - 46.915084 + 46.380367 89.981744 18 - 44.919372 + 44.836789 90.547140 19 - 47.127851 + 47.515037 89.820219 20 - 43.343258 + 43.009317 91.768721 21 - 45.316584 + 46.358493 92.305556 22 - 46.297425 + 45.232018 90.503017 23 - 46.715354 + 46.545270 92.147974 24 - 46.357748 + 46.236355 91.386320 25 - 46.003656 + 45.769037 90.815660 26 - 44.587669 + 44.745129 92.216574 27 - 46.046236 + 46.231509 90.107460 28 - 45.736545 + 45.337221 90.620318 29 - 48.122259 + 47.850384 91.165331 @@ -992,7 +992,7 @@

Comparing model performance with a simple baseline

-../_images/8727e9f63828f06e8362a1525dfc9eeafb49514a173961af9e1f12b49f15dbb8.png +../_images/92e36ced055bb12ff5e4930b06dc91a85c90c7a63a066571e6a79696032beb18.png

We see that the generalization performance of our decision tree is far from diff --git a/python_scripts/cross_validation_grouping.html b/python_scripts/cross_validation_grouping.html index 59c04e0ad..f651a35b8 100644 --- a/python_scripts/cross_validation_grouping.html +++ b/python_scripts/cross_validation_grouping.html @@ -753,7 +753,7 @@

Sample grouping -

The cross-validation testing error that uses the shuffling has less variance @@ -1014,7 +1014,7 @@

Sample grouping -../_images/903ca253ee93134930868fd7b7f6740bfdddb6017f4644af9c480e43ed4b67c1.png +../_images/64c942d42075f7c44b2a146ea90c6f9b41ce7824a1fa39babb9bb046fc9ec361.png

As a conclusion, it is really important to take any sample grouping pattern diff --git a/python_scripts/cross_validation_learning_curve.html b/python_scripts/cross_validation_learning_curve.html index 966ee22eb..fb2720821 100644 --- a/python_scripts/cross_validation_learning_curve.html +++ b/python_scripts/cross_validation_learning_curve.html @@ -794,7 +794,7 @@

Learning curve -../_images/fc51a1271daac754bf679fbfc4e72961c1b7cb54c5451f13fbae17f7e75c428f.png +../_images/0ad8ea179db874928d64f02173c404c55183b9117f6c438a9fa6be030dfc5975.png

Looking at the training error alone, we see that we get an error of 0 k$. It diff --git a/python_scripts/cross_validation_sol_01.html b/python_scripts/cross_validation_sol_01.html index e6c38c12f..c0757b1a0 100644 --- a/python_scripts/cross_validation_sol_01.html +++ b/python_scripts/cross_validation_sol_01.html @@ -789,62 +789,62 @@

📃 Solution for Exercise M2.01 -../_images/3a99f568b3347b7f38a410fc4b078ea630d1a785ae744a15d4c8eb61ab71d9f2.png +../_images/7827a94926af2af5a5913097aeb103df4d37efd4838fae9928ae6cce894ab514.png

We see that using strategy="stratified", the results are much worse than diff --git a/python_scripts/cross_validation_stratification.html b/python_scripts/cross_validation_stratification.html index aa3f527ea..9ee7a97c6 100644 --- a/python_scripts/cross_validation_stratification.html +++ b/python_scripts/cross_validation_stratification.html @@ -718,10 +718,10 @@

StratificationKFold cross-validation strategy. We will define -a dataset with nine samples and repeat the cross-validation three times (i.e. -n_splits).

+

Once the model is created, we can evaluate it using cross-validation. We start +by using the KFold strategy.

+

Let’s review how this strategy works. For such purpose, we define a dataset +with nine samples and split the dataset into three folds (i.e. n_splits=3).

import numpy as np
@@ -742,12 +742,12 @@ 

StratificationKFold does not shuffle by default. It means that it will -select the three first samples for the testing set at the first split, then -the next three samples for the second split, and the three next for the -last split. In the end, all samples have been used in testing at least once -among the different splits.

+

By defining three splits, we use three samples (1-fold) for testing and six +(2-folds) for training each time. KFold does not shuffle by default. It +means that the three first samples are selected for the testing set at the +first split, then the three next three samples for the second split, and the +three next for the last split. In the end, all samples have been used in +testing at least once among the different splits.

Now, let’s apply this strategy to check the generalization performance of our model.

@@ -770,8 +770,8 @@

Stratification
-

We see that the target vector target is ordered. It will have some -unexpected consequences when using the KFold cross-validation. To illustrate -the consequences, we will show the class count in each fold of the -cross-validation in the train and test set.

+

We see that the target vector target is ordered. This has some unexpected +consequences when using the KFold cross-validation. To illustrate the +consequences, we show the class count in each fold of the cross-validation in +the train and test set.

Let’s compute the class counts for both the training and testing sets using the KFold cross-validation, and plot these information in a bar plot.

-

We will iterate given the number of split and check how many samples of each -are present in the training and testing set. We will store the information -into two distincts lists; one for the training set and one for the testing -set.

+

We iterate given the number of split and check how many samples of each are +present in the training and testing set. We then store the information into +two distinct lists; one for the training set and one for the testing set.

We see that neither the training and testing sets have the same class @@ -1104,12 +1103,12 @@

Stratification
train_cv_counts.plot.bar()
 plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left")
 plt.ylabel("Count")
-_ = plt.title("Training set")
+_ = plt.title("Training set class counts\n(with stratifying)")
 

-../_images/fd04fad067c0f4fa7da82726bfb7e50570c6204ec1717fafba326d13a0274032.png +../_images/804f03ccd90f611cf09d0a2877ff4b38a034653f1561809ace702c35ddbeb643.png

-../_images/6e35dffb680833c1b6fed9ac74960ab1c19475f9df827ecd06318c7b954b6c77.png +../_images/1ee8b865671a2ec2a79d7e3397c5d58cdcbd7475eff44511d0f699d8707382f8.png

In this case, we observe that the class counts are very close both in the train set and the test set. The difference is due to the small number of samples in the iris dataset.

-

In conclusion, this is a good practice to use stratification within the -cross-validation framework when dealing with a classification problem.

+

In other words, stratifying is more effective than just shuffling when it +comes to making sure that the distributions of classes in all the folds are +representative of the entire dataset. As training and testing folds have +similar class distributions, stratifying leads to a more realistic measure of +the model’s ability to generalize. This is specially important when the +performance metrics depend on the proportion of the positive class, as we will +see in a future notebook.

+

The interested reader can learn about other stratified cross-validation +techniques in the scikit-learn user +guide.

-
-
-