diff --git a/_images/01e66fdece1c06d9d2ab36de01d04ac8d4eb1db75d9752af1f821e93bea1604e.png b/_images/01e66fdece1c06d9d2ab36de01d04ac8d4eb1db75d9752af1f821e93bea1604e.png new file mode 100644 index 000000000..0cbe43657 Binary files /dev/null and b/_images/01e66fdece1c06d9d2ab36de01d04ac8d4eb1db75d9752af1f821e93bea1604e.png differ diff --git a/_images/027d5a00539b4be3504a0168d4c00400f61aa083ba2a69dfc31e6ea8fef57d6b.png b/_images/027d5a00539b4be3504a0168d4c00400f61aa083ba2a69dfc31e6ea8fef57d6b.png deleted file mode 100644 index 0304f6c65..000000000 Binary files a/_images/027d5a00539b4be3504a0168d4c00400f61aa083ba2a69dfc31e6ea8fef57d6b.png and /dev/null differ diff --git a/_images/0522866330231fa2080e854e9d5fdbc6e63c618e7d2b6881af2bfb6eabe79a90.png b/_images/0522866330231fa2080e854e9d5fdbc6e63c618e7d2b6881af2bfb6eabe79a90.png deleted file mode 100644 index b2441a397..000000000 Binary files a/_images/0522866330231fa2080e854e9d5fdbc6e63c618e7d2b6881af2bfb6eabe79a90.png and /dev/null differ diff --git a/_images/0ad8ea179db874928d64f02173c404c55183b9117f6c438a9fa6be030dfc5975.png b/_images/0ad8ea179db874928d64f02173c404c55183b9117f6c438a9fa6be030dfc5975.png new file mode 100644 index 000000000..3dc94794f Binary files /dev/null and b/_images/0ad8ea179db874928d64f02173c404c55183b9117f6c438a9fa6be030dfc5975.png differ diff --git a/_images/0b6958ae440d33aebbcd79653f1003606d4b3118fe5e1f3bb86006816a3e703a.png b/_images/0b6958ae440d33aebbcd79653f1003606d4b3118fe5e1f3bb86006816a3e703a.png new file mode 100644 index 000000000..426535975 Binary files /dev/null and b/_images/0b6958ae440d33aebbcd79653f1003606d4b3118fe5e1f3bb86006816a3e703a.png differ diff --git a/_images/0c7550d535f0278175b8c1c49432fc2772621bd1cb93228ba19e3c566e32ae1a.png b/_images/0c7550d535f0278175b8c1c49432fc2772621bd1cb93228ba19e3c566e32ae1a.png deleted file mode 100644 index 80ca184f9..000000000 Binary files a/_images/0c7550d535f0278175b8c1c49432fc2772621bd1cb93228ba19e3c566e32ae1a.png and /dev/null differ diff --git a/_images/0fe01dcff6e97df32e78bb68708a64215a4a9bdf5112ad8146b0358cc0f62b55.png b/_images/0fe01dcff6e97df32e78bb68708a64215a4a9bdf5112ad8146b0358cc0f62b55.png new file mode 100644 index 000000000..5f4eb0c63 Binary files /dev/null and b/_images/0fe01dcff6e97df32e78bb68708a64215a4a9bdf5112ad8146b0358cc0f62b55.png differ diff --git a/_images/1027374749a31f12f13f0f993153c8cd74c24838f73f5d092005820a60d842df.png b/_images/1027374749a31f12f13f0f993153c8cd74c24838f73f5d092005820a60d842df.png deleted file mode 100644 index a6347a4ba..000000000 Binary files a/_images/1027374749a31f12f13f0f993153c8cd74c24838f73f5d092005820a60d842df.png and /dev/null differ diff --git a/_images/116104ce35df99d16650676c641dd050503c92673eb5ea03715e120513cd3f0f.png b/_images/116104ce35df99d16650676c641dd050503c92673eb5ea03715e120513cd3f0f.png deleted file mode 100644 index 5f6e3c818..000000000 Binary files a/_images/116104ce35df99d16650676c641dd050503c92673eb5ea03715e120513cd3f0f.png and /dev/null differ diff --git a/_images/124ca54eb9e5f8a334a26e971836ff232a64d732c55089ffdc7d92ed1fcd948a.png b/_images/124ca54eb9e5f8a334a26e971836ff232a64d732c55089ffdc7d92ed1fcd948a.png new file mode 100644 index 000000000..0a50d3c4c Binary files /dev/null and b/_images/124ca54eb9e5f8a334a26e971836ff232a64d732c55089ffdc7d92ed1fcd948a.png differ diff --git a/_images/1937fc917f9ec711cabb21379467e2581c2d20c05ed7bb1762efa2a637a8d09f.png b/_images/1937fc917f9ec711cabb21379467e2581c2d20c05ed7bb1762efa2a637a8d09f.png deleted file mode 100644 index a2770f39f..000000000 Binary files a/_images/1937fc917f9ec711cabb21379467e2581c2d20c05ed7bb1762efa2a637a8d09f.png and /dev/null differ diff --git a/_images/1e335f3000191c4aaa033f400119af5c3b9ccd86318ef3c9365f4262de5fdb68.png b/_images/1e335f3000191c4aaa033f400119af5c3b9ccd86318ef3c9365f4262de5fdb68.png new file mode 100644 index 000000000..7a7e718bc Binary files /dev/null and b/_images/1e335f3000191c4aaa033f400119af5c3b9ccd86318ef3c9365f4262de5fdb68.png differ diff --git a/_images/1e64029747a1ef073c224724d20b3e12ebf1ead6392d1004b2935ecadd3723bd.png b/_images/1e64029747a1ef073c224724d20b3e12ebf1ead6392d1004b2935ecadd3723bd.png deleted file mode 100644 index d5bf469dd..000000000 Binary files a/_images/1e64029747a1ef073c224724d20b3e12ebf1ead6392d1004b2935ecadd3723bd.png and /dev/null differ diff --git a/_images/1ee8b865671a2ec2a79d7e3397c5d58cdcbd7475eff44511d0f699d8707382f8.png b/_images/1ee8b865671a2ec2a79d7e3397c5d58cdcbd7475eff44511d0f699d8707382f8.png new file mode 100644 index 000000000..4828dff23 Binary files /dev/null and b/_images/1ee8b865671a2ec2a79d7e3397c5d58cdcbd7475eff44511d0f699d8707382f8.png differ diff --git a/_images/1f50d1286f8f618a8d4b50066e2edaf4e537e35b44e900b17277493f67bd9daf.png b/_images/1f50d1286f8f618a8d4b50066e2edaf4e537e35b44e900b17277493f67bd9daf.png deleted file mode 100644 index 81052be0f..000000000 Binary files a/_images/1f50d1286f8f618a8d4b50066e2edaf4e537e35b44e900b17277493f67bd9daf.png and /dev/null differ diff --git a/_images/208c57ab761416b84edfc78cde51afa79934fe92c36787f1d0b5917737f9d357.png b/_images/208c57ab761416b84edfc78cde51afa79934fe92c36787f1d0b5917737f9d357.png new file mode 100644 index 000000000..e4695248e Binary files /dev/null and b/_images/208c57ab761416b84edfc78cde51afa79934fe92c36787f1d0b5917737f9d357.png differ diff --git a/_images/258959bcadd4c804a7311d08c7c7fc55e615d53c3b96dc817578d68a131c8842.png b/_images/258959bcadd4c804a7311d08c7c7fc55e615d53c3b96dc817578d68a131c8842.png new file mode 100644 index 000000000..9cfb65f0a Binary files /dev/null and b/_images/258959bcadd4c804a7311d08c7c7fc55e615d53c3b96dc817578d68a131c8842.png differ diff --git a/_images/2fb0b0a4b02019d117adbabee3d5d0507a9c94d60fc5c58400149f32d481fe16.png b/_images/2fb0b0a4b02019d117adbabee3d5d0507a9c94d60fc5c58400149f32d481fe16.png deleted file mode 100644 index 40ed13a87..000000000 Binary files a/_images/2fb0b0a4b02019d117adbabee3d5d0507a9c94d60fc5c58400149f32d481fe16.png and /dev/null differ diff --git a/_images/363de2948bcaab1634dfcb8a787e2d6eb215733fb7260a5241a417937cb93207.png b/_images/363de2948bcaab1634dfcb8a787e2d6eb215733fb7260a5241a417937cb93207.png deleted file mode 100644 index 1a08da07f..000000000 Binary files a/_images/363de2948bcaab1634dfcb8a787e2d6eb215733fb7260a5241a417937cb93207.png and /dev/null differ diff --git a/_images/38adfa6c210945b5616ca3b825b0db39808af1cb6b4c133d9ff6faa4d02c8bed.png b/_images/38adfa6c210945b5616ca3b825b0db39808af1cb6b4c133d9ff6faa4d02c8bed.png new file mode 100644 index 000000000..559ec0d4e Binary files /dev/null and b/_images/38adfa6c210945b5616ca3b825b0db39808af1cb6b4c133d9ff6faa4d02c8bed.png differ diff --git a/_images/3a99f568b3347b7f38a410fc4b078ea630d1a785ae744a15d4c8eb61ab71d9f2.png b/_images/3a99f568b3347b7f38a410fc4b078ea630d1a785ae744a15d4c8eb61ab71d9f2.png deleted file mode 100644 index 64d0298f6..000000000 Binary files a/_images/3a99f568b3347b7f38a410fc4b078ea630d1a785ae744a15d4c8eb61ab71d9f2.png and /dev/null differ diff --git a/_images/3e082a1a3664385fde73ceb17ed1b99d566808a050788c128d05b5820d5d9da7.png b/_images/3e082a1a3664385fde73ceb17ed1b99d566808a050788c128d05b5820d5d9da7.png new file mode 100644 index 000000000..818820552 Binary files /dev/null and b/_images/3e082a1a3664385fde73ceb17ed1b99d566808a050788c128d05b5820d5d9da7.png differ diff --git a/_images/426fe7b4cff8a531c7c310e5b714884e9d19f1d6deaf987ef8c71b8fa7f0c08d.png b/_images/426fe7b4cff8a531c7c310e5b714884e9d19f1d6deaf987ef8c71b8fa7f0c08d.png new file mode 100644 index 000000000..25158ba0a Binary files /dev/null and b/_images/426fe7b4cff8a531c7c310e5b714884e9d19f1d6deaf987ef8c71b8fa7f0c08d.png differ diff --git a/_images/42b4a8274e883eb742e53763ba1f450032402d4e8e3a28ec87cbb78485593404.png b/_images/42b4a8274e883eb742e53763ba1f450032402d4e8e3a28ec87cbb78485593404.png deleted file mode 100644 index 776aed298..000000000 Binary files a/_images/42b4a8274e883eb742e53763ba1f450032402d4e8e3a28ec87cbb78485593404.png and /dev/null differ diff --git a/_images/4498752f6f0677a4d3f997446130d7bfc11dd91048abbf8743290725450eb42f.png b/_images/4498752f6f0677a4d3f997446130d7bfc11dd91048abbf8743290725450eb42f.png deleted file mode 100644 index fcc92febd..000000000 Binary files a/_images/4498752f6f0677a4d3f997446130d7bfc11dd91048abbf8743290725450eb42f.png and /dev/null differ diff --git a/_images/52d8fb26eb9eb7c7d183411bb8bc681925cb28cee33462a27890b58679eda629.png b/_images/52d8fb26eb9eb7c7d183411bb8bc681925cb28cee33462a27890b58679eda629.png deleted file mode 100644 index 7b759e080..000000000 Binary files a/_images/52d8fb26eb9eb7c7d183411bb8bc681925cb28cee33462a27890b58679eda629.png and /dev/null differ diff --git a/_images/58b76fc1ccab386c7a34e3413fca8faa3393472db13d34ef91c7015dcbe5fbe7.png b/_images/58b76fc1ccab386c7a34e3413fca8faa3393472db13d34ef91c7015dcbe5fbe7.png new file mode 100644 index 000000000..368f5d680 Binary files /dev/null and b/_images/58b76fc1ccab386c7a34e3413fca8faa3393472db13d34ef91c7015dcbe5fbe7.png differ diff --git a/_images/5a0443b022db383942a27c20acdd6b8c7ed0258039a05147d6891df5fdc5f52f.png b/_images/5a0443b022db383942a27c20acdd6b8c7ed0258039a05147d6891df5fdc5f52f.png deleted file mode 100644 index 870a23ae1..000000000 Binary files a/_images/5a0443b022db383942a27c20acdd6b8c7ed0258039a05147d6891df5fdc5f52f.png and /dev/null differ diff --git a/_images/5fd74df20ad59564762e3cdd603f458571ae4794eac7e01b83f55c25fe8047bc.png b/_images/5fd74df20ad59564762e3cdd603f458571ae4794eac7e01b83f55c25fe8047bc.png new file mode 100644 index 000000000..81c0ccd66 Binary files /dev/null and b/_images/5fd74df20ad59564762e3cdd603f458571ae4794eac7e01b83f55c25fe8047bc.png differ diff --git a/_images/62010fb41643ba01d43e8d7e5c0f5376497d8312bb172f3f19643e4dabfa78e3.png b/_images/62010fb41643ba01d43e8d7e5c0f5376497d8312bb172f3f19643e4dabfa78e3.png deleted file mode 100644 index 4cecfd183..000000000 Binary files a/_images/62010fb41643ba01d43e8d7e5c0f5376497d8312bb172f3f19643e4dabfa78e3.png and /dev/null differ diff --git a/_images/64c942d42075f7c44b2a146ea90c6f9b41ce7824a1fa39babb9bb046fc9ec361.png b/_images/64c942d42075f7c44b2a146ea90c6f9b41ce7824a1fa39babb9bb046fc9ec361.png new file mode 100644 index 000000000..52efa7794 Binary files /dev/null and b/_images/64c942d42075f7c44b2a146ea90c6f9b41ce7824a1fa39babb9bb046fc9ec361.png differ diff --git a/_images/68d93051103d363584f935ee7294e702987106093edccc5ebee9e62f0c80fdfc.png b/_images/68d93051103d363584f935ee7294e702987106093edccc5ebee9e62f0c80fdfc.png new file mode 100644 index 000000000..549e05e60 Binary files /dev/null and b/_images/68d93051103d363584f935ee7294e702987106093edccc5ebee9e62f0c80fdfc.png differ diff --git a/_images/6e35dffb680833c1b6fed9ac74960ab1c19475f9df827ecd06318c7b954b6c77.png b/_images/6e35dffb680833c1b6fed9ac74960ab1c19475f9df827ecd06318c7b954b6c77.png deleted file mode 100644 index 70b01f434..000000000 Binary files a/_images/6e35dffb680833c1b6fed9ac74960ab1c19475f9df827ecd06318c7b954b6c77.png and /dev/null differ diff --git a/_images/6fe2431ab2860d68a5cbbd3b8bae1522332cf168bb5e477404facc6791d88f13.png b/_images/6fe2431ab2860d68a5cbbd3b8bae1522332cf168bb5e477404facc6791d88f13.png new file mode 100644 index 000000000..48b617e54 Binary files /dev/null and b/_images/6fe2431ab2860d68a5cbbd3b8bae1522332cf168bb5e477404facc6791d88f13.png differ diff --git a/_images/7827a94926af2af5a5913097aeb103df4d37efd4838fae9928ae6cce894ab514.png b/_images/7827a94926af2af5a5913097aeb103df4d37efd4838fae9928ae6cce894ab514.png new file mode 100644 index 000000000..149a0b9d9 Binary files /dev/null and b/_images/7827a94926af2af5a5913097aeb103df4d37efd4838fae9928ae6cce894ab514.png differ diff --git a/_images/7b6ceb31af3987cc9d9bf2e29bb967316133e4c3999dfc9ac8a9b4f674914eff.png b/_images/7b6ceb31af3987cc9d9bf2e29bb967316133e4c3999dfc9ac8a9b4f674914eff.png new file mode 100644 index 000000000..795acb58e Binary files /dev/null and b/_images/7b6ceb31af3987cc9d9bf2e29bb967316133e4c3999dfc9ac8a9b4f674914eff.png differ diff --git a/_images/804f03ccd90f611cf09d0a2877ff4b38a034653f1561809ace702c35ddbeb643.png b/_images/804f03ccd90f611cf09d0a2877ff4b38a034653f1561809ace702c35ddbeb643.png new file mode 100644 index 000000000..74f4f9056 Binary files /dev/null and b/_images/804f03ccd90f611cf09d0a2877ff4b38a034653f1561809ace702c35ddbeb643.png differ diff --git a/_images/8727e9f63828f06e8362a1525dfc9eeafb49514a173961af9e1f12b49f15dbb8.png b/_images/8727e9f63828f06e8362a1525dfc9eeafb49514a173961af9e1f12b49f15dbb8.png deleted file mode 100644 index 0c414c36d..000000000 Binary files a/_images/8727e9f63828f06e8362a1525dfc9eeafb49514a173961af9e1f12b49f15dbb8.png and /dev/null differ diff --git a/_images/903ca253ee93134930868fd7b7f6740bfdddb6017f4644af9c480e43ed4b67c1.png b/_images/903ca253ee93134930868fd7b7f6740bfdddb6017f4644af9c480e43ed4b67c1.png deleted file mode 100644 index e88a5053c..000000000 Binary files a/_images/903ca253ee93134930868fd7b7f6740bfdddb6017f4644af9c480e43ed4b67c1.png and /dev/null differ diff --git a/_images/92e36ced055bb12ff5e4930b06dc91a85c90c7a63a066571e6a79696032beb18.png b/_images/92e36ced055bb12ff5e4930b06dc91a85c90c7a63a066571e6a79696032beb18.png new file mode 100644 index 000000000..915ae8bef Binary files /dev/null and b/_images/92e36ced055bb12ff5e4930b06dc91a85c90c7a63a066571e6a79696032beb18.png differ diff --git a/_images/9590555d4c3e191e67e5e104078c6f7ce5036ad3eb6fad6d07023fd3a1e1cb80.png b/_images/9590555d4c3e191e67e5e104078c6f7ce5036ad3eb6fad6d07023fd3a1e1cb80.png deleted file mode 100644 index 4c1b3e9a7..000000000 Binary files a/_images/9590555d4c3e191e67e5e104078c6f7ce5036ad3eb6fad6d07023fd3a1e1cb80.png and /dev/null differ diff --git a/_images/969c0ed009e2ffa918a559366f1855ffb6e3e2992ab113f89dd262f3cacce599.png b/_images/969c0ed009e2ffa918a559366f1855ffb6e3e2992ab113f89dd262f3cacce599.png deleted file mode 100644 index 4e6ecd577..000000000 Binary files a/_images/969c0ed009e2ffa918a559366f1855ffb6e3e2992ab113f89dd262f3cacce599.png and /dev/null differ diff --git a/_images/976653ac5d2b6a6282a8c46257fa3f0d30fc3317073b45c08cf5ae24dba79e9e.png b/_images/976653ac5d2b6a6282a8c46257fa3f0d30fc3317073b45c08cf5ae24dba79e9e.png deleted file mode 100644 index 23df08a5f..000000000 Binary files a/_images/976653ac5d2b6a6282a8c46257fa3f0d30fc3317073b45c08cf5ae24dba79e9e.png and /dev/null differ diff --git a/_images/9a237d36bd97b228b1ebd0367fc6c50b8aacf9bb59c447cef28c1acd45dbf8ea.png b/_images/9a237d36bd97b228b1ebd0367fc6c50b8aacf9bb59c447cef28c1acd45dbf8ea.png deleted file mode 100644 index 98f75f0b0..000000000 Binary files a/_images/9a237d36bd97b228b1ebd0367fc6c50b8aacf9bb59c447cef28c1acd45dbf8ea.png and /dev/null differ diff --git a/_images/9cd0c17f5a84349390570a10cfb7edabfefa8f6b1e2d21ad2ec1b8952a43b2d0.png b/_images/9cd0c17f5a84349390570a10cfb7edabfefa8f6b1e2d21ad2ec1b8952a43b2d0.png deleted file mode 100644 index 3e9aa1f03..000000000 Binary files a/_images/9cd0c17f5a84349390570a10cfb7edabfefa8f6b1e2d21ad2ec1b8952a43b2d0.png and /dev/null differ diff --git a/_images/9e703116f3f72d9227a61ed67899b9b1d77ef20a62f729a7c75202e0c7cd78bd.png b/_images/9e703116f3f72d9227a61ed67899b9b1d77ef20a62f729a7c75202e0c7cd78bd.png deleted file mode 100644 index 93f6ab442..000000000 Binary files a/_images/9e703116f3f72d9227a61ed67899b9b1d77ef20a62f729a7c75202e0c7cd78bd.png and /dev/null differ diff --git a/_images/9fff407a25907bd41a1abd6a92210a69b26e39b6d9e72324c3faeb4260092f22.png b/_images/9fff407a25907bd41a1abd6a92210a69b26e39b6d9e72324c3faeb4260092f22.png deleted file mode 100644 index 68c34052b..000000000 Binary files a/_images/9fff407a25907bd41a1abd6a92210a69b26e39b6d9e72324c3faeb4260092f22.png and /dev/null differ diff --git a/_images/a02686147976a4974ba8e5701f00ef2d150f2c7a16c06c9e0bcb02ee6187f747.png b/_images/a02686147976a4974ba8e5701f00ef2d150f2c7a16c06c9e0bcb02ee6187f747.png new file mode 100644 index 000000000..c21746c56 Binary files /dev/null and b/_images/a02686147976a4974ba8e5701f00ef2d150f2c7a16c06c9e0bcb02ee6187f747.png differ diff --git a/_images/a07028c89eb4196e8bd7e4d4666e80806c04e00602ce204cc6944fddb83b7e9c.png b/_images/a07028c89eb4196e8bd7e4d4666e80806c04e00602ce204cc6944fddb83b7e9c.png deleted file mode 100644 index 1db5c9138..000000000 Binary files a/_images/a07028c89eb4196e8bd7e4d4666e80806c04e00602ce204cc6944fddb83b7e9c.png and /dev/null differ diff --git a/_images/a20a795700d05cb881645f8d4da00285671fdb94ac13116c389d399f5e75ed7b.png b/_images/a20a795700d05cb881645f8d4da00285671fdb94ac13116c389d399f5e75ed7b.png deleted file mode 100644 index 19d359efc..000000000 Binary files a/_images/a20a795700d05cb881645f8d4da00285671fdb94ac13116c389d399f5e75ed7b.png and /dev/null differ diff --git a/_images/a731d1b3b5433b622817b78a3b695950dab7c395dd97b2a7786b42116069a906.png b/_images/a731d1b3b5433b622817b78a3b695950dab7c395dd97b2a7786b42116069a906.png new file mode 100644 index 000000000..ba8cc2a69 Binary files /dev/null and b/_images/a731d1b3b5433b622817b78a3b695950dab7c395dd97b2a7786b42116069a906.png differ diff --git a/_images/b2056fa661f8c25fdd8991d19559b9ea2c9f2e0de82302657b07b187b861a355.png b/_images/b2056fa661f8c25fdd8991d19559b9ea2c9f2e0de82302657b07b187b861a355.png new file mode 100644 index 000000000..a668d265b Binary files /dev/null and b/_images/b2056fa661f8c25fdd8991d19559b9ea2c9f2e0de82302657b07b187b861a355.png differ diff --git a/_images/b36df54e4cbfc212054ec9c356041684b296051613764d63dcd190509a585a9c.png b/_images/b36df54e4cbfc212054ec9c356041684b296051613764d63dcd190509a585a9c.png deleted file mode 100644 index 38d2cda34..000000000 Binary files a/_images/b36df54e4cbfc212054ec9c356041684b296051613764d63dcd190509a585a9c.png and /dev/null differ diff --git a/_images/b5752918a7baa2d398fc96edf81ee6b215d72d9fdfb3610a52e52f97b6e63f4d.png b/_images/b5752918a7baa2d398fc96edf81ee6b215d72d9fdfb3610a52e52f97b6e63f4d.png new file mode 100644 index 000000000..994bcc37c Binary files /dev/null and b/_images/b5752918a7baa2d398fc96edf81ee6b215d72d9fdfb3610a52e52f97b6e63f4d.png differ diff --git a/_images/c1b539edf1e6733760f4d74e58256b1f57414f27c01788d439ade461e98f39c4.png b/_images/c1b539edf1e6733760f4d74e58256b1f57414f27c01788d439ade461e98f39c4.png new file mode 100644 index 000000000..6a041498b Binary files /dev/null and b/_images/c1b539edf1e6733760f4d74e58256b1f57414f27c01788d439ade461e98f39c4.png differ diff --git a/_images/c3d51314392ba6f390c6be666b7ab333bed32939a482b60843df13799983f5e7.png b/_images/c3d51314392ba6f390c6be666b7ab333bed32939a482b60843df13799983f5e7.png new file mode 100644 index 000000000..bb9e8c512 Binary files /dev/null and b/_images/c3d51314392ba6f390c6be666b7ab333bed32939a482b60843df13799983f5e7.png differ diff --git a/_images/caed91b90bf1059987a6b8faaf83c6a78ab5c33a0805f2711ad205e0f6c1489d.png b/_images/caed91b90bf1059987a6b8faaf83c6a78ab5c33a0805f2711ad205e0f6c1489d.png new file mode 100644 index 000000000..f8c0f6d42 Binary files /dev/null and b/_images/caed91b90bf1059987a6b8faaf83c6a78ab5c33a0805f2711ad205e0f6c1489d.png differ diff --git a/_images/cf7a7673ca0704a9847d2c4ec02f69c554ce1e324bf452b8a8817c8f235fadd3.png b/_images/cf7a7673ca0704a9847d2c4ec02f69c554ce1e324bf452b8a8817c8f235fadd3.png deleted file mode 100644 index c605fb6a8..000000000 Binary files a/_images/cf7a7673ca0704a9847d2c4ec02f69c554ce1e324bf452b8a8817c8f235fadd3.png and /dev/null differ diff --git a/_images/cfacddd703cf16c0c65beb747de7780e425df44ad48f8000c7ece25270bf240c.png b/_images/cfacddd703cf16c0c65beb747de7780e425df44ad48f8000c7ece25270bf240c.png new file mode 100644 index 000000000..253ea7fef Binary files /dev/null and b/_images/cfacddd703cf16c0c65beb747de7780e425df44ad48f8000c7ece25270bf240c.png differ diff --git a/_images/d2dc1eaec0de1e6210091c07f1e1dbd9eb70b95593a74b392935c24d8f505242.png b/_images/d2dc1eaec0de1e6210091c07f1e1dbd9eb70b95593a74b392935c24d8f505242.png new file mode 100644 index 000000000..6dcad1aa5 Binary files /dev/null and b/_images/d2dc1eaec0de1e6210091c07f1e1dbd9eb70b95593a74b392935c24d8f505242.png differ diff --git a/_images/da7614d8d35733e9f32602a53fd2d6110b4a817ebc1b7c889cf783b7d4dc4e2d.png b/_images/da7614d8d35733e9f32602a53fd2d6110b4a817ebc1b7c889cf783b7d4dc4e2d.png new file mode 100644 index 000000000..450d4182a Binary files /dev/null and b/_images/da7614d8d35733e9f32602a53fd2d6110b4a817ebc1b7c889cf783b7d4dc4e2d.png differ diff --git a/_images/e64bf19e7fb3d9c1bad7a5b685cc4ccdd0ebee67898b1f4dd7b770db1d98e0ed.png b/_images/e64bf19e7fb3d9c1bad7a5b685cc4ccdd0ebee67898b1f4dd7b770db1d98e0ed.png new file mode 100644 index 000000000..3317e4d1f Binary files /dev/null and b/_images/e64bf19e7fb3d9c1bad7a5b685cc4ccdd0ebee67898b1f4dd7b770db1d98e0ed.png differ diff --git a/_images/ed062b50214a192a401ba32bd20cdcdcb82875d0a6b7aa23aa0918dd41c62494.png b/_images/ed062b50214a192a401ba32bd20cdcdcb82875d0a6b7aa23aa0918dd41c62494.png deleted file mode 100644 index 14a781a1e..000000000 Binary files a/_images/ed062b50214a192a401ba32bd20cdcdcb82875d0a6b7aa23aa0918dd41c62494.png and /dev/null differ diff --git a/_images/ed236e08f361b99be15fb0aebb313b4a614465f6a03b7f25ad4b8b98b38ef653.png b/_images/ed236e08f361b99be15fb0aebb313b4a614465f6a03b7f25ad4b8b98b38ef653.png new file mode 100644 index 000000000..c3fe7189b Binary files /dev/null and b/_images/ed236e08f361b99be15fb0aebb313b4a614465f6a03b7f25ad4b8b98b38ef653.png differ diff --git a/_images/ee0e802a4a632e9671ffb3ae723408df8516630503c49c9ee2336efe6f39c9ea.png b/_images/ee0e802a4a632e9671ffb3ae723408df8516630503c49c9ee2336efe6f39c9ea.png new file mode 100644 index 000000000..2b5f2b4b0 Binary files /dev/null and b/_images/ee0e802a4a632e9671ffb3ae723408df8516630503c49c9ee2336efe6f39c9ea.png differ diff --git a/_images/eed795e853302faf5dce9966892359d12bb14142d8db73070de7029531c1f5b7.png b/_images/eed795e853302faf5dce9966892359d12bb14142d8db73070de7029531c1f5b7.png deleted file mode 100644 index ac99797b1..000000000 Binary files a/_images/eed795e853302faf5dce9966892359d12bb14142d8db73070de7029531c1f5b7.png and /dev/null differ diff --git a/_images/f6775c9cfacd000140139bba907417d558bfbe112c49635f08a71427b9b14cd2.png b/_images/f6775c9cfacd000140139bba907417d558bfbe112c49635f08a71427b9b14cd2.png new file mode 100644 index 000000000..5a735250c Binary files /dev/null and b/_images/f6775c9cfacd000140139bba907417d558bfbe112c49635f08a71427b9b14cd2.png differ diff --git a/_images/f7c7861e22b2b54fd6dbdbba5ac136870baed455ff926a540fc2cb70d73a2f2e.png b/_images/f7c7861e22b2b54fd6dbdbba5ac136870baed455ff926a540fc2cb70d73a2f2e.png deleted file mode 100644 index 462db4023..000000000 Binary files a/_images/f7c7861e22b2b54fd6dbdbba5ac136870baed455ff926a540fc2cb70d73a2f2e.png and /dev/null differ diff --git a/_images/b44bdaee47507fab314d6b806af33cb0478e2dc48cf2717353811942e003c15e.png b/_images/f967e8e0388c93d375fb7565d7710df30536775bae32fba02bcffaf286210e22.png similarity index 82% rename from _images/b44bdaee47507fab314d6b806af33cb0478e2dc48cf2717353811942e003c15e.png rename to _images/f967e8e0388c93d375fb7565d7710df30536775bae32fba02bcffaf286210e22.png index ced5f2c6f..aa42d95ca 100644 Binary files a/_images/b44bdaee47507fab314d6b806af33cb0478e2dc48cf2717353811942e003c15e.png and b/_images/f967e8e0388c93d375fb7565d7710df30536775bae32fba02bcffaf286210e22.png differ diff --git a/_images/fc51a1271daac754bf679fbfc4e72961c1b7cb54c5451f13fbae17f7e75c428f.png b/_images/fc51a1271daac754bf679fbfc4e72961c1b7cb54c5451f13fbae17f7e75c428f.png deleted file mode 100644 index debadc870..000000000 Binary files a/_images/fc51a1271daac754bf679fbfc4e72961c1b7cb54c5451f13fbae17f7e75c428f.png and /dev/null differ diff --git a/_images/fd04fad067c0f4fa7da82726bfb7e50570c6204ec1717fafba326d13a0274032.png b/_images/fd04fad067c0f4fa7da82726bfb7e50570c6204ec1717fafba326d13a0274032.png deleted file mode 100644 index fed347835..000000000 Binary files a/_images/fd04fad067c0f4fa7da82726bfb7e50570c6204ec1717fafba326d13a0274032.png and /dev/null differ diff --git a/_sources/python_scripts/cross_validation_stratification.py b/_sources/python_scripts/cross_validation_stratification.py index ad39bdb3f..d1f8abc95 100644 --- a/_sources/python_scripts/cross_validation_stratification.py +++ b/_sources/python_scripts/cross_validation_stratification.py @@ -36,10 +36,11 @@ model = make_pipeline(StandardScaler(), LogisticRegression()) # %% [markdown] -# Once we created our model, we will use the cross-validation framework to -# evaluate it. We will use the `KFold` cross-validation strategy. We will define -# a dataset with nine samples and repeat the cross-validation three times (i.e. -# `n_splits`). +# Once the model is created, we can evaluate it using cross-validation. We start +# by using the `KFold` strategy. +# +# Let's review how this strategy works. For such purpose, we define a dataset +# with nine samples and split the dataset into three folds (i.e. `n_splits=3`). # %% import numpy as np @@ -51,12 +52,12 @@ print("TRAIN:", train_index, "TEST:", test_index) # %% [markdown] -# By defining three splits, we will use three samples for testing and six for -# training each time. `KFold` does not shuffle by default. It means that it will -# select the three first samples for the testing set at the first split, then -# the next three samples for the second split, and the three next for the -# last split. In the end, all samples have been used in testing at least once -# among the different splits. +# By defining three splits, we use three samples (1-fold) for testing and six +# (2-folds) for training each time. `KFold` does not shuffle by default. It +# means that the three first samples are selected for the testing set at the +# first split, then the three next three samples for the second split, and the +# three next for the last split. In the end, all samples have been used in +# testing at least once among the different splits. # # Now, let's apply this strategy to check the generalization performance of our # model. @@ -73,8 +74,8 @@ # %% [markdown] # It is a real surprise that our model cannot correctly classify any sample in -# any cross-validation split. We will now check our target's value to understand -# the issue. +# any cross-validation split. We now check our target's value to understand the +# issue. # %% import matplotlib.pyplot as plt @@ -86,18 +87,17 @@ _ = plt.title("Class value in target y") # %% [markdown] -# We see that the target vector `target` is ordered. It will have some -# unexpected consequences when using the `KFold` cross-validation. To illustrate -# the consequences, we will show the class count in each fold of the -# cross-validation in the train and test set. +# We see that the target vector `target` is ordered. This has some unexpected +# consequences when using the `KFold` cross-validation. To illustrate the +# consequences, we show the class count in each fold of the cross-validation in +# the train and test set. # # Let's compute the class counts for both the training and testing sets using # the `KFold` cross-validation, and plot these information in a bar plot. # -# We will iterate given the number of split and check how many samples of each -# are present in the training and testing set. We will store the information -# into two distincts lists; one for the training set and one for the testing -# set. +# We iterate given the number of split and check how many samples of each are +# present in the training and testing set. We then store the information into +# two distinct lists; one for the training set and one for the testing set. # %% import pandas as pd @@ -114,8 +114,8 @@ test_cv_counts.append(target_test.value_counts()) # %% [markdown] -# To plot the information on a single figure, we will concatenate the -# information regarding the fold within the same dataset. +# To plot the information on a single figure, we concatenate the information +# regarding the fold within the same dataset. # %% train_cv_counts = pd.concat( @@ -138,13 +138,13 @@ train_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Training set") +_ = plt.title("Training set class counts") # %% test_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Test set") +_ = plt.title("Test set class counts") # %% [markdown] # We can confirm that in each fold, only two of the three classes are present in @@ -168,7 +168,7 @@ # 90%. Now that we solved our first issue, it would be interesting to check if # the class frequency in the training and testing set is equal to our original # set's class frequency. It would ensure that we are training and testing our -# model with a class distribution that we will encounter in production. +# model with a class distribution that we would encounter in production. # %% train_cv_counts = [] @@ -191,13 +191,13 @@ train_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Training set") +_ = plt.title("Training set class counts\n(with suffling)") # %% test_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Test set") +_ = plt.title("Test set class counts\n(with suffling)") # %% [markdown] # We see that neither the training and testing sets have the same class @@ -242,18 +242,27 @@ train_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Training set") +_ = plt.title("Training set class counts\n(with stratifying)") # %% test_cv_counts.plot.bar() plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left") plt.ylabel("Count") -_ = plt.title("Test set") +_ = plt.title("Test set class counts\n(with stratifying)") # %% [markdown] # In this case, we observe that the class counts are very close both in the # train set and the test set. The difference is due to the small number of # samples in the iris dataset. # -# In conclusion, this is a good practice to use stratification within the -# cross-validation framework when dealing with a classification problem. +# In other words, stratifying is more effective than just shuffling when it +# comes to making sure that the distributions of classes in all the folds are +# representative of the entire dataset. As training and testing folds have +# similar class distributions, stratifying leads to a more realistic measure of +# the model’s ability to generalize. This is specially important when the +# performance metrics depend on the proportion of the positive class, as we will +# see in a future notebook. +# +# The interested reader can learn about other stratified cross-validation +# techniques in the [scikit-learn user +# guide](https://scikit-learn.org/stable/modules/cross_validation.html#cross-validation-iterators-with-stratification-based-on-class-labels). diff --git a/appendix/notebook_timings.html b/appendix/notebook_timings.html index d84635b8a..4b470dcd2 100644 --- a/appendix/notebook_timings.html +++ b/appendix/notebook_timings.html @@ -668,555 +668,555 @@
2024-05-17 08:43
2024-05-17 08:45
cache
9.52
7.58
✅
2024-05-17 08:43
2024-05-17 08:45
cache
0.82
0.75
✅
2024-05-17 08:43
2024-05-17 08:45
cache
3.73
3.78
✅
2024-05-17 08:43
2024-05-17 08:45
cache
2.08
2.19
✅
2024-05-17 08:43
2024-05-17 08:45
cache
1.65
1.55
✅
2024-05-17 08:43
2024-05-17 08:45
cache
1.51
1.67
✅
2024-05-17 08:43
2024-05-17 08:45
cache
1.92
1.84
✅
2024-05-17 08:43
2024-05-17 08:45
cache
5.01
4.99
✅
2024-05-17 08:43
2024-05-17 08:45
cache
3.3
3.19
✅
2024-05-17 08:43
2024-05-17 08:45
cache
5.51
5.28
✅
2024-05-17 08:43
2024-05-17 08:45
cache
1.79
1.58
✅
2024-05-17 08:43
2024-05-17 08:45
cache
2.98
2.82
✅
2024-05-17 08:43
2024-05-17 08:45
cache
4.38
4.16
✅
2024-05-17 08:43
2024-05-17 08:45
cache
1.69
1.5
✅
2024-05-17 08:43
2024-05-17 08:45
cache
6.22
5.82
✅
2024-05-17 08:43
2024-05-17 08:45
cache
4.08
4.53
✅
2024-05-17 08:44
2024-05-17 08:46
cache
27.34
27.14
✅
2024-05-17 08:44
2024-05-17 08:46
cache
1.97
1.82
✅
2024-05-17 08:44
2024-05-17 08:46
cache
9.92
9.71
✅
2024-05-17 08:44
2024-05-17 08:46
cache
1.16
1.3
✅
2024-05-17 08:44
2024-05-17 08:46
cache
1.32
1.12
✅
2024-05-17 08:44
2024-05-17 08:46
cache
6.28
6.01
✅
2024-05-17 08:44
2024-05-17 08:46
cache
12.75
12.59
✅
2024-05-17 08:45
2024-05-17 08:47
cache
20.76
20.96
✅
2024-05-17 08:45
2024-05-17 08:47
cache
10.23
10.16
✅
2024-05-17 08:45
2024-05-17 08:47
cache
7.78
7.89
✅
2024-05-17 08:45
2024-05-17 08:47
cache
3.18
3.02
✅
2024-05-17 08:45
2024-05-17 08:47
cache
6.19
6.1
✅
2024-05-17 08:45
2024-05-17 08:47
cache
10.34
10.27
✅
2024-05-17 08:46
2024-05-17 08:48
cache
18.6
18.11
✅
2024-05-17 08:46
2024-05-17 08:48
cache
2.94
2.78
✅
2024-05-17 08:46
2024-05-17 08:48
cache
9.5
9.04
✅
2024-05-17 08:46
2024-05-17 08:48
cache
9.26
8.88
✅
2024-05-17 08:46
2024-05-17 08:48
cache
5.83
5.7
✅
2024-05-17 08:47
2024-05-17 08:48
cache
15.09
14.99
✅
2024-05-17 08:47
2024-05-17 08:49
cache
53.73
53.47
✅
2024-05-17 08:47
2024-05-17 08:49
cache
3.11
3.14
✅
2024-05-17 08:48
2024-05-17 08:49
cache
4.76
✅
2024-05-17 08:48
2024-05-17 08:49
cache
1.65
1.5
✅
2024-05-17 08:48
2024-05-17 08:49
cache
1.58
1.61
✅
2024-05-17 08:48
2024-05-17 08:49
cache
1.62
1.45
✅
2024-05-17 08:48
2024-05-17 08:50
cache
1.61
1.48
✅
2024-05-17 08:49
2024-05-17 08:51
cache
63.09
63.07
✅
2024-05-17 08:49
2024-05-17 08:51
cache
41.73
41.13
✅
2024-05-17 08:50
2024-05-17 08:52
cache
58.14
57.66
✅
2024-05-17 08:51
2024-05-17 08:53
cache
23.07
22.77
✅
2024-05-17 08:51
2024-05-17 08:53
cache
21.45
21.28
✅
2024-05-17 08:51
2024-05-17 08:53
cache
21.19
20.39
✅
2024-05-17 08:52
2024-05-17 08:53
cache
2.23
2.21
✅
2024-05-17 08:53
2024-05-17 08:54
cache
60.68
65.12
✅
2024-05-17 08:53
2024-05-17 08:55
cache
40.78
40.4
✅
2024-05-17 08:53
2024-05-17 08:55
cache
1.5
1.48
✅
2024-05-17 08:53
2024-05-17 08:55
cache
12.71
12.5
✅
2024-05-17 08:54
2024-05-17 08:56
cache
45.12
44.74
✅
2024-05-17 08:54
2024-05-17 08:56
cache
5.96
6.07
✅
2024-05-17 08:54
2024-05-17 08:56
cache
1.3
1.33
✅
2024-05-17 08:54
2024-05-17 08:56
cache
1.2
1.16
✅
2024-05-17 08:54
2024-05-17 08:56
cache
1.23
1.22
✅
2024-05-17 08:54
2024-05-17 08:56
cache
2.14
2.13
✅
python_scripts/linear_models_feature_engineering_classification
2024-05-17 08:55
2024-05-17 08:56
cache
7.06
7.12
✅
2024-05-17 08:55
2024-05-17 08:57
cache
9.13
9.05
✅
2024-05-17 08:55
2024-05-17 08:57
cache
2.0
2.15
✅
2024-05-17 08:55
2024-05-17 08:57
cache
6.08
5.86
✅
2024-05-17 08:55
2024-05-17 08:57
cache
15.12
15.27
✅
2024-05-17 08:55
2024-05-17 08:57
cache
5.44
5.21
✅
2024-05-17 08:55
2024-05-17 08:57
cache
2.28
2.29
✅
2024-05-17 08:55
2024-05-17 08:57
cache
3.51
3.52
✅
2024-05-17 08:55
2024-05-17 08:57
cache
2.68
2.64
✅
2024-05-17 08:55
2024-05-17 08:57
cache
2.96
3.12
✅
2024-05-17 08:55
2024-05-17 08:57
cache
3.12
2.93
✅
2024-05-17 08:55
2024-05-17 08:57
cache
1.72
1.7
✅
2024-05-17 08:55
2024-05-17 08:57
cache
1.34
1.09
✅
2024-05-17 08:56
2024-05-17 08:57
cache
2.61
2.59
✅
2024-05-17 08:56
2024-05-17 08:57
cache
2.49
2.25
✅
2024-05-17 08:56
2024-05-17 08:57
cache
4.83
4.89
✅
2024-05-17 08:56
2024-05-17 08:58
cache
1.76
1.63
✅
2024-05-17 08:56
2024-05-17 08:58
cache
1.51
1.46
✅
2024-05-17 08:56
2024-05-17 08:58
cache
10.29
10.34
✅
2024-05-17 08:56
2024-05-17 08:58
cache
4.54
4.4
✅
2024-05-17 08:56
2024-05-17 08:58
cache
24.96
24.99
✅
2024-05-17 08:56
2024-05-17 08:58
cache
2.89
2.78
✅
2024-05-17 08:57
2024-05-17 08:59
cache
40.18
25.58
✅
2024-05-17 08:57
2024-05-17 08:59
cache
5.95
6.23
✅
2024-05-17 08:58
2024-05-17 08:59
cache
19.67
19.72
✅
2024-05-17 08:58
2024-05-17 08:59
cache
2.71
2.68
✅
2024-05-17 08:58
2024-05-17 08:59
cache
2.93
3.0
✅
2024-05-17 08:58
2024-05-17 08:59
cache
1.8
1.48
✅
2024-05-17 08:58
2024-05-17 08:59
cache
1.29
1.11
✅
2024-05-17 08:58
2024-05-17 08:59
cache
5.42
5.3
✅
2024-05-17 08:58
2024-05-17 08:59
cache
3.16
2.91
✅
2024-05-17 08:58
2024-05-17 08:59
cache
2.7
2.67
✅
2024-05-17 08:58
2024-05-17 08:59
cache
2.37
2.47
✅
CPU times: user 464 ms, sys: 271 ms, total: 735 ms
-Wall time: 413 ms
+CPU times: user 481 ms, sys: 238 ms, total: 719 ms
+Wall time: 403 ms
-{'fit_time': array([0.0630393 , 0.05985069, 0.05934858, 0.05829835, 0.05691195]),
- 'score_time': array([0.01378393, 0.01406264, 0.01310515, 0.01331496, 0.01302361]),
+{'fit_time': array([0.06043458, 0.05871677, 0.05862284, 0.05633068, 0.05741453]),
+ 'score_time': array([0.01347184, 0.01342702, 0.01305532, 0.01317239, 0.01284552]),
'test_score': array([0.79557785, 0.80049135, 0.79965192, 0.79873055, 0.80456593])}
diff --git a/python_scripts/02_numerical_pipeline_scaling.html b/python_scripts/02_numerical_pipeline_scaling.html
index 4b67bfa6c..fe3747a24 100644
--- a/python_scripts/02_numerical_pipeline_scaling.html
+++ b/python_scripts/02_numerical_pipeline_scaling.html
@@ -2067,7 +2067,7 @@ Model fitting with preprocessing
-The accuracy using a LogisticRegression is 0.807 with a fitting time of 0.134 seconds in 60 iterations
+The accuracy using a LogisticRegression is 0.807 with a fitting time of 0.138 seconds in 60 iterations
diff --git a/python_scripts/03_categorical_pipeline.html b/python_scripts/03_categorical_pipeline.html
index f6d7387b6..80cc61665 100644
--- a/python_scripts/03_categorical_pipeline.html
+++ b/python_scripts/03_categorical_pipeline.html
@@ -2078,8 +2078,8 @@ Evaluate our predictive pipeline
-{'fit_time': array([0.18344617, 0.16978073, 0.1794219 , 0.18088317, 0.17538881]),
- 'score_time': array([0.02263308, 0.0238204 , 0.02229023, 0.02480221, 0.02325177]),
+{'fit_time': array([0.1810813 , 0.16806316, 0.17796063, 0.18232203, 0.16931438]),
+ 'score_time': array([0.02264929, 0.02345228, 0.02224159, 0.02423429, 0.02271461]),
'test_score': array([0.83232675, 0.83570478, 0.82831695, 0.83292383, 0.83497133])}
diff --git a/python_scripts/03_categorical_pipeline_column_transformer.html b/python_scripts/03_categorical_pipeline_column_transformer.html
index a15f2b151..5af0c93a0 100644
--- a/python_scripts/03_categorical_pipeline_column_transformer.html
+++ b/python_scripts/03_categorical_pipeline_column_transformer.html
@@ -1571,8 +1571,8 @@ Evaluation of the model with cross-validation
-{'fit_time': array([0.25613689, 0.25960398, 0.22366261, 0.24478149, 0.27019453]),
- 'score_time': array([0.0276804 , 0.0271585 , 0.02736831, 0.02919555, 0.02713871]),
+{'fit_time': array([0.25266504, 0.25575709, 0.22375464, 0.24230623, 0.26495242]),
+ 'score_time': array([0.02791929, 0.0277667 , 0.02767706, 0.02842665, 0.02699661]),
'test_score': array([0.85116184, 0.84993346, 0.8482801 , 0.85257985, 0.85544636])}
@@ -1644,8 +1644,8 @@ Fitting a more powerful model
-CPU times: user 689 ms, sys: 11.7 ms, total: 701 ms
-Wall time: 700 ms
+CPU times: user 673 ms, sys: 11.9 ms, total: 685 ms
+Wall time: 684 ms
@@ -1657,7 +1657,7 @@ Fitting a more powerful model
-0.8794529522561625
+0.8792891655065105
diff --git a/python_scripts/03_categorical_pipeline_ex_02.html b/python_scripts/03_categorical_pipeline_ex_02.html
index 2d11a1194..1dd4a2123 100644
--- a/python_scripts/03_categorical_pipeline_ex_02.html
+++ b/python_scripts/03_categorical_pipeline_ex_02.html
@@ -782,7 +782,7 @@ Reference pipeline (no numerical scaling and integer-coded categories)
-The mean cross-validation accuracy is: 0.873 ± 0.002 with a fitting time of 4.327
+The mean cross-validation accuracy is: 0.874 ± 0.002 with a fitting time of 4.259
diff --git a/python_scripts/03_categorical_pipeline_sol_02.html b/python_scripts/03_categorical_pipeline_sol_02.html
index de271403e..9b376292b 100644
--- a/python_scripts/03_categorical_pipeline_sol_02.html
+++ b/python_scripts/03_categorical_pipeline_sol_02.html
@@ -788,7 +788,7 @@ Reference pipeline (no numerical scaling and integer-coded categories)
-The mean cross-validation accuracy is: 0.874 ± 0.002 with a fitting time of 4.252
+The mean cross-validation accuracy is: 0.873 ± 0.002 with a fitting time of 4.264
@@ -835,7 +835,7 @@ Scaling numerical features
-The mean cross-validation accuracy is: 0.874 ± 0.003 with a fitting time of 4.295
+The mean cross-validation accuracy is: 0.873 ± 0.002 with a fitting time of 4.257
@@ -891,7 +891,7 @@ One-hot encoding of categorical variables
-The mean cross-validation accuracy is: 0.873 ± 0.003 with a fitting time of 17.117
+The mean cross-validation accuracy is: 0.873 ± 0.003 with a fitting time of 16.863
diff --git a/python_scripts/cross_validation_baseline.html b/python_scripts/cross_validation_baseline.html
index ec0c1360f..217fb1cc4 100644
--- a/python_scripts/cross_validation_baseline.html
+++ b/python_scripts/cross_validation_baseline.html
@@ -743,13 +743,13 @@ Comparing model performance with a simple baseline
count 30.000000
-mean 45.781621
-std 1.165621
-min 43.343258
-25% 44.934040
-50% 45.870101
-75% 46.681409
-max 48.122259
+mean 45.739021
+std 1.199512
+min 43.009317
+25% 44.852893
+50% 45.789081
+75% 46.483567
+max 48.333765
Name: Decision tree regressor, dtype: float64
@@ -826,152 +826,152 @@ Comparing model performance with a simple baseline
0
- 46.849316
+ 46.498200
90.713153
1
- 47.060510
+ 46.462715
90.539353
2
- 44.386111
+ 44.428989
91.941912
3
- 43.792578
+ 43.865327
90.213912
4
- 47.649948
+ 48.333765
92.015862
5
- 44.809461
+ 45.125717
90.542490
6
- 44.321769
+ 44.725138
89.757566
7
- 44.841198
+ 44.901204
92.477244
8
- 44.978043
+ 44.532377
90.947952
9
- 45.148078
+ 45.114938
91.991373
10
- 46.579575
+ 46.760074
92.023571
11
- 46.405369
+ 46.490518
90.556965
12
- 45.550643
+ 45.809124
91.539567
13
- 45.250819
+ 44.977102
91.185225
14
- 46.994709
+ 47.010698
92.298971
15
- 45.011550
+ 44.646918
91.084639
16
- 46.329916
+ 46.440914
90.984471
17
- 46.915084
+ 46.380367
89.981744
18
- 44.919372
+ 44.836789
90.547140
19
- 47.127851
+ 47.515037
89.820219
20
- 43.343258
+ 43.009317
91.768721
21
- 45.316584
+ 46.358493
92.305556
22
- 46.297425
+ 45.232018
90.503017
23
- 46.715354
+ 46.545270
92.147974
24
- 46.357748
+ 46.236355
91.386320
25
- 46.003656
+ 45.769037
90.815660
26
- 44.587669
+ 44.745129
92.216574
27
- 46.046236
+ 46.231509
90.107460
28
- 45.736545
+ 45.337221
90.620318
29
- 48.122259
+ 47.850384
91.165331
@@ -992,7 +992,7 @@ Comparing model performance with a simple baseline
-
+
We see that the generalization performance of our decision tree is far from
diff --git a/python_scripts/cross_validation_grouping.html b/python_scripts/cross_validation_grouping.html
index 59c04e0ad..f651a35b8 100644
--- a/python_scripts/cross_validation_grouping.html
+++ b/python_scripts/cross_validation_grouping.html
@@ -753,7 +753,7 @@
Sample grouping
-The average accuracy is 0.963 ± 0.006
+The average accuracy is 0.963 ± 0.009
@@ -787,7 +787,7 @@ Sample grouping
-
+
The cross-validation testing error that uses the shuffling has less variance
@@ -1014,7 +1014,7 @@
Sample grouping
-
+
As a conclusion, it is really important to take any sample grouping pattern
diff --git a/python_scripts/cross_validation_learning_curve.html b/python_scripts/cross_validation_learning_curve.html
index 966ee22eb..fb2720821 100644
--- a/python_scripts/cross_validation_learning_curve.html
+++ b/python_scripts/cross_validation_learning_curve.html
@@ -794,7 +794,7 @@
Learning curve
-
+
Looking at the training error alone, we see that we get an error of 0 k$. It
diff --git a/python_scripts/cross_validation_sol_01.html b/python_scripts/cross_validation_sol_01.html
index e6c38c12f..c0757b1a0 100644
--- a/python_scripts/cross_validation_sol_01.html
+++ b/python_scripts/cross_validation_sol_01.html
@@ -789,62 +789,62 @@
📃 Solution for Exercise M2.01
-
+
We see that using strategy="stratified"
, the results are much worse than
diff --git a/python_scripts/cross_validation_stratification.html b/python_scripts/cross_validation_stratification.html
index aa3f527ea..9ee7a97c6 100644
--- a/python_scripts/cross_validation_stratification.html
+++ b/python_scripts/cross_validation_stratification.html
@@ -718,10 +718,10 @@
StratificationKFold cross-validation strategy. We will define
-a dataset with nine samples and repeat the cross-validation three times (i.e.
-n_splits
).
+Once the model is created, we can evaluate it using cross-validation. We start
+by using the KFold
strategy.
+Let’s review how this strategy works. For such purpose, we define a dataset
+with nine samples and split the dataset into three folds (i.e. n_splits=3
).
import numpy as np
@@ -742,12 +742,12 @@ StratificationKFold does not shuffle by default. It means that it will
-select the three first samples for the testing set at the first split, then
-the next three samples for the second split, and the three next for the
-last split. In the end, all samples have been used in testing at least once
-among the different splits.
+By defining three splits, we use three samples (1-fold) for testing and six
+(2-folds) for training each time. KFold
does not shuffle by default. It
+means that the three first samples are selected for the testing set at the
+first split, then the three next three samples for the second split, and the
+three next for the last split. In the end, all samples have been used in
+testing at least once among the different splits.
Now, let’s apply this strategy to check the generalization performance of our
model.
@@ -770,8 +770,8 @@ Stratification
-We see that the target vector target
is ordered. It will have some
-unexpected consequences when using the KFold
cross-validation. To illustrate
-the consequences, we will show the class count in each fold of the
-cross-validation in the train and test set.
+We see that the target vector target
is ordered. This has some unexpected
+consequences when using the KFold
cross-validation. To illustrate the
+consequences, we show the class count in each fold of the cross-validation in
+the train and test set.
Let’s compute the class counts for both the training and testing sets using
the KFold
cross-validation, and plot these information in a bar plot.
-We will iterate given the number of split and check how many samples of each
-are present in the training and testing set. We will store the information
-into two distincts lists; one for the training set and one for the testing
-set.
+We iterate given the number of split and check how many samples of each are
+present in the training and testing set. We then store the information into
+two distinct lists; one for the training set and one for the testing set.
import pandas as pd
@@ -816,8 +815,8 @@ Stratification
train_cv_counts = pd.concat(
@@ -950,12 +949,12 @@ Stratificationtrain_cv_counts.plot.bar()
plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left")
plt.ylabel("Count")
-_ = plt.title("Training set")
+_ = plt.title("Training set class counts")
-
+
-
+
We can confirm that in each fold, only two of the three classes are present in
@@ -998,7 +997,7 @@
Stratification
train_cv_counts = []
@@ -1025,12 +1024,12 @@ Stratificationtrain_cv_counts.plot.bar()
plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left")
plt.ylabel("Count")
-_ = plt.title("Training set")
+_ = plt.title("Training set class counts\n(with suffling)")
-
+
-
+
We see that neither the training and testing sets have the same class
@@ -1104,12 +1103,12 @@
Stratificationtrain_cv_counts.plot.bar()
plt.legend(bbox_to_anchor=(1.05, 0.8), loc="upper left")
plt.ylabel("Count")
-_ = plt.title("Training set")
+_ = plt.title("Training set class counts\n(with stratifying)")
-
+
-
+
In this case, we observe that the class counts are very close both in the
train set and the test set. The difference is due to the small number of
samples in the iris dataset.
-In conclusion, this is a good practice to use stratification within the
-cross-validation framework when dealing with a classification problem.
+In other words, stratifying is more effective than just shuffling when it
+comes to making sure that the distributions of classes in all the folds are
+representative of the entire dataset. As training and testing folds have
+similar class distributions, stratifying leads to a more realistic measure of
+the model’s ability to generalize. This is specially important when the
+performance metrics depend on the proportion of the positive class, as we will
+see in a future notebook.
+The interested reader can learn about other stratified cross-validation
+techniques in the scikit-learn user
+guide.
-
-
-