diff --git a/.nojekyll b/.nojekyll index 900409e..5377691 100644 --- a/.nojekyll +++ b/.nojekyll @@ -1 +1 @@ -8c34a58a \ No newline at end of file +43ae9686 \ No newline at end of file diff --git a/blogs/index.html b/blogs/index.html index 2abde31..6ee5334 100644 --- a/blogs/index.html +++ b/blogs/index.html @@ -238,7 +238,7 @@

Data Science Blog

-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+
-
+

diff --git a/blogs/posts/2024-05-22-storing-data-safely/azure_python.out.ipynb b/blogs/posts/2024-05-22-storing-data-safely/azure_python.out.ipynb index c7929a5..a3fdaa7 100644 --- a/blogs/posts/2024-05-22-storing-data-safely/azure_python.out.ipynb +++ b/blogs/posts/2024-05-22-storing-data-safely/azure_python.out.ipynb @@ -6,7 +6,7 @@ "source": [ "#" ], - "id": "46f05c65-ba75-44af-833e-475408b687d2" + "id": "7a1025e7-9762-406c-9d8c-12e260b37e26" }, { "cell_type": "code", @@ -59,7 +59,7 @@ "Install then run `az login` in your terminal. Once you have logged in\n", "with your browser try the `DefaultAzureCredential()` again!" ], - "id": "13c8a940-39f3-40fc-9d4a-8b2eb7af1184" + "id": "0899f787-142e-4fbc-b96c-741306947d48" }, { "cell_type": "code", diff --git a/presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html b/presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html index 56e4599..aff1c15 100644 --- a/presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html +++ b/presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html @@ -554,7 +554,7 @@ code span.vs { color: #20794d; } code span.wa { color: #5e5e5e; font-style: italic; } - + + + + + + + + + + + + +
+
+ +
+

Computer Vision

+

Is this AI?

+ +
+ +

10 October 2024

+
+
+

What is Computer Vision (CV)

+
+
+

Computer vision is a field of computer science that focuses on enabling computers to identify and understand objects and people in images and videos.

+
+
+
+
+

Like other types of AI, computer vision seeks to perform and automate tasks that replicate human capabilities.

+
+
+
+
+

In this case, computer vision seeks to replicate both the way humans see, and the way humans make sense of what they see.

+
+
+ +
+
+
+

What tasks can CV be used for?

+ +
+
+

Classification

+
+

Assign a single label to each image

+
+
+
+
+

+
+
+

Dog

+
+
+

Welsh Spaniel

+
+
+

Animal in water

+
+
+ +
+
+

+
+
+

Dog

+
+
+

Sussex Spaniel

+
+
+

Animal on land

+
+
+ +
+
+

Multi-Classification

+
+

Assign one or more labels to each image

+
+
+
+

+
+

Dog, Welsh Spaniel, Animal in water

+
+
+ +
+

+
+

Dog, Sussex Spaniel, Animal on land

+
+
+ +
+
+

Object Detection

+ + +
+
+

Event Detection

+ + +
+
+

How does CV work?

+

for classification tasks

+
+
    +
  • get a very large corpus or labelled images
  • +
  • convert the images to a form that the computer can work with (tensors)
  • +
  • feed into a convolutional neural network
  • +
  • profit?
  • +
+
+
+
+

Large corpus of labelled images

+

ImageNet

+
    +
  • a large visual database
  • +
  • over 14 million hand-annotated images
  • +
  • more than 20,000 categories
  • +
  • each category has “several hundred” images
  • +
  • started in 2006
  • +
+
+
+

Convert the images to tensors

+
+
+

A fancy way of saying:

+

turn the images into a 2d table

+

of values between 0 and 1

+
+
+
+
+
+
+
+
+

+
+
+
+
+
+
+
+
+
+
+
+
+

+
+
+
+
+
+
+
+
+
+
+

(Convolutional) Neural Networks

+ + +
+
+

Use pre-trained models

+

Models that have been pre-trained on some image datasets which can be downloaded and used

+
+ +
+
+

Transfer learning is the concept of taking a pre-trained model as a basis, then fine-tuning it to classify based on your own images.

+
+
+
+

How can CV be used in Healthcare?

+
    +
  • classification
  • +
  • multi-classification
  • +
  • object detection
  • +
  • event detection
  • +
+
+
+

How can CV be used in Healthcare?

+
+
+
    +
  • detecting disease or injury
  • +
  • monitoring patients vitals, e.g. respiratory rate
  • +
  • detecting bounds of a tumour when planning radiotherapy
  • +
  • automating cell counting
  • +
  • calculating the grade of a cancer
  • +
  • monitor for long-term changes, e.g. AAA
  • +
  • devices for patients with vision impairments
  • +
  • detecting when patients move (fall prevention)
  • +
  • monitoring the hygiene of a ward
  • +
+
+
+
+
+

Issues with Computer Vision

+
+
+

+
+
+
+

+
+
+
+
+
+

Issues with Computer Vision

+
+

… one neural network learned to differentiate between dogs and wolves. It didn’t learn the differences between dogs and wolves, but instead learned that wolves were on snow in their picture and dogs were on grass.

+
+ +
+
+

Issues with Computer Vision

+ + +
+
+

Issues with Computer Vision

+
+
    +
  • algorithm trained at Mount Sinai Hospital, New York City
  • +
  • Busy ICU, many elderly patients
  • +
  • 34% of their x-rays came from patients with pneumonia
  • +
  • 93% accuracy
  • +
+
+
+
    +
  • tested at other sites, pneumonia ~1% of x-rays
  • +
  • accuracy dropped to 73%-80%
  • +
+
+ +
+
+

Issues with Computer Vision

+
+

At Mount Sinai, many of the infected patients were too sick to get out of bed, and so doctors used a portable chest x-ray machine. Portable x-ray images look very different from those created when a patient is standing up. Because of what it learned from Mount Sinai’s x-rays, the algorithm began to associate a portable x-ray with illness. It also anticipated a high rate of pneumonia.

+
+ +
+
+

The Unique Problems of Medical Computer Vision

+
+
+

+
+
+
+

This is the very unique problem of medical computer vision: we are attempting to solve a small signal on the background of small noise whereas standard computer vision’s problem is a large signal on the background of large noise.

+
+
+
+ +
+
+

The Unique Problems of Medical Computer Vision

+ +
+
+

Annotations of cats & dogs is cheaper than reviewing medical scans/slides. The latter adds an additional burden on health systems.

+
+
+
+
+

Other issues with CV

+
+
    +
  • Early detection vs over diagnosis
  • +
  • Adversarial attacks can trick CV algorithms to incorrectly classify images
  • +
  • Computational power (environmental impact)
  • +
  • Governance: have we got consent to use images?
  • +
  • Explainability of Neural Networks
  • +
+
+ +
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + \ No newline at end of file diff --git a/presentations/2024-10-10_what-is-ai-tom/index_files/figure-revealjs/unnamed-chunk-1-1.png b/presentations/2024-10-10_what-is-ai-tom/index_files/figure-revealjs/unnamed-chunk-1-1.png new file mode 100644 index 0000000..7aa2680 Binary files /dev/null and b/presentations/2024-10-10_what-is-ai-tom/index_files/figure-revealjs/unnamed-chunk-1-1.png differ diff --git a/presentations/2024-10-10_what-is-ai-tom/index_files/figure-revealjs/unnamed-chunk-2-1.png b/presentations/2024-10-10_what-is-ai-tom/index_files/figure-revealjs/unnamed-chunk-2-1.png new file mode 100644 index 0000000..d9956af Binary files /dev/null and b/presentations/2024-10-10_what-is-ai-tom/index_files/figure-revealjs/unnamed-chunk-2-1.png differ diff --git a/presentations/index.html b/presentations/index.html index c38aa22..5ca3a9d 100644 --- a/presentations/index.html +++ b/presentations/index.html @@ -174,23 +174,23 @@

Presentations

-
- @@ -639,111 +639,116 @@

Presentations

2024-10-10 +Computer Vision: Is this AI? +Tom Jemmett +2024-10-10 + + Identifying patients at risk: Is this AI? YiWen Hon 2024-10-10 - + Using R and Python to model future hospital activity: EARL Conference 2024 YiWen Hon, Matt Dray, Tom Jemmett 2024-09-05 - + Agile and scrum working Chris Beeley 2024-08-22 - + Open source licensing: Or: how I learned to stop worrying and love openness Chris Beeley 2024-05-30 - + GitHub as a team sport: DfT QA Month Matt Dray 2024-05-23 - + Store Data Safely: Coffee & Coding YiWen Hon, Matt Dray 2024-05-16 - + Coffee and Coding: Making my analytical workflow more reproducible with {targets} Jacqueline Grout 2024-01-25 - + Conference Check-in App: NHS-R/NHS.pycom 2023 Tom Jemmett 2023-10-17 - + System Dynamics in health and care: fitting square data into round models Sally Thompson 2023-10-09 - + Repeating Yourself with Functions: Coffee and Coding Sally Thompson 2023-09-07 - + Coffee and Coding: Working with Geospatial Data in R Tom Jemmett 2023-08-24 - + Unit testing in R: NHS-R Community Webinar Tom Jemmett 2023-08-23 - + Everything you ever wanted to know about data science: but were too afraid to ask Chris Beeley 2023-08-02 - + Travels with R and Python: the power of data science in healthcare Chris Beeley 2023-08-02 - + An Introduction to the New Hospital Programme Demand Model: HACA 2023 Tom Jemmett 2023-07-11 - + What good data science looks like Chris Beeley 2023-05-23 - + Text mining of patient experience data Chris Beeley 2023-05-15 - + Coffee and Coding: {targets} Tom Jemmett 2023-03-23 - + Collaborative working Chris Beeley 2023-03-23 - + Coffee and Coding: Good Coding Practices Tom Jemmett 2023-03-09 - + RAP: what is it and how can my team start using it effectively? Chris Beeley 2023-03-09 - + Coffee and coding: Intro session Chris Beeley 2023-02-23 diff --git a/presentations/su_presentation.scss b/presentations/su_presentation.scss index 5545823..001adf8 100644 --- a/presentations/su_presentation.scss +++ b/presentations/su_presentation.scss @@ -194,6 +194,10 @@ $presentation-heading-color: $su-charcoal; font-size: 1.2rem; } +.medium { + font-size: 2rem; +} + .yellow { color: $su-yellow; } @@ -212,4 +216,4 @@ $presentation-heading-color: $su-charcoal; .center { text-align: center; -} +} \ No newline at end of file diff --git a/search.json b/search.json index a64bf2b..1a3b480 100644 --- a/search.json +++ b/search.json @@ -428,1033 +428,1089 @@ "text": "Summary\n\nAI is lots of things and has been called lots of things\nEven some apparently “intelligent” tasks are not really so intelligent when you know how they work\nMy own view is that a useful definition of AI includes:\n\nModelling complex nonlinear systems without an explicit model\nDeep learning predictive algorithms meet this criterion, as do LLMs\n\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#section", - "href": "presentations/2023-10-17_conference-check-in-app/index.html#section", - "title": "Conference Check-in App", - "section": "", - "text": "digital.library.unt.edu/ark:/67531/metadc1039451/m1/1/\n\n\nClark, Junebug. [Registration Desk for the LPC Conference], photograph, 2016-03-17/2016-03-19; (https://digital.library.unt.edu/ark:/67531/metadc1039451/m1/1/: accessed October 16, 2023), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Special Collections." + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#what-is-computer-vision-cv", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#what-is-computer-vision-cv", + "title": "Computer Vision", + "section": "What is Computer Vision (CV)", + "text": "What is Computer Vision (CV)\n\n\nComputer vision is a field of computer science that focuses on enabling computers to identify and understand objects and people in images and videos.\n\n\n\n\nLike other types of AI, computer vision seeks to perform and automate tasks that replicate human capabilities.\n\n\n\n\nIn this case, computer vision seeks to replicate both the way humans see, and the way humans make sense of what they see.\n\n\n\nSource: What is computer vision? Microsoft" }, { - "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#qr-codes-are-great", - "href": "presentations/2023-10-17_conference-check-in-app/index.html#qr-codes-are-great", - "title": "Conference Check-in App", - "section": "QR codes are great", - "text": "QR codes are great" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#classification", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#classification", + "title": "Computer Vision", + "section": "Classification", + "text": "Classification\n\nAssign a single label to each image\n\n\n\n\n\n\n\nDog\n\n\nWelsh Spaniel\n\n\nAnimal in water\n\n\n\n\n\n\n\n\nDog\n\n\nSussex Spaniel\n\n\nAnimal on land\n\n\n\nImages from imagenet" }, { - "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#and-can-be-easily-generated-in-r", - "href": "presentations/2023-10-17_conference-check-in-app/index.html#and-can-be-easily-generated-in-r", - "title": "Conference Check-in App", - "section": "and can be easily generated in R", - "text": "and can be easily generated in R\ninstall.packages(\"qrcode\")\nlibrary(qrcode)\n\nqr_code(\"https://www.youtube.com/watch?v=dQw4w9WgXcQ\")" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#multi-classification", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#multi-classification", + "title": "Computer Vision", + "section": "Multi-Classification", + "text": "Multi-Classification\n\nAssign one or more labels to each image\n\n\n\n\n\nDog, Welsh Spaniel, Animal in water\n\n\n\n\n\n\nDog, Sussex Spaniel, Animal on land\n\n\n\nImages from imagenet" }, { - "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#why-not", - "href": "presentations/2023-10-17_conference-check-in-app/index.html#why-not", - "title": "Conference Check-in App", - "section": "Why not?", - "text": "Why not?\n\n{shiny} would be doing all the processing on the server side\nwe would need to read from a camera client side\nthen stream video to the server for {shiny} to detect and decode the QR codes" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#object-detection", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#object-detection", + "title": "Computer Vision", + "section": "Object Detection", + "text": "Object Detection\n\n\nImage sourced from Wikimedia" }, { - "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work", - "href": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work", - "title": "Conference Check-in App", - "section": "How does this work?", - "text": "How does this work?\n\n\nFront-end\n\n\nuses the React JavaScript framework\n@yidel/react-qr-scanner\nApp scan’s a QR code, then sends this to our backend\nA window pops up to say who has checked in, or shows an error message" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#event-detection", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#event-detection", + "title": "Computer Vision", + "section": "Event Detection", + "text": "Event Detection\n\n\nIs camera-only the future of self-driving cars?" }, { - "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work-1", - "href": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work-1", - "title": "Conference Check-in App", - "section": "How does this work?", - "text": "How does this work?\nBack-end\nUses the {plumber} R package to build the API, with endpoints for\n\ngetting the list of all of the attendees for that day\nuploading a list of attendees in bulk\nadding an attendee individually\ngetting an attendee\nchecking the attendee in" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#how-does-cv-work", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#how-does-cv-work", + "title": "Computer Vision", + "section": "How does CV work?", + "text": "How does CV work?\nfor classification tasks\n\n\nget a very large corpus or labelled images\nconvert the images to a form that the computer can work with (tensors)\nfeed into a convolutional neural network\nprofit?" }, { - "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work-2", - "href": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work-2", - "title": "Conference Check-in App", - "section": "How does this work?", - "text": "How does this work?\nMore Back-end Stuff\n\nuses a simple SQLite DB that will be thrown away at the end of the conference\nwe send personalised emails using {blastula} to the attendees with their QR codes\nthe QR codes are just random ids (UUIDs) that identify each attendee\nuses websockets to update all of the clients when a user checks in (to update the list of attendees)" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#large-corpus-of-labelled-images", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#large-corpus-of-labelled-images", + "title": "Computer Vision", + "section": "Large corpus of labelled images", + "text": "Large corpus of labelled images\nImageNet\n\na large visual database\nover 14 million hand-annotated images\nmore than 20,000 categories\neach category has “several hundred” images\nstarted in 2006" }, { - "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#learning-different-tools-can-show-you-the-light", - "href": "presentations/2023-10-17_conference-check-in-app/index.html#learning-different-tools-can-show-you-the-light", - "title": "Conference Check-in App", - "section": "Learning different tools can show you the light", - "text": "Learning different tools can show you the light\n\nunsplash.com/photos/tMGMINwFOtI" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#convert-the-images-to-tensors", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#convert-the-images-to-tensors", + "title": "Computer Vision", + "section": "Convert the images to tensors", + "text": "Convert the images to tensors\n\n\nA fancy way of saying:\nturn the images into a 2d table\nof values between 0 and 1" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#patient-experience", - "href": "presentations/2023-05-15_text-mining/index.html#patient-experience", - "title": "Text mining of patient experience data", - "section": "Patient experience", - "text": "Patient experience\n\nThe NHS collects a lot of patient experience data\nRate the service 1-5 (Very poor… Excellent) but also give written feedback\n\n“Parking was difficult”\n“Doctor was rude”\n“You saved my life”\n\nMany organisations lack the staffing to read all of the feedback in a systematic way" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#convolutional-neural-networks", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#convolutional-neural-networks", + "title": "Computer Vision", + "section": "(Convolutional) Neural Networks", + "text": "(Convolutional) Neural Networks\n\n\n3Blue1Brown, YouTube Channel" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#text-mining", - "href": "presentations/2023-05-15_text-mining/index.html#text-mining", - "title": "Text mining of patient experience data", - "section": "Text mining", - "text": "Text mining\n\nWe have built an algorithm to read it\n\nTheme\n“Criticality”\n\nFits alongside other work happening within NHSE\n\nA framework for understanding patient experience" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#use-pre-trained-models", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#use-pre-trained-models", + "title": "Computer Vision", + "section": "Use pre-trained models", + "text": "Use pre-trained models\nModels that have been pre-trained on some image datasets which can be downloaded and used\n\n\nResNet (Microsoft Research)\nInception (Google)\nTrending classifiers from Hugging Face\n\n\n\nTransfer learning is the concept of taking a pre-trained model as a basis, then fine-tuning it to classify based on your own images." }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#patient-experience-101", - "href": "presentations/2023-05-15_text-mining/index.html#patient-experience-101", - "title": "Text mining of patient experience data", - "section": "Patient experience 101", - "text": "Patient experience 101\n\nTick box scoring is not useful (or accurate)\nText based data is complex and built on human experience\nWe’re not making word clouds!\nWe’re not classifying movie reviews or Reddit posts\nThe tool should enhance, not replace, human understanding\n“A recommendation engine for feedback data”" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#how-can-cv-be-used-in-healthcare", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#how-can-cv-be-used-in-healthcare", + "title": "Computer Vision", + "section": "How can CV be used in Healthcare?", + "text": "How can CV be used in Healthcare?\n\nclassification\nmulti-classification\nobject detection\nevent detection" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#everything-open-all-the-time", - "href": "presentations/2023-05-15_text-mining/index.html#everything-open-all-the-time", - "title": "Text mining of patient experience data", - "section": "Everything open, all the time", - "text": "Everything open, all the time\n\nThis project was coded in the open and is MIT licensed\nEngage with the organisations as we find them\n\nDo they want code or a docker image?\nDo they want to fetch their own themes from an API?\nDo they want to use our dashboard?" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#how-can-cv-be-used-in-healthcare-1", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#how-can-cv-be-used-in-healthcare-1", + "title": "Computer Vision", + "section": "How can CV be used in Healthcare?", + "text": "How can CV be used in Healthcare?\n\n\n\ndetecting disease or injury\nmonitoring patients vitals, e.g. respiratory rate\ndetecting bounds of a tumour when planning radiotherapy\nautomating cell counting\ncalculating the grade of a cancer\nmonitor for long-term changes, e.g. AAA\ndevices for patients with vision impairments\ndetecting when patients move (fall prevention)\nmonitoring the hygiene of a ward" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#phase-1", - "href": "presentations/2023-05-15_text-mining/index.html#phase-1", - "title": "Text mining of patient experience data", - "section": "Phase 1", - "text": "Phase 1\n\n10 categories and moderate performance on criticality analysis\nscikit-learn\nShiny\nReticulate\nR package of Python code" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision", + "title": "Computer Vision", + "section": "Issues with Computer Vision", + "text": "Issues with Computer Vision" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#golem-all-the-things", - "href": "presentations/2023-05-15_text-mining/index.html#golem-all-the-things", - "title": "Text mining of patient experience data", - "section": "Golem all the things!", - "text": "Golem all the things!\n\nOpinionated way of building Shiny\nAllows flexibility in deployed versions using YAML\nAgnostic to deployment\nEmphasises dependency management and testing\nSeparate “reactive” and “business” logic (see the accompanying book)" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision-1", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision-1", + "title": "Computer Vision", + "section": "Issues with Computer Vision", + "text": "Issues with Computer Vision\n\n… one neural network learned to differentiate between dogs and wolves. It didn’t learn the differences between dogs and wolves, but instead learned that wolves were on snow in their picture and dogs were on grass.\n\n\nDogs, Wolves, Data Science, and Why Machines Must Learn Like Humans Do (2017)" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#phase-2", - "href": "presentations/2023-05-15_text-mining/index.html#phase-2", - "title": "Text mining of patient experience data", - "section": "Phase 2", - "text": "Phase 2\n\n30-50 categories and excellent criticality performance\nscikit-learn/ BERT\nMore Shiny\nSeparate the code bases\nFastAPI\nInspired by the Royal College of Paediatrics and Child Health API\nDocumentation, documentation, documentation" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision-2", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision-2", + "title": "Computer Vision", + "section": "Issues with Computer Vision", + "text": "Issues with Computer Vision\n\n\nArtificial intelligence could revolutionize medical care. But don’t trust it to read your x-ray just yet (2019)" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#making-it-useful", - "href": "presentations/2023-05-15_text-mining/index.html#making-it-useful", - "title": "Text mining of patient experience data", - "section": "Making it useful", - "text": "Making it useful\n\nAccurately rating low frequency categories\nPer category precision and recall\nSpeed versus accuracy\nRepresenting the thematic structure" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision-3", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision-3", + "title": "Computer Vision", + "section": "Issues with Computer Vision", + "text": "Issues with Computer Vision\n\n\nalgorithm trained at Mount Sinai Hospital, New York City\nBusy ICU, many elderly patients\n34% of their x-rays came from patients with pneumonia\n93% accuracy\n\n\n\n\ntested at other sites, pneumonia ~1% of x-rays\naccuracy dropped to 73%-80%\n\n\n\nArtificial intelligence could revolutionize medical care. But don’t trust it to read your x-ray just yet (2019)" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#the-future", - "href": "presentations/2023-05-15_text-mining/index.html#the-future", - "title": "Text mining of patient experience data", - "section": "The future", - "text": "The future\n\nOff the shelf, proprietary data collection systems dominate\nThey often offer bundled analytic products of low quality\nThe DS time can’t and doesn’t want to offer a complete data system\nHow can we best contribute to improving patient experience for patients in the NHS?\n\nIf the patient experience data won’t come to the mountain…" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision-4", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#issues-with-computer-vision-4", + "title": "Computer Vision", + "section": "Issues with Computer Vision", + "text": "Issues with Computer Vision\n\nAt Mount Sinai, many of the infected patients were too sick to get out of bed, and so doctors used a portable chest x-ray machine. Portable x-ray images look very different from those created when a patient is standing up. Because of what it learned from Mount Sinai’s x-rays, the algorithm began to associate a portable x-ray with illness. It also anticipated a high rate of pneumonia.\n\n\nArtificial intelligence could revolutionize medical care. But don’t trust it to read your x-ray just yet (2019)" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#open-source-ftw", - "href": "presentations/2023-05-15_text-mining/index.html#open-source-ftw", - "title": "Text mining of patient experience data", - "section": "Open source FTW!", - "text": "Open source FTW!\n\nOften individuals in the NHS don’t want private companies to “benefit” from open code\nBut if they make their products better with open code the patients win\nBest practice as code" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#the-unique-problems-of-medical-computer-vision", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#the-unique-problems-of-medical-computer-vision", + "title": "Computer Vision", + "section": "The Unique Problems of Medical Computer Vision", + "text": "The Unique Problems of Medical Computer Vision\n\n\n\n\n\n\nThis is the very unique problem of medical computer vision: we are attempting to solve a small signal on the background of small noise whereas standard computer vision’s problem is a large signal on the background of large noise.\n\n\n\n\nThe Unique Problems of Medical Computer Vision" }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#the-projects", - "href": "presentations/2023-05-15_text-mining/index.html#the-projects", - "title": "Text mining of patient experience data", - "section": "The projects", - "text": "The projects\n\nhttps://github.com/CDU-data-science-team/pxtextmining\nhttps://github.com/CDU-data-science-team/experiencesdashboard\nhttps://github.com/CDU-data-science-team/PatientExperience-QDC" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#the-unique-problems-of-medical-computer-vision-1", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#the-unique-problems-of-medical-computer-vision-1", + "title": "Computer Vision", + "section": "The Unique Problems of Medical Computer Vision", + "text": "The Unique Problems of Medical Computer Vision\n\n\n\n\nIs this a dog?\n\n\n\n\n\nEnglish Springer Spaniel in a Field, Wikipedia\n\n\n\n\n\n\n\nSmall-cell carcinoma of the lung, Wikipedia\n\n\n\n\n\n\nAnnotations of cats & dogs is cheaper than reviewing medical scans/slides. The latter adds an additional burden on health systems." }, { - "objectID": "presentations/2023-05-15_text-mining/index.html#the-team", - "href": "presentations/2023-05-15_text-mining/index.html#the-team", - "title": "Text mining of patient experience data", - "section": "The team", - "text": "The team\n\nYiWen Hon (Python & Machine learning)\nOluwasegun Apejoye (Shiny)\n\nContact:\n\nchris.beeley1@nhs.net\nhttps://fosstodon.org/@chrisbeeley\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2024-10-10_what-is-ai-tom/index.html#other-issues-with-cv", + "href": "presentations/2024-10-10_what-is-ai-tom/index.html#other-issues-with-cv", + "title": "Computer Vision", + "section": "Other issues with CV", + "text": "Other issues with CV\n\n\nEarly detection vs over diagnosis\nAdversarial attacks can trick CV algorithms to incorrectly classify images\nComputational power (environmental impact)\nGovernance: have we got consent to use images?\nExplainability of Neural Networks\n\n\n\n\n\nLearn more about The Strategy Unit" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#how-did-we-get-here", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#how-did-we-get-here", - "title": "Agile and scrum working", - "section": "How did we get here?", - "text": "How did we get here?\n\nWaterfall approaches were used in the early days of software development\n\nRequirements; Design; Development; Integration; Testing; Deployment\n\nYou only move to the next stage when the first one is complete\n(although actually it turns out you kind of don’t…)" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#what-is-testing", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#what-is-testing", + "title": "Unit testing in R", + "section": "What is testing?", + "text": "What is testing?\n\nSoftware testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation\nwikipedia" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#the-road-to-agile", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#the-road-to-agile", - "title": "Agile and scrum working", - "section": "The road to agile", - "text": "The road to agile\n\nSome of the ideas for agile floated around in the 20th century\nShewart’s Plan-Do-Study-Act cycle\nThe New New Product Development Game in 1986\nScrum (which we’ll return to) was proposed in 1993\nIn 2001 the Manifesto for Agile Software Development was published" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-can-we-test-our-code", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-can-we-test-our-code", + "title": "Unit testing in R", + "section": "How can we test our code?", + "text": "How can we test our code?\n\n\nStatically\n\n\n(without executing the code)\nhappens constantly, as we are writing code\nvia code reviews\ncompilers/interpreters/linters statically analyse the code for syntax errors\n\n\n\n\n\nDynamically" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#the-agile-manifesto", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#the-agile-manifesto", - "title": "Agile and scrum working", - "section": "The agile manifesto", - "text": "The agile manifesto\n\nCopyright © 2001 Kent Beck, Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, Martin Fowler, James Grenning, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick\nRobert C. Martin, Steve Mellor, Ken Schwaber, Jeff Sutherland, Dave Thomas\nthis declaration may be freely copied in any form, but only in its entirety through this notice." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-can-we-test-our-code-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-can-we-test-our-code-1", + "title": "Unit testing in R", + "section": "How can we test our code?", + "text": "How can we test our code?\n\n\nStatically\n\n(without executing the code)\nhappens constantly, as we are writing code\nvia code reviews\ncompilers/interpreters/linters statically analyse the code for syntax errors\n\n\n\n\nDynamically\n\n\n(by executing the code)\nsplit into functional and non-functional testing\ntesting can be manual, or automated\n\n\n\n\n\nnon-functional testing covers things like performance, security, and usability testing" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--software-and-the-mvp", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--software-and-the-mvp", - "title": "Agile and scrum working", - "section": "Agile principles- software and the MVP", - "text": "Agile principles- software and the MVP\n\nOur highest priority is to satisfy the customer through early and continuous delivery of valuable software.\nDeliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.\nWorking software is the primary measure of progress.\n\n(these principles and those on following slides copyright Ibid.)" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests", + "title": "Unit testing in R", + "section": "Different types of functional tests", + "text": "Different types of functional tests\nUnit Testing checks each component (or unit) for accuracy independently of one another.\n\nIntegration Testing integrates units to ensure that the code works together.\n\n\nEnd-to-End Testing (e2e) makes sure that the entire system functions correctly.\n\n\nUser Acceptance Testing (UAT) ensures that the product meets the real user’s requirements." }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--working-with-customers", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--working-with-customers", - "title": "Agile and scrum working", - "section": "Agile principles- working with customers", - "text": "Agile principles- working with customers\n\nWelcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.\nBusiness people and developers must work together daily throughout the project." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests-1", + "title": "Unit testing in R", + "section": "Different types of functional tests", + "text": "Different types of functional tests\nUnit Testing checks each component (or unit) for accuracy independently of one another.\nIntegration Testing integrates units to ensure that the code works together.\nEnd-to-End Testing (e2e) makes sure that the entire system functions correctly.\n\nUser Acceptance Testing (UAT) ensures that the product meets the real user’s requirements.\n\n\nUnit, Integration, and E2E testing are all things we can automate in code, whereas UAT testing is going to be manual" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--teamwork", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--teamwork", - "title": "Agile and scrum working", - "section": "Agile principles- teamwork", - "text": "Agile principles- teamwork\n\nBuild projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.\nThe most efficient and effective method of conveying information to and within a development team is face-to-face conversation.\nThe best architectures, requirements, and designs emerge from self-organizing teams.\nAt regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests-2", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests-2", + "title": "Unit testing in R", + "section": "Different types of functional tests", + "text": "Different types of functional tests\nUnit Testing checks each component (or unit) for accuracy independently of one another.\n\nIntegration Testing integrates units to ensure that the code works together.\nEnd-to-End Testing (e2e) makes sure that the entire system functions correctly.\nUser Acceptance Testing (UAT) ensures that the product meets the real user’s requirements.\n\n\nOnly focussing on unit testing in this talk, but the techniques/packages could be extended to integration testing. Often other tools (potentially specific tools) are needed for E2E testing." }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--project-management", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--project-management", - "title": "Agile and scrum working", - "section": "Agile principles- project management", - "text": "Agile principles- project management\n\nAgile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.\nContinuous attention to technical excellence and good design enhances agility.\nSimplicity–the art of maximizing the amount of work not done–is essential." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#example", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#example", + "title": "Unit testing in R", + "section": "Example", + "text": "Example\nWe have a {shiny} app which grabs some data from a database, manipulates the data, and generates a plot.\n\n\nwe would write unit tests to check the data manipulation and plot functions work correctly (with pre-created sample/simple datasets)\nwe would write integration tests to check that the data manipulation function works with the plot function (with similar data to what we used for the unit tests)\nwe would write e2e tests to ensure that from start to finish the app grabs the data and produces a plot as required\n\n\n\nsimple (unit tests) to complex (e2e tests)" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#the-agile-advantage", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#the-agile-advantage", - "title": "Agile and scrum working", - "section": "The agile advantage", - "text": "The agile advantage\n\nBetter use of fixed resources to deliver an unknown outcome, rather than unknown resources to deliver a fixed outcome\nContinuous delivery" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-pyramid", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-pyramid", + "title": "Unit testing in R", + "section": "Testing Pyramid", + "text": "Testing Pyramid\n\n\nImage source: The Testing Pyramid: Simplified for One and All headspin.io" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#feature-creep", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#feature-creep", - "title": "Agile and scrum working", - "section": "Feature creep", - "text": "Feature creep\n\nUsers ask for: everything they need, everything they think they may need, everything they want, everything they think they may want\n\n“every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can”\n\nZawinski’s Law- Source" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function", + "title": "Unit testing in R", + "section": "Let’s create a simple function…", + "text": "Let’s create a simple function…\n\nmy_function <- function(x, y) {\n \n stopifnot(\n \"x must be numeric\" = is.numeric(x),\n \"y must be numeric\" = is.numeric(y),\n \"x must be same length as y\" = length(x) == length(y),\n \"cannot divide by zero!\" = y != 0\n )\n\n x / y\n}" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#regular-stakeholder-feedback", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#regular-stakeholder-feedback", - "title": "Agile and scrum working", - "section": "Regular stakeholder feedback", - "text": "Regular stakeholder feedback\n\nAgile teams are very responsive to product feedback\nThe project we’re curently working on is very agile whether we like it or not\nOur customers never know what they want until we show them something they don’t want" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function-1", + "title": "Unit testing in R", + "section": "Let’s create a simple function…", + "text": "Let’s create a simple function…\n\nmy_function <- function(x, y) {\n \n stopifnot(\n \"x must be numeric\" = is.numeric(x),\n \"y must be numeric\" = is.numeric(y),\n \"x must be same length as y\" = length(x) == length(y),\n \"cannot divide by zero!\" = y != 0\n )\n\n x / y\n}" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#more-agile-advantages", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#more-agile-advantages", - "title": "Agile and scrum working", - "section": "More agile advantages", - "text": "More agile advantages\n\nEarly and cheap failure\nContinuous testing and QA\nReduction in unproductive work\nTeam can improve regularly, not just the product" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function-2", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function-2", + "title": "Unit testing in R", + "section": "Let’s create a simple function…", + "text": "Let’s create a simple function…\n\nmy_function <- function(x, y) {\n \n stopifnot(\n \"x must be numeric\" = is.numeric(x),\n \"y must be numeric\" = is.numeric(y),\n \"x must be same length as y\" = length(x) == length(y),\n \"cannot divide by zero!\" = y != 0\n )\n\n x / y\n}\n\n\nThe Ten Rules of Defensive Programming in R" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-methods", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-methods", - "title": "Agile and scrum working", - "section": "Agile methods", - "text": "Agile methods\n\nThere are lots of agile methodologies\nI’m not going to embarrass myself by pretending to understand them\nExamples include Lean, Crystal, and Extreme Programming" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test", + "title": "Unit testing in R", + "section": "… and create our first test", + "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#scrum", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#scrum", - "title": "Agile and scrum working", - "section": "Scrum", - "text": "Scrum\n\nScrum is the agile methodology we have adopted\nDespite dire warnings to the contrary we have not adopted it wholesale but most of its principles\nThe fundamental organising principle of work in scrum is a sprint lasting 1-4 weeks\nEach sprint finishes with a defined and useful piece of software that can be shown to/ used by customers" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-1", + "title": "Unit testing in R", + "section": "… and create our first test", + "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#product-owner", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#product-owner", - "title": "Agile and scrum working", - "section": "Product owner", - "text": "Product owner\n\nThis person is responsible for the backlog- what goes in to the sprint\nThe backlog should be inclusive of all of the things that customers want or might want\nThe backlog should be prioritised\nThe product owner does this through deep and frequent conversations with customers" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-2", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-2", + "title": "Unit testing in R", + "section": "… and create our first test", + "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#scrum-master-helps-the-scrum-team", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#scrum-master-helps-the-scrum-team", - "title": "Agile and scrum working", - "section": "Scrum master helps the scrum team", - "text": "Scrum master helps the scrum team\n\n“By coaching the team members in self-management and cross-functionality\nFocus on creating high-value Increments that meet the Definition of Done\nInfluence the removal of impediments to the Scrum Team’s progress\nEnsure that all Scrum events take place and are positive, productive, and kept within the timebox.”\n\nSource" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-3", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-3", + "title": "Unit testing in R", + "section": "… and create our first test", + "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#the-backlog", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#the-backlog", - "title": "Agile and scrum working", - "section": "The backlog", - "text": "The backlog\n\nHaving an accurate and well prioritised backlog is key\nDon’t estimate the backlog in hours- use “T shirt sizes” or “points”\nPeople are terrible at estimating how long things take- particularly in software\nEverything in the backlog needs a defined “Done” state" - }, + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-4", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-4", + "title": "Unit testing in R", + "section": "… and create our first test", + "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" + }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#sprint-planning", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#sprint-planning", - "title": "Agile and scrum working", - "section": "Sprint planning", - "text": "Sprint planning\n\nThe team, the product owner, and the scrum master plan the sprint\nSprints should be a fixed length of time less than one month\nThe sprint cannot be changed or added to (we break this rule)\nThe team works autonomously in the sprint- nobody decides who does what except the team\nCan take three hours and should if it needs to" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-5", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-5", + "title": "Unit testing in R", + "section": "… and create our first test", + "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})\n\nTest passed 😸" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#standup", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#standup", - "title": "Agile and scrum working", - "section": "Standup", - "text": "Standup\n\nEvery day, for no more than 15 minutes (teams often stand up to reinforce this rule) team and scrum master meet\nEach person answers three questions\n\nWhat did you do yesterday to help the team finish the sprint?\nWhat will you do today to help the team finish the sprint?\nIs there an obstacle blocking you or the team from achieveing the sprint goal" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#other-expect_-functions", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#other-expect_-functions", + "title": "Unit testing in R", + "section": "other expect_*() functions…", + "text": "other expect_*() functions…\n\ntest_that(\"my_function correctly divides values\", {\n expect_lt(\n my_function(4, 2),\n 10\n )\n expect_gt(\n my_function(1, 4),\n 0.2\n )\n expect_length(\n my_function(c(4, 1), c(2, 4)),\n 2\n )\n})\n\nTest passed 🎉\n\n\n\n{testthat} documentation" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#sprint-retro", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#sprint-retro", - "title": "Agile and scrum working", - "section": "Sprint retro", - "text": "Sprint retro\n\nWhat went well, what could have gone better, and what to improve next time\nLooking at process, not blaming individuals\nRequires maturity and trust to bring up issues, and to respond to them in a constructive way\nShould agree at the end on one process improvement which goes in the next sprint\nWe’ve had some really, really good retros and I think it’s a really important process for a team" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert", + "title": "Unit testing in R", + "section": "Arrange, Act, Assert", + "text": "Arrange, Act, Assert\n\n\n\n\n\ntest_that(\"my_function works\", {\n # arrange\n # \n #\n #\n\n # act\n #\n\n # assert\n #\n})" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#team-perspective", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#team-perspective", - "title": "Agile and scrum working", - "section": "Team perspective", - "text": "Team perspective\n\nProduct owner- that’s me\n\nFocus, clarity and transparency, team delivery, clear and appropriate responsibilities\n\nScrum master- YiWen\nTeam member- Matt\nTeam member- Rhian" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-1", + "title": "Unit testing in R", + "section": "Arrange, Act, Assert", + "text": "Arrange, Act, Assert\n\n\nwe arrange the environment, before running the function\n\n\nto create sample values\ncreate fake/temporary files\nset random seed\nset R options/environment variables\n\n\n\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n #\n\n # assert\n #\n})" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#scrum-values", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#scrum-values", - "title": "Agile and scrum working", - "section": "Scrum values", - "text": "Scrum values\n\nCourage\nFocus\nCommitment\nRespect\nOpenness" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-2", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-2", + "title": "Unit testing in R", + "section": "Arrange, Act, Assert", + "text": "Arrange, Act, Assert\n\n\nwe arrange the environment, before running the function\nwe act by calling the function\n\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n actual <- my_function(x, y)\n\n # assert\n #\n})" }, { - "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#using-agile-outside-of-software", - "href": "presentations/2024-08-22_agile-and-scrum/index.html#using-agile-outside-of-software", - "title": "Agile and scrum working", - "section": "Using agile outside of software", - "text": "Using agile outside of software\n\nData science is outside of software (IMHO)\n\nWe don’t have daily standups and some of our processes run longer than in software development\n\nYou can build cars with Agile\nMarketing and UX design\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-3", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-3", + "title": "Unit testing in R", + "section": "Arrange, Act, Assert", + "text": "Arrange, Act, Assert\n\n\nwe arrange the environment, before running the function\nwe act by calling the function\nwe assert that the actual results match our expected results\n\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n actual <- my_function(x, y)\n\n # assert\n expect_equal(actual, expected)\n})" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#the-team", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#the-team", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "The team", - "text": "The team" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#our-test-failed", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#our-test-failed", + "title": "Unit testing in R", + "section": "Our test failed!?! 😢", + "text": "Our test failed!?! 😢\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n actual <- my_function(x, y)\n\n # assert\n expect_equal(actual, expected)\n})\n\n── Failure: my_function works ──────────────────────────────────────────────────\n`actual` not equal to `expected`.\n1/1 mismatches\n[1] 0.714 - 0.714 == 7.14e-07\n\n\nError:\n! Test failed" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#a-hospital-is-a-place-where-you-can-find-people", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#a-hospital-is-a-place-where-you-can-find-people", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "A hospital is a place where you can find people…", - "text": "A hospital is a place where you can find people…\n\n\nhaving the best day of their life,\nthe worst day of their life,\nthe first day of their life,\nand the last day of their life." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#tolerance-to-the-rescue", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#tolerance-to-the-rescue", + "title": "Unit testing in R", + "section": "Tolerance to the rescue 🙂", + "text": "Tolerance to the rescue 🙂\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n actual <- my_function(x, y)\n\n # assert\n expect_equal(actual, expected, tolerance = 1e-6)\n})\n\nTest passed 🎊\n\n\n\n(this is a slightly artificial example, usually the default tolerance is good enough)" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#planning-is-hard", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#planning-is-hard", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Planning is hard", - "text": "Planning is hard\n\n\n\n\n\nbuilt with enough capacity to replace the existing school\nfailed to take into account a new housing estate\nlikely needs double the number of spaces within the next decade\n\nBBC article" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-edge-cases", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-edge-cases", + "title": "Unit testing in R", + "section": "Testing edge cases", + "text": "Testing edge cases\n\n\nRemember the validation steps we built into our function to handle edge cases?\n\nLet’s write tests for these edge cases:\nwe expect errors\n\n\ntest_that(\"my_function works\", {\n expect_error(my_function(5, 0))\n expect_error(my_function(\"a\", 3))\n expect_error(my_function(3, \"a\"))\n expect_error(my_function(1:2, 4))\n})\n\nTest passed 🎊" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#review-of-existing-models", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#review-of-existing-models", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Review of existing models", - "text": "Review of existing models\n\nSteven Wyatt - NHS-R 2022" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example", + "title": "Unit testing in R", + "section": "Another (simple) example", + "text": "Another (simple) example\n\n\n\nmy_new_function <- function(x, y) {\n if (x > y) {\n \"x\"\n } else {\n \"y\"\n }\n}\n\n\nConsider this function - there is branched logic, so we need to carefully design tests to validate the logic works as intended." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#review-of-existing-models-1", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#review-of-existing-models-1", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Review of existing models", - "text": "Review of existing models\n\nlots of models\nlots of external consultancies\nlots of similarities\n\n\n\nlots of repetition/duplication\nsufficiently different that comparing results is difficult\nmethodological progress slow\nno base to build from\n\n\n\nconsultancies don’t tend to offer products, but services\ndifficult to compare different models to understand if differences are methodological or due to assumptions\nsame issues seen 20/30 years ago\nlearning and expertise gathered tends to be trapped within trusts, or kept secret by consultancies" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example-1", + "title": "Unit testing in R", + "section": "Another (simple) example", + "text": "Another (simple) example\n\nmy_new_function <- function(x, y) {\n if (x > y) {\n \"x\"\n } else {\n \"y\"\n }\n}\n\n\n\ntest_that(\"it returns 'x' if x is bigger than y\", {\n expect_equal(my_new_function(4, 3), \"x\")\n})\n\nTest passed 🎉\n\ntest_that(\"it returns 'y' if y is bigger than x\", {\n expect_equal(my_new_function(3, 4), \"y\")\n expect_equal(my_new_function(3, 3), \"y\")\n})\n\nTest passed 🥳" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#common-issues", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#common-issues", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Common issues", - "text": "Common issues\n\nhandling uncertainty\nunnecessary/early aggregation\npoor coverage of some changes\nlack of ownership & auditability of assumptions\nconflating demand forecasting with affordability\n\n\n\nmost models handle changes like demographic changes and the impact of changes in occupancy rates\nbut few try to handle addressing inequities, health status adjustment" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-to-design-good-tests", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-to-design-good-tests", + "title": "Unit testing in R", + "section": "How to design good tests", + "text": "How to design good tests\na non-exhaustive list\n\nconsider all the functions arguments,\nwhat are the expected values for these arguments?\nwhat are unexpected values, and are they handled?\nare there edge cases that need to be handled?\nhave you covered all of the different paths in your code?\nhave you managed to create tests that check the range of results you expect?" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#our-model", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#our-model", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Our model", - "text": "Our model\n\nopen source (not quite yet…)\nuses standard, well-known datasets (e.g. HES, ONS population projections)\ncurrently handles Inpatient admissions, Outpatient attendances, and A&E arrivals\nextensible and adaptable\ncovering all of the change factors\nstochastic Monte-Carlo model to handle uncertainty" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#but-why-create-tests", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#but-why-create-tests", + "title": "Unit testing in R", + "section": "But, why create tests?", + "text": "But, why create tests?\nanother non-exhaustive list\n\ngood tests will help you uncover existing issues in your code\nwill defend you from future changes that break existing functionality\nwill alert you to changes in dependencies that may have changed the functionality of your code\ncan act as documentation for other developers" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#project-structure", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#project-structure", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Project Structure", - "text": "Project Structure\n\n\n\nData Extraction (R + {targets} & Sql)\nInputs App (R + {shiny})\nOutputs App (R + {shiny})\nModel Engine (Python & Docker)\nAzure Infrastructure (VM/ACR/ACI/Storage Accounts)\nAll of the code is stored on GitHub (currently, private repos 😔)\n\n\n\n\n\n\n\nflowchart TB\n classDef orange fill:#f9bf07,stroke:#2c2825,color:#2c2825;\n classDef lightslate fill:#b2b7b9,stroke:#2c2825,color:#2c2825;\n\n A[Data Extraction]\n B[Inputs App]\n C[Model]\n D[Outputs App]\n\n\n SB[(input app data)]\n SC[(model data)]\n SD[(results data)]\n\n A ---> SB\n A ---> SC\n \n SB ---> B\n SC ---> C\n\n B ---> C\n\n C ---> SD\n SD ---> D\n\n B -.-> D\n\n class A,B,C,D orange\n class SB,SC,SD lightslate" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-complex-functions", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-complex-functions", + "title": "Unit testing in R", + "section": "Testing complex functions", + "text": "Testing complex functions\n\n\n\nmy_big_function <- function(type) {\n con <- dbConnect(RSQLite::SQLite(), \"data.db\")\n df <- tbl(con, \"data_table\") |>\n collect() |>\n mutate(across(date, lubridate::ymd))\n\n conditions <- read_csv(\n \"conditions.csv\", col_types = \"cc\"\n ) |>\n filter(condition_type == type)\n\n df |>\n semi_join(conditions, by = \"condition\") |>\n count(date) |>\n ggplot(aes(date, n)) +\n geom_line() +\n geom_point()\n}\n\n\nWhere do you even begin to start writing tests for something so complex?\n\n\nNote: to get the code on the left to fit on one page, I skipped including a few library calls\n\nlibrary(tidyverse)\nlibrary(DBI)" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-overview", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-overview", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Overview", - "text": "Model Overview\n\n\nthe baseline data is a year worth of a provider’s HES data\neach row in the baseline data is run through a series of steps\neach step creates a factor that says how many times (on average) to sample that row\nthe factors are multiplied together and used to create a random Poisson value\nwe resample the rows using this random values\nefficiencies are then applied, e.g. LoS reductions, type conversions\n\n\n\n\nIP/OP/A&E data\ncomplex, but not complicated" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions", + "title": "Unit testing in R", + "section": "Split the logic into smaller functions", + "text": "Split the logic into smaller functions\nFunction to get the data from the database\n\nget_data_from_sql <- function() {\n con <- dbConnect(RSQLite::SQLite(), \"data.db\")\n tbl(con, \"data_table\") |>\n collect() |>\n mutate(across(date, lubridate::ymd))\n}" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-diagram", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-diagram", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Diagram", - "text": "Model Diagram\n\n\n\n\n\nflowchart TB\n classDef blue fill:#5881c1,stroke:#2c2825,color:#2c2825;\n classDef orange fill:#f9bf07,stroke:#2c2825,color:#2c2825;\n classDef red fill:#ec6555,stroke:#2c2825,color:#2c2825;\n classDef lightslate fill:#b2b7b9,stroke:#2c2825,color:#2c2825;\n classDef slate fill:#e0e2e3,stroke:#2c2825,color:#2c2825;\n\n S[Baseline Activity]\n T[Future Activity]\n\n class S,T red\n\n subgraph rr[Row Resampling]\n direction LR\n\n subgraph pop[Population Changes]\n direction TB\n pop_p[Population Growth]\n pop_a[Age/Sex Structure]\n pop_h[Population Specific Health Status]\n\n class pop_p,pop_a,pop_h orange\n\n pop_p --- pop_a --- pop_h\n end\n\n subgraph dsi[Demand Supply Imbalances]\n direction TB\n dsi_w[Waiting List Adjustment]\n dsi_r[Repatriation/Expatriation]\n dsi_p[Private Healthcare Dynamics]\n\n class dsi_w,dsi_r,dsi_p orange\n\n dsi_w --- dsi_r --- dsi_p\n end\n\n subgraph nsi[Need Supply Imbalances]\n direction TB\n nsi_g[Gaps in Care]\n nsi_i[Inequalities]\n nsi_t[Threshold Imbalances]\n\n class nsi_g,nsi_i,nsi_t orange\n\n nsi_g --- nsi_i --- nsi_t\n end\n\n subgraph nda [Non-Demographic Adjustment]\n direction TB\n nda_m[Medical Interventions]\n nda_c[Changes to National Standards]\n nda_p[Patient Expectations]\n\n class nda_m,nda_c,nda_p orange\n\n nda_m --- nda_c --- nda_p\n end\n\n subgraph mit[Activity Mitigators]\n direction TB\n mit_a[Activity Avoidance]\n mit_t[Type Conversion]\n mit_e[Efficiencies]\n \n class mit_a,mit_t,mit_e orange\n\n mit_a --- mit_t --- mit_e\n end\n\n pop --- dsi --- nsi --- nda --- mit\n\n class dsi,nsi,pop,nda,mit lightslate\n end\n\n class rr slate\n \n S --> rr --> T\n\n\n\n\n\n\n\n\nuses either patient-level data, or minimal aggregation\nrow resampling grouped into 5 broad groups\n\npopulation changes address the changes to the structure of the population and health status over the medium term\ndemand supply imbalances: hospitals are currently struggling to keep pace with demand, so we correct for this to not carry forwards these into the future\nneed supply imbalance: addressing gaps in care that currently exist\nnon-demographic: such as the development of new medical technologies\nactivity mitigators: strategies trusts adopt for reducing activity, or delivering activity more efficiently\n\nsome assumptions set nationally, such as population growth via ONS population projections\nother assumptions set locally, with support from a Shiny app" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-1", + "title": "Unit testing in R", + "section": "Split the logic into smaller functions", + "text": "Split the logic into smaller functions\nFunction to get the relevant conditions\n\nget_conditions <- function(type) {\n read_csv(\n \"conditions.csv\", col_types = \"cc\"\n ) |>\n filter(condition_type == type)\n}" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-diagram-1", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-diagram-1", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Diagram", - "text": "Model Diagram\n\n\n\n\n\nflowchart TB\n classDef blue fill:#5881c1,stroke:#2c2825,color:#2c2825;\n classDef orange fill:#f9bf07,stroke:#2c2825,color:#2c2825;\n classDef red fill:#ec6555,stroke:#2c2825,color:#2c2825;\n classDef lightslate fill:#b2b7b9,stroke:#2c2825,color:#2c2825;\n classDef slate fill:#e0e2e3,stroke:#2c2825,color:#2c2825;\n\n S[Baseline Activity]\n T[Future Activity]\n\n ORANGE[Implemented]\n BLUE[Not yet implemented]\n\n class ORANGE orange\n class BLUE blue\n\n class S,T red\n\n subgraph rr[Row Resampling]\n direction LR\n\n subgraph pop[Population Changes]\n direction TB\n pop_p[Population Growth]\n pop_a[Age/Sex Structure]\n pop_h[Population Specific Health Status]\n\n class pop_p,pop_a,pop_h orange\n\n pop_p --- pop_a --- pop_h\n end\n\n subgraph dsi[Demand Supply Imbalances]\n direction TB\n dsi_w[Waiting List Adjustment]\n dsi_r[Repatriation/Expatriation]\n dsi_p[Private Healthcare Dynamics]\n\n class dsi_w,dsi_r orange\n class dsi_p blue\n\n dsi_w --- dsi_r --- dsi_p\n end\n\n subgraph nsi[Need Supply Imbalances]\n direction TB\n nsi_g[Gaps in Care]\n nsi_i[Inequalities]\n nsi_t[Threshold Imbalances]\n\n class nsi_g,nsi_i,nsi_t blue\n\n nsi_g --- nsi_i --- nsi_t\n end\n\n subgraph nda [Non-Demographic Adjustment]\n direction TB\n nda_m[Medical Interventions]\n nda_c[Changes to National Standards]\n nda_p[Patient Expectations]\n\n class nda_m,nda_c,nda_p blue\n\n nda_m --- nda_c --- nda_p\n end\n\n subgraph mit[Activity Mitigators]\n direction TB\n mit_a[Activity Avoidance]\n mit_t[Type Conversion]\n mit_e[Efficiencies]\n \n class mit_a,mit_t,mit_e orange\n\n mit_a --- mit_t --- mit_e\n end\n\n pop --- dsi --- nsi --- nda --- mit\n\n class dsi,nsi,pop,nda,mit lightslate\n end\n\n class rr slate\n \n S --> rr --> T" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-2", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-2", + "title": "Unit testing in R", + "section": "Split the logic into smaller functions", + "text": "Split the logic into smaller functions\nFunction to combine the data and create a count by date\n\nsummarise_data <- function(df, conditions) {\n df |>\n semi_join(conditions, by = \"condition\") |>\n count(date)\n}" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#monte-carlo-simulation", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#monte-carlo-simulation", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Monte Carlo Simulation", - "text": "Monte Carlo Simulation\n\n\n\nWe run the model N times, varying the input parameters each time slightly to handle the uncertainty.\nThe results of the model are aggregated at the end of each model run\nThe aggregated results are combined at the end into a single file\n\n\n\n\n\n\n\nflowchart LR\n classDef orange fill:#f9bf07,stroke:#2c2825,color:#2c2825;\n classDef red fill:#ec6555,stroke:#2c2825,color:#2c2825;\n \n A[Baseline Activity]\n Ba[Model Run 0]\n Bb[Model Run 1]\n Bc[Model Run 2]\n Bd[Model Run 3]\n Bn[Model Run n]\n C[Results]\n\n A ---> Ba ---> C\n A ---> Bb ---> C\n A ---> Bc ---> C\n A ---> Bd ---> C\n A ---> Bn ---> C\n \n class A,C red\n class Ba,Bb,Bc,Bd,Bn orange\n \n\n\n\n\n\n\n\nInspired by\n\nMapReduce (Google, 2004)\nSplit, Apply, Combine (H. Wickham, 2011)" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-3", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-3", + "title": "Unit testing in R", + "section": "Split the logic into smaller functions", + "text": "Split the logic into smaller functions\nFunction to generate a plot from the summarised data\n\ncreate_plot <- function(df) {\n df |>\n ggplot(aes(date, n)) +\n geom_line() +\n geom_point()\n}" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Parameters", - "text": "Model Parameters\n\nWe ask users to provide parameters in the form of 90% confidence intervals\nWe can then convert these confidence intervals into distributions\nDuring the model we sample values from these distributions for each model parameter\nAll of the parameters represent the average rate to sample a row of data from the baseline" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-4", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-4", + "title": "Unit testing in R", + "section": "Split the logic into smaller functions", + "text": "Split the logic into smaller functions\nThe original function refactored to use the new functions\n\nmy_big_function <- function(type) {\n conditions <- get_conditions(type)\n\n get_data_from_sql() |>\n summarise_data(conditions) |>\n create_plot()\n}\n\n\nThis is going to be significantly easier to test, because we now can verify that the individual components work correctly, rather than having to consider all of the possibilities at once." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-1", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-1", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Parameters", - "text": "Model Parameters\n\n“We expect in the future to see between a 25% reduction and a 25% increase in this activity”\n\n\n\n\ngrey highlighted section: 90% confidence intervals\nblack line: confidence intervals into distributions\nyellow points: sampled parameter for a model run" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data", + "title": "Unit testing in R", + "section": "Let’s test summarise_data", + "text": "Let’s test summarise_data\nsummarise_data <- function(df, conditions) {\n df |>\n semi_join(conditions, by = \"condition\") |>\n count(date)\n}" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-2", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-2", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Parameters", - "text": "Model Parameters\n\n“We expect in the future to see between a 20% reduction and a 90% reduction in this activity”\n\n\n\n\ngrey highlighted section: 90% confidence intervals\nblack line: confidence intervals into distributions\nyellow points: sampled parameter for a model run" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-1", + "title": "Unit testing in R", + "section": "Let’s test summarise_data", + "text": "Let’s test summarise_data\ntest_that(\"it summarises the data\", {\n # arrange\n \n\n\n\n\n\n\n \n\n \n # act\n \n # assert\n \n})" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-3", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-3", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Parameters", - "text": "Model Parameters\n\n“We expect in the future to see between a 2% reduction and an 18% reduction in this activity”\n\n\n\n\ngrey highlighted section: 90% confidence intervals\nblack line: confidence intervals into distributions\nyellow points: sampled parameter for a model run" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-2", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-2", + "title": "Unit testing in R", + "section": "Let’s test summarise_data", + "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n \n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n \n\n\n\n\n # act\n \n # assert\n \n})\n\nGenerate some random data to build a reasonably sized data frame.\nYou could also create a table manually, but part of the trick of writing good tests for this function is to make it so the dates don’t all have the same count.\nThe reason for this is it’s harder to know for sure that the count worked if every row returns the same value.\nWe don’t need the values to be exactly like they are in the real data, just close enough. Instead of dates, we can use numbers, and instead of actual conditions, we can use letters." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-1", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-1", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Run Example (1)", - "text": "Model Run Example (1)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\nf\n\n\n\n\n1\n50\nm\n100\n4\n1.00\n\n\n2\n50\nm\n110\n3\n1.00\n\n\n3\n51\nm\n120\n5\n1.00\n\n\n4\n50\nf\n100\n1\n1.00\n\n\n5\n50\nf\n110\n2\n1.00\n\n\n6\n52\nf\n120\n0\n1.00\n\n\n\n\n\n\n\n\n\nStart with baseline data - we are going to sample each row exactly once (column f)." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-3", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-3", + "title": "Unit testing in R", + "section": "Let’s test summarise_data", + "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n \n\n\n\n\n # act\n \n # assert\n \n})\n\nTests need to be reproducible, and generating our table at random will give us unpredictable results.\nSo, we need to set the random seed; now every time this test runs we will generate the same data." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-2", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-2", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Run Example (2)", - "text": "Model Run Example (2)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\nf\n\n\n\n\n1\n50\nm\n100\n4\n1.00\n\n\n2\n50\nm\n110\n3\n1.00\n\n\n3\n51\nm\n120\n5\n1.00\n\n\n4\n50\nf\n100\n1\n1.00\n\n\n5\n50\nf\n110\n2\n1.00\n\n\n6\n52\nf\n120\n0\n1.00\n\n\n\n\n\n\n\nage\nsex\nf\n\n\n\n\n50\nm\n0.90\n\n\n51\nm\n1.10\n\n\n52\nm\n1.20\n\n\n50\nf\n0.80\n\n\n51\nf\n0.70\n\n\n52\nf\n1.30\n\n\n\n\n\n\n\nf\n\n\n\n\n1.00 × 0.90 = 0.90\n\n\n1.00 × 0.90 = 0.90\n\n\n1.00 × 1.10 = 1.10\n\n\n1.00 × 0.80 = 0.80\n\n\n1.00 × 0.80 = 0.80\n\n\n1.00 × 1.30 = 1.30\n\n\n\n\n\nWe perform a step where we join based on age and sex, then update the f column." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-4", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-4", + "title": "Unit testing in R", + "section": "Let’s test summarise_data", + "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n conditions <- tibble(condition = c(\"a\", \"b\")) \n \n\n\n\n # act\n \n # assert\n \n})\n\nCreate the conditions table. We don’t need all of the columns that are present in the real csv, just the ones that will make our code work.\nWe also need to test that the filtering join (semi_join) is working, so we want to use a subset of the conditions that were used in df." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-3", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-3", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Run Example (3)", - "text": "Model Run Example (3)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\nf\n\n\n\n\n1\n50\nm\n100\n4\n0.90\n\n\n2\n50\nm\n110\n3\n0.90\n\n\n3\n51\nm\n120\n5\n1.10\n\n\n4\n50\nf\n100\n1\n0.80\n\n\n5\n50\nf\n110\n2\n0.80\n\n\n6\n52\nf\n120\n0\n1.30\n\n\n\n\n\n\n\nspecialty\nf\n\n\n\n\n100\n0.90\n\n\n110\n1.10\n\n\n\n\n\n\n\nf\n\n\n\n\n0.90 × 0.90 = 0.81\n\n\n0.90 × 1.10 = 0.99\n\n\n1.10 × 1.00 = 1.10\n\n\n0.80 × 0.90 = 0.72\n\n\n0.80 × 1.10 = 0.88\n\n\n1.30 × 1.00 = 1.30\n\n\n\n\n\nThe next step joins on the specialty column, again updating f. Note, if there is no value to join on, then we multiply by 1." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-5", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-5", + "title": "Unit testing in R", + "section": "Let’s test summarise_data", + "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n conditions <- tibble(condition = c(\"a\", \"b\")) \n \n \n\n \n # act\n actual <- summarise_data(df, conditions)\n # assert\n \n})\n\nBecause we are generating df randomly, to figure out what our “expected” results are, I simply ran the code inside of the test to generate the “actual” results.\nGenerally, this isn’t a good idea. You are creating the results of your test from the code; ideally, you want to be thinking about what the results of your function should be.\nImagine your function doesn’t work as intended, there is some subtle bug that you are not yet aware of. By writing tests “backwards” you may write test cases that confirm the results, but not expose the bug. This is why it’s good to think about edge cases." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-4", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-4", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Run Example (4)", - "text": "Model Run Example (4)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\nf\nn\n\n\n\n\n1\n50\nm\n100\n4\n0.90\n1\n\n\n2\n50\nm\n110\n3\n0.90\n0\n\n\n3\n51\nm\n120\n5\n1.10\n2\n\n\n4\n50\nf\n100\n1\n0.80\n1\n\n\n5\n50\nf\n110\n2\n0.80\n0\n\n\n6\n52\nf\n120\n0\n1.30\n3\n\n\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\n\n\n\n\n1\n50\nm\n100\n4\n\n\n3\n51\nm\n120\n5\n\n\n3\n51\nm\n120\n5\n\n\n4\n50\nf\n100\n1\n\n\n6\n52\nf\n120\n0\n\n\n6\n52\nf\n120\n0\n\n\n6\n52\nf\n120\n0\n\n\n\n\n\nOnce all of the steps are performed, sample a random value n from a Poisson distribution with λ=f, then we select each row n times." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-6", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-6", + "title": "Unit testing in R", + "section": "Let’s test summarise_data", + "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n conditions <- tibble(condition = c(\"a\", \"b\")) \n expected <- tibble(\n date = 1:10,\n n = c(19, 18, 12, 14, 17, 18, 24, 18, 31, 21)\n ) \n # act\n actual <- summarise_data(df, conditions)\n # assert\n \n})\n\nThat said, in cases where we can be confident (say by static analysis of our code) that it is correct, building tests in this way will give us the confidence going forwards that future changes do not break existing functionality.\nIn this case, I have created the expected data frame using the results from running the function." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-5", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-5", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Model Run Example (5)", - "text": "Model Run Example (5)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\ng\n\n\n\n\n1\n50\nm\n100\n4\n0.75\n\n\n3\n51\nm\n120\n5\n0.50\n\n\n3\n51\nm\n120\n5\n1.00\n\n\n4\n50\nf\n100\n1\n0.90\n\n\n6\n52\nf\n120\n0\n0.80\n\n\n6\n52\nf\n120\n0\n0.80\n\n\n6\n52\nf\n120\n0\n0.80\n\n\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\n\n\n\n\n1\n50\nm\n100\n2\n\n\n3\n51\nm\n120\n1\n\n\n3\n51\nm\n120\n5\n\n\n4\n50\nf\n100\n0\n\n\n6\n52\nf\n120\n0\n\n\n6\n52\nf\n120\n0\n\n\n6\n52\nf\n120\n0\n\n\n\n\n\nAfter resampling, we apply efficiency steps. E.g., similar joins are used to create column g, which is then used to sample a new LOS from a binomial distribution." + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-7", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-7", + "title": "Unit testing in R", + "section": "Let’s test summarise_data", + "text": "Let’s test summarise_data\n\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n conditions <- tibble(condition = c(\"a\", \"b\"))\n expected <- tibble(\n date = 1:10,\n n = c(19, 18, 12, 14, 17, 18, 24, 18, 31, 21)\n )\n # act\n actual <- summarise_data(df, conditions)\n # assert\n expect_equal(actual, expected)\n})\n\nTest passed 😸\n\n\n\nThe test works!" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-built", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-built", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "How the model is built", - "text": "How the model is built\n\nThe model is built in Python and can be run on any machine you can install Python on\nUses various packages, such as numpy and pandas\nReads data in .parquet format for efficiency\nReturns aggregated results as a .json file\nCould also output full row level results if needed" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#next-steps", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#next-steps", + "title": "Unit testing in R", + "section": "Next steps", + "text": "Next steps\n\nYou can add tests to any R project (to test functions),\nBut {testthat} works best with Packages\nThe R Packages book has 3 chapters on testing\nThere are two useful helper functions in {usethis}\n\nuse_testthat() will set up the folders for test scripts\nuse_test() will create a test file for the currently open script" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-built-1", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-built-1", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "How the model is built", - "text": "How the model is built\n\nCode is built in a modular approach\nEach activity type (Inpatients/Outpatients/A&E) has its own model code\nCode is reused where possible (e.g. all three models share the code for demographic adjustment)" + "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#next-steps-1", + "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#next-steps-1", + "title": "Unit testing in R", + "section": "Next steps", + "text": "Next steps\n\nIf your test needs to temporarily create a file, or change some R-options, the {withr} package has a lot of useful functions that will automatically clean things up when the test finishes\nIf you are writing tests that involve calling out to a database, or you want to test my_big_function (from before) without calling the intermediate functions, then you should look at the {mockery} package" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-deployed", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-deployed", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "How the model is deployed", - "text": "How the model is deployed\n\nDeployed as a Docker Container\nRuns in Azure Container Instances\nEach model run creates a new container, and the container is destroyed when the model run completes" + "objectID": "presentations/index.html", + "href": "presentations/index.html", + "title": "Presentations", + "section": "", + "text": "Title\nAuthor\nDate\n\n\n\n\nWhat is AI?\nData science team, Strategy Unit\n2024-10-10\n\n\nComputer Vision: Is this AI?\nTom Jemmett\n2024-10-10\n\n\nIdentifying patients at risk: Is this AI?\nYiWen Hon\n2024-10-10\n\n\nUsing R and Python to model future hospital activity: EARL Conference 2024\nYiWen Hon, Matt Dray, Tom Jemmett\n2024-09-05\n\n\nAgile and scrum working\nChris Beeley\n2024-08-22\n\n\nOpen source licensing: Or: how I learned to stop worrying and love openness\nChris Beeley\n2024-05-30\n\n\nGitHub as a team sport: DfT QA Month\nMatt Dray\n2024-05-23\n\n\nStore Data Safely: Coffee & Coding\nYiWen Hon, Matt Dray\n2024-05-16\n\n\nCoffee and Coding: Making my analytical workflow more reproducible with {targets}\nJacqueline Grout\n2024-01-25\n\n\nConference Check-in App: NHS-R/NHS.pycom 2023\nTom Jemmett\n2023-10-17\n\n\nSystem Dynamics in health and care: fitting square data into round models\nSally Thompson\n2023-10-09\n\n\nRepeating Yourself with Functions: Coffee and Coding\nSally Thompson\n2023-09-07\n\n\nCoffee and Coding: Working with Geospatial Data in R\nTom Jemmett\n2023-08-24\n\n\nUnit testing in R: NHS-R Community Webinar\nTom Jemmett\n2023-08-23\n\n\nEverything you ever wanted to know about data science: but were too afraid to ask\nChris Beeley\n2023-08-02\n\n\nTravels with R and Python: the power of data science in healthcare\nChris Beeley\n2023-08-02\n\n\nAn Introduction to the New Hospital Programme Demand Model: HACA 2023\nTom Jemmett\n2023-07-11\n\n\nWhat good data science looks like\nChris Beeley\n2023-05-23\n\n\nText mining of patient experience data\nChris Beeley\n2023-05-15\n\n\nCoffee and Coding: {targets}\nTom Jemmett\n2023-03-23\n\n\nCollaborative working\nChris Beeley\n2023-03-23\n\n\nCoffee and Coding: Good Coding Practices\nTom Jemmett\n2023-03-09\n\n\nRAP: what is it and how can my team start using it effectively?\nChris Beeley\n2023-03-09\n\n\nCoffee and coding: Intro session\nChris Beeley\n2023-02-23" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#data-extraction", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#data-extraction", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Data Extraction", - "text": "Data Extraction\n\nUses principles of RAP, using R + {targets} and Sql\nAll of the data required to run the model\nData is extracted from various sources\n\nSql Datawarehouse (HES data)\nONS population projections + life expectancy tables\nCentral returns, e.g. KH03\nODS data (organisation names, successors)\n\nExtracted data is uploaded to Azure storage containers" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#a-note-on-richard-stallman", + "href": "presentations/2024-05-30_open-source-licensing/index.html#a-note-on-richard-stallman", + "title": "Open source licensing", + "section": "A note on Richard Stallman", + "text": "A note on Richard Stallman\n\nRichard Stallman has been heavily criticised for some of this views\nHe is hard to ignore when talking about open source so I am going to talk about him\nNothing in this talk should be read as endorsing any of his comments outside (or inside) the world of open source" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#inputs-app", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#inputs-app", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Inputs App", - "text": "Inputs App\nA {shiny} app that allows the user to set parameters, and submit as a job to run the model with those values." + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#the-origin-of-open-source", + "href": "presentations/2024-05-30_open-source-licensing/index.html#the-origin-of-open-source", + "title": "Open source licensing", + "section": "The origin of open source", + "text": "The origin of open source\n\nIn the 50s and 60s source code was routinely shared with hardware and users were often expected to modify to run on their hardware\nBy the late 1960s the production cost of software was rising relative to hardware and proprietary licences became more prevalent\nIn 1980 Richard Stallman’s department at MIT took delivery of a printer they were not able to modify the source code for\nRichard Stallman launched the GNU project in 1983 to fight for software freedoms\nMIT licence was launched in the late 1980s\nCathedral and the bazaar was released in 1997 (more on which later)" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#inputs-app-1", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#inputs-app-1", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Inputs App", - "text": "Inputs App" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#what-is-open-source", + "href": "presentations/2024-05-30_open-source-licensing/index.html#what-is-open-source", + "title": "Open source licensing", + "section": "What is open source?", + "text": "What is open source?\n\nThink free as in free speech, not free beer (Stallman)\n\n\nOpen source does not mean free of charge! Software freedom implies the ability to sell code\nFree of charge does not mean open source! Many free to download pieces of software are not open source (Zoom, for example)\n\n\nBy Chao-Kuei et al. - https://www.gnu.org/philosophy/categories.en.html, GPL, Link" }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#outputs-app", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#outputs-app", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Outputs App", - "text": "Outputs App\nA {shiny} app that allows the user to view the results of model runs." + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#the-four-freedoms", + "href": "presentations/2024-05-30_open-source-licensing/index.html#the-four-freedoms", + "title": "Open source licensing", + "section": "The four freedoms", + "text": "The four freedoms\n\nFreedom 0: The freedom to use the program for any purpose.\nFreedom 1: The freedom to study how the program works, and change it to make it do what you wish.\nFreedom 2: The freedom to redistribute and make copies so you can help your neighbor.\nFreedom 3: The freedom to improve the program, and release your improvements (and modified versions in general) to the public, so that the whole community benefits." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#outputs-app-1", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#outputs-app-1", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Outputs App", - "text": "Outputs App" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#cathedral-and-the-bazaar", + "href": "presentations/2024-05-30_open-source-licensing/index.html#cathedral-and-the-bazaar", + "title": "Open source licensing", + "section": "Cathedral and the bazaar", + "text": "Cathedral and the bazaar\n\nEvery good work of software starts by scratching a developer’s personal itch.\nGood programmers know what to write. Great ones know what to rewrite (and reuse).\nPlan to throw one [version] away; you will, anyhow (copied from Frederick Brooks’s The Mythical Man-Month).\nIf you have the right attitude, interesting problems will find you.\nWhen you lose interest in a program, your last duty to it is to hand it off to a competent successor.\nTreating your users as co-developers is your least-hassle route to rapid code improvement and effective debugging.\nRelease early. Release often. And listen to your customers.\nGiven a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone.\nSmart data structures and dumb code works a lot better than the other way around.\nIf you treat your beta-testers as if they’re your most valuable resource, they will respond by becoming your most valuable resource." }, { - "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#questions", - "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#questions", - "title": "An Introduction to the New Hospital Programme Demand Model", - "section": "Questions?", - "text": "Questions?\n\nContact The Strategy Unit\n\n\n strategy.unit@nhs.net\n The-Strategy-Unit\n\n\nContact Me\n\n\n thomas.jemmett@nhs.net\n tomjemmett\n\n\n\n\n\nview slides at https://tinyurl.com/haca23nhp" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#cathedral-and-the-bazaar-cont.", + "href": "presentations/2024-05-30_open-source-licensing/index.html#cathedral-and-the-bazaar-cont.", + "title": "Open source licensing", + "section": "Cathedral and the bazaar (cont.)", + "text": "Cathedral and the bazaar (cont.)\n\nThe next best thing to having good ideas is recognizing good ideas from your users. Sometimes the latter is better.\nOften, the most striking and innovative solutions come from realizing that your concept of the problem was wrong.\nPerfection (in design) is achieved not when there is nothing more to add, but rather when there is nothing more to take away. (Attributed to Antoine de Saint-Exupéry)\nAny tool should be useful in the expected way, but a truly great tool lends itself to uses you never expected.\nWhen writing gateway software of any kind, take pains to disturb the data stream as little as possible—and never throw away information unless the recipient forces you to!\nWhen your language is nowhere near Turing-complete, syntactic sugar can be your friend.\nA security system is only as secure as its secret. Beware of pseudo-secrets.\nTo solve an interesting problem, start by finding a problem that is interesting to you.\nProvided the development coordinator has a communications medium at least as good as the Internet, and knows how to lead without coercion, many heads are inevitably better than one." }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#packages-we-are-using-today", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#packages-we-are-using-today", - "title": "Coffee and Coding", - "section": "Packages we are using today", - "text": "Packages we are using today\n\nlibrary(tidyverse)\n\nlibrary(sf)\n\nlibrary(tidygeocoder)\nlibrary(PostcodesioR)\n\nlibrary(osrm)\n\nlibrary(leaflet)" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#the-disciplines-of-open-source-are-the-disciplines-of-good-data-science", + "href": "presentations/2024-05-30_open-source-licensing/index.html#the-disciplines-of-open-source-are-the-disciplines-of-good-data-science", + "title": "Open source licensing", + "section": "The disciplines of open source are the disciplines of good data science", + "text": "The disciplines of open source are the disciplines of good data science\n\nMeaningful README\nMeaningful commit messages\nModularity\nSeparating data code from analytic code from interactive code\nAssigning issues and pull requests for action/ review\nDon’t forget one of the most lazy and incompetent developers you will ever work with is yourself, six months later" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#getting-boundary-data", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#getting-boundary-data", - "title": "Coffee and Coding", - "section": "Getting boundary data", - "text": "Getting boundary data\nWe can use the ONS’s Geoportal we can grab boundary data to generate maps\n\n\n\nicb_url <- paste0(\n \"https://services1.arcgis.com\",\n \"/ESMARspQHYMw9BZ9/arcgis\",\n \"/rest/services\",\n \"/Integrated_Care_Boards_April_2023_EN_BGC\",\n \"/FeatureServer/0/query\",\n \"?outFields=*&where=1%3D1&f=geojson\"\n)\nicb_boundaries <- read_sf(icb_url)\n\nicb_boundaries |>\n ggplot() +\n geom_sf() +\n theme_void()" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#what-licences-exist", + "href": "presentations/2024-05-30_open-source-licensing/index.html#what-licences-exist", + "title": "Open source licensing", + "section": "What licences exist?", + "text": "What licences exist?\n\nPermissive\n\nSuch as MIT but there are others. Recommended by NHSX draft guidelines on open source\nApache is a notable permissive licence- includes a patent licence\nIn our work the OGL is also relevant- civil servant publish stuff under OGL (and MIT- it isn’t particularly recommended for code)\n\nCopyleft\n\nGPL2, GPL3, AGPL (“the GPL of the web”)\nNote that the provisions of the GPL only apply when you distribute the code\nAt a certain point it all gets too complicated and you need a lawyer\nMPL is a notable copyleft licence- can combine with proprietary code as long as kept separate\n\nArguments for permissive/ copyleft- getting your code used versus preserving software freedoms for other people\nNote that most of the licences are impossible to read! There is a website to explain tl;dr" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-is-the-icb_boundaries-data", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-is-the-icb_boundaries-data", - "title": "Coffee and Coding", - "section": "What is the icb_boundaries data?", - "text": "What is the icb_boundaries data?\n\nicb_boundaries |>\n select(ICB23CD, ICB23NM)\n\nSimple feature collection with 42 features and 2 fields\nGeometry type: MULTIPOLYGON\nDimension: XY\nBounding box: xmin: -6.418667 ymin: 49.86479 xmax: 1.763706 ymax: 55.81112\nGeodetic CRS: WGS 84\n# A tibble: 42 × 3\n ICB23CD ICB23NM geometry\n <chr> <chr> <MULTIPOLYGON [°]>\n 1 E54000008 NHS Cheshire and Merseyside Integrated C… (((-3.083264 53.2559, -3…\n 2 E54000010 NHS Staffordshire and Stoke-on-Trent Int… (((-1.950489 53.21188, -…\n 3 E54000011 NHS Shropshire, Telford and Wrekin Integ… (((-2.380794 52.99841, -…\n 4 E54000013 NHS Lincolnshire Integrated Care Board (((0.2687853 52.81584, 0…\n 5 E54000015 NHS Leicester, Leicestershire and Rutlan… (((-0.7875237 52.97762, …\n 6 E54000018 NHS Coventry and Warwickshire Integrated… (((-1.577608 52.67858, -…\n 7 E54000019 NHS Herefordshire and Worcestershire Int… (((-2.272042 52.43972, -…\n 8 E54000022 NHS Norfolk and Waveney Integrated Care … (((1.666741 52.31366, 1.…\n 9 E54000023 NHS Suffolk and North East Essex Integra… (((0.8997023 51.7732, 0.…\n10 E54000024 NHS Bedfordshire, Luton and Milton Keyne… (((-0.4577115 52.32009, …\n# ℹ 32 more rows" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#what-is-copyright-and-why-does-it-matter", + "href": "presentations/2024-05-30_open-source-licensing/index.html#what-is-copyright-and-why-does-it-matter", + "title": "Open source licensing", + "section": "What is copyright and why does it matter", + "text": "What is copyright and why does it matter\n\nCopyright is assigned at the moment of creation\nIf you made it in your own time, it’s yours (usually!)\nIf you made it at work, it belongs to your employer\nIf someone paid you to make it (“work for hire”) it belongs to them\nCrucially, the copyright holder can relicence software\n\nIf it’s jointly authored it depends if it’s a “collective” or “joint” work\nHonestly it’s pretty complicated. Just vest copyright in an organisation or group of individuals you trust\nGoldacre review suggests using Crown copyright for copyright in the NHS because it’s a “shoal, not a big fish” (with apologies to Ben whom I am misquoting)" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-dataframes", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-dataframes", - "title": "Coffee and Coding", - "section": "Working with geospatial dataframes", - "text": "Working with geospatial dataframes\nWe can simply join sf data frames and “regular” data frames together\n\n\n\nicb_metrics <- icb_boundaries |>\n st_drop_geometry() |>\n select(ICB23CD) |>\n mutate(admissions = rpois(n(), 1000000))\n\nicb_boundaries |>\n inner_join(icb_metrics, by = \"ICB23CD\") |>\n ggplot() +\n geom_sf(aes(fill = admissions)) +\n scale_fill_viridis_c() +\n theme_void()" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#iceweasel", + "href": "presentations/2024-05-30_open-source-licensing/index.html#iceweasel", + "title": "Open source licensing", + "section": "Iceweasel", + "text": "Iceweasel\n\nIceweasel is a story of trademark rather than copyright\nDebian (a Linux flavour) had the permission to use the source code of Firefox, but not the logo\nSo they took the source code and made their own version\nThis sounds very obscure and unimportant but it could become important in future projects of ours, like…" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-data-frames", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-data-frames", - "title": "Coffee and Coding", - "section": "Working with geospatial data frames", - "text": "Working with geospatial data frames\nWe can manipulate sf objects like other data frames\n\n\n\nlondon_icbs <- icb_boundaries |>\n filter(ICB23NM |> stringr::str_detect(\"London\"))\n\nggplot() +\n geom_sf(data = london_icbs) +\n geom_sf(data = st_centroid(london_icbs)) +\n theme_void()" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#what-we-have-learned-in-recent-projects", + "href": "presentations/2024-05-30_open-source-licensing/index.html#what-we-have-learned-in-recent-projects", + "title": "Open source licensing", + "section": "What we have learned in recent projects", + "text": "What we have learned in recent projects\n\nThe huge benefits of being open\n\nTransparency\nWorking with customers\nGoodwill\n\nNonfree mitigators\nDifferent licences for different repos" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-data-frames-1", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-data-frames-1", - "title": "Coffee and Coding", - "section": "Working with geospatial data frames", - "text": "Working with geospatial data frames\nSummarising the data will combine the geometries.\n\nlondon_icbs |>\n summarise(area = sum(Shape__Area)) |>\n # and use geospatial functions to create calculations using the geometry\n mutate(new_area = st_area(geometry), .before = \"geometry\")\n\nSimple feature collection with 1 feature and 2 fields\nGeometry type: MULTIPOLYGON\nDimension: XY\nBounding box: xmin: -0.5102803 ymin: 51.28676 xmax: 0.3340241 ymax: 51.69188\nGeodetic CRS: WGS 84\n# A tibble: 1 × 3\n area new_area geometry\n* <dbl> [m^2] <MULTIPOLYGON [°]>\n1 1573336388. 1567995610. (((-0.3314819 51.43935, -0.3306676 51.43889, -0.33118…\n\n\n Why the difference in area?\n\n We are using a simplified geometry, so calculating the area will be slightly inaccurate. The original area was calculated on the non-simplified geometries." + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#software-freedom-means-allowing-people-to-do-stuff-you-dont-like", + "href": "presentations/2024-05-30_open-source-licensing/index.html#software-freedom-means-allowing-people-to-do-stuff-you-dont-like", + "title": "Open source licensing", + "section": "Software freedom means allowing people to do stuff you don’t like", + "text": "Software freedom means allowing people to do stuff you don’t like\n\nFreedom 0: The freedom to use the program for any purpose.\nFreedom 3: The freedom to improve the program, and release your improvements (and modified versions in general) to the public, so that the whole community benefits.\nThe code isn’t the only thing with worth in the project\nThis is why there are whole businesses founded on “here’s the Linux source code”\nSo when we’re sharing code we are letting people do stupid things with it but we’re not recommending that they do stupid things with it\nPeople do stupid things with Excel and Microsoft don’t accept liability for that, and neither should we\nThis issue of sharing analytic code and merchantability for a particular purpose is poorly understood and I think everyone needs to be clearer on it (us, and our customers)\nIn my view a world where consultants are selling our code is better than a world where they’re selling their spreadsheets" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#creating-our-own-geospatial-data", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#creating-our-own-geospatial-data", - "title": "Coffee and Coding", - "section": "Creating our own geospatial data", - "text": "Creating our own geospatial data\n\nlocation_raw <- postcode_lookup(\"B2 4BJ\")\nglimpse(location_raw)\n\nRows: 1\nColumns: 40\n$ postcode <chr> \"B2 4BJ\"\n$ quality <int> 1\n$ eastings <int> 406866\n$ northings <int> 286775\n$ country <chr> \"England\"\n$ nhs_ha <chr> \"West Midlands\"\n$ longitude <dbl> -1.90033\n$ latitude <dbl> 52.47887\n$ european_electoral_region <chr> \"West Midlands\"\n$ primary_care_trust <chr> \"Heart of Birmingham Teaching\"\n$ region <chr> \"West Midlands\"\n$ lsoa <chr> \"Birmingham 138A\"\n$ msoa <chr> \"Birmingham 138\"\n$ incode <chr> \"4BJ\"\n$ outcode <chr> \"B2\"\n$ parliamentary_constituency <chr> \"Birmingham, Ladywood\"\n$ parliamentary_constituency_2024 <chr> \"Birmingham Ladywood\"\n$ admin_district <chr> \"Birmingham\"\n$ parish <chr> \"Birmingham, unparished area\"\n$ admin_county <lgl> NA\n$ date_of_introduction <chr> \"198001\"\n$ admin_ward <chr> \"Ladywood\"\n$ ced <lgl> NA\n$ ccg <chr> \"NHS Birmingham and Solihull\"\n$ nuts <chr> \"Birmingham\"\n$ pfa <chr> \"West Midlands\"\n$ admin_district_code <chr> \"E08000025\"\n$ admin_county_code <chr> \"E99999999\"\n$ admin_ward_code <chr> \"E05011151\"\n$ parish_code <chr> \"E43000250\"\n$ parliamentary_constituency_code <chr> \"E14000564\"\n$ parliamentary_constituency_2024_code <chr> \"E14001096\"\n$ ccg_code <chr> \"E38000258\"\n$ ccg_id_code <chr> \"15E\"\n$ ced_code <chr> \"E99999999\"\n$ nuts_code <chr> \"TLG31\"\n$ lsoa_code <chr> \"E01033620\"\n$ msoa_code <chr> \"E02006899\"\n$ lau2_code <chr> \"E08000025\"\n$ pfa_code <chr> \"E23000014\"\n\n\n\n\n\nlocation <- location_raw |>\n st_as_sf(coords = c(\"eastings\", \"northings\"), crs = 27700) |>\n select(postcode, ccg) |>\n st_transform(crs = 4326)\n\nlocation\n\nSimple feature collection with 1 feature and 2 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -1.900335 ymin: 52.47886 xmax: -1.900335 ymax: 52.47886\nGeodetic CRS: WGS 84\n postcode ccg geometry\n1 B2 4BJ NHS Birmingham and Solihull POINT (-1.900335 52.47886)" + "objectID": "presentations/2024-05-30_open-source-licensing/index.html#open-source-as-in-piano", + "href": "presentations/2024-05-30_open-source-licensing/index.html#open-source-as-in-piano", + "title": "Open source licensing", + "section": "“Open source as in piano”", + "text": "“Open source as in piano”\n\nThe patient experience QDC project\nOur current project\nOpen source code is not necessarily to be run, but understood and learned from\nBuilding a group of people who can use and contribute to your code is arguably as important as writing it\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#creating-a-geospatial-data-frame-for-all-nhs-trusts", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#creating-a-geospatial-data-frame-for-all-nhs-trusts", - "title": "Coffee and Coding", - "section": "Creating a geospatial data frame for all NHS Trusts", - "text": "Creating a geospatial data frame for all NHS Trusts\n\n\n\n# using the NHSRtools package\n# remotes::install_github(\"NHS-R-Community/NHSRtools\")\ntrusts <- ods_get_trusts() |>\n filter(status == \"Active\") |>\n select(name, org_id, post_code) |>\n geocode(postalcode = \"post_code\") |>\n st_as_sf(coords = c(\"long\", \"lat\"), crs = 4326)\n\n\ntrusts |>\n leaflet() |>\n addProviderTiles(\"Stamen.TonerLite\") |>\n addMarkers(popup = ~name)" + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-is-rap", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-is-rap", + "title": "RAP", + "section": "What is RAP", + "text": "What is RAP\n\na process in which code is used to minimise manual, undocumented steps, and a clear, properly documented process is produced in code which can reliably give the same result from the same dataset\nRAP should be:\n\n\nthe core working practice that must be supported by all platforms and teams; make this a core focus of NHS analyst training\n\nGoldacre review" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-are-the-nearest-trusts-to-our-location", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-are-the-nearest-trusts-to-our-location", - "title": "Coffee and Coding", - "section": "What are the nearest trusts to our location?", - "text": "What are the nearest trusts to our location?\n\nnearest_trusts <- trusts |>\n mutate(\n distance = st_distance(geometry, location)[, 1]\n ) |>\n arrange(distance) |>\n head(5)\n\nnearest_trusts\n\nSimple feature collection with 5 features and 4 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -1.9384 ymin: 52.4533 xmax: -1.886282 ymax: 52.48764\nGeodetic CRS: WGS 84\n# A tibble: 5 × 5\n name org_id post_code geometry distance\n <chr> <chr> <chr> <POINT [°]> [m]\n1 BIRMINGHAM WOMEN'S AND CH… RQ3 B4 6NH (-1.894241 52.4849) 789.\n2 BIRMINGHAM AND SOLIHULL M… RXT B1 3RB (-1.917663 52.48416) 1313.\n3 BIRMINGHAM COMMUNITY HEAL… RYW B7 4BN (-1.886282 52.48754) 1356.\n4 SANDWELL AND WEST BIRMING… RXK B18 7QH (-1.930203 52.48764) 2246.\n5 UNIVERSITY HOSPITALS BIRM… RRK B15 2GW (-1.9384 52.4533) 3838." + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-are-we-trying-to-achieve", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-are-we-trying-to-achieve", + "title": "RAP", + "section": "What are we trying to achieve?", + "text": "What are we trying to achieve?\n\nLegibility\nReproducibility\nAccuracy\nLaziness" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#lets-find-driving-routes-to-these-trusts", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#lets-find-driving-routes-to-these-trusts", - "title": "Coffee and Coding", - "section": "Let’s find driving routes to these trusts", - "text": "Let’s find driving routes to these trusts\n\nroutes <- nearest_trusts |>\n mutate(\n route = map(geometry, ~ osrmRoute(location, st_coordinates(.x)))\n ) |>\n st_drop_geometry() |>\n rename(straight_line_distance = distance) |>\n unnest(route) |>\n st_as_sf()\n\nroutes\n\nSimple feature collection with 5 features and 8 fields\nGeometry type: LINESTRING\nDimension: XY\nBounding box: xmin: -1.93846 ymin: 52.45316 xmax: -1.88527 ymax: 52.49279\nGeodetic CRS: WGS 84\n# A tibble: 5 × 9\n name org_id post_code straight_line_distance src dst duration distance\n <chr> <chr> <chr> [m] <chr> <chr> <dbl> <dbl>\n1 BIRMING… RQ3 B4 6NH 789. 1 dst 5.77 3.09\n2 BIRMING… RXT B1 3RB 1313. 1 dst 6.84 4.14\n3 BIRMING… RYW B7 4BN 1356. 1 dst 7.59 4.29\n4 SANDWEL… RXK B18 7QH 2246. 1 dst 8.78 4.95\n5 UNIVERS… RRK B15 2GW 3838. 1 dst 10.6 4.67\n# ℹ 1 more variable: geometry <LINESTRING [°]>" + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-are-some-of-the-fundamental-principles", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-are-some-of-the-fundamental-principles", + "title": "RAP", + "section": "What are some of the fundamental principles?", + "text": "What are some of the fundamental principles?\n\nPredictability, reducing mental load, and reducing truck factor\nMaking it easy to collaborate with yourself and others on different computers, in the cloud, in six months’ time…\nDRY" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#lets-show-the-routes", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#lets-show-the-routes", - "title": "Coffee and Coding", - "section": "Let’s show the routes", - "text": "Let’s show the routes\n\nleaflet(routes) |>\n addTiles() |>\n addMarkers(data = location) |>\n addPolylines(color = \"black\", weight = 3, opacity = 1) |>\n addCircleMarkers(data = nearest_trusts, radius = 4, opacity = 1, fillOpacity = 1)" + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#the-road-to-rap", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#the-road-to-rap", + "title": "RAP", + "section": "The road to RAP", + "text": "The road to RAP\n\nWe’re roughly using NHS Digital’s RAP stages\nThere is an incredibly large amount to learn!\nConfession time! (everything I do not know…)\nYou don’t need to do it all at once\nYou don’t need to do it all at all ever\nEach thing you learn will incrementally help you\nRemember- that’s why we learnt this stuff. Because it helped us. And it can help you too" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#we-can-use-osrm-to-calculate-isochrones", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#we-can-use-osrm-to-calculate-isochrones", - "title": "Coffee and Coding", - "section": "We can use {osrm} to calculate isochrones", - "text": "We can use {osrm} to calculate isochrones\n\n\n\niso <- osrmIsochrone(location, breaks = seq(0, 60, 15), res = 10)\n\nisochrone_ids <- unique(iso$id)\n\npal <- colorFactor(\n viridis::viridis(length(isochrone_ids)),\n isochrone_ids\n)\n\nleaflet(location) |>\n addProviderTiles(\"Stamen.TonerLite\") |>\n addMarkers() |>\n addPolygons(\n data = iso,\n fillColor = ~ pal(id),\n color = \"#000000\",\n weight = 1\n )" + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--baseline", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--baseline", + "title": "RAP", + "section": "Levels of RAP- Baseline", + "text": "Levels of RAP- Baseline\n\nData produced by code in an open-source language (e.g., Python, R, SQL).\nCode is version controlled (see Git basics and using Git collaboratively guides).\nRepository includes a README.md file (or equivalent) that clearly details steps a user must follow to reproduce the code\nCode has been peer reviewed.\nCode is published in the open and linked to & from accompanying publication (if relevant).\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones", - "title": "Coffee and Coding", - "section": "What trusts are in the isochrones?", - "text": "What trusts are in the isochrones?\nThe summarise() function will “union” the geometry\n\nsummarise(iso)\n\nSimple feature collection with 1 feature and 0 fields\nGeometry type: POLYGON\nDimension: XY\nBounding box: xmin: -2.913575 ymin: 51.98062 xmax: -0.8502164 ymax: 53.1084\nGeodetic CRS: WGS 84\n geometry\n1 POLYGON ((-1.541014 52.9693..." + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--silver", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--silver", + "title": "RAP", + "section": "Levels of RAP- Silver", + "text": "Levels of RAP- Silver\n\nCode is well-documented…\nCode is well-organised following standard directory format\nReusable functions and/or classes are used where appropriate\nPipeline includes a testing framework\nRepository includes dependency information (e.g. requirements.txt, PipFile, environment.yml\nData is handled and output in a Tidy data format\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones-1", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones-1", - "title": "Coffee and Coding", - "section": "What trusts are in the isochrones?", - "text": "What trusts are in the isochrones?\nWe can use this with a geo-filter to find the trusts in the isochrone\n\n# also works\ntrusts_in_iso <- trusts |>\n st_filter(\n summarise(iso),\n .predicate = st_within\n )\n\ntrusts_in_iso\n\nSimple feature collection with 31 features and 3 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -2.793386 ymin: 52.19205 xmax: -1.10302 ymax: 53.01015\nGeodetic CRS: WGS 84\n# A tibble: 31 × 4\n name org_id post_code geometry\n * <chr> <chr> <chr> <POINT [°]>\n 1 BIRMINGHAM AND SOLIHULL MENTAL HE… RXT B1 3RB (-1.917663 52.48416)\n 2 BIRMINGHAM COMMUNITY HEALTHCARE N… RYW B7 4BN (-1.886282 52.48754)\n 3 BIRMINGHAM WOMEN'S AND CHILDREN'S… RQ3 B4 6NH (-1.894241 52.4849)\n 4 BIRMINGHAM WOMEN'S NHS FOUNDATION… RLU B15 2TG (-1.942861 52.45325)\n 5 BURTON HOSPITALS NHS FOUNDATION T… RJF DE13 0RB (-1.656667 52.81774)\n 6 COVENTRY AND WARWICKSHIRE PARTNER… RYG CV6 6NY (-1.48692 52.45659)\n 7 DERBYSHIRE HEALTHCARE NHS FOUNDAT… RXM DE22 3LZ (-1.512896 52.91831)\n 8 DUDLEY INTEGRATED HEALTH AND CARE… RYK DY5 1RU (-2.11786 52.48176)\n 9 GEORGE ELIOT HOSPITAL NHS TRUST RLT CV10 7DJ (-1.47844 52.51258)\n10 HEART OF ENGLAND NHS FOUNDATION T… RR1 B9 5ST (-1.828759 52.4781)\n# ℹ 21 more rows" + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--gold", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--gold", + "title": "RAP", + "section": "Levels of RAP- Gold", + "text": "Levels of RAP- Gold\n\nCode is fully packaged\nRepository automatically runs tests etc. via CI/CD or a different integration/deployment tool e.g. GitHub Actions\nProcess runs based on event-based triggers (e.g., new data in database) or on a schedule\nChanges to the RAP are clearly signposted. E.g. a changelog in the package, releases etc. (See gov.uk info on Semantic Versioning)\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones-2", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones-2", - "title": "Coffee and Coding", - "section": "What trusts are in the isochrones?", - "text": "What trusts are in the isochrones?\n\n\n\nleaflet(trusts_in_iso) |>\n addProviderTiles(\"Stamen.TonerLite\") |>\n addMarkers() |>\n addPolygons(\n data = iso,\n fillColor = ~pal(id),\n color = \"#000000\",\n weight = 1\n )" + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#a-learning-journey-to-get-you-there", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#a-learning-journey-to-get-you-there", + "title": "RAP", + "section": "A learning journey to get you there", + "text": "A learning journey to get you there\n\nCode style, organising your files\nFunctions and iteration\nGit and GitHub\nPackaging your code\nTesting\nPackage management and versioning" }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#doing-the-same-but-within-a-radius", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#doing-the-same-but-within-a-radius", - "title": "Coffee and Coding", - "section": "Doing the same but within a radius", - "text": "Doing the same but within a radius\n\n\n\nr <- 25000\n\ntrusts_in_radius <- trusts |>\n st_filter(\n location,\n .predicate = st_is_within_distance,\n dist = r\n )\n\n# transforming gives us a pretty smooth circle\nradius <- location |>\n st_transform(crs = 27700) |>\n st_buffer(dist = r) |>\n st_transform(crs = 4326)\n\nleaflet(trusts_in_radius) |>\n addProviderTiles(\"Stamen.TonerLite\") |>\n addMarkers() |>\n addPolygons(\n data = radius,\n color = \"#000000\",\n weight = 1\n )" + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#how-we-can-help-each-other-get-there", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#how-we-can-help-each-other-get-there", + "title": "RAP", + "section": "How we can help each other get there", + "text": "How we can help each other get there\n\nWork as a team!\nCoffee and coding!\nAsk for help!\nDo pair coding!\nGet your code reviewed!\nJoin the NHS-R/ NHSPycom communities" + }, + { + "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#haca", + "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#haca", + "title": "RAP", + "section": "HACA", + "text": "HACA\n\nThe first national analytics conference for health and care\nInsight to action!\nJuly 11th and 12th, University of Birmingham\nAccepting abstracts for short and long talks and posters\nAbstract deadline 27th March\nHelp is available (with abstract, poster, preparing presentation…)!\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + }, + { + "objectID": "presentations/2024-05-23_github-team-sport/index.html#tldr", + "href": "presentations/2024-05-23_github-team-sport/index.html#tldr", + "title": "GitHub as a team sport", + "section": "tl;dr", + "text": "tl;dr\n\n\n\n‘Quality’ isn’t just good code\nTeamwork makes the dream work\nGitHub is a communication tool\n\n\n\n\n\n\n‘Too long; didn’t read’.\nGitHub isn’t just a dumping ground for code and Git history.\nIt’s a platform for working with teammates to get things done.\nQuality is improved by good communication, organisation and reduction of something called the ‘bus factor’ that I’ll get to in a minute." + }, + { + "objectID": "presentations/2024-05-23_github-team-sport/index.html#the-strategy-unit-su", + "href": "presentations/2024-05-23_github-team-sport/index.html#the-strategy-unit-su", + "title": "GitHub as a team sport", + "section": "The Strategy Unit (SU)", + "text": "The Strategy Unit (SU)\n\n\n\nAn ‘internal consultancy’\nHosted by NHS Midlands and Lancashire\nGrowing in size and reputation\n\n\n\n\n\n\nInitially a ‘start-up’ style operation that has expanded to 70+ staff.\n‘We produce high-quality, multi-disciplinary analytical work – and we help people apply the results.’\nA lot of our work is on the important New Hospital Programme (NHP).\n‘Our proposition is simple: better evidence, better decisions, better outcomes.’\nExpansion is tricky; how can we maintain quality?" + }, + { + "objectID": "presentations/2024-05-23_github-team-sport/index.html#the-data-science-team", + "href": "presentations/2024-05-23_github-team-sport/index.html#the-data-science-team", + "title": "GitHub as a team sport", + "section": "The Data Science Team", + "text": "The Data Science Team\n \n\nExpanded to 6, all remote\nModelling, Quarto, Shiny\nNew Hospital Programme (NHP)\n\n\n\nA new team, expanding rapidly from 2 to 6 in about a year.\nRemote across England.\nExperience from across the NHS and consultancy. I spent a decade in five central government departments before this.\nWe’re helping to model and design apps for the NHP to help build hospitals.\nSo: growing team, different experiences, important work, but few standardised processes. What to do?" + }, + { + "objectID": "presentations/2024-05-23_github-team-sport/index.html#github-at-the-su", + "href": "presentations/2024-05-23_github-team-sport/index.html#github-at-the-su", + "title": "GitHub as a team sport", + "section": "GitHub at the SU", + "text": "GitHub at the SU\n\n\nWe should be exemplars\nAiming for open by default\nGitHub is on the homepage and there’s a Data Science site\n\n\n\nIt’s not just the DS team.\nWe have many other analysts eager to learn and contribute.\nHow can we set good standards and encourage use across the organisation?\nWe’re running Coffee & Coding sessions, teaching and encouraging talks and blogs on our site.\nWe want to drive up quality by making code open too.\nIt’s a statement of intent that the SU homepage links to our GitHub organisation." }, { - "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#further-reading", - "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#further-reading", - "title": "Coffee and Coding", - "section": "Further reading", - "text": "Further reading\n\nGeocomputation with R\nr-spatial\n{sf} documentation\nLeaflet documentation\nTidy Geospatial Networks in R\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#what-this-is", + "href": "presentations/2024-05-23_github-team-sport/index.html#what-this-is", + "title": "GitHub as a team sport", + "section": "What this is", + "text": "What this is\n\nLow-tech, no code\nTips and etiquette, not directives\nWhat’s been working for us\n\n\n\nBut this is not a technical talk about how to use Git for version control.\nMostly it’s about planning, workflows, standards and communication.\nIt’s things that our team have been doing and the ideas are evolving.\nI’ve worked mostly alone on GitHub projects in my career and never worked in a data science team of even this size. So at worst these slides are a way for me to write down what I’m learning." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-is-data-science", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-is-data-science", - "title": "Travels with R and Python", - "section": "What is data science?", - "text": "What is data science?\n\n“A data scientist knows more about computer science than the average statistician, and more about statistics than the average computer scientist”\n\n(Josh Wills, a former head of data engineering at Slack)" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#the-bus-factor", + "href": "presentations/2024-05-23_github-team-sport/index.html#the-bus-factor", + "title": "GitHub as a team sport", + "section": "The ‘bus factor’ 🚍", + "text": "The ‘bus factor’ 🚍\n\nWe should maintain quality\nWe need redundancy\nStandardised processes can help\n\n\n\nWhy do we care about discussing and ‘formalising’ these ideas?\nWe should encourage standard practices in case someone is ill or away.\nThis also makes it easier when new team members join.\nThis helps us maintain quality." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#drew-conways-famous-venn-diagram", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#drew-conways-famous-venn-diagram", - "title": "Travels with R and Python", - "section": "Drew Conway’s famous Venn diagram", - "text": "Drew Conway’s famous Venn diagram\n\nSource" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#rules", + "href": "presentations/2024-05-23_github-team-sport/index.html#rules", + "title": "GitHub as a team sport", + "section": "‘Rules’", + "text": "‘Rules’\n\nIt’s the spirit that counts\nDo as I say, not as I do\nKnow why you’re breaking the rules\n\n\n\nTo be clear though, nothing here is etched into stone.\nThere will be times where rules can be broken.\nBut we shouldn’t be complacent." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-are-the-skills-of-data-science", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-are-the-skills-of-data-science", - "title": "Travels with R and Python", - "section": "What are the skills of data science?", - "text": "What are the skills of data science?\n\nAnalysis\n\nML\nStats\nData viz\n\nSoftware engineering\n\nProgramming\nSQL/ data\nDevOps\nRAP" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#github-flow", + "href": "presentations/2024-05-23_github-team-sport/index.html#github-flow", + "title": "GitHub as a team sport", + "section": "GitHub flow", + "text": "GitHub flow\n\nCreate a repository\nWrite issues\nPlan\nCreate a branch\nMake a pull request\nReview\nRelease\n\n\n\nThis is a fairly generic GitHub flow.\nI’ll talk through a few things in each of these categories." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-are-the-skills-of-data-science-1", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-are-the-skills-of-data-science-1", - "title": "Travels with R and Python", - "section": "What are the skills of data science?", - "text": "What are the skills of data science?\n\nDomain knowledge\n\nCommunication\nProblem formulation\nDashboards and reports" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#repositories", + "href": "presentations/2024-05-23_github-team-sport/index.html#repositories", + "title": "GitHub as a team sport", + "section": "Repositories", + "text": "Repositories\n\nAssign ‘owner’ and ‘deputy’ roles\nAdd README and .gitignore\nStore data elsewhere\n\n\n\nEasy starter: tell people what the purpose of the repo is and how to use it. This is what a README is for. This is an absolute must to lower the bus factor.\nWe should be prevent accidental file upload immediately. Use a .gitignore to exclude likely data files (as well as other unnecessary files). We’re thinking about common templates/cookiecutters.\nCommunicative files (README, .gitignores) are good, but so is vigilance (code review).\nOwners/deputies are in charge of ‘GitHub gardening’ (keeping issues in order, labelling, milestones, etc).\nDeputies help with bus factor.\nThe owner can be auto-selected as the reviewer. We’re experimenting with this for repos with external contributors, especially.\nData is stored elsewhere, on Azure or Posit Connect, due to sensitivity and size. This should be planned before you begin and recorded in the README." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#stats-and-data-viz", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#stats-and-data-viz", - "title": "Travels with R and Python", - "section": "Stats and data viz", - "text": "Stats and data viz\n\nML leans a bit more towards atheoretical prediction\nStats leans a bit more towards inference (but they both do both)\nData scientists may use different visualisations\n\nInteractive web based tools\nDashboard based visualisers e.g. {stminsights}" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#issues", + "href": "presentations/2024-05-23_github-team-sport/index.html#issues", + "title": "GitHub as a team sport", + "section": "Issues", + "text": "Issues\n\n\n\nAren’t just ‘problems’\nUse labels, including MoSCoW\nExplain the need, be informative\n\n\n\n\n\n\nIssues can be reminders or questions for further discussion, not just features to build.\nTickets should get two labels. We use a topic like ‘enhancement’, ‘bug’, ‘documentation’, ‘techdebt’, etc, plus MoSCoW (must, should, could, won’t) to help prioritisation.\nIssue templates can ensure certain info is provided, which is especially good for external contributors.\nRefer to other related commits by number (e.g. #1), which stops you repeating the same information.\nPrefer to reopen an issue if it doesn’t actually work.\nIssues can track separate sub-issues.\nYou can add checklists with markdown checkbox: - [ ] (these appear in the issue preview).\nYou can ‘hide’ comments if they’re out of date, etc." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#software-engineering", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#software-engineering", - "title": "Travels with R and Python", - "section": "Software engineering", - "text": "Software engineering\n\nProgramming\n\nNo/ low code data science?\n\nSQL/ data\n\nTend to use reproducible automated processes\n\nDevOps\n\nPlan, code, build, test, release, deploy, operate, monitor\n\nRAP\n\nI will come back to this" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#plan", + "href": "presentations/2024-05-23_github-team-sport/index.html#plan", + "title": "GitHub as a team sport", + "section": "Plan", + "text": "Plan\n\n\nTalk, review and reflect\nUse labels to prioritise\nSort into milestones\n\n\n\nWe have a repo and issues, what do we do now? Where to start?\nWe’ve begun working in sprints of about 4 weeks. We have sprint planning meetings to plan things out.\nConsider what needs to be done in the sprint period, what other issues support those goals?\nIs there time for other tasks, like clearing techdebt?\nAll issues should be assigned to a milestone.\nIssues in milestones should be sorted in priority order/order of expected completion (MoSCoW labels will help with this).\nThis helps focus the goals of the sprint and keep us on track." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#domain-knowledge", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#domain-knowledge", - "title": "Travels with R and Python", - "section": "Domain knowledge", - "text": "Domain knowledge\n\nDo stuff that matters\n\nThe best minds of my generation are thinking about how to make people click ads. That sucks. Jeffrey Hammerbacher\n\nConvince other people that it matters\nThis is the hardest part of data science" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#branches", + "href": "presentations/2024-05-23_github-team-sport/index.html#branches", + "title": "GitHub as a team sport", + "section": "Branches", + "text": "Branches\n\n\nOne issue, one branch, one assigned person\nName them sensibly\nBurn them\n\n\n\nOnly one person works on a branch at a time. This person is the one assigned to the relevant issue.\nBranch names should be numbered to match their issue, e.g. ‘123-add-filter’. This makes it obvious what issue is being fixed by that branch and should help identify if more than one person has a branch open for the same issue.\nIf commits from someone else are required, then all parties must communicate about the current state of the branch to ensure they pull changes and avoid merge conflicts.\nBranches are ephemeral and die when the PR is merged. They should be deleted (this can be done automatically).\nThe only branches to exist at all times should be main and a deployment branch, if necessary. All others should be active branches so it’s clear what’s being worked on." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#rap", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#rap", - "title": "Travels with R and Python", - "section": "RAP", - "text": "RAP\n\nData science isn’t RAP\nRAP isn’t data science\nThey are firm friends" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#commits", + "href": "presentations/2024-05-23_github-team-sport/index.html#commits", + "title": "GitHub as a team sport", + "section": "Commits", + "text": "Commits\n\n\nDon’t commit to main!\n‘Small, early and often’\nMake messages meaningful\n\n\n\nThere’s not a lot of earth-shattering advice to give here; this stuff is fairly standard.\nDo not commit directly to main. Your work must be independently checked first to limit the chance of mistakes.\nMake your commits small in terms of code and files touched, if possible. This makes the Git history easier to read and makes reviews easier too.\nCommit and push early and often into your branch. This can help others see progress and helps reduce the bus factor.\nDon’t dump your work into a commit because it’s the end of the day.\nMake your commit messages meaningful. What does the commit do? Start with a verb in present tense (‘adds’, not ‘added’). Or maybe use ‘conventional’ commits." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#reproducibility", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#reproducibility", - "title": "Travels with R and Python", - "section": "Reproducibility", - "text": "Reproducibility\n\nReproducibility in science\nThe $6B spreadsheet error\nGeorge Osbourne’s austerity was based on a spreadsheet error\nFor us, reproducibility also means we can do the same analysis 50 times in one minute\n\nWhich is why I started down the road of data science" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#pull-requests-prs", + "href": "presentations/2024-05-23_github-team-sport/index.html#pull-requests-prs", + "title": "GitHub as a team sport", + "section": "Pull requests (PRs)", + "text": "Pull requests (PRs)\n\n\n\nSmall and closes an issue\nSelect the assignee and reviewer\nThe assignee merges\n\n\n\n\n\n\nPRs should solve the issue they’re related to. Occasionally one fix may solve another.\nThey should be named to explain what they do. The issue might be ‘the red button doesn’t work’; the PR might be ‘fix the red button’.\nThey should be small in terms of lines of code and files touched. This will make it easier and faster to understand and assess the changes.\nThe submitter should mark themself as the ‘assignee’ and choose a reviewer. You may want to chat with the reviewer to let them know if they have time.\nFor context, link to the issue(s) being closed with the magic words (‘closes’, ‘fixes’, etc), which will also close those issues as completed.\nInclude a short explanation or bullet-points of what the PR does. Provide any extra information to make the reviewer’s life easier (areas of focus, maybe) or to ask a question about some aspect of what you’ve written.\nThe PR submitter is the one who clicks the merge button. This is in case the submitter realises there’s something they need to add or change before the merge." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-is-rap", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-is-rap", - "title": "Travels with R and Python", - "section": "What is RAP", - "text": "What is RAP\n\na process in which code is used to minimise manual, undocumented steps, and a clear, properly documented process is produced in code which can reliably give the same result from the same dataset\nRAP should be:\n\n\nthe core working practice that must be supported by all platforms and teams; make this a core focus of NHS analyst training\n\n\nGoldacre review" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#reviewing-prs", + "href": "presentations/2024-05-23_github-team-sport/index.html#reviewing-prs", + "title": "GitHub as a team sport", + "section": "Reviewing PRs", + "text": "Reviewing PRs\n\n\n\nBe helpful, be kind\nUse GitHub suggestions\nDiscuss if unclear\n\n\n\n\n\n\nThe reviewer should typically check that the changes result in the issue being fixed. This may require pulling the branch and then testing it, but may not be necessary for small changes.\nThe reviewer should seek clarification and add comments where something isn’t clear.\nUse ‘suggestions’ as a reviewer rather than committing to someone else’s branch.\nWhen working at pace (when aren’t we?), we should err towards approval if the issue is completed rather than an endless cycle of asking for small changes. The submitter and reviewer should decide whether smaller things like code style or change in approach should be added as a new issue with a ‘techdebt’ label." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--baseline", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--baseline", - "title": "Travels with R and Python", - "section": "Levels of RAP- Baseline", - "text": "Levels of RAP- Baseline\n\nData produced by code in an open-source language (e.g., Python, R, SQL)\nCode is version controlled\nRepository includes a README.md file that clearly details steps a user must follow to reproduce the code\nCode has been peer reviewed\nCode is published in the open and linked to & from accompanying publication (if relevant)\n\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#releases", + "href": "presentations/2024-05-23_github-team-sport/index.html#releases", + "title": "GitHub as a team sport", + "section": "Releases", + "text": "Releases\n\nUse semantic versioning (1.2.3)\nAutofill notes with PR names\nDon’t release on a Friday 🙃\n\n\n\nTag the history and release on GitHub concurrently to keep them in sync (this is done automatically if the release is done from the GitHub interface).\nSemantic (x.y.z where x is breaking, y is new features and z is patches for bugs).\nWe typically just autofill the release description with the constituent PR titles. Which means it’s important to give them meaningful names.\nWe align releases with sprints, though patches may occur more frequently.\nWe link releases to deployment in many cases. Don’t release to prod on a Friday, lol." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--silver", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--silver", - "title": "Travels with R and Python", - "section": "Levels of RAP- Silver", - "text": "Levels of RAP- Silver\n\nCode is well-documented…\nCode is well-organised following standard directory format\nReusable functions and/or classes are used where appropriate\nPipeline includes a testing framework\nRepository includes dependency information (e.g. requirements.txt, PipFile, environment.yml)\nData is handled and output in a Tidy data format\n\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#github-is-a-team-member", + "href": "presentations/2024-05-23_github-team-sport/index.html#github-is-a-team-member", + "title": "GitHub as a team sport", + "section": "GitHub is a team member", + "text": "GitHub is a team member\n\n\nAutomate with Actions\nProvide issue and repo templates\nAn all-in-one planner?\n\n\n\nI lied: we have 6 human team members. GitHub itself has features that can automate away some boring things and help prevent accidents or forgetfulness.\nGitHub Actions for continuous integration. R-CMD check at least for R projects. Start with r-lib examples as a basis.\nWe’re looking towards things like templates at the issue and repo levels; again to remove drudgery.\nWe use Trello to plan things and have to link to GitHub repos and issues in Trello cards. Can we use GitHub as our planner across multiple repos instead? Seems possible." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--gold", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--gold", - "title": "Travels with R and Python", - "section": "Levels of RAP- Gold", - "text": "Levels of RAP- Gold\n\nCode is fully packaged\nRepository automatically runs tests etc. via CI/CD or a different integration/deployment tool e.g. GitHub Actions\nProcess runs based on event-based triggers (e.g., new data in database) or on a schedule\nChanges to the RAP are clearly signposted. E.g. a changelog in the package, releases etc. (See gov.uk info on Semantic Versioning)\n\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#are-we-curling", + "href": "presentations/2024-05-23_github-team-sport/index.html#are-we-curling", + "title": "GitHub as a team sport", + "section": "Are we curling? 🥌", + "text": "Are we curling? 🥌\n\n\nWe:\n\nare a small team\nassume specialist roles\nwork in sync\n\n\n\n\n\n\nYou have been wondering: if this is a ‘team sport’, what sport is it?\nThis is a terrible metaphor. But think about it." }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#data-science-in-healthcare", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#data-science-in-healthcare", - "title": "Travels with R and Python", - "section": "Data science in healthcare", - "text": "Data science in healthcare\n\nForecasting\n\nStats versus ML\n\nText mining\n\nR versus Python\n\nDemand modelling\n\nDevOps as a way of life" + "objectID": "presentations/2024-05-23_github-team-sport/index.html#the-bottom-line-actually", + "href": "presentations/2024-05-23_github-team-sport/index.html#the-bottom-line-actually", + "title": "GitHub as a team sport", + "section": "The bottom line, actually", + "text": "The bottom line, actually\n\n\n\n\n\nCommunicate\nHelp each other\nBe kind\n\n\n\n\nThe ideas in this talk are things that have helped us, and could help you, to drive up and maintain quality. Some were obvious, some were specific features you might not have known about.\nBut none of these are replacements for being good team members.\nGitHub just provides some affordances to help you.\nI am the guy falling over, the stones are tasks, my team mates are picking me up and dusting me off.\nDid you learn at least one thing? What has your team been doing? What works for you?\n\n\n\n\n\nLearn more about The Strategy Unit" }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#get-involved", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#get-involved", - "title": "Travels with R and Python", - "section": "Get involved!", - "text": "Get involved!\n\nNHS-R community\n\nWebinars, training, conference, Slack\n\nNHS Pycom\n\nditto…\n\nMLCSU GitHub?\nBuild links with the other CSUs" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#why", + "href": "presentations/2024-05-16_store-data-safely/index.html#why", + "title": "Store Data Safely", + "section": "Why?", + "text": "Why?\nBecause:\n\ndata may be sensitive\nGitHub was designed for source control of code\nGitHub has repository file-size limits\nit makes data independent from code\nit prevents repetition" }, { - "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#contact", - "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#contact", - "title": "Travels with R and Python", - "section": "Contact", - "text": "Contact\n\n\n\n\n strategy.unit@nhs.net\n The-Strategy-Unit\n\n\n\n\n\n chris.beeley1@nhs.net\n chrisbeeley\n\n\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#other-approaches", + "href": "presentations/2024-05-16_store-data-safely/index.html#other-approaches", + "title": "Store Data Safely", + "section": "Other approaches", + "text": "Other approaches\nTo prevent data commits:\n\nuse a .gitignore file (*.csv, etc)\nuse Git hooks\navoid ‘add all’ (git add .) when staging\nensure thorough reviews of (small) pull-requests" }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#welcome-to-coffee-and-coding", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#welcome-to-coffee-and-coding", - "title": "Coffee and coding", - "section": "Welcome to coffee and coding", - "text": "Welcome to coffee and coding\n\nProject demos, showcasing work from a particular project\nMethod demos, showcasing how to use a particular method/tool/package\nSurgery and problem solving sessions\nDefining code standards and SOP" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#what-if-i-committed-data", + "href": "presentations/2024-05-16_store-data-safely/index.html#what-if-i-committed-data", + "title": "Store Data Safely", + "section": "What if I committed data?", + "text": "What if I committed data?\n‘It depends’, but if it’s sensitive:\n\n‘undo’ the commit with git reset\nuse a tool like BFG to expunge the file from Git history\ndelete the repo and restart 🔥\n\nA data security breach may have to be reported." }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#what-are-we-trying-to-achieve", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#what-are-we-trying-to-achieve", - "title": "Coffee and coding", - "section": "What are we trying to achieve?", - "text": "What are we trying to achieve?\n\nLegibility\nReproducibility\nAccuracy\nLaziness" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#data-hosting-solutions", + "href": "presentations/2024-05-16_store-data-safely/index.html#data-hosting-solutions", + "title": "Store Data Safely", + "section": "Data-hosting solutions", + "text": "Data-hosting solutions\nWe’ll talk about two main options for The Strategy Unit:\n\nPosit Connect and the {pins} package\nAzure Data Storage\n\nWhich to use? It depends." }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#what-are-some-of-the-fundamental-principles", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#what-are-some-of-the-fundamental-principles", - "title": "Coffee and coding", - "section": "What are some of the fundamental principles?", - "text": "What are some of the fundamental principles?\n\nPredictability, reducing mental load, and reducing truck factor\nMaking it easy to collaborate with yourself and others on different computers, in the cloud, in six months’ time…\nDRY" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#a-platform-by-posit", + "href": "presentations/2024-05-16_store-data-safely/index.html#a-platform-by-posit", + "title": "Store Data Safely", + "section": "A platform by Posit", + "text": "A platform by Posit\n\n\nhttps://connect.strategyunitwm.nhs.uk/" }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#what-is-rap", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#what-is-rap", - "title": "Coffee and coding", - "section": "What is RAP", - "text": "What is RAP\n\na process in which code is used to minimise manual, undocumented steps, and a clear, properly documented process is produced in code which can reliably give the same result from the same dataset\nRAP should be:\n\n\nthe core working practice that must be supported by all platforms and teams; make this a core focus of NHS analyst training\n\nGoldacre review" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#a-package-by-posit", + "href": "presentations/2024-05-16_store-data-safely/index.html#a-package-by-posit", + "title": "Store Data Safely", + "section": "A package by Posit", + "text": "A package by Posit\n\n\nhttps://pins.rstudio.com/" }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#the-road-to-rap", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#the-road-to-rap", - "title": "Coffee and coding", - "section": "The road to RAP", - "text": "The road to RAP\n\nWe’re roughly using NHS Digital’s RAP stages\nThere is an incredibly large amount to learn!\nConfession time! (everything I do not know…)\nYou don’t need to do it all at once\nYou don’t need to do it all at all ever\nEach thing you learn will incrementally help you\nRemember- that’s why we learnt this stuff. Because it helped us. And it can help you too" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#basic-approach", + "href": "presentations/2024-05-16_store-data-safely/index.html#basic-approach", + "title": "Store Data Safely", + "section": "Basic approach", + "text": "Basic approach\ninstall.packages(\"pins\")\nlibrary(pins)\n\nboard_connect()\npin_write(board, data, \"pin_name\")\npin_read(board, \"user_name/pin_name\")" }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--baseline", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--baseline", - "title": "Coffee and coding", - "section": "Levels of RAP- Baseline", - "text": "Levels of RAP- Baseline\n\nData produced by code in an open-source language (e.g., Python, R, SQL).\nCode is version controlled (see Git basics and using Git collaboratively guides).\nRepository includes a README.md file (or equivalent) that clearly details steps a user must follow to reproduce the code\nCode has been peer reviewed.\nCode is published in the open and linked to & from accompanying publication (if relevant).\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#live-demo", + "href": "presentations/2024-05-16_store-data-safely/index.html#live-demo", + "title": "Store Data Safely", + "section": "Live demo", + "text": "Live demo\n\nLink RStudio to Posit Connect (authenticate)\nConnect to the board\nWrite a new pin\nCheck pin status and details\nPin versions\nUse pinned data\nUnpin your pin" }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--silver", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--silver", - "title": "Coffee and coding", - "section": "Levels of RAP- Silver", - "text": "Levels of RAP- Silver\n\nCode is well-documented…\nCode is well-organised following standard directory format\nReusable functions and/or classes are used where appropriate\nPipeline includes a testing framework\nRepository includes dependency information (e.g. requirements.txt, PipFile, environment.yml\nData is handled and output in a Tidy data format\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#should-i-use-it", + "href": "presentations/2024-05-16_store-data-safely/index.html#should-i-use-it", + "title": "Store Data Safely", + "section": "Should I use it?", + "text": "Should I use it?\n\n\n⚠️ {pins} is not great because:\n\nyou should not upload sensitive data!\nthere’s a file-size upload limit\npin organisation is a bit awkward (no subfolders)\n\n\n{pins} is helpful because:\n\nauthentication is straightforward\ndata can be versioned\nyou can control permissions\nthere are R and Python versions of the package" }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--gold", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--gold", - "title": "Coffee and coding", - "section": "Levels of RAP- Gold", - "text": "Levels of RAP- Gold\n\nCode is fully packaged\nRepository automatically runs tests etc. via CI/CD or a different integration/deployment tool e.g. GitHub Actions\nProcess runs based on event-based triggers (e.g., new data in database) or on a schedule\nChanges to the RAP are clearly signposted. E.g. a changelog in the package, releases etc. (See gov.uk info on Semantic Versioning)\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#what-is-azure-data-storage", + "href": "presentations/2024-05-16_store-data-safely/index.html#what-is-azure-data-storage", + "title": "Store Data Safely", + "section": "What is Azure Data Storage?", + "text": "What is Azure Data Storage?\nMicrosoft cloud storage for unstructured data or ‘blobs’ (Binary Large Objects): data objects in binary form that do not necessarily conform to any file format.\nHow is it different?\n\nNo hierarchy – although you can make pseudo-‘folders’ with the blobnames.\nAuthenticates with your Microsoft account." + }, + { + "objectID": "presentations/2024-05-16_store-data-safely/index.html#authenticating-to-azure-data-storage", + "href": "presentations/2024-05-16_store-data-safely/index.html#authenticating-to-azure-data-storage", + "title": "Store Data Safely", + "section": "Authenticating to Azure Data Storage", + "text": "Authenticating to Azure Data Storage\n\nYou are all part of the “strategy-unit-analysts” group; this gives you read/write access to specific Azure storage containers.\nYou can store sensitive information like the container ID in a local .Renviron or .env file that should be ignored by git.\nUsing {AzureAuth}, {AzureStor} and your credentials, you can connect to the Azure storage container, upload files and download them, or read the files directly from storage!" }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#a-learning-journey-to-get-us-there", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#a-learning-journey-to-get-us-there", - "title": "Coffee and coding", - "section": "A learning journey to get us there", - "text": "A learning journey to get us there\n\nCode style, organising your files\nFunctions and iteration\nGit and GitHub\nPackaging your code\nTesting\nPackage management and versioning" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#step-1-load-your-environment-variables", + "href": "presentations/2024-05-16_store-data-safely/index.html#step-1-load-your-environment-variables", + "title": "Store Data Safely", + "section": "Step 1: load your environment variables", + "text": "Step 1: load your environment variables\nStore sensitive info in an .Renviron file that’s kept out of your Git history! The info can then be loaded in your script.\n.Renviron:\nAZ_STORAGE_EP=https://STORAGEACCOUNT.blob.core.windows.net/\nScript:\nep_uri <- Sys.getenv(\"AZ_STORAGE_EP\")\nTip: reload .Renviron with readRenviron(\".Renviron\")" }, { - "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#how-we-can-help-each-other-get-there", - "href": "presentations/2023-02-23_coffee-and-coding/index.html#how-we-can-help-each-other-get-there", - "title": "Coffee and coding", - "section": "How we can help each other get there", - "text": "How we can help each other get there\n\nWork as a team!\nCoffee and coding!\nAsk for help!\nDo pair coding!\nGet your code reviewed!\nJoin the NHS-R/ NHSPycom communities\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#step-1-load-your-environment-variables-1", + "href": "presentations/2024-05-16_store-data-safely/index.html#step-1-load-your-environment-variables-1", + "title": "Store Data Safely", + "section": "Step 1: load your environment variables", + "text": "Step 1: load your environment variables\nIn the demo script we are providing, you will need these environment variables:\nep_uri <- Sys.getenv(\"AZ_STORAGE_EP\")\napp_id <- Sys.getenv(\"AZ_APP_ID\")\ncontainer_name <- Sys.getenv(\"AZ_STORAGE_CONTAINER\")\ntenant <- Sys.getenv(\"AZ_TENANT_ID\")" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#why", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#why", - "title": "Repeating Yourself with Functions", - "section": "Why?", - "text": "Why?\n\nForecasting project, need to do the same thing with data for 6 centres.\nCopy-paste runs risk of not doing the same thing each time (and boring/time-consuming/frustrating).\nRepetition –> function." + "objectID": "presentations/2024-05-16_store-data-safely/index.html#step-2-authenticate-with-azure", + "href": "presentations/2024-05-16_store-data-safely/index.html#step-2-authenticate-with-azure", + "title": "Store Data Safely", + "section": "Step 2: Authenticate with Azure", + "text": "Step 2: Authenticate with Azure\n\n\ntoken <- AzureAuth::get_azure_token(\n \"https://storage.azure.com\",\n tenant = tenant,\n app = app_id,\n auth_type = \"device_code\",\n)\nThe first time you do this, you will have link to authenticate in your browser and a code in your terminal to enter. Use the browser that works best with your @mlcsu.nhs.uk account!" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#what", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#what", - "title": "Repeating Yourself with Functions", - "section": "What?", - "text": "What?\n\n\nDemo with plots, equally applicable to ‘doing stuff’ with data.\n\n\n# preview data\nhead(new_rtt)\n\n provider_code count rtt_yrmon rtt_mon\n1 RJE 83 Nov 2015 11\n2 RJE 75 Dec 2015 12\n3 RJE 82 Jan 2016 1\n4 RJE 74 Feb 2016 2\n5 RJE 62 Mar 2016 3\n6 RJE 76 Apr 2016 4\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nRemember, this is about writing functions, not creating stunning visualisations!\n\n\n\nRepeat this for each of the 6 centres" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#step-3-connect-to-container", + "href": "presentations/2024-05-16_store-data-safely/index.html#step-3-connect-to-container", + "title": "Store Data Safely", + "section": "Step 3: Connect to container", + "text": "Step 3: Connect to container\nendpoint <- AzureStor::blob_endpoint(ep_uri, token = token)\ncontainer <- AzureStor::storage_container(endpoint, container_name)\n\n# List files in container\nblob_list <- AzureStor::list_blobs(container)\nIf you get 403 error, delete your token and re-authenticate, try a different browser/incognito, etc.\nTo clear Azure tokens: AzureAuth::clean_token_directory()" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#how", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#how", - "title": "Repeating Yourself with Functions", - "section": "How?", - "text": "How?\nDo it ‘normally’ for one centre. What are the parameters to change?\n\n\np1 <- new_rtt |> \n filter(provider_code == \"RJE\") |> \n ggplot(aes(x = rtt_yrmon, y = count)) +\n geom_line() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(title = \"RJE\",\n subtitle = \"time trend of new referrals\")\n\np2 <- new_rtt |> \n filter(provider_code == \"RJE\") |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = \"monthly pattern of new referrals\")\n\nplots <- ggarrange(p1, p2, nrow = 2)\n\nplots\n\n\n\n\n\n\n\n\n\n\nThis becomes the argument for the function.\nChoose a name for the argument (!= variable_name)\nIn this example we will use prov in place of \"RJE\"\n\n\n\nPlease remember, this is about writing functions, not creating stunning visualisations!" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container", + "href": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container", + "title": "Store Data Safely", + "section": "Interact with the container", + "text": "Interact with the container\nIt’s possible to interact with the container via your browser!\nYou can upload and download files using the Graphical User Interface (GUI), login with your @mlcsu.nhs.uk account: https://portal.azure.com/#home\nAlthough it’s also cooler to interact via code… 😎" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#anatomy-of-a-function", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#anatomy-of-a-function", - "title": "Repeating Yourself with Functions", - "section": "Anatomy of a Function", - "text": "Anatomy of a Function\n\nfn_name <- function(arguments){\n \n # do stuff\n \n}\n\nRun the function with fn_name(parameter as argument)" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container-1", + "href": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container-1", + "title": "Store Data Safely", + "section": "Interact with the container", + "text": "Interact with the container\n# Upload contents of a local directory to container\nAzureStor::storage_multiupload(\n container,\n \"LOCAL_FOLDERNAME/*\",\n \"FOLDERNAME_ON_AZURE\"\n)\n\n# Upload specific file to container\nAzureStor::storage_upload(\n container,\n \"data/ronald.jpeg\",\n \"newdir/ronald.jpeg\"\n)" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#turning-our-code-into-a-function", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#turning-our-code-into-a-function", - "title": "Repeating Yourself with Functions", - "section": "Turning our code into a function", - "text": "Turning our code into a function\n\n\n\np1 <- new_rtt |> \n filter(provider_code == \"RJE\") |> \n ggplot(aes(x = rtt_yrmon, y = count)) +\n geom_line() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(title = \"RJE\",\n subtitle = \"time trend of new referrals\")\n\np2 <- new_rtt |> \n filter(provider_code == \"RJE\") |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = \"monthly pattern of new referrals\")\n\nplots <- ggarrange(p1, p2, nrow = 2)\n\nplots\n\n\n\nfn_plots <- function(prov){\n \n p1 <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = rtt_yrmon, y = count)) +\n geom_line() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(title = prov,\n subtitle = \"time trend of new referrals\")\n \n p2 <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = \"monthly pattern of new referrals\")\n \n plots <- ggarrange(p1, p2, nrow = 2)\n \n plots\n \n}" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#load-csv-files-directly-from-azure-container", + "href": "presentations/2024-05-16_store-data-safely/index.html#load-csv-files-directly-from-azure-container", + "title": "Store Data Safely", + "section": "Load csv files directly from Azure container", + "text": "Load csv files directly from Azure container\ndf_from_azure <- AzureStor::storage_read_csv(\n container,\n \"newdir/cats.csv\",\n show_col_types = FALSE\n)\n\n# Load file directly from Azure container (by storing it in memory)\n\nparquet_in_memory <- AzureStor::storage_download(\n container, src = \"newdir/cats.parquet\", dest = NULL\n)\n\nparq_df <- arrow::read_parquet(parquet_in_memory)" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#running-our-function", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#running-our-function", - "title": "Repeating Yourself with Functions", - "section": "Running our function", - "text": "Running our function\n\n\n\nfn_plots <- function(prov){\n \n p1 <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = rtt_yrmon, y = count)) +\n geom_line() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(title = prov,\n subtitle = \"time trend of new referrals\")\n \n p2 <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = \"monthly pattern of new referrals\")\n \n plots <- ggarrange(p1, p2, nrow = 2)\n \n plots\n \n}\n\n\n\nfn_plots(\"RKB\")" + "objectID": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container-2", + "href": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container-2", + "title": "Store Data Safely", + "section": "Interact with the container", + "text": "Interact with the container\n# Delete from Azure container (!!!)\nAzureStor::delete_storage_file(container, BLOB_NAME)" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#what-if-we-want-more-than-one-argument", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#what-if-we-want-more-than-one-argument", - "title": "Repeating Yourself with Functions", - "section": "What if we want more than one argument?", - "text": "What if we want more than one argument?\nEasy! Just add them to the arguments when you define the function.\nIf I wanted to run this function on multiple dataframes I would change the function to:\n\nfn_plots <- function(df, prov){\n \n p1 <- df |> \n filter(provider_code == prov) \n # and the rest as before\n}\n\nand run it with fn_plots(new_rtt, \"RKB\").\nNote that the order of entering the parameters is important. If I tried to run fn_plots(\"RKB\", new_rtt) it would look for a dataframe called \"RKB\" and a provider called new_rtt." + "objectID": "presentations/2024-05-16_store-data-safely/index.html#what-does-this-achieve", + "href": "presentations/2024-05-16_store-data-safely/index.html#what-does-this-achieve", + "title": "Store Data Safely", + "section": "What does this achieve?", + "text": "What does this achieve?\n\nData is not in the repository, it is instead stored in a secure location\nCode can be open – sensitive information like Azure container name stored as environment variables\nLarge filesizes possible, other people can also access the same container.\nNaming conventions can help to keep blobs organised (these create pseudo-folders)\n\n\n\n\nLearn more about Data Science at The Strategy Unit" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#working-through-a-list-of-parameters", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#working-through-a-list-of-parameters", - "title": "Repeating Yourself with Functions", - "section": "Working through a list of parameters", - "text": "Working through a list of parameters\nAvoid manually running fn_plots() for each provider.\nUse purrr::map to iterate over a list\n\n\n# create a vector of all the providers\nprov_labels <- c(\"RJE\", \"RKB\", \"RL4\", \"RRK\", \"RWE\", \"RX1\")\n\nmap(prov_labels, ~ fn_plots(.x))\n\n\n[[1]]\n\n\n\n\n\n\n\n\n\n\n[[2]]\n\n\n\n\n\n\n\n\n\n\n[[3]]\n\n\n\n\n\n\n\n\n\n\n[[4]]\n\n\n\n\n\n\n\n\n\n\n[[5]]\n\n\n\n\n\n\n\n\n\n\n[[6]]" + "objectID": "presentations/2023-03-23_collaborative-working/index.html#introduction", + "href": "presentations/2023-03-23_collaborative-working/index.html#introduction", + "title": "Collaborative working", + "section": "Introduction", + "text": "Introduction\n\nThis is definitely an art and not a science\nI do not claim to have all, or even most of, the answers\nHow you use these tools is way more important than the tools themselves\nThis is a culture and not a technique" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#troubleshooting---does-the-function-work", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#troubleshooting---does-the-function-work", - "title": "Repeating Yourself with Functions", - "section": "Troubleshooting - does the function work?", - "text": "Troubleshooting - does the function work?\nCrawl before you can walk - make sure fn_plot() works for one parameter.\nInsert browser() into the function while testing - steps into the function (don’t forget to remove it when it works!)\n\n\nThis is a new function that will save each time-trend plot\n\nfn_save_plot <- function(prov){\n \n p <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = paste0(prov, \" - monthly pattern of new referrals\"))\n \n ggsave(paste0(prov, \"_plot.png\"), \n plot = p)\n \n}\n\n\n \n\n\n\nCheck out Shannon Pileggi’s slides for more options" + "objectID": "presentations/2023-03-23_collaborative-working/index.html#costs", + "href": "presentations/2023-03-23_collaborative-working/index.html#costs", + "title": "Collaborative working", + "section": "Costs", + "text": "Costs\n\nDelay and time\nStress and disagreement\nCommittee thinking\nLearning and effort" }, { - "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#troubleshooting---does-it-walk-the-walk", - "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#troubleshooting---does-it-walk-the-walk", - "title": "Repeating Yourself with Functions", - "section": "Troubleshooting - does it walk the walk?", - "text": "Troubleshooting - does it walk the walk?\nWhen learning to walk, use safely() or possibly() in your walk function - it will indicate if any parameters have failed, rather than just fall down.\n\n\n\n# wrap fn_plots in safely\nsafe_pl <- safely(.f = fn_save_plot)\n\nmap(prov_labels, ~ safe_pl(.x))\n\n\n# wrap fn_plots in possibly\nposs_pl <- possibly(.f = fn_save_plot)\n\nmap(prov_labels, ~ poss_pl(.x))\n\n\nConsole output of wrapping function in possibly\n\n\n\n\n\nThis is my first attempt at a Quarto presentation!" + "objectID": "presentations/2023-03-23_collaborative-working/index.html#benefits", + "href": "presentations/2023-03-23_collaborative-working/index.html#benefits", + "title": "Collaborative working", + "section": "Benefits", + "text": "Benefits\n\n“From each according to their ability”\nLearning\nReproducibility and reduced truck factor\nFun!" }, { - "objectID": "presentations/2024-09-05_earl-nhp/index.html#the-new-hospital-programme-nhp", - "href": "presentations/2024-09-05_earl-nhp/index.html#the-new-hospital-programme-nhp", - "title": "Using R and Python to model future hospital activity", - "section": "The New Hospital Programme (NHP)", - "text": "The New Hospital Programme (NHP)\n\n\n\nA manifesto commitment\nFuture activity must be modelled\nNeed consistency across schemes\n\n\n\n\n\n\nBuilding new hospitals - replacing crumbling infrastructure in some cases, completely new builds in others.\nIt’s important to size the hospitals according to the type and quantity of activity there will be in the future.\nThere are many proprietary black box models in use for estimating healthcare activity in the future - no consistency, difficult to compare results\nStrategy Unit was asked to develop a model to be used across all of the builds: a model owned and operated by the NHS, for the NHS.\n\n\n::::" + "objectID": "presentations/2023-03-23_collaborative-working/index.html#github-as-an-organising-principle-behind-work", + "href": "presentations/2023-03-23_collaborative-working/index.html#github-as-an-organising-principle-behind-work", + "title": "Collaborative working", + "section": "GitHub as an organising principle behind work", + "text": "GitHub as an organising principle behind work\n\nA project is just a set of milestones\nA milestone is just a set of issues\nAn issue is just a set of commits\nA commit is just text added and removed" }, { - "objectID": "presentations/2024-09-05_earl-nhp/index.html#model-process", - "href": "presentations/2024-09-05_earl-nhp/index.html#model-process", - "title": "Using R and Python to model future hospital activity", - "section": "", - "text": "A probabilistic Monte Carlo simulation that:\n\nTakes hospital activity from a baseline year, using NHS England’s Hospital Episode Statistics (HES) data\nApplies variables that:\n\nare outside of our control (e.g. population changes, using ONS projections)\ncan reduce hospital activity (mitigators, e.g. virtual wards or teleappointments)\n\nForecasts future demand based on these variables, outputting probabilistic predictive intervals" + "objectID": "presentations/2023-03-23_collaborative-working/index.html#the-repo-owner", + "href": "presentations/2023-03-23_collaborative-working/index.html#the-repo-owner", + "title": "Collaborative working", + "section": "The repo owner", + "text": "The repo owner\n\nReview milestones\nReview issues\n\nDiscuss the issue on the issue- NOT on email!\n\nReview pull requests and get your pull requests reviewed!" }, { - "objectID": "presentations/2024-09-05_earl-nhp/index.html#our-challenges", - "href": "presentations/2024-09-05_earl-nhp/index.html#our-challenges", - "title": "Using R and Python to model future hospital activity", - "section": "Our challenges", - "text": "Our challenges\n\n28 hospitals currently using the model\nModel is being developed whilst in production\nModel is very complex - technically, and for end users\n\n\n\nHospitals are actively using the model while it is still in development, which can be tricky\nDataset is massive for each hospital - hundreds and thousands of rows - all activity for a hospital trust in one year\nModel can accommodate hundreds of different variables, understanding and setting these can be challenging for end users\nWe have comprehensive, openly available documentation and also a team of Model Relationship Managers to help address this" + "objectID": "presentations/2023-03-23_collaborative-working/index.html#asynchronous-communication", + "href": "presentations/2023-03-23_collaborative-working/index.html#asynchronous-communication", + "title": "Collaborative working", + "section": "Asynchronous communication", + "text": "Asynchronous communication\n\nInvolve others before you pull request\nInvolve others when you pull request\nRead issues!\nComment on issues!\nFile issues- suggestions/ bug reports/ questions\n\nNOT in emails" }, { - "objectID": "presentations/2024-09-05_earl-nhp/index.html#tools-and-platforms", - "href": "presentations/2024-09-05_earl-nhp/index.html#tools-and-platforms", - "title": "Using R and Python to model future hospital activity", - "section": "Tools and platforms", - "text": "Tools and platforms\n\nData pipelines: {targets} , SQL \nModel: Python , Docker \nApps: {shiny} and {golem} , Posit Connect \nInfrastructure and storage: Azure \nDocumentation: Quarto \nVersion control and collaboration: Git , GitHub \n\n\n\nSo how did we solve the problem?\nHere’s a rundown of the tools and platforms that we use.\nThe data pipeline is orchestrated by {targets} for its recipe-like format and so we re-run only what needs re-running.\nThe model is built in Python and involves a lot of pandas DataFrame manipulations.\nWe use Azure for storage of model input data and JSON files of results.\nUsers input model paramters in one Shiny app and view results in another. This uses modules and {golem} for its package focus, as well as {bs4Dash}. We have development and productino environments.\nWe have a deployed Quarto website that contains the documentation for the whole project.\nIn general, we’re following the principles of Reproducible Analytical Pipelines (RAP) in everything we do.\nAll originally written by Tom.\nAs the team has grown we have shared responsibilities: YiWen in Python, Matt with Shiny, Tom as technical lead." + "objectID": "presentations/2023-03-23_collaborative-working/index.html#asynchronous-work", + "href": "presentations/2023-03-23_collaborative-working/index.html#asynchronous-work", + "title": "Collaborative working", + "section": "Asynchronous work", + "text": "Asynchronous work\n\nEvery piece of work has an issues associated with it\nEvery piece of work associated with an issue lives on its own branch\nEvery branch is incorporated to the main repo by a pull request\nEvery pull request is reviewed" }, { - "objectID": "presentations/2024-09-05_earl-nhp/index.html#structure", - "href": "presentations/2024-09-05_earl-nhp/index.html#structure", - "title": "Using R and Python to model future hospital activity", - "section": "", - "text": "This is a simplified overview of the structure and flow of information through the system.\nThe full structure is quite complex, reflecting the complexity of user needs and the scale of the task.\nData from our database is processed and stored in Azure Storage Containers via a targets pipeline. Additional data, like ONS population projections, are also stored.\nThe users interact with a Shiny app to set their input parameters. The app provides some contextual information derived from the data held in Azure. Users click a button to run the model.\nThe model is deployed as a Docker container in Azure Continer Instances, triggered by an API call.\nThe model results are stored as JSON in an Azure container, ready for collection and presentation in an outputs app.\nUsers can view charts and tables and download files for further analysis.\nSo there’s clear front- and backends and we have\nFurther complexity is added by the need to process and present information despite changes to the model over time.\nWe use development and production environments for our apps to help reduce errors." + "objectID": "presentations/2023-03-23_collaborative-working/index.html#iteration-and-documentation", + "href": "presentations/2023-03-23_collaborative-working/index.html#iteration-and-documentation", + "title": "Collaborative working", + "section": "Iteration and documentation", + "text": "Iteration and documentation\n\nAnalyse early, analyse often (using RAPs!)\nWrite down what you did\nWrite down what you did but then changed your mind about\nFavour Quarto/ RMarkdown\n\nClean sessions\nDocumentation and graphics" }, { - "objectID": "presentations/2024-09-05_earl-nhp/index.html#outputs-app", - "href": "presentations/2024-09-05_earl-nhp/index.html#outputs-app", - "title": "Using R and Python to model future hospital activity", - "section": "", - "text": "Here’s a preview of the outputs app.\nIn the navbar you can see that users can aggregate by hospital sites; view charts and tables; and download results files for further processing.\nThere are also context-specific drodown menus to focus in on certain data. For example, to see results by activity type: inpatients, outpatients or A&E.\nIn this particular tab we can see a beeswarm plot showing each simulation as an individual point. This kind of presentation is important to remind users that the model outputs a distribution; that there are range of possibilities.\nThe data provided here to users is used to drive decisions about the size of hospital that will be developed." + "objectID": "presentations/2023-03-23_collaborative-working/index.html#data-and-.gitignore", + "href": "presentations/2023-03-23_collaborative-working/index.html#data-and-.gitignore", + "title": "Collaborative working", + "section": "Data and .gitignore", + "text": "Data and .gitignore\n\nYour repo needs to be reproducible but also needs to be safe\nThe main branch should be reproducible by anyone at any time\n\nDocument package dependencies (using renv)\nDocument data loads if the data isn’t in the repo\n\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2024-09-05_earl-nhp/index.html#next", - "href": "presentations/2024-09-05_earl-nhp/index.html#next", - "title": "Using R and Python to model future hospital activity", - "section": "Next", - "text": "Next\n\nForecast regionally and nationally\nMove data and pipelines into Databricks\nOpen-source model code\n\n\n\nWe’re currently working with hospitals and trusts, but we’re also expanding the geographical scale to produce results at the regional and national scale. This will require some thinking around processing, modelling and generating outputs.\nWe’re currently transferring data processing into Databricks, partly to bring all the steps into one platform but also as an opportunity to speed up the processing by using Spark.\nFinally, we already have some aspects in the open, like the project information site, but we’d also like to open-source the model code itself so that others can use and develop it." + "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#which-is-easier-to-read", + "href": "presentations/2023-03-09_coffee-and-coding/index.html#which-is-easier-to-read", + "title": "Coffee and Coding", + "section": "Which is easier to read?", + "text": "Which is easier to read?\n\nae_attendances |>\n filter(org_code %in% c(\"RNA\", \"RL4\")) |>\n mutate(performance = 1 + breaches / attendances) |>\n filter(type == 1) |>\n mutate(met_target = performance >= 0.95)\n\nor\n\nae_attendances |>\n filter(\n org_code %in% c(\"RNA\", \"RL4\"),\n type == 1\n ) |>\n mutate(\n performance = 1 + breaches / attendances,\n met_target = performance >= 0.95\n )\n\n\n spending a few seconds to neatly format your code can greatly improve the legibility to future readers, making the intent of the code far clearer, and will make finding bugs easier to spot.\n\n\n (have you spotted the mistake in the snippets above?)" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#patient-experience", - "href": "presentations/2023-05-23_data-science-for-good/index.html#patient-experience", - "title": "What good data science looks like", - "section": "Patient experience", - "text": "Patient experience\n\nThe NHS collects a lot of patient experience data\nRate the service 1-5 (Very poor… Excellent) but also give written feedback\n\n“Parking was difficult”\n“Doctor was rude”\n“You saved my life”\n\nMany organisations lack the staffing to read all of the feedback in a systematic way\nProduce an algorithm to rate theme and “criticality”" + "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#tidyverse-style-guide", + "href": "presentations/2023-03-09_coffee-and-coding/index.html#tidyverse-style-guide", + "title": "Coffee and Coding", + "section": "Tidyverse Style Guide", + "text": "Tidyverse Style Guide\n\nGood coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread\n\n\nAll style guides are fundamentally opinionated. Some decisions genuinely do make code easier to use (especially matching indenting to programming structure), but many decisions are arbitrary. The most important thing about a style guide is that it provides consistency, making code easier to write because you need to make fewer decisions.\n\ntidyverse style guide" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#help-people-to-do-their-jobs", - "href": "presentations/2023-05-23_data-science-for-good/index.html#help-people-to-do-their-jobs", - "title": "What good data science looks like", - "section": "Help people to do their jobs", - "text": "Help people to do their jobs\n\nText based data is complex and built on human experience\nThe tool should enhance, not replace, human understanding\nEnhancing search and filtering\n\nIf they read 100 comments today, which should they read?\n\n“A recommendation engine for feedback data”" + "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#lintr-styler-are-your-new-best-friends", + "href": "presentations/2023-03-09_coffee-and-coding/index.html#lintr-styler-are-your-new-best-friends", + "title": "Coffee and Coding", + "section": "{lintr} + {styler} are your new best friends", + "text": "{lintr} + {styler} are your new best friends\n\n\n{lintr}\n\n{lintr} is a static code analysis tool that inspects your code (without running it)\nit checks for certain classes of errors (e.g. mismatched { and (’s)\nit warns about potential issues (e.g. using variables that aren’t defined)\nit warns about places where you are not adhering to the code style\n\n\n{styler}\n\n{styler} is an RStudio add in that automatically reformats your code, tidying it up to match the style guide\n99.9% of the time it will give you equivalent code, but there is the potential that it may change the behaviour of your code\nit will overwrite the files that you ask it to run on however, so it is vital to be using version control\na good workflow here is to save your file, “stage” the changes to your file, then run {styler}. You can then revert back to the staged changed if needed." }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#reflect-what-users-want", - "href": "presentations/2023-05-23_data-science-for-good/index.html#reflect-what-users-want", - "title": "What good data science looks like", - "section": "Reflect what users want", - "text": "Reflect what users want\n\nI have worked with this data since before it existed\nI came to realise that people were struggling to read all of their data\nFits alongside other work happening within NHSE\n\nA framework for understanding patient experience" + "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#what-does-lintr-look-like", + "href": "presentations/2023-03-09_coffee-and-coding/index.html#what-does-lintr-look-like", + "title": "Coffee and Coding", + "section": "What does {lintr} look like?", + "text": "What does {lintr} look like?\n\n\n\nsource: Good practice for writing R code and R packages\n\nrunning lintr can be done in the console, e.g.\n\nlintr::lintr_dir(\".\")\n\nor via the Addins menu" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#useful", - "href": "presentations/2023-05-23_data-science-for-good/index.html#useful", - "title": "What good data science looks like", - "section": "Useful", - "text": "Useful\n\nA fundamental principle is that everyone can use\nIf you can run the code, run it\nIf you can use the API, use it\nIf you just want the dashboard, use it\nCredit to the growth charts API" + "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#using-styler", + "href": "presentations/2023-03-09_coffee-and-coding/index.html#using-styler", + "title": "Coffee and Coding", + "section": "Using {styler}", + "text": "Using {styler}\n\nsource: Good practice for writing R code and R packages" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#understandable", - "href": "presentations/2023-05-23_data-science-for-good/index.html#understandable", - "title": "What good data science looks like", - "section": "Understandable", - "text": "Understandable\n\nTuned to the users needs\nNot simply tuning accuracy scores\nLook at the type of mistake the model is making\nLook at the category it’s predicting\n\nWe can lose a few of common unimportant categories\nWe need to get every rare and important category" + "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#further-thoughts-on-improving-code-legibility", + "href": "presentations/2023-03-09_coffee-and-coding/index.html#further-thoughts-on-improving-code-legibility", + "title": "Coffee and Coding", + "section": "Further thoughts on improving code legibility", + "text": "Further thoughts on improving code legibility\n\ndo not let files grow too big\nbreak up logic into separate files, then you can use source(\"filename.R) to run the code in that file\nidealy, break up your logic into separate functions, each function having it’s own file, and then call those functions within your analysis\ndo not repeat yourself - if you are copying and pasting your code then you should be thinking about how to write a single function to handle this repeated logic\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#iterative", - "href": "presentations/2023-05-23_data-science-for-good/index.html#iterative", - "title": "What good data science looks like", - "section": "Iterative", - "text": "Iterative\n\nYear one\n\n10 categories\nModerate criticality performance\nNo deep learning\nWeak dashboard\nPositive evaluation" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#ai-could-help-identify-high-risk-heart-patients", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#ai-could-help-identify-high-risk-heart-patients", + "title": "Identifying patients at risk", + "section": "AI could help identify high-risk heart patients1", + "text": "AI could help identify high-risk heart patients1\nThe University of Leeds has helped train an AI system called Optimise, that looked at health records of more than two million people.\n…\nOf those two million records that were scanned, more than 400,000 people were identified as being high risk for the likes of heart failure, stroke and diabetes.\nhttps://www.bbc.co.uk/news/articles/cj620yl96kzo" + }, + { + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#how-it-works", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#how-it-works", + "title": "Identifying patients at risk", + "section": "How it works", + "text": "How it works\n\nThe input: Health records\nHealth records can be structured or unstructured\n\nStructured: can be stored in a table\nUnstructured: can’t be stored in a table, different shapes/sizes (e.g. text, audio, images)" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#iterative-1", - "href": "presentations/2023-05-23_data-science-for-good/index.html#iterative-1", - "title": "What good data science looks like", - "section": "Iterative", - "text": "Iterative\n\nYear two\n\n30-50 categories\nStrong criticality performance\nDeep learning\nImproved dashboard\nWIP\n\nOverall five minor versions of algorithm and seven of dashboard" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#example-of-structured-data-in-health-records", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#example-of-structured-data-in-health-records", + "title": "Identifying patients at risk", + "section": "Example of structured data in health records", + "text": "Example of structured data in health records\n\n\n\nID\nBMI\nAge\nIMD Decile\nSmoker\nBlood Pressure\n\n\n\n\n1\n17\n49\n3\n1\n110/70\n\n\n2\n25\n67\n1\n1\n129/70\n\n\n3\n20\n39\n8\n0\n140/90\n\n\n4\n28\n81\n6\n0\n130/85\n\n\n5\n29\n41\n4\n0\n120/80\n\n\n\nData is consistent within each column in the table." }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#documented", - "href": "presentations/2023-05-23_data-science-for-good/index.html#documented", - "title": "What good data science looks like", - "section": "Documented", - "text": "Documented\n\nWe’ve documented in the way you usually would\nWe were asked in year 1 to provide plain English documentation\nWe made a website with all the product details" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#example-of-unstructured-data-in-health-records", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#example-of-unstructured-data-in-health-records", + "title": "Identifying patients at risk", + "section": "Example of unstructured data in health records", + "text": "Example of unstructured data in health records\n\n\n\n\n\n\n\nID\nNotes\n\n\n\n\n1\nShortness of breath\n\n\n2\nPatient attended clinic following one week of fever, vomiting, and abdominal pain.\n\n\n\nThe length of each sentence is different - data not consistent." }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#develop-skills-of-the-staff-technical-and-otherwise", - "href": "presentations/2023-05-23_data-science-for-good/index.html#develop-skills-of-the-staff-technical-and-otherwise", - "title": "What good data science looks like", - "section": "Develop skills of the staff, technical and otherwise", - "text": "Develop skills of the staff, technical and otherwise\n\nYear one created a Python programmer\nYear two created an R/ Shiny programmer\nThe team has learned:\n\nStatic website generation\nText cleaning/ searching/ mining\nCollaborative coding practices\nWorking with and communicating with users\nLinux, databases, APIs…" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#a-simple-approach-to-classifying-data-knn", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#a-simple-approach-to-classifying-data-knn", + "title": "Identifying patients at risk", + "section": "A simple approach to classifying data: KNN", + "text": "A simple approach to classifying data: KNN\n\n\nClustering algorithms like K Nearest Neighbours (KNN) are on the more basic end of the scale, requiring very little computational power.\n\n1\n\nAntti Ajanki AnAj, CC BY-SA 3.0 http://creativecommons.org/licenses/by-sa/3.0/, via Wikimedia Commons" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#benefits-from-and-benefits-the-community", - "href": "presentations/2023-05-23_data-science-for-good/index.html#benefits-from-and-benefits-the-community", - "title": "What good data science looks like", - "section": "Benefits from, and benefits, the community", - "text": "Benefits from, and benefits, the community\n\nNHSBSA R Shiny template" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#a-simple-approach-to-classifying-data-decision-tree", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#a-simple-approach-to-classifying-data-decision-tree", + "title": "Identifying patients at risk", + "section": "A simple approach to classifying data: Decision Tree", + "text": "A simple approach to classifying data: Decision Tree\n1\nhttps://www.researchgate.net/publication/26635430_Using_Machine_Learning_Algorithms_in_Cardiovascular_Disease_Risk_Evaluation" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#benefits-from-and-benefits-the-community-1", - "href": "presentations/2023-05-23_data-science-for-good/index.html#benefits-from-and-benefits-the-community-1", - "title": "What good data science looks like", - "section": "Benefits from, and benefits, the community", - "text": "Benefits from, and benefits, the community\n\nWe benefit and benefit from\n\nNHS-R\nNHS-Pycom\nGovernment Digital Service\nColleagues and friends" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#there-are-many-different-models-out-there", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#there-are-many-different-models-out-there", + "title": "Identifying patients at risk", + "section": "There are many different models out there! 🥴", + "text": "There are many different models out there! 🥴\n1\nhttps://scikit-learn.org/1.3/tutorial/machine_learning_map/" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#open-and-reproducible", - "href": "presentations/2023-05-23_data-science-for-good/index.html#open-and-reproducible", - "title": "What good data science looks like", - "section": "Open and reproducible", - "text": "Open and reproducible\n\nOff the shelf, proprietary data collection systems dominate\nThey often offer bundled analytic products of low quality\nThe DS time can’t and doesn’t want to offer a complete data system\nHow can we best contribute to improving patient experience for patients in the NHS?\n\nIf the patient experience data won’t come to the mountain…" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#what-makes-a-model-simple-or-complex", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#what-makes-a-model-simple-or-complex", + "title": "Identifying patients at risk", + "section": "What makes a model simple or complex?", + "text": "What makes a model simple or complex?\n\nThere are dozens of different algorithms out there\nEach algorithm has different strengths and weaknesses\nWhat makes a model simple or complex is the amount of computational power required and how much the model needs to “learn” - how many parameters there are" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#open-source-ftw", - "href": "presentations/2023-05-23_data-science-for-good/index.html#open-source-ftw", - "title": "What good data science looks like", - "section": "Open source FTW!", - "text": "Open source FTW!\n\nOften individuals in the NHS don’t want private companies to “benefit” from open code\nBut if they make their products better with open code the patients win\nBest practice as code" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#is-the-input-or-the-computation-complex", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#is-the-input-or-the-computation-complex", + "title": "Identifying patients at risk", + "section": "Is the input or the computation complex?", + "text": "Is the input or the computation complex?\n“We used UK primary care EHR data from 2,081,139 individuals aged ≥ 30 years…\nWe trained a random forest classifier using age, sex, ethnicity and comorbidities (OPTIMISE).”1\nNadarajah, Ramesh, et al. “Machine learning to identify community-dwelling individuals at higher risk of incident cardio-renal-metabolic diseases and death.” Future Healthcare Journal 11 (2024): 100109. https://www.sciencedirect.com/science/article/pii/S2514664524002212" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#fun", - "href": "presentations/2023-05-23_data-science-for-good/index.html#fun", - "title": "What good data science looks like", - "section": "Fun!", - "text": "Fun!\n\nCombing through spreadsheets looking for one comment is not fun\nDoing things the same way you did them last year is not fun\nTrying to implement a project that is too complicated is not fun\n\n \n\nWorking with a diverse team with different skills is fun\nAccessing high quality documentation to understand a project better is fun*" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#how-do-we-know-if-a-model-is-good", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#how-do-we-know-if-a-model-is-good", + "title": "Identifying patients at risk", + "section": "How do we know if a model is good?", + "text": "How do we know if a model is good?\nBeware of metrics like “accuracy”\nIf only 1% of the population is at risk of heart disease, and the model is a broken model that never ever predicts someone is at risk, this model would still have 99% accuracy, because it would be right 99/100 times!\nCarefully consider what is important to measure, given the context.\n\nIn the scenario above, a measure like “recall” would be more useful.1\n\nhttps://www.youtube.com/watch?v=qWfzIYCvBqo" }, { - "objectID": "presentations/2023-05-23_data-science-for-good/index.html#team-and-code", - "href": "presentations/2023-05-23_data-science-for-good/index.html#team-and-code", - "title": "What good data science looks like", - "section": "Team and code", - "text": "Team and code\n\nAndreas Soteriades (Y1)\nYiWen Hon, Oluwasegun Apejoye (Y2)\n\n \n\npxtextmining\nexperiencesdashboard\nDocumentation\n\n\n\nchris.beeley1@nhs.net\nhttps://fosstodon.org/@chrisbeeley\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#pros-and-cons-of-simple-a.i.-approaches", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#pros-and-cons-of-simple-a.i.-approaches", + "title": "Identifying patients at risk", + "section": "Pros and cons of simple “A.I.” approaches", + "text": "Pros and cons of simple “A.I.” approaches\n\n\nPros:\n\nSimple models are more easily explained\nCan sometimes find new patterns in the data\n\n\nCons:\n\nThe quality of the data determines the quality of the model\nNot able to handle very complex tasks" + }, + { + "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#issues-to-look-out-for", + "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#issues-to-look-out-for", + "title": "Identifying patients at risk", + "section": "🚩 Issues to look out for 🚩", + "text": "🚩 Issues to look out for 🚩\n\nHow complex is the input, or the computational approach?\nHow is the model’s performance measured?\nDoes the model get updated?\nWhere did the data come from?\nHave issues of bias or ethics been considered?\n\n\n\nContinuously learning, or learning from mistakes vs. snapshot in time\n\n\n\n\n\nLearn more about The Strategy Unit" }, { "objectID": "presentations/2023-02-01_what-is-data-science/index.html#what-is-data-science", @@ -1604,949 +1660,1033 @@ "text": "Note\nAll copyrighted material is reused under Fair Dealing\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#ai-could-help-identify-high-risk-heart-patients", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#ai-could-help-identify-high-risk-heart-patients", - "title": "Identifying patients at risk", - "section": "AI could help identify high-risk heart patients1", - "text": "AI could help identify high-risk heart patients1\nThe University of Leeds has helped train an AI system called Optimise, that looked at health records of more than two million people.\n…\nOf those two million records that were scanned, more than 400,000 people were identified as being high risk for the likes of heart failure, stroke and diabetes.\nhttps://www.bbc.co.uk/news/articles/cj620yl96kzo" + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#patient-experience", + "href": "presentations/2023-05-23_data-science-for-good/index.html#patient-experience", + "title": "What good data science looks like", + "section": "Patient experience", + "text": "Patient experience\n\nThe NHS collects a lot of patient experience data\nRate the service 1-5 (Very poor… Excellent) but also give written feedback\n\n“Parking was difficult”\n“Doctor was rude”\n“You saved my life”\n\nMany organisations lack the staffing to read all of the feedback in a systematic way\nProduce an algorithm to rate theme and “criticality”" }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#how-it-works", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#how-it-works", - "title": "Identifying patients at risk", - "section": "How it works", - "text": "How it works\n\nThe input: Health records\nHealth records can be structured or unstructured\n\nStructured: can be stored in a table\nUnstructured: can’t be stored in a table, different shapes/sizes (e.g. text, audio, images)" + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#help-people-to-do-their-jobs", + "href": "presentations/2023-05-23_data-science-for-good/index.html#help-people-to-do-their-jobs", + "title": "What good data science looks like", + "section": "Help people to do their jobs", + "text": "Help people to do their jobs\n\nText based data is complex and built on human experience\nThe tool should enhance, not replace, human understanding\nEnhancing search and filtering\n\nIf they read 100 comments today, which should they read?\n\n“A recommendation engine for feedback data”" }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#example-of-structured-data-in-health-records", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#example-of-structured-data-in-health-records", - "title": "Identifying patients at risk", - "section": "Example of structured data in health records", - "text": "Example of structured data in health records\n\n\n\nID\nBMI\nAge\nIMD Decile\nSmoker\nBlood Pressure\n\n\n\n\n1\n17\n49\n3\n1\n110/70\n\n\n2\n25\n67\n1\n1\n129/70\n\n\n3\n20\n39\n8\n0\n140/90\n\n\n4\n28\n81\n6\n0\n130/85\n\n\n5\n29\n41\n4\n0\n120/80\n\n\n\nData is consistent within each column in the table." + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#reflect-what-users-want", + "href": "presentations/2023-05-23_data-science-for-good/index.html#reflect-what-users-want", + "title": "What good data science looks like", + "section": "Reflect what users want", + "text": "Reflect what users want\n\nI have worked with this data since before it existed\nI came to realise that people were struggling to read all of their data\nFits alongside other work happening within NHSE\n\nA framework for understanding patient experience" }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#example-of-unstructured-data-in-health-records", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#example-of-unstructured-data-in-health-records", - "title": "Identifying patients at risk", - "section": "Example of unstructured data in health records", - "text": "Example of unstructured data in health records\n\n\n\n\n\n\n\nID\nNotes\n\n\n\n\n1\nShortness of breath\n\n\n2\nPatient attended clinic following one week of fever, vomiting, and abdominal pain.\n\n\n\nThe length of each sentence is different - data not consistent." + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#useful", + "href": "presentations/2023-05-23_data-science-for-good/index.html#useful", + "title": "What good data science looks like", + "section": "Useful", + "text": "Useful\n\nA fundamental principle is that everyone can use\nIf you can run the code, run it\nIf you can use the API, use it\nIf you just want the dashboard, use it\nCredit to the growth charts API" }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#a-simple-approach-to-classifying-data-knn", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#a-simple-approach-to-classifying-data-knn", - "title": "Identifying patients at risk", - "section": "A simple approach to classifying data: KNN", - "text": "A simple approach to classifying data: KNN\n\n\nClustering algorithms like K Nearest Neighbours (KNN) are on the more basic end of the scale, requiring very little computational power.\n\n1\n\nAntti Ajanki AnAj, CC BY-SA 3.0 http://creativecommons.org/licenses/by-sa/3.0/, via Wikimedia Commons" + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#understandable", + "href": "presentations/2023-05-23_data-science-for-good/index.html#understandable", + "title": "What good data science looks like", + "section": "Understandable", + "text": "Understandable\n\nTuned to the users needs\nNot simply tuning accuracy scores\nLook at the type of mistake the model is making\nLook at the category it’s predicting\n\nWe can lose a few of common unimportant categories\nWe need to get every rare and important category" }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#a-simple-approach-to-classifying-data-decision-tree", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#a-simple-approach-to-classifying-data-decision-tree", - "title": "Identifying patients at risk", - "section": "A simple approach to classifying data: Decision Tree", - "text": "A simple approach to classifying data: Decision Tree\n1\nhttps://www.researchgate.net/publication/26635430_Using_Machine_Learning_Algorithms_in_Cardiovascular_Disease_Risk_Evaluation" + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#iterative", + "href": "presentations/2023-05-23_data-science-for-good/index.html#iterative", + "title": "What good data science looks like", + "section": "Iterative", + "text": "Iterative\n\nYear one\n\n10 categories\nModerate criticality performance\nNo deep learning\nWeak dashboard\nPositive evaluation" }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#there-are-many-different-models-out-there", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#there-are-many-different-models-out-there", - "title": "Identifying patients at risk", - "section": "There are many different models out there! 🥴", - "text": "There are many different models out there! 🥴\n1\nhttps://scikit-learn.org/1.3/tutorial/machine_learning_map/" + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#iterative-1", + "href": "presentations/2023-05-23_data-science-for-good/index.html#iterative-1", + "title": "What good data science looks like", + "section": "Iterative", + "text": "Iterative\n\nYear two\n\n30-50 categories\nStrong criticality performance\nDeep learning\nImproved dashboard\nWIP\n\nOverall five minor versions of algorithm and seven of dashboard" }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#what-makes-a-model-simple-or-complex", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#what-makes-a-model-simple-or-complex", - "title": "Identifying patients at risk", - "section": "What makes a model simple or complex?", - "text": "What makes a model simple or complex?\n\nThere are dozens of different algorithms out there\nEach algorithm has different strengths and weaknesses\nWhat makes a model simple or complex is the amount of computational power required and how much the model needs to “learn” - how many parameters there are" + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#documented", + "href": "presentations/2023-05-23_data-science-for-good/index.html#documented", + "title": "What good data science looks like", + "section": "Documented", + "text": "Documented\n\nWe’ve documented in the way you usually would\nWe were asked in year 1 to provide plain English documentation\nWe made a website with all the product details" + }, + { + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#develop-skills-of-the-staff-technical-and-otherwise", + "href": "presentations/2023-05-23_data-science-for-good/index.html#develop-skills-of-the-staff-technical-and-otherwise", + "title": "What good data science looks like", + "section": "Develop skills of the staff, technical and otherwise", + "text": "Develop skills of the staff, technical and otherwise\n\nYear one created a Python programmer\nYear two created an R/ Shiny programmer\nThe team has learned:\n\nStatic website generation\nText cleaning/ searching/ mining\nCollaborative coding practices\nWorking with and communicating with users\nLinux, databases, APIs…" + }, + { + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#benefits-from-and-benefits-the-community", + "href": "presentations/2023-05-23_data-science-for-good/index.html#benefits-from-and-benefits-the-community", + "title": "What good data science looks like", + "section": "Benefits from, and benefits, the community", + "text": "Benefits from, and benefits, the community\n\nNHSBSA R Shiny template" + }, + { + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#benefits-from-and-benefits-the-community-1", + "href": "presentations/2023-05-23_data-science-for-good/index.html#benefits-from-and-benefits-the-community-1", + "title": "What good data science looks like", + "section": "Benefits from, and benefits, the community", + "text": "Benefits from, and benefits, the community\n\nWe benefit and benefit from\n\nNHS-R\nNHS-Pycom\nGovernment Digital Service\nColleagues and friends" + }, + { + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#open-and-reproducible", + "href": "presentations/2023-05-23_data-science-for-good/index.html#open-and-reproducible", + "title": "What good data science looks like", + "section": "Open and reproducible", + "text": "Open and reproducible\n\nOff the shelf, proprietary data collection systems dominate\nThey often offer bundled analytic products of low quality\nThe DS time can’t and doesn’t want to offer a complete data system\nHow can we best contribute to improving patient experience for patients in the NHS?\n\nIf the patient experience data won’t come to the mountain…" + }, + { + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#open-source-ftw", + "href": "presentations/2023-05-23_data-science-for-good/index.html#open-source-ftw", + "title": "What good data science looks like", + "section": "Open source FTW!", + "text": "Open source FTW!\n\nOften individuals in the NHS don’t want private companies to “benefit” from open code\nBut if they make their products better with open code the patients win\nBest practice as code" + }, + { + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#fun", + "href": "presentations/2023-05-23_data-science-for-good/index.html#fun", + "title": "What good data science looks like", + "section": "Fun!", + "text": "Fun!\n\nCombing through spreadsheets looking for one comment is not fun\nDoing things the same way you did them last year is not fun\nTrying to implement a project that is too complicated is not fun\n\n \n\nWorking with a diverse team with different skills is fun\nAccessing high quality documentation to understand a project better is fun*" + }, + { + "objectID": "presentations/2023-05-23_data-science-for-good/index.html#team-and-code", + "href": "presentations/2023-05-23_data-science-for-good/index.html#team-and-code", + "title": "What good data science looks like", + "section": "Team and code", + "text": "Team and code\n\nAndreas Soteriades (Y1)\nYiWen Hon, Oluwasegun Apejoye (Y2)\n\n \n\npxtextmining\nexperiencesdashboard\nDocumentation\n\n\n\nchris.beeley1@nhs.net\nhttps://fosstodon.org/@chrisbeeley\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + }, + { + "objectID": "presentations/2024-09-05_earl-nhp/index.html#the-new-hospital-programme-nhp", + "href": "presentations/2024-09-05_earl-nhp/index.html#the-new-hospital-programme-nhp", + "title": "Using R and Python to model future hospital activity", + "section": "The New Hospital Programme (NHP)", + "text": "The New Hospital Programme (NHP)\n\n\n\nA manifesto commitment\nFuture activity must be modelled\nNeed consistency across schemes\n\n\n\n\n\n\nBuilding new hospitals - replacing crumbling infrastructure in some cases, completely new builds in others.\nIt’s important to size the hospitals according to the type and quantity of activity there will be in the future.\nThere are many proprietary black box models in use for estimating healthcare activity in the future - no consistency, difficult to compare results\nStrategy Unit was asked to develop a model to be used across all of the builds: a model owned and operated by the NHS, for the NHS.\n\n\n::::" + }, + { + "objectID": "presentations/2024-09-05_earl-nhp/index.html#model-process", + "href": "presentations/2024-09-05_earl-nhp/index.html#model-process", + "title": "Using R and Python to model future hospital activity", + "section": "", + "text": "A probabilistic Monte Carlo simulation that:\n\nTakes hospital activity from a baseline year, using NHS England’s Hospital Episode Statistics (HES) data\nApplies variables that:\n\nare outside of our control (e.g. population changes, using ONS projections)\ncan reduce hospital activity (mitigators, e.g. virtual wards or teleappointments)\n\nForecasts future demand based on these variables, outputting probabilistic predictive intervals" + }, + { + "objectID": "presentations/2024-09-05_earl-nhp/index.html#our-challenges", + "href": "presentations/2024-09-05_earl-nhp/index.html#our-challenges", + "title": "Using R and Python to model future hospital activity", + "section": "Our challenges", + "text": "Our challenges\n\n28 hospitals currently using the model\nModel is being developed whilst in production\nModel is very complex - technically, and for end users\n\n\n\nHospitals are actively using the model while it is still in development, which can be tricky\nDataset is massive for each hospital - hundreds and thousands of rows - all activity for a hospital trust in one year\nModel can accommodate hundreds of different variables, understanding and setting these can be challenging for end users\nWe have comprehensive, openly available documentation and also a team of Model Relationship Managers to help address this" + }, + { + "objectID": "presentations/2024-09-05_earl-nhp/index.html#tools-and-platforms", + "href": "presentations/2024-09-05_earl-nhp/index.html#tools-and-platforms", + "title": "Using R and Python to model future hospital activity", + "section": "Tools and platforms", + "text": "Tools and platforms\n\nData pipelines: {targets} , SQL \nModel: Python , Docker \nApps: {shiny} and {golem} , Posit Connect \nInfrastructure and storage: Azure \nDocumentation: Quarto \nVersion control and collaboration: Git , GitHub \n\n\n\nSo how did we solve the problem?\nHere’s a rundown of the tools and platforms that we use.\nThe data pipeline is orchestrated by {targets} for its recipe-like format and so we re-run only what needs re-running.\nThe model is built in Python and involves a lot of pandas DataFrame manipulations.\nWe use Azure for storage of model input data and JSON files of results.\nUsers input model paramters in one Shiny app and view results in another. This uses modules and {golem} for its package focus, as well as {bs4Dash}. We have development and productino environments.\nWe have a deployed Quarto website that contains the documentation for the whole project.\nIn general, we’re following the principles of Reproducible Analytical Pipelines (RAP) in everything we do.\nAll originally written by Tom.\nAs the team has grown we have shared responsibilities: YiWen in Python, Matt with Shiny, Tom as technical lead." }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#is-the-input-or-the-computation-complex", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#is-the-input-or-the-computation-complex", - "title": "Identifying patients at risk", - "section": "Is the input or the computation complex?", - "text": "Is the input or the computation complex?\n“We used UK primary care EHR data from 2,081,139 individuals aged ≥ 30 years…\nWe trained a random forest classifier using age, sex, ethnicity and comorbidities (OPTIMISE).”1\nNadarajah, Ramesh, et al. “Machine learning to identify community-dwelling individuals at higher risk of incident cardio-renal-metabolic diseases and death.” Future Healthcare Journal 11 (2024): 100109. https://www.sciencedirect.com/science/article/pii/S2514664524002212" + "objectID": "presentations/2024-09-05_earl-nhp/index.html#structure", + "href": "presentations/2024-09-05_earl-nhp/index.html#structure", + "title": "Using R and Python to model future hospital activity", + "section": "", + "text": "This is a simplified overview of the structure and flow of information through the system.\nThe full structure is quite complex, reflecting the complexity of user needs and the scale of the task.\nData from our database is processed and stored in Azure Storage Containers via a targets pipeline. Additional data, like ONS population projections, are also stored.\nThe users interact with a Shiny app to set their input parameters. The app provides some contextual information derived from the data held in Azure. Users click a button to run the model.\nThe model is deployed as a Docker container in Azure Continer Instances, triggered by an API call.\nThe model results are stored as JSON in an Azure container, ready for collection and presentation in an outputs app.\nUsers can view charts and tables and download files for further analysis.\nSo there’s clear front- and backends and we have\nFurther complexity is added by the need to process and present information despite changes to the model over time.\nWe use development and production environments for our apps to help reduce errors." }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#how-do-we-know-if-a-model-is-good", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#how-do-we-know-if-a-model-is-good", - "title": "Identifying patients at risk", - "section": "How do we know if a model is good?", - "text": "How do we know if a model is good?\nBeware of metrics like “accuracy”\nIf only 1% of the population is at risk of heart disease, and the model is a broken model that never ever predicts someone is at risk, this model would still have 99% accuracy, because it would be right 99/100 times!\nCarefully consider what is important to measure, given the context.\n\nIn the scenario above, a measure like “recall” would be more useful.1\n\nhttps://www.youtube.com/watch?v=qWfzIYCvBqo" + "objectID": "presentations/2024-09-05_earl-nhp/index.html#outputs-app", + "href": "presentations/2024-09-05_earl-nhp/index.html#outputs-app", + "title": "Using R and Python to model future hospital activity", + "section": "", + "text": "Here’s a preview of the outputs app.\nIn the navbar you can see that users can aggregate by hospital sites; view charts and tables; and download results files for further processing.\nThere are also context-specific drodown menus to focus in on certain data. For example, to see results by activity type: inpatients, outpatients or A&E.\nIn this particular tab we can see a beeswarm plot showing each simulation as an individual point. This kind of presentation is important to remind users that the model outputs a distribution; that there are range of possibilities.\nThe data provided here to users is used to drive decisions about the size of hospital that will be developed." }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#pros-and-cons-of-simple-a.i.-approaches", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#pros-and-cons-of-simple-a.i.-approaches", - "title": "Identifying patients at risk", - "section": "Pros and cons of simple “A.I.” approaches", - "text": "Pros and cons of simple “A.I.” approaches\n\n\nPros:\n\nSimple models are more easily explained\nCan sometimes find new patterns in the data\n\n\nCons:\n\nThe quality of the data determines the quality of the model\nNot able to handle very complex tasks" + "objectID": "presentations/2024-09-05_earl-nhp/index.html#next", + "href": "presentations/2024-09-05_earl-nhp/index.html#next", + "title": "Using R and Python to model future hospital activity", + "section": "Next", + "text": "Next\n\nForecast regionally and nationally\nMove data and pipelines into Databricks\nOpen-source model code\n\n\n\nWe’re currently working with hospitals and trusts, but we’re also expanding the geographical scale to produce results at the regional and national scale. This will require some thinking around processing, modelling and generating outputs.\nWe’re currently transferring data processing into Databricks, partly to bring all the steps into one platform but also as an opportunity to speed up the processing by using Spark.\nFinally, we already have some aspects in the open, like the project information site, but we’d also like to open-source the model code itself so that others can use and develop it." }, { - "objectID": "presentations/2024-10-10_what-is-ai-yiwen/index.html#issues-to-look-out-for", - "href": "presentations/2024-10-10_what-is-ai-yiwen/index.html#issues-to-look-out-for", - "title": "Identifying patients at risk", - "section": "🚩 Issues to look out for 🚩", - "text": "🚩 Issues to look out for 🚩\n\nHow complex is the input, or the computational approach?\nHow is the model’s performance measured?\nDoes the model get updated?\nWhere did the data come from?\nHave issues of bias or ethics been considered?\n\n\n\nContinuously learning, or learning from mistakes vs. snapshot in time\n\n\n\n\n\nLearn more about The Strategy Unit" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#why", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#why", + "title": "Repeating Yourself with Functions", + "section": "Why?", + "text": "Why?\n\nForecasting project, need to do the same thing with data for 6 centres.\nCopy-paste runs risk of not doing the same thing each time (and boring/time-consuming/frustrating).\nRepetition –> function." }, { - "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#which-is-easier-to-read", - "href": "presentations/2023-03-09_coffee-and-coding/index.html#which-is-easier-to-read", - "title": "Coffee and Coding", - "section": "Which is easier to read?", - "text": "Which is easier to read?\n\nae_attendances |>\n filter(org_code %in% c(\"RNA\", \"RL4\")) |>\n mutate(performance = 1 + breaches / attendances) |>\n filter(type == 1) |>\n mutate(met_target = performance >= 0.95)\n\nor\n\nae_attendances |>\n filter(\n org_code %in% c(\"RNA\", \"RL4\"),\n type == 1\n ) |>\n mutate(\n performance = 1 + breaches / attendances,\n met_target = performance >= 0.95\n )\n\n\n spending a few seconds to neatly format your code can greatly improve the legibility to future readers, making the intent of the code far clearer, and will make finding bugs easier to spot.\n\n\n (have you spotted the mistake in the snippets above?)" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#what", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#what", + "title": "Repeating Yourself with Functions", + "section": "What?", + "text": "What?\n\n\nDemo with plots, equally applicable to ‘doing stuff’ with data.\n\n\n# preview data\nhead(new_rtt)\n\n provider_code count rtt_yrmon rtt_mon\n1 RJE 83 Nov 2015 11\n2 RJE 75 Dec 2015 12\n3 RJE 82 Jan 2016 1\n4 RJE 74 Feb 2016 2\n5 RJE 62 Mar 2016 3\n6 RJE 76 Apr 2016 4\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nRemember, this is about writing functions, not creating stunning visualisations!\n\n\n\nRepeat this for each of the 6 centres" }, { - "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#tidyverse-style-guide", - "href": "presentations/2023-03-09_coffee-and-coding/index.html#tidyverse-style-guide", - "title": "Coffee and Coding", - "section": "Tidyverse Style Guide", - "text": "Tidyverse Style Guide\n\nGood coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread\n\n\nAll style guides are fundamentally opinionated. Some decisions genuinely do make code easier to use (especially matching indenting to programming structure), but many decisions are arbitrary. The most important thing about a style guide is that it provides consistency, making code easier to write because you need to make fewer decisions.\n\ntidyverse style guide" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#how", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#how", + "title": "Repeating Yourself with Functions", + "section": "How?", + "text": "How?\nDo it ‘normally’ for one centre. What are the parameters to change?\n\n\np1 <- new_rtt |> \n filter(provider_code == \"RJE\") |> \n ggplot(aes(x = rtt_yrmon, y = count)) +\n geom_line() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(title = \"RJE\",\n subtitle = \"time trend of new referrals\")\n\np2 <- new_rtt |> \n filter(provider_code == \"RJE\") |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = \"monthly pattern of new referrals\")\n\nplots <- ggarrange(p1, p2, nrow = 2)\n\nplots\n\n\n\n\n\n\n\n\n\n\nThis becomes the argument for the function.\nChoose a name for the argument (!= variable_name)\nIn this example we will use prov in place of \"RJE\"\n\n\n\nPlease remember, this is about writing functions, not creating stunning visualisations!" }, { - "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#lintr-styler-are-your-new-best-friends", - "href": "presentations/2023-03-09_coffee-and-coding/index.html#lintr-styler-are-your-new-best-friends", - "title": "Coffee and Coding", - "section": "{lintr} + {styler} are your new best friends", - "text": "{lintr} + {styler} are your new best friends\n\n\n{lintr}\n\n{lintr} is a static code analysis tool that inspects your code (without running it)\nit checks for certain classes of errors (e.g. mismatched { and (’s)\nit warns about potential issues (e.g. using variables that aren’t defined)\nit warns about places where you are not adhering to the code style\n\n\n{styler}\n\n{styler} is an RStudio add in that automatically reformats your code, tidying it up to match the style guide\n99.9% of the time it will give you equivalent code, but there is the potential that it may change the behaviour of your code\nit will overwrite the files that you ask it to run on however, so it is vital to be using version control\na good workflow here is to save your file, “stage” the changes to your file, then run {styler}. You can then revert back to the staged changed if needed." + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#anatomy-of-a-function", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#anatomy-of-a-function", + "title": "Repeating Yourself with Functions", + "section": "Anatomy of a Function", + "text": "Anatomy of a Function\n\nfn_name <- function(arguments){\n \n # do stuff\n \n}\n\nRun the function with fn_name(parameter as argument)" }, { - "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#what-does-lintr-look-like", - "href": "presentations/2023-03-09_coffee-and-coding/index.html#what-does-lintr-look-like", - "title": "Coffee and Coding", - "section": "What does {lintr} look like?", - "text": "What does {lintr} look like?\n\n\n\nsource: Good practice for writing R code and R packages\n\nrunning lintr can be done in the console, e.g.\n\nlintr::lintr_dir(\".\")\n\nor via the Addins menu" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#turning-our-code-into-a-function", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#turning-our-code-into-a-function", + "title": "Repeating Yourself with Functions", + "section": "Turning our code into a function", + "text": "Turning our code into a function\n\n\n\np1 <- new_rtt |> \n filter(provider_code == \"RJE\") |> \n ggplot(aes(x = rtt_yrmon, y = count)) +\n geom_line() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(title = \"RJE\",\n subtitle = \"time trend of new referrals\")\n\np2 <- new_rtt |> \n filter(provider_code == \"RJE\") |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = \"monthly pattern of new referrals\")\n\nplots <- ggarrange(p1, p2, nrow = 2)\n\nplots\n\n\n\nfn_plots <- function(prov){\n \n p1 <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = rtt_yrmon, y = count)) +\n geom_line() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(title = prov,\n subtitle = \"time trend of new referrals\")\n \n p2 <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = \"monthly pattern of new referrals\")\n \n plots <- ggarrange(p1, p2, nrow = 2)\n \n plots\n \n}" }, { - "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#using-styler", - "href": "presentations/2023-03-09_coffee-and-coding/index.html#using-styler", - "title": "Coffee and Coding", - "section": "Using {styler}", - "text": "Using {styler}\n\nsource: Good practice for writing R code and R packages" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#running-our-function", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#running-our-function", + "title": "Repeating Yourself with Functions", + "section": "Running our function", + "text": "Running our function\n\n\n\nfn_plots <- function(prov){\n \n p1 <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = rtt_yrmon, y = count)) +\n geom_line() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(title = prov,\n subtitle = \"time trend of new referrals\")\n \n p2 <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = \"monthly pattern of new referrals\")\n \n plots <- ggarrange(p1, p2, nrow = 2)\n \n plots\n \n}\n\n\n\nfn_plots(\"RKB\")" }, { - "objectID": "presentations/2023-03-09_coffee-and-coding/index.html#further-thoughts-on-improving-code-legibility", - "href": "presentations/2023-03-09_coffee-and-coding/index.html#further-thoughts-on-improving-code-legibility", - "title": "Coffee and Coding", - "section": "Further thoughts on improving code legibility", - "text": "Further thoughts on improving code legibility\n\ndo not let files grow too big\nbreak up logic into separate files, then you can use source(\"filename.R) to run the code in that file\nidealy, break up your logic into separate functions, each function having it’s own file, and then call those functions within your analysis\ndo not repeat yourself - if you are copying and pasting your code then you should be thinking about how to write a single function to handle this repeated logic\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#what-if-we-want-more-than-one-argument", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#what-if-we-want-more-than-one-argument", + "title": "Repeating Yourself with Functions", + "section": "What if we want more than one argument?", + "text": "What if we want more than one argument?\nEasy! Just add them to the arguments when you define the function.\nIf I wanted to run this function on multiple dataframes I would change the function to:\n\nfn_plots <- function(df, prov){\n \n p1 <- df |> \n filter(provider_code == prov) \n # and the rest as before\n}\n\nand run it with fn_plots(new_rtt, \"RKB\").\nNote that the order of entering the parameters is important. If I tried to run fn_plots(\"RKB\", new_rtt) it would look for a dataframe called \"RKB\" and a provider called new_rtt." }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#introduction", - "href": "presentations/2023-03-23_collaborative-working/index.html#introduction", - "title": "Collaborative working", - "section": "Introduction", - "text": "Introduction\n\nThis is definitely an art and not a science\nI do not claim to have all, or even most of, the answers\nHow you use these tools is way more important than the tools themselves\nThis is a culture and not a technique" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#working-through-a-list-of-parameters", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#working-through-a-list-of-parameters", + "title": "Repeating Yourself with Functions", + "section": "Working through a list of parameters", + "text": "Working through a list of parameters\nAvoid manually running fn_plots() for each provider.\nUse purrr::map to iterate over a list\n\n\n# create a vector of all the providers\nprov_labels <- c(\"RJE\", \"RKB\", \"RL4\", \"RRK\", \"RWE\", \"RX1\")\n\nmap(prov_labels, ~ fn_plots(.x))\n\n\n[[1]]\n\n\n\n\n\n\n\n\n\n\n[[2]]\n\n\n\n\n\n\n\n\n\n\n[[3]]\n\n\n\n\n\n\n\n\n\n\n[[4]]\n\n\n\n\n\n\n\n\n\n\n[[5]]\n\n\n\n\n\n\n\n\n\n\n[[6]]" }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#costs", - "href": "presentations/2023-03-23_collaborative-working/index.html#costs", - "title": "Collaborative working", - "section": "Costs", - "text": "Costs\n\nDelay and time\nStress and disagreement\nCommittee thinking\nLearning and effort" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#troubleshooting---does-the-function-work", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#troubleshooting---does-the-function-work", + "title": "Repeating Yourself with Functions", + "section": "Troubleshooting - does the function work?", + "text": "Troubleshooting - does the function work?\nCrawl before you can walk - make sure fn_plot() works for one parameter.\nInsert browser() into the function while testing - steps into the function (don’t forget to remove it when it works!)\n\n\nThis is a new function that will save each time-trend plot\n\nfn_save_plot <- function(prov){\n \n p <- new_rtt |> \n filter(provider_code == prov) |> \n ggplot(aes(x = month(rtt_yrmon), y = count)) +\n geom_col() +\n su_theme() +\n theme(legend.position = \"none\") +\n labs(\n subtitle = paste0(prov, \" - monthly pattern of new referrals\"))\n \n ggsave(paste0(prov, \"_plot.png\"), \n plot = p)\n \n}\n\n\n \n\n\n\nCheck out Shannon Pileggi’s slides for more options" }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#benefits", - "href": "presentations/2023-03-23_collaborative-working/index.html#benefits", - "title": "Collaborative working", - "section": "Benefits", - "text": "Benefits\n\n“From each according to their ability”\nLearning\nReproducibility and reduced truck factor\nFun!" + "objectID": "presentations/2023-09-07_coffee_and_coding_functions/index.html#troubleshooting---does-it-walk-the-walk", + "href": "presentations/2023-09-07_coffee_and_coding_functions/index.html#troubleshooting---does-it-walk-the-walk", + "title": "Repeating Yourself with Functions", + "section": "Troubleshooting - does it walk the walk?", + "text": "Troubleshooting - does it walk the walk?\nWhen learning to walk, use safely() or possibly() in your walk function - it will indicate if any parameters have failed, rather than just fall down.\n\n\n\n# wrap fn_plots in safely\nsafe_pl <- safely(.f = fn_save_plot)\n\nmap(prov_labels, ~ safe_pl(.x))\n\n\n# wrap fn_plots in possibly\nposs_pl <- possibly(.f = fn_save_plot)\n\nmap(prov_labels, ~ poss_pl(.x))\n\n\nConsole output of wrapping function in possibly\n\n\n\n\n\nThis is my first attempt at a Quarto presentation!" }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#github-as-an-organising-principle-behind-work", - "href": "presentations/2023-03-23_collaborative-working/index.html#github-as-an-organising-principle-behind-work", - "title": "Collaborative working", - "section": "GitHub as an organising principle behind work", - "text": "GitHub as an organising principle behind work\n\nA project is just a set of milestones\nA milestone is just a set of issues\nAn issue is just a set of commits\nA commit is just text added and removed" + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#welcome-to-coffee-and-coding", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#welcome-to-coffee-and-coding", + "title": "Coffee and coding", + "section": "Welcome to coffee and coding", + "text": "Welcome to coffee and coding\n\nProject demos, showcasing work from a particular project\nMethod demos, showcasing how to use a particular method/tool/package\nSurgery and problem solving sessions\nDefining code standards and SOP" }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#the-repo-owner", - "href": "presentations/2023-03-23_collaborative-working/index.html#the-repo-owner", - "title": "Collaborative working", - "section": "The repo owner", - "text": "The repo owner\n\nReview milestones\nReview issues\n\nDiscuss the issue on the issue- NOT on email!\n\nReview pull requests and get your pull requests reviewed!" + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#what-are-we-trying-to-achieve", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#what-are-we-trying-to-achieve", + "title": "Coffee and coding", + "section": "What are we trying to achieve?", + "text": "What are we trying to achieve?\n\nLegibility\nReproducibility\nAccuracy\nLaziness" }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#asynchronous-communication", - "href": "presentations/2023-03-23_collaborative-working/index.html#asynchronous-communication", - "title": "Collaborative working", - "section": "Asynchronous communication", - "text": "Asynchronous communication\n\nInvolve others before you pull request\nInvolve others when you pull request\nRead issues!\nComment on issues!\nFile issues- suggestions/ bug reports/ questions\n\nNOT in emails" + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#what-are-some-of-the-fundamental-principles", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#what-are-some-of-the-fundamental-principles", + "title": "Coffee and coding", + "section": "What are some of the fundamental principles?", + "text": "What are some of the fundamental principles?\n\nPredictability, reducing mental load, and reducing truck factor\nMaking it easy to collaborate with yourself and others on different computers, in the cloud, in six months’ time…\nDRY" }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#asynchronous-work", - "href": "presentations/2023-03-23_collaborative-working/index.html#asynchronous-work", - "title": "Collaborative working", - "section": "Asynchronous work", - "text": "Asynchronous work\n\nEvery piece of work has an issues associated with it\nEvery piece of work associated with an issue lives on its own branch\nEvery branch is incorporated to the main repo by a pull request\nEvery pull request is reviewed" + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#what-is-rap", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#what-is-rap", + "title": "Coffee and coding", + "section": "What is RAP", + "text": "What is RAP\n\na process in which code is used to minimise manual, undocumented steps, and a clear, properly documented process is produced in code which can reliably give the same result from the same dataset\nRAP should be:\n\n\nthe core working practice that must be supported by all platforms and teams; make this a core focus of NHS analyst training\n\nGoldacre review" }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#iteration-and-documentation", - "href": "presentations/2023-03-23_collaborative-working/index.html#iteration-and-documentation", - "title": "Collaborative working", - "section": "Iteration and documentation", - "text": "Iteration and documentation\n\nAnalyse early, analyse often (using RAPs!)\nWrite down what you did\nWrite down what you did but then changed your mind about\nFavour Quarto/ RMarkdown\n\nClean sessions\nDocumentation and graphics" + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#the-road-to-rap", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#the-road-to-rap", + "title": "Coffee and coding", + "section": "The road to RAP", + "text": "The road to RAP\n\nWe’re roughly using NHS Digital’s RAP stages\nThere is an incredibly large amount to learn!\nConfession time! (everything I do not know…)\nYou don’t need to do it all at once\nYou don’t need to do it all at all ever\nEach thing you learn will incrementally help you\nRemember- that’s why we learnt this stuff. Because it helped us. And it can help you too" }, { - "objectID": "presentations/2023-03-23_collaborative-working/index.html#data-and-.gitignore", - "href": "presentations/2023-03-23_collaborative-working/index.html#data-and-.gitignore", - "title": "Collaborative working", - "section": "Data and .gitignore", - "text": "Data and .gitignore\n\nYour repo needs to be reproducible but also needs to be safe\nThe main branch should be reproducible by anyone at any time\n\nDocument package dependencies (using renv)\nDocument data loads if the data isn’t in the repo\n\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--baseline", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--baseline", + "title": "Coffee and coding", + "section": "Levels of RAP- Baseline", + "text": "Levels of RAP- Baseline\n\nData produced by code in an open-source language (e.g., Python, R, SQL).\nCode is version controlled (see Git basics and using Git collaboratively guides).\nRepository includes a README.md file (or equivalent) that clearly details steps a user must follow to reproduce the code\nCode has been peer reviewed.\nCode is published in the open and linked to & from accompanying publication (if relevant).\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#why", - "href": "presentations/2024-05-16_store-data-safely/index.html#why", - "title": "Store Data Safely", - "section": "Why?", - "text": "Why?\nBecause:\n\ndata may be sensitive\nGitHub was designed for source control of code\nGitHub has repository file-size limits\nit makes data independent from code\nit prevents repetition" + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--silver", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--silver", + "title": "Coffee and coding", + "section": "Levels of RAP- Silver", + "text": "Levels of RAP- Silver\n\nCode is well-documented…\nCode is well-organised following standard directory format\nReusable functions and/or classes are used where appropriate\nPipeline includes a testing framework\nRepository includes dependency information (e.g. requirements.txt, PipFile, environment.yml\nData is handled and output in a Tidy data format\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#other-approaches", - "href": "presentations/2024-05-16_store-data-safely/index.html#other-approaches", - "title": "Store Data Safely", - "section": "Other approaches", - "text": "Other approaches\nTo prevent data commits:\n\nuse a .gitignore file (*.csv, etc)\nuse Git hooks\navoid ‘add all’ (git add .) when staging\nensure thorough reviews of (small) pull-requests" + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--gold", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#levels-of-rap--gold", + "title": "Coffee and coding", + "section": "Levels of RAP- Gold", + "text": "Levels of RAP- Gold\n\nCode is fully packaged\nRepository automatically runs tests etc. via CI/CD or a different integration/deployment tool e.g. GitHub Actions\nProcess runs based on event-based triggers (e.g., new data in database) or on a schedule\nChanges to the RAP are clearly signposted. E.g. a changelog in the package, releases etc. (See gov.uk info on Semantic Versioning)\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#what-if-i-committed-data", - "href": "presentations/2024-05-16_store-data-safely/index.html#what-if-i-committed-data", - "title": "Store Data Safely", - "section": "What if I committed data?", - "text": "What if I committed data?\n‘It depends’, but if it’s sensitive:\n\n‘undo’ the commit with git reset\nuse a tool like BFG to expunge the file from Git history\ndelete the repo and restart 🔥\n\nA data security breach may have to be reported." + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#a-learning-journey-to-get-us-there", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#a-learning-journey-to-get-us-there", + "title": "Coffee and coding", + "section": "A learning journey to get us there", + "text": "A learning journey to get us there\n\nCode style, organising your files\nFunctions and iteration\nGit and GitHub\nPackaging your code\nTesting\nPackage management and versioning" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#data-hosting-solutions", - "href": "presentations/2024-05-16_store-data-safely/index.html#data-hosting-solutions", - "title": "Store Data Safely", - "section": "Data-hosting solutions", - "text": "Data-hosting solutions\nWe’ll talk about two main options for The Strategy Unit:\n\nPosit Connect and the {pins} package\nAzure Data Storage\n\nWhich to use? It depends." + "objectID": "presentations/2023-02-23_coffee-and-coding/index.html#how-we-can-help-each-other-get-there", + "href": "presentations/2023-02-23_coffee-and-coding/index.html#how-we-can-help-each-other-get-there", + "title": "Coffee and coding", + "section": "How we can help each other get there", + "text": "How we can help each other get there\n\nWork as a team!\nCoffee and coding!\nAsk for help!\nDo pair coding!\nGet your code reviewed!\nJoin the NHS-R/ NHSPycom communities\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#a-platform-by-posit", - "href": "presentations/2024-05-16_store-data-safely/index.html#a-platform-by-posit", - "title": "Store Data Safely", - "section": "A platform by Posit", - "text": "A platform by Posit\n\n\nhttps://connect.strategyunitwm.nhs.uk/" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-is-data-science", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-is-data-science", + "title": "Travels with R and Python", + "section": "What is data science?", + "text": "What is data science?\n\n“A data scientist knows more about computer science than the average statistician, and more about statistics than the average computer scientist”\n\n(Josh Wills, a former head of data engineering at Slack)" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#a-package-by-posit", - "href": "presentations/2024-05-16_store-data-safely/index.html#a-package-by-posit", - "title": "Store Data Safely", - "section": "A package by Posit", - "text": "A package by Posit\n\n\nhttps://pins.rstudio.com/" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#drew-conways-famous-venn-diagram", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#drew-conways-famous-venn-diagram", + "title": "Travels with R and Python", + "section": "Drew Conway’s famous Venn diagram", + "text": "Drew Conway’s famous Venn diagram\n\nSource" }, - { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#basic-approach", - "href": "presentations/2024-05-16_store-data-safely/index.html#basic-approach", - "title": "Store Data Safely", - "section": "Basic approach", - "text": "Basic approach\ninstall.packages(\"pins\")\nlibrary(pins)\n\nboard_connect()\npin_write(board, data, \"pin_name\")\npin_read(board, \"user_name/pin_name\")" + { + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-are-the-skills-of-data-science", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-are-the-skills-of-data-science", + "title": "Travels with R and Python", + "section": "What are the skills of data science?", + "text": "What are the skills of data science?\n\nAnalysis\n\nML\nStats\nData viz\n\nSoftware engineering\n\nProgramming\nSQL/ data\nDevOps\nRAP" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#live-demo", - "href": "presentations/2024-05-16_store-data-safely/index.html#live-demo", - "title": "Store Data Safely", - "section": "Live demo", - "text": "Live demo\n\nLink RStudio to Posit Connect (authenticate)\nConnect to the board\nWrite a new pin\nCheck pin status and details\nPin versions\nUse pinned data\nUnpin your pin" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-are-the-skills-of-data-science-1", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-are-the-skills-of-data-science-1", + "title": "Travels with R and Python", + "section": "What are the skills of data science?", + "text": "What are the skills of data science?\n\nDomain knowledge\n\nCommunication\nProblem formulation\nDashboards and reports" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#should-i-use-it", - "href": "presentations/2024-05-16_store-data-safely/index.html#should-i-use-it", - "title": "Store Data Safely", - "section": "Should I use it?", - "text": "Should I use it?\n\n\n⚠️ {pins} is not great because:\n\nyou should not upload sensitive data!\nthere’s a file-size upload limit\npin organisation is a bit awkward (no subfolders)\n\n\n{pins} is helpful because:\n\nauthentication is straightforward\ndata can be versioned\nyou can control permissions\nthere are R and Python versions of the package" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#stats-and-data-viz", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#stats-and-data-viz", + "title": "Travels with R and Python", + "section": "Stats and data viz", + "text": "Stats and data viz\n\nML leans a bit more towards atheoretical prediction\nStats leans a bit more towards inference (but they both do both)\nData scientists may use different visualisations\n\nInteractive web based tools\nDashboard based visualisers e.g. {stminsights}" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#what-is-azure-data-storage", - "href": "presentations/2024-05-16_store-data-safely/index.html#what-is-azure-data-storage", - "title": "Store Data Safely", - "section": "What is Azure Data Storage?", - "text": "What is Azure Data Storage?\nMicrosoft cloud storage for unstructured data or ‘blobs’ (Binary Large Objects): data objects in binary form that do not necessarily conform to any file format.\nHow is it different?\n\nNo hierarchy – although you can make pseudo-‘folders’ with the blobnames.\nAuthenticates with your Microsoft account." + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#software-engineering", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#software-engineering", + "title": "Travels with R and Python", + "section": "Software engineering", + "text": "Software engineering\n\nProgramming\n\nNo/ low code data science?\n\nSQL/ data\n\nTend to use reproducible automated processes\n\nDevOps\n\nPlan, code, build, test, release, deploy, operate, monitor\n\nRAP\n\nI will come back to this" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#authenticating-to-azure-data-storage", - "href": "presentations/2024-05-16_store-data-safely/index.html#authenticating-to-azure-data-storage", - "title": "Store Data Safely", - "section": "Authenticating to Azure Data Storage", - "text": "Authenticating to Azure Data Storage\n\nYou are all part of the “strategy-unit-analysts” group; this gives you read/write access to specific Azure storage containers.\nYou can store sensitive information like the container ID in a local .Renviron or .env file that should be ignored by git.\nUsing {AzureAuth}, {AzureStor} and your credentials, you can connect to the Azure storage container, upload files and download them, or read the files directly from storage!" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#domain-knowledge", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#domain-knowledge", + "title": "Travels with R and Python", + "section": "Domain knowledge", + "text": "Domain knowledge\n\nDo stuff that matters\n\nThe best minds of my generation are thinking about how to make people click ads. That sucks. Jeffrey Hammerbacher\n\nConvince other people that it matters\nThis is the hardest part of data science" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#step-1-load-your-environment-variables", - "href": "presentations/2024-05-16_store-data-safely/index.html#step-1-load-your-environment-variables", - "title": "Store Data Safely", - "section": "Step 1: load your environment variables", - "text": "Step 1: load your environment variables\nStore sensitive info in an .Renviron file that’s kept out of your Git history! The info can then be loaded in your script.\n.Renviron:\nAZ_STORAGE_EP=https://STORAGEACCOUNT.blob.core.windows.net/\nScript:\nep_uri <- Sys.getenv(\"AZ_STORAGE_EP\")\nTip: reload .Renviron with readRenviron(\".Renviron\")" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#rap", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#rap", + "title": "Travels with R and Python", + "section": "RAP", + "text": "RAP\n\nData science isn’t RAP\nRAP isn’t data science\nThey are firm friends" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#step-1-load-your-environment-variables-1", - "href": "presentations/2024-05-16_store-data-safely/index.html#step-1-load-your-environment-variables-1", - "title": "Store Data Safely", - "section": "Step 1: load your environment variables", - "text": "Step 1: load your environment variables\nIn the demo script we are providing, you will need these environment variables:\nep_uri <- Sys.getenv(\"AZ_STORAGE_EP\")\napp_id <- Sys.getenv(\"AZ_APP_ID\")\ncontainer_name <- Sys.getenv(\"AZ_STORAGE_CONTAINER\")\ntenant <- Sys.getenv(\"AZ_TENANT_ID\")" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#reproducibility", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#reproducibility", + "title": "Travels with R and Python", + "section": "Reproducibility", + "text": "Reproducibility\n\nReproducibility in science\nThe $6B spreadsheet error\nGeorge Osbourne’s austerity was based on a spreadsheet error\nFor us, reproducibility also means we can do the same analysis 50 times in one minute\n\nWhich is why I started down the road of data science" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#step-2-authenticate-with-azure", - "href": "presentations/2024-05-16_store-data-safely/index.html#step-2-authenticate-with-azure", - "title": "Store Data Safely", - "section": "Step 2: Authenticate with Azure", - "text": "Step 2: Authenticate with Azure\n\n\ntoken <- AzureAuth::get_azure_token(\n \"https://storage.azure.com\",\n tenant = tenant,\n app = app_id,\n auth_type = \"device_code\",\n)\nThe first time you do this, you will have link to authenticate in your browser and a code in your terminal to enter. Use the browser that works best with your @mlcsu.nhs.uk account!" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-is-rap", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#what-is-rap", + "title": "Travels with R and Python", + "section": "What is RAP", + "text": "What is RAP\n\na process in which code is used to minimise manual, undocumented steps, and a clear, properly documented process is produced in code which can reliably give the same result from the same dataset\nRAP should be:\n\n\nthe core working practice that must be supported by all platforms and teams; make this a core focus of NHS analyst training\n\n\nGoldacre review" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#step-3-connect-to-container", - "href": "presentations/2024-05-16_store-data-safely/index.html#step-3-connect-to-container", - "title": "Store Data Safely", - "section": "Step 3: Connect to container", - "text": "Step 3: Connect to container\nendpoint <- AzureStor::blob_endpoint(ep_uri, token = token)\ncontainer <- AzureStor::storage_container(endpoint, container_name)\n\n# List files in container\nblob_list <- AzureStor::list_blobs(container)\nIf you get 403 error, delete your token and re-authenticate, try a different browser/incognito, etc.\nTo clear Azure tokens: AzureAuth::clean_token_directory()" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--baseline", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--baseline", + "title": "Travels with R and Python", + "section": "Levels of RAP- Baseline", + "text": "Levels of RAP- Baseline\n\nData produced by code in an open-source language (e.g., Python, R, SQL)\nCode is version controlled\nRepository includes a README.md file that clearly details steps a user must follow to reproduce the code\nCode has been peer reviewed\nCode is published in the open and linked to & from accompanying publication (if relevant)\n\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container", - "href": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container", - "title": "Store Data Safely", - "section": "Interact with the container", - "text": "Interact with the container\nIt’s possible to interact with the container via your browser!\nYou can upload and download files using the Graphical User Interface (GUI), login with your @mlcsu.nhs.uk account: https://portal.azure.com/#home\nAlthough it’s also cooler to interact via code… 😎" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--silver", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--silver", + "title": "Travels with R and Python", + "section": "Levels of RAP- Silver", + "text": "Levels of RAP- Silver\n\nCode is well-documented…\nCode is well-organised following standard directory format\nReusable functions and/or classes are used where appropriate\nPipeline includes a testing framework\nRepository includes dependency information (e.g. requirements.txt, PipFile, environment.yml)\nData is handled and output in a Tidy data format\n\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container-1", - "href": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container-1", - "title": "Store Data Safely", - "section": "Interact with the container", - "text": "Interact with the container\n# Upload contents of a local directory to container\nAzureStor::storage_multiupload(\n container,\n \"LOCAL_FOLDERNAME/*\",\n \"FOLDERNAME_ON_AZURE\"\n)\n\n# Upload specific file to container\nAzureStor::storage_upload(\n container,\n \"data/ronald.jpeg\",\n \"newdir/ronald.jpeg\"\n)" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--gold", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#levels-of-rap--gold", + "title": "Travels with R and Python", + "section": "Levels of RAP- Gold", + "text": "Levels of RAP- Gold\n\nCode is fully packaged\nRepository automatically runs tests etc. via CI/CD or a different integration/deployment tool e.g. GitHub Actions\nProcess runs based on event-based triggers (e.g., new data in database) or on a schedule\nChanges to the RAP are clearly signposted. E.g. a changelog in the package, releases etc. (See gov.uk info on Semantic Versioning)\n\n\nSource: NHS Digital RAP community of practice" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#load-csv-files-directly-from-azure-container", - "href": "presentations/2024-05-16_store-data-safely/index.html#load-csv-files-directly-from-azure-container", - "title": "Store Data Safely", - "section": "Load csv files directly from Azure container", - "text": "Load csv files directly from Azure container\ndf_from_azure <- AzureStor::storage_read_csv(\n container,\n \"newdir/cats.csv\",\n show_col_types = FALSE\n)\n\n# Load file directly from Azure container (by storing it in memory)\n\nparquet_in_memory <- AzureStor::storage_download(\n container, src = \"newdir/cats.parquet\", dest = NULL\n)\n\nparq_df <- arrow::read_parquet(parquet_in_memory)" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#data-science-in-healthcare", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#data-science-in-healthcare", + "title": "Travels with R and Python", + "section": "Data science in healthcare", + "text": "Data science in healthcare\n\nForecasting\n\nStats versus ML\n\nText mining\n\nR versus Python\n\nDemand modelling\n\nDevOps as a way of life" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container-2", - "href": "presentations/2024-05-16_store-data-safely/index.html#interact-with-the-container-2", - "title": "Store Data Safely", - "section": "Interact with the container", - "text": "Interact with the container\n# Delete from Azure container (!!!)\nAzureStor::delete_storage_file(container, BLOB_NAME)" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#get-involved", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#get-involved", + "title": "Travels with R and Python", + "section": "Get involved!", + "text": "Get involved!\n\nNHS-R community\n\nWebinars, training, conference, Slack\n\nNHS Pycom\n\nditto…\n\nMLCSU GitHub?\nBuild links with the other CSUs" }, { - "objectID": "presentations/2024-05-16_store-data-safely/index.html#what-does-this-achieve", - "href": "presentations/2024-05-16_store-data-safely/index.html#what-does-this-achieve", - "title": "Store Data Safely", - "section": "What does this achieve?", - "text": "What does this achieve?\n\nData is not in the repository, it is instead stored in a secure location\nCode can be open – sensitive information like Azure container name stored as environment variables\nLarge filesizes possible, other people can also access the same container.\nNaming conventions can help to keep blobs organised (these create pseudo-folders)\n\n\n\n\nLearn more about Data Science at The Strategy Unit" + "objectID": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#contact", + "href": "presentations/2023-08-02_mlcsu-ksn-meeting/index.html#contact", + "title": "Travels with R and Python", + "section": "Contact", + "text": "Contact\n\n\n\n\n strategy.unit@nhs.net\n The-Strategy-Unit\n\n\n\n\n\n chris.beeley1@nhs.net\n chrisbeeley\n\n\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#tldr", - "href": "presentations/2024-05-23_github-team-sport/index.html#tldr", - "title": "GitHub as a team sport", - "section": "tl;dr", - "text": "tl;dr\n\n\n\n‘Quality’ isn’t just good code\nTeamwork makes the dream work\nGitHub is a communication tool\n\n\n\n\n\n\n‘Too long; didn’t read’.\nGitHub isn’t just a dumping ground for code and Git history.\nIt’s a platform for working with teammates to get things done.\nQuality is improved by good communication, organisation and reduction of something called the ‘bus factor’ that I’ll get to in a minute." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#packages-we-are-using-today", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#packages-we-are-using-today", + "title": "Coffee and Coding", + "section": "Packages we are using today", + "text": "Packages we are using today\n\nlibrary(tidyverse)\n\nlibrary(sf)\n\nlibrary(tidygeocoder)\nlibrary(PostcodesioR)\n\nlibrary(osrm)\n\nlibrary(leaflet)" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#the-strategy-unit-su", - "href": "presentations/2024-05-23_github-team-sport/index.html#the-strategy-unit-su", - "title": "GitHub as a team sport", - "section": "The Strategy Unit (SU)", - "text": "The Strategy Unit (SU)\n\n\n\nAn ‘internal consultancy’\nHosted by NHS Midlands and Lancashire\nGrowing in size and reputation\n\n\n\n\n\n\nInitially a ‘start-up’ style operation that has expanded to 70+ staff.\n‘We produce high-quality, multi-disciplinary analytical work – and we help people apply the results.’\nA lot of our work is on the important New Hospital Programme (NHP).\n‘Our proposition is simple: better evidence, better decisions, better outcomes.’\nExpansion is tricky; how can we maintain quality?" + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#getting-boundary-data", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#getting-boundary-data", + "title": "Coffee and Coding", + "section": "Getting boundary data", + "text": "Getting boundary data\nWe can use the ONS’s Geoportal we can grab boundary data to generate maps\n\n\n\nicb_url <- paste0(\n \"https://services1.arcgis.com\",\n \"/ESMARspQHYMw9BZ9/arcgis\",\n \"/rest/services\",\n \"/Integrated_Care_Boards_April_2023_EN_BGC\",\n \"/FeatureServer/0/query\",\n \"?outFields=*&where=1%3D1&f=geojson\"\n)\nicb_boundaries <- read_sf(icb_url)\n\nicb_boundaries |>\n ggplot() +\n geom_sf() +\n theme_void()" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#the-data-science-team", - "href": "presentations/2024-05-23_github-team-sport/index.html#the-data-science-team", - "title": "GitHub as a team sport", - "section": "The Data Science Team", - "text": "The Data Science Team\n \n\nExpanded to 6, all remote\nModelling, Quarto, Shiny\nNew Hospital Programme (NHP)\n\n\n\nA new team, expanding rapidly from 2 to 6 in about a year.\nRemote across England.\nExperience from across the NHS and consultancy. I spent a decade in five central government departments before this.\nWe’re helping to model and design apps for the NHP to help build hospitals.\nSo: growing team, different experiences, important work, but few standardised processes. What to do?" + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-is-the-icb_boundaries-data", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-is-the-icb_boundaries-data", + "title": "Coffee and Coding", + "section": "What is the icb_boundaries data?", + "text": "What is the icb_boundaries data?\n\nicb_boundaries |>\n select(ICB23CD, ICB23NM)\n\nSimple feature collection with 42 features and 2 fields\nGeometry type: MULTIPOLYGON\nDimension: XY\nBounding box: xmin: -6.418667 ymin: 49.86479 xmax: 1.763706 ymax: 55.81112\nGeodetic CRS: WGS 84\n# A tibble: 42 × 3\n ICB23CD ICB23NM geometry\n <chr> <chr> <MULTIPOLYGON [°]>\n 1 E54000008 NHS Cheshire and Merseyside Integrated C… (((-3.083264 53.2559, -3…\n 2 E54000010 NHS Staffordshire and Stoke-on-Trent Int… (((-1.950489 53.21188, -…\n 3 E54000011 NHS Shropshire, Telford and Wrekin Integ… (((-2.380794 52.99841, -…\n 4 E54000013 NHS Lincolnshire Integrated Care Board (((0.2687853 52.81584, 0…\n 5 E54000015 NHS Leicester, Leicestershire and Rutlan… (((-0.7875237 52.97762, …\n 6 E54000018 NHS Coventry and Warwickshire Integrated… (((-1.577608 52.67858, -…\n 7 E54000019 NHS Herefordshire and Worcestershire Int… (((-2.272042 52.43972, -…\n 8 E54000022 NHS Norfolk and Waveney Integrated Care … (((1.666741 52.31366, 1.…\n 9 E54000023 NHS Suffolk and North East Essex Integra… (((0.8997023 51.7732, 0.…\n10 E54000024 NHS Bedfordshire, Luton and Milton Keyne… (((-0.4577115 52.32009, …\n# ℹ 32 more rows" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#github-at-the-su", - "href": "presentations/2024-05-23_github-team-sport/index.html#github-at-the-su", - "title": "GitHub as a team sport", - "section": "GitHub at the SU", - "text": "GitHub at the SU\n\n\nWe should be exemplars\nAiming for open by default\nGitHub is on the homepage and there’s a Data Science site\n\n\n\nIt’s not just the DS team.\nWe have many other analysts eager to learn and contribute.\nHow can we set good standards and encourage use across the organisation?\nWe’re running Coffee & Coding sessions, teaching and encouraging talks and blogs on our site.\nWe want to drive up quality by making code open too.\nIt’s a statement of intent that the SU homepage links to our GitHub organisation." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-dataframes", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-dataframes", + "title": "Coffee and Coding", + "section": "Working with geospatial dataframes", + "text": "Working with geospatial dataframes\nWe can simply join sf data frames and “regular” data frames together\n\n\n\nicb_metrics <- icb_boundaries |>\n st_drop_geometry() |>\n select(ICB23CD) |>\n mutate(admissions = rpois(n(), 1000000))\n\nicb_boundaries |>\n inner_join(icb_metrics, by = \"ICB23CD\") |>\n ggplot() +\n geom_sf(aes(fill = admissions)) +\n scale_fill_viridis_c() +\n theme_void()" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#what-this-is", - "href": "presentations/2024-05-23_github-team-sport/index.html#what-this-is", - "title": "GitHub as a team sport", - "section": "What this is", - "text": "What this is\n\nLow-tech, no code\nTips and etiquette, not directives\nWhat’s been working for us\n\n\n\nBut this is not a technical talk about how to use Git for version control.\nMostly it’s about planning, workflows, standards and communication.\nIt’s things that our team have been doing and the ideas are evolving.\nI’ve worked mostly alone on GitHub projects in my career and never worked in a data science team of even this size. So at worst these slides are a way for me to write down what I’m learning." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-data-frames", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-data-frames", + "title": "Coffee and Coding", + "section": "Working with geospatial data frames", + "text": "Working with geospatial data frames\nWe can manipulate sf objects like other data frames\n\n\n\nlondon_icbs <- icb_boundaries |>\n filter(ICB23NM |> stringr::str_detect(\"London\"))\n\nggplot() +\n geom_sf(data = london_icbs) +\n geom_sf(data = st_centroid(london_icbs)) +\n theme_void()" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#the-bus-factor", - "href": "presentations/2024-05-23_github-team-sport/index.html#the-bus-factor", - "title": "GitHub as a team sport", - "section": "The ‘bus factor’ 🚍", - "text": "The ‘bus factor’ 🚍\n\nWe should maintain quality\nWe need redundancy\nStandardised processes can help\n\n\n\nWhy do we care about discussing and ‘formalising’ these ideas?\nWe should encourage standard practices in case someone is ill or away.\nThis also makes it easier when new team members join.\nThis helps us maintain quality." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-data-frames-1", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#working-with-geospatial-data-frames-1", + "title": "Coffee and Coding", + "section": "Working with geospatial data frames", + "text": "Working with geospatial data frames\nSummarising the data will combine the geometries.\n\nlondon_icbs |>\n summarise(area = sum(Shape__Area)) |>\n # and use geospatial functions to create calculations using the geometry\n mutate(new_area = st_area(geometry), .before = \"geometry\")\n\nSimple feature collection with 1 feature and 2 fields\nGeometry type: MULTIPOLYGON\nDimension: XY\nBounding box: xmin: -0.5102803 ymin: 51.28676 xmax: 0.3340241 ymax: 51.69188\nGeodetic CRS: WGS 84\n# A tibble: 1 × 3\n area new_area geometry\n* <dbl> [m^2] <MULTIPOLYGON [°]>\n1 1573336388. 1567995610. (((-0.3314819 51.43935, -0.3306676 51.43889, -0.33118…\n\n\n Why the difference in area?\n\n We are using a simplified geometry, so calculating the area will be slightly inaccurate. The original area was calculated on the non-simplified geometries." }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#rules", - "href": "presentations/2024-05-23_github-team-sport/index.html#rules", - "title": "GitHub as a team sport", - "section": "‘Rules’", - "text": "‘Rules’\n\nIt’s the spirit that counts\nDo as I say, not as I do\nKnow why you’re breaking the rules\n\n\n\nTo be clear though, nothing here is etched into stone.\nThere will be times where rules can be broken.\nBut we shouldn’t be complacent." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#creating-our-own-geospatial-data", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#creating-our-own-geospatial-data", + "title": "Coffee and Coding", + "section": "Creating our own geospatial data", + "text": "Creating our own geospatial data\n\nlocation_raw <- postcode_lookup(\"B2 4BJ\")\nglimpse(location_raw)\n\nRows: 1\nColumns: 40\n$ postcode <chr> \"B2 4BJ\"\n$ quality <int> 1\n$ eastings <int> 406866\n$ northings <int> 286775\n$ country <chr> \"England\"\n$ nhs_ha <chr> \"West Midlands\"\n$ longitude <dbl> -1.90033\n$ latitude <dbl> 52.47887\n$ european_electoral_region <chr> \"West Midlands\"\n$ primary_care_trust <chr> \"Heart of Birmingham Teaching\"\n$ region <chr> \"West Midlands\"\n$ lsoa <chr> \"Birmingham 138A\"\n$ msoa <chr> \"Birmingham 138\"\n$ incode <chr> \"4BJ\"\n$ outcode <chr> \"B2\"\n$ parliamentary_constituency <chr> \"Birmingham, Ladywood\"\n$ parliamentary_constituency_2024 <chr> \"Birmingham Ladywood\"\n$ admin_district <chr> \"Birmingham\"\n$ parish <chr> \"Birmingham, unparished area\"\n$ admin_county <lgl> NA\n$ date_of_introduction <chr> \"198001\"\n$ admin_ward <chr> \"Ladywood\"\n$ ced <lgl> NA\n$ ccg <chr> \"NHS Birmingham and Solihull\"\n$ nuts <chr> \"Birmingham\"\n$ pfa <chr> \"West Midlands\"\n$ admin_district_code <chr> \"E08000025\"\n$ admin_county_code <chr> \"E99999999\"\n$ admin_ward_code <chr> \"E05011151\"\n$ parish_code <chr> \"E43000250\"\n$ parliamentary_constituency_code <chr> \"E14000564\"\n$ parliamentary_constituency_2024_code <chr> \"E14001096\"\n$ ccg_code <chr> \"E38000258\"\n$ ccg_id_code <chr> \"15E\"\n$ ced_code <chr> \"E99999999\"\n$ nuts_code <chr> \"TLG31\"\n$ lsoa_code <chr> \"E01033620\"\n$ msoa_code <chr> \"E02006899\"\n$ lau2_code <chr> \"E08000025\"\n$ pfa_code <chr> \"E23000014\"\n\n\n\n\n\nlocation <- location_raw |>\n st_as_sf(coords = c(\"eastings\", \"northings\"), crs = 27700) |>\n select(postcode, ccg) |>\n st_transform(crs = 4326)\n\nlocation\n\nSimple feature collection with 1 feature and 2 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -1.900335 ymin: 52.47886 xmax: -1.900335 ymax: 52.47886\nGeodetic CRS: WGS 84\n postcode ccg geometry\n1 B2 4BJ NHS Birmingham and Solihull POINT (-1.900335 52.47886)" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#github-flow", - "href": "presentations/2024-05-23_github-team-sport/index.html#github-flow", - "title": "GitHub as a team sport", - "section": "GitHub flow", - "text": "GitHub flow\n\nCreate a repository\nWrite issues\nPlan\nCreate a branch\nMake a pull request\nReview\nRelease\n\n\n\nThis is a fairly generic GitHub flow.\nI’ll talk through a few things in each of these categories." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#creating-a-geospatial-data-frame-for-all-nhs-trusts", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#creating-a-geospatial-data-frame-for-all-nhs-trusts", + "title": "Coffee and Coding", + "section": "Creating a geospatial data frame for all NHS Trusts", + "text": "Creating a geospatial data frame for all NHS Trusts\n\n\n\n# using the NHSRtools package\n# remotes::install_github(\"NHS-R-Community/NHSRtools\")\ntrusts <- ods_get_trusts() |>\n filter(status == \"Active\") |>\n select(name, org_id, post_code) |>\n geocode(postalcode = \"post_code\") |>\n st_as_sf(coords = c(\"long\", \"lat\"), crs = 4326)\n\n\ntrusts |>\n leaflet() |>\n addProviderTiles(\"Stamen.TonerLite\") |>\n addMarkers(popup = ~name)" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#repositories", - "href": "presentations/2024-05-23_github-team-sport/index.html#repositories", - "title": "GitHub as a team sport", - "section": "Repositories", - "text": "Repositories\n\nAssign ‘owner’ and ‘deputy’ roles\nAdd README and .gitignore\nStore data elsewhere\n\n\n\nEasy starter: tell people what the purpose of the repo is and how to use it. This is what a README is for. This is an absolute must to lower the bus factor.\nWe should be prevent accidental file upload immediately. Use a .gitignore to exclude likely data files (as well as other unnecessary files). We’re thinking about common templates/cookiecutters.\nCommunicative files (README, .gitignores) are good, but so is vigilance (code review).\nOwners/deputies are in charge of ‘GitHub gardening’ (keeping issues in order, labelling, milestones, etc).\nDeputies help with bus factor.\nThe owner can be auto-selected as the reviewer. We’re experimenting with this for repos with external contributors, especially.\nData is stored elsewhere, on Azure or Posit Connect, due to sensitivity and size. This should be planned before you begin and recorded in the README." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-are-the-nearest-trusts-to-our-location", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-are-the-nearest-trusts-to-our-location", + "title": "Coffee and Coding", + "section": "What are the nearest trusts to our location?", + "text": "What are the nearest trusts to our location?\n\nnearest_trusts <- trusts |>\n mutate(\n distance = st_distance(geometry, location)[, 1]\n ) |>\n arrange(distance) |>\n head(5)\n\nnearest_trusts\n\nSimple feature collection with 5 features and 4 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -1.9384 ymin: 52.4533 xmax: -1.886282 ymax: 52.48764\nGeodetic CRS: WGS 84\n# A tibble: 5 × 5\n name org_id post_code geometry distance\n <chr> <chr> <chr> <POINT [°]> [m]\n1 BIRMINGHAM WOMEN'S AND CH… RQ3 B4 6NH (-1.894241 52.4849) 789.\n2 BIRMINGHAM AND SOLIHULL M… RXT B1 3RB (-1.917663 52.48416) 1313.\n3 BIRMINGHAM COMMUNITY HEAL… RYW B7 4BN (-1.886282 52.48754) 1356.\n4 SANDWELL AND WEST BIRMING… RXK B18 7QH (-1.930203 52.48764) 2246.\n5 UNIVERSITY HOSPITALS BIRM… RRK B15 2GW (-1.9384 52.4533) 3838." }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#issues", - "href": "presentations/2024-05-23_github-team-sport/index.html#issues", - "title": "GitHub as a team sport", - "section": "Issues", - "text": "Issues\n\n\n\nAren’t just ‘problems’\nUse labels, including MoSCoW\nExplain the need, be informative\n\n\n\n\n\n\nIssues can be reminders or questions for further discussion, not just features to build.\nTickets should get two labels. We use a topic like ‘enhancement’, ‘bug’, ‘documentation’, ‘techdebt’, etc, plus MoSCoW (must, should, could, won’t) to help prioritisation.\nIssue templates can ensure certain info is provided, which is especially good for external contributors.\nRefer to other related commits by number (e.g. #1), which stops you repeating the same information.\nPrefer to reopen an issue if it doesn’t actually work.\nIssues can track separate sub-issues.\nYou can add checklists with markdown checkbox: - [ ] (these appear in the issue preview).\nYou can ‘hide’ comments if they’re out of date, etc." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#lets-find-driving-routes-to-these-trusts", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#lets-find-driving-routes-to-these-trusts", + "title": "Coffee and Coding", + "section": "Let’s find driving routes to these trusts", + "text": "Let’s find driving routes to these trusts\n\nroutes <- nearest_trusts |>\n mutate(\n route = map(geometry, ~ osrmRoute(location, st_coordinates(.x)))\n ) |>\n st_drop_geometry() |>\n rename(straight_line_distance = distance) |>\n unnest(route) |>\n st_as_sf()\n\nroutes\n\nSimple feature collection with 5 features and 8 fields\nGeometry type: LINESTRING\nDimension: XY\nBounding box: xmin: -1.93846 ymin: 52.45316 xmax: -1.88527 ymax: 52.49279\nGeodetic CRS: WGS 84\n# A tibble: 5 × 9\n name org_id post_code straight_line_distance src dst duration distance\n <chr> <chr> <chr> [m] <chr> <chr> <dbl> <dbl>\n1 BIRMING… RQ3 B4 6NH 789. 1 dst 5.77 3.09\n2 BIRMING… RXT B1 3RB 1313. 1 dst 6.84 4.14\n3 BIRMING… RYW B7 4BN 1356. 1 dst 7.59 4.29\n4 SANDWEL… RXK B18 7QH 2246. 1 dst 8.78 4.95\n5 UNIVERS… RRK B15 2GW 3838. 1 dst 10.6 4.67\n# ℹ 1 more variable: geometry <LINESTRING [°]>" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#plan", - "href": "presentations/2024-05-23_github-team-sport/index.html#plan", - "title": "GitHub as a team sport", - "section": "Plan", - "text": "Plan\n\n\nTalk, review and reflect\nUse labels to prioritise\nSort into milestones\n\n\n\nWe have a repo and issues, what do we do now? Where to start?\nWe’ve begun working in sprints of about 4 weeks. We have sprint planning meetings to plan things out.\nConsider what needs to be done in the sprint period, what other issues support those goals?\nIs there time for other tasks, like clearing techdebt?\nAll issues should be assigned to a milestone.\nIssues in milestones should be sorted in priority order/order of expected completion (MoSCoW labels will help with this).\nThis helps focus the goals of the sprint and keep us on track." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#lets-show-the-routes", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#lets-show-the-routes", + "title": "Coffee and Coding", + "section": "Let’s show the routes", + "text": "Let’s show the routes\n\nleaflet(routes) |>\n addTiles() |>\n addMarkers(data = location) |>\n addPolylines(color = \"black\", weight = 3, opacity = 1) |>\n addCircleMarkers(data = nearest_trusts, radius = 4, opacity = 1, fillOpacity = 1)" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#branches", - "href": "presentations/2024-05-23_github-team-sport/index.html#branches", - "title": "GitHub as a team sport", - "section": "Branches", - "text": "Branches\n\n\nOne issue, one branch, one assigned person\nName them sensibly\nBurn them\n\n\n\nOnly one person works on a branch at a time. This person is the one assigned to the relevant issue.\nBranch names should be numbered to match their issue, e.g. ‘123-add-filter’. This makes it obvious what issue is being fixed by that branch and should help identify if more than one person has a branch open for the same issue.\nIf commits from someone else are required, then all parties must communicate about the current state of the branch to ensure they pull changes and avoid merge conflicts.\nBranches are ephemeral and die when the PR is merged. They should be deleted (this can be done automatically).\nThe only branches to exist at all times should be main and a deployment branch, if necessary. All others should be active branches so it’s clear what’s being worked on." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#we-can-use-osrm-to-calculate-isochrones", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#we-can-use-osrm-to-calculate-isochrones", + "title": "Coffee and Coding", + "section": "We can use {osrm} to calculate isochrones", + "text": "We can use {osrm} to calculate isochrones\n\n\n\niso <- osrmIsochrone(location, breaks = seq(0, 60, 15), res = 10)\n\nisochrone_ids <- unique(iso$id)\n\npal <- colorFactor(\n viridis::viridis(length(isochrone_ids)),\n isochrone_ids\n)\n\nleaflet(location) |>\n addProviderTiles(\"Stamen.TonerLite\") |>\n addMarkers() |>\n addPolygons(\n data = iso,\n fillColor = ~ pal(id),\n color = \"#000000\",\n weight = 1\n )" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#commits", - "href": "presentations/2024-05-23_github-team-sport/index.html#commits", - "title": "GitHub as a team sport", - "section": "Commits", - "text": "Commits\n\n\nDon’t commit to main!\n‘Small, early and often’\nMake messages meaningful\n\n\n\nThere’s not a lot of earth-shattering advice to give here; this stuff is fairly standard.\nDo not commit directly to main. Your work must be independently checked first to limit the chance of mistakes.\nMake your commits small in terms of code and files touched, if possible. This makes the Git history easier to read and makes reviews easier too.\nCommit and push early and often into your branch. This can help others see progress and helps reduce the bus factor.\nDon’t dump your work into a commit because it’s the end of the day.\nMake your commit messages meaningful. What does the commit do? Start with a verb in present tense (‘adds’, not ‘added’). Or maybe use ‘conventional’ commits." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones", + "title": "Coffee and Coding", + "section": "What trusts are in the isochrones?", + "text": "What trusts are in the isochrones?\nThe summarise() function will “union” the geometry\n\nsummarise(iso)\n\nSimple feature collection with 1 feature and 0 fields\nGeometry type: POLYGON\nDimension: XY\nBounding box: xmin: -2.913575 ymin: 51.98062 xmax: -0.8502164 ymax: 53.1084\nGeodetic CRS: WGS 84\n geometry\n1 POLYGON ((-1.541014 52.9693..." }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#pull-requests-prs", - "href": "presentations/2024-05-23_github-team-sport/index.html#pull-requests-prs", - "title": "GitHub as a team sport", - "section": "Pull requests (PRs)", - "text": "Pull requests (PRs)\n\n\n\nSmall and closes an issue\nSelect the assignee and reviewer\nThe assignee merges\n\n\n\n\n\n\nPRs should solve the issue they’re related to. Occasionally one fix may solve another.\nThey should be named to explain what they do. The issue might be ‘the red button doesn’t work’; the PR might be ‘fix the red button’.\nThey should be small in terms of lines of code and files touched. This will make it easier and faster to understand and assess the changes.\nThe submitter should mark themself as the ‘assignee’ and choose a reviewer. You may want to chat with the reviewer to let them know if they have time.\nFor context, link to the issue(s) being closed with the magic words (‘closes’, ‘fixes’, etc), which will also close those issues as completed.\nInclude a short explanation or bullet-points of what the PR does. Provide any extra information to make the reviewer’s life easier (areas of focus, maybe) or to ask a question about some aspect of what you’ve written.\nThe PR submitter is the one who clicks the merge button. This is in case the submitter realises there’s something they need to add or change before the merge." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones-1", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones-1", + "title": "Coffee and Coding", + "section": "What trusts are in the isochrones?", + "text": "What trusts are in the isochrones?\nWe can use this with a geo-filter to find the trusts in the isochrone\n\n# also works\ntrusts_in_iso <- trusts |>\n st_filter(\n summarise(iso),\n .predicate = st_within\n )\n\ntrusts_in_iso\n\nSimple feature collection with 31 features and 3 fields\nGeometry type: POINT\nDimension: XY\nBounding box: xmin: -2.793386 ymin: 52.19205 xmax: -1.10302 ymax: 53.01015\nGeodetic CRS: WGS 84\n# A tibble: 31 × 4\n name org_id post_code geometry\n * <chr> <chr> <chr> <POINT [°]>\n 1 BIRMINGHAM AND SOLIHULL MENTAL HE… RXT B1 3RB (-1.917663 52.48416)\n 2 BIRMINGHAM COMMUNITY HEALTHCARE N… RYW B7 4BN (-1.886282 52.48754)\n 3 BIRMINGHAM WOMEN'S AND CHILDREN'S… RQ3 B4 6NH (-1.894241 52.4849)\n 4 BIRMINGHAM WOMEN'S NHS FOUNDATION… RLU B15 2TG (-1.942861 52.45325)\n 5 BURTON HOSPITALS NHS FOUNDATION T… RJF DE13 0RB (-1.656667 52.81774)\n 6 COVENTRY AND WARWICKSHIRE PARTNER… RYG CV6 6NY (-1.48692 52.45659)\n 7 DERBYSHIRE HEALTHCARE NHS FOUNDAT… RXM DE22 3LZ (-1.512896 52.91831)\n 8 DUDLEY INTEGRATED HEALTH AND CARE… RYK DY5 1RU (-2.11786 52.48176)\n 9 GEORGE ELIOT HOSPITAL NHS TRUST RLT CV10 7DJ (-1.47844 52.51258)\n10 HEART OF ENGLAND NHS FOUNDATION T… RR1 B9 5ST (-1.828759 52.4781)\n# ℹ 21 more rows" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#reviewing-prs", - "href": "presentations/2024-05-23_github-team-sport/index.html#reviewing-prs", - "title": "GitHub as a team sport", - "section": "Reviewing PRs", - "text": "Reviewing PRs\n\n\n\nBe helpful, be kind\nUse GitHub suggestions\nDiscuss if unclear\n\n\n\n\n\n\nThe reviewer should typically check that the changes result in the issue being fixed. This may require pulling the branch and then testing it, but may not be necessary for small changes.\nThe reviewer should seek clarification and add comments where something isn’t clear.\nUse ‘suggestions’ as a reviewer rather than committing to someone else’s branch.\nWhen working at pace (when aren’t we?), we should err towards approval if the issue is completed rather than an endless cycle of asking for small changes. The submitter and reviewer should decide whether smaller things like code style or change in approach should be added as a new issue with a ‘techdebt’ label." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones-2", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#what-trusts-are-in-the-isochrones-2", + "title": "Coffee and Coding", + "section": "What trusts are in the isochrones?", + "text": "What trusts are in the isochrones?\n\n\n\nleaflet(trusts_in_iso) |>\n addProviderTiles(\"Stamen.TonerLite\") |>\n addMarkers() |>\n addPolygons(\n data = iso,\n fillColor = ~pal(id),\n color = \"#000000\",\n weight = 1\n )" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#releases", - "href": "presentations/2024-05-23_github-team-sport/index.html#releases", - "title": "GitHub as a team sport", - "section": "Releases", - "text": "Releases\n\nUse semantic versioning (1.2.3)\nAutofill notes with PR names\nDon’t release on a Friday 🙃\n\n\n\nTag the history and release on GitHub concurrently to keep them in sync (this is done automatically if the release is done from the GitHub interface).\nSemantic (x.y.z where x is breaking, y is new features and z is patches for bugs).\nWe typically just autofill the release description with the constituent PR titles. Which means it’s important to give them meaningful names.\nWe align releases with sprints, though patches may occur more frequently.\nWe link releases to deployment in many cases. Don’t release to prod on a Friday, lol." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#doing-the-same-but-within-a-radius", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#doing-the-same-but-within-a-radius", + "title": "Coffee and Coding", + "section": "Doing the same but within a radius", + "text": "Doing the same but within a radius\n\n\n\nr <- 25000\n\ntrusts_in_radius <- trusts |>\n st_filter(\n location,\n .predicate = st_is_within_distance,\n dist = r\n )\n\n# transforming gives us a pretty smooth circle\nradius <- location |>\n st_transform(crs = 27700) |>\n st_buffer(dist = r) |>\n st_transform(crs = 4326)\n\nleaflet(trusts_in_radius) |>\n addProviderTiles(\"Stamen.TonerLite\") |>\n addMarkers() |>\n addPolygons(\n data = radius,\n color = \"#000000\",\n weight = 1\n )" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#github-is-a-team-member", - "href": "presentations/2024-05-23_github-team-sport/index.html#github-is-a-team-member", - "title": "GitHub as a team sport", - "section": "GitHub is a team member", - "text": "GitHub is a team member\n\n\nAutomate with Actions\nProvide issue and repo templates\nAn all-in-one planner?\n\n\n\nI lied: we have 6 human team members. GitHub itself has features that can automate away some boring things and help prevent accidents or forgetfulness.\nGitHub Actions for continuous integration. R-CMD check at least for R projects. Start with r-lib examples as a basis.\nWe’re looking towards things like templates at the issue and repo levels; again to remove drudgery.\nWe use Trello to plan things and have to link to GitHub repos and issues in Trello cards. Can we use GitHub as our planner across multiple repos instead? Seems possible." + "objectID": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#further-reading", + "href": "presentations/2023-08-24_coffee-and-coding_geospatial/index.html#further-reading", + "title": "Coffee and Coding", + "section": "Further reading", + "text": "Further reading\n\nGeocomputation with R\nr-spatial\n{sf} documentation\nLeaflet documentation\nTidy Geospatial Networks in R\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#are-we-curling", - "href": "presentations/2024-05-23_github-team-sport/index.html#are-we-curling", - "title": "GitHub as a team sport", - "section": "Are we curling? 🥌", - "text": "Are we curling? 🥌\n\n\nWe:\n\nare a small team\nassume specialist roles\nwork in sync\n\n\n\n\n\n\nYou have been wondering: if this is a ‘team sport’, what sport is it?\nThis is a terrible metaphor. But think about it." + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#the-team", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#the-team", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "The team", + "text": "The team" }, { - "objectID": "presentations/2024-05-23_github-team-sport/index.html#the-bottom-line-actually", - "href": "presentations/2024-05-23_github-team-sport/index.html#the-bottom-line-actually", - "title": "GitHub as a team sport", - "section": "The bottom line, actually", - "text": "The bottom line, actually\n\n\n\n\n\nCommunicate\nHelp each other\nBe kind\n\n\n\n\nThe ideas in this talk are things that have helped us, and could help you, to drive up and maintain quality. Some were obvious, some were specific features you might not have known about.\nBut none of these are replacements for being good team members.\nGitHub just provides some affordances to help you.\nI am the guy falling over, the stones are tasks, my team mates are picking me up and dusting me off.\nDid you learn at least one thing? What has your team been doing? What works for you?\n\n\n\n\n\nLearn more about The Strategy Unit" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#a-hospital-is-a-place-where-you-can-find-people", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#a-hospital-is-a-place-where-you-can-find-people", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "A hospital is a place where you can find people…", + "text": "A hospital is a place where you can find people…\n\n\nhaving the best day of their life,\nthe worst day of their life,\nthe first day of their life,\nand the last day of their life." }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-is-rap", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-is-rap", - "title": "RAP", - "section": "What is RAP", - "text": "What is RAP\n\na process in which code is used to minimise manual, undocumented steps, and a clear, properly documented process is produced in code which can reliably give the same result from the same dataset\nRAP should be:\n\n\nthe core working practice that must be supported by all platforms and teams; make this a core focus of NHS analyst training\n\nGoldacre review" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#planning-is-hard", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#planning-is-hard", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Planning is hard", + "text": "Planning is hard\n\n\n\n\n\nbuilt with enough capacity to replace the existing school\nfailed to take into account a new housing estate\nlikely needs double the number of spaces within the next decade\n\nBBC article" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-are-we-trying-to-achieve", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-are-we-trying-to-achieve", - "title": "RAP", - "section": "What are we trying to achieve?", - "text": "What are we trying to achieve?\n\nLegibility\nReproducibility\nAccuracy\nLaziness" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#review-of-existing-models", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#review-of-existing-models", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Review of existing models", + "text": "Review of existing models\n\nSteven Wyatt - NHS-R 2022" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-are-some-of-the-fundamental-principles", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#what-are-some-of-the-fundamental-principles", - "title": "RAP", - "section": "What are some of the fundamental principles?", - "text": "What are some of the fundamental principles?\n\nPredictability, reducing mental load, and reducing truck factor\nMaking it easy to collaborate with yourself and others on different computers, in the cloud, in six months’ time…\nDRY" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#review-of-existing-models-1", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#review-of-existing-models-1", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Review of existing models", + "text": "Review of existing models\n\nlots of models\nlots of external consultancies\nlots of similarities\n\n\n\nlots of repetition/duplication\nsufficiently different that comparing results is difficult\nmethodological progress slow\nno base to build from\n\n\n\nconsultancies don’t tend to offer products, but services\ndifficult to compare different models to understand if differences are methodological or due to assumptions\nsame issues seen 20/30 years ago\nlearning and expertise gathered tends to be trapped within trusts, or kept secret by consultancies" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#the-road-to-rap", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#the-road-to-rap", - "title": "RAP", - "section": "The road to RAP", - "text": "The road to RAP\n\nWe’re roughly using NHS Digital’s RAP stages\nThere is an incredibly large amount to learn!\nConfession time! (everything I do not know…)\nYou don’t need to do it all at once\nYou don’t need to do it all at all ever\nEach thing you learn will incrementally help you\nRemember- that’s why we learnt this stuff. Because it helped us. And it can help you too" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#common-issues", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#common-issues", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Common issues", + "text": "Common issues\n\nhandling uncertainty\nunnecessary/early aggregation\npoor coverage of some changes\nlack of ownership & auditability of assumptions\nconflating demand forecasting with affordability\n\n\n\nmost models handle changes like demographic changes and the impact of changes in occupancy rates\nbut few try to handle addressing inequities, health status adjustment" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--baseline", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--baseline", - "title": "RAP", - "section": "Levels of RAP- Baseline", - "text": "Levels of RAP- Baseline\n\nData produced by code in an open-source language (e.g., Python, R, SQL).\nCode is version controlled (see Git basics and using Git collaboratively guides).\nRepository includes a README.md file (or equivalent) that clearly details steps a user must follow to reproduce the code\nCode has been peer reviewed.\nCode is published in the open and linked to & from accompanying publication (if relevant).\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#our-model", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#our-model", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Our model", + "text": "Our model\n\nopen source (not quite yet…)\nuses standard, well-known datasets (e.g. HES, ONS population projections)\ncurrently handles Inpatient admissions, Outpatient attendances, and A&E arrivals\nextensible and adaptable\ncovering all of the change factors\nstochastic Monte-Carlo model to handle uncertainty" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--silver", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--silver", - "title": "RAP", - "section": "Levels of RAP- Silver", - "text": "Levels of RAP- Silver\n\nCode is well-documented…\nCode is well-organised following standard directory format\nReusable functions and/or classes are used where appropriate\nPipeline includes a testing framework\nRepository includes dependency information (e.g. requirements.txt, PipFile, environment.yml\nData is handled and output in a Tidy data format\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#project-structure", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#project-structure", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Project Structure", + "text": "Project Structure\n\n\n\nData Extraction (R + {targets} & Sql)\nInputs App (R + {shiny})\nOutputs App (R + {shiny})\nModel Engine (Python & Docker)\nAzure Infrastructure (VM/ACR/ACI/Storage Accounts)\nAll of the code is stored on GitHub (currently, private repos 😔)\n\n\n\n\n\n\n\nflowchart TB\n classDef orange fill:#f9bf07,stroke:#2c2825,color:#2c2825;\n classDef lightslate fill:#b2b7b9,stroke:#2c2825,color:#2c2825;\n\n A[Data Extraction]\n B[Inputs App]\n C[Model]\n D[Outputs App]\n\n\n SB[(input app data)]\n SC[(model data)]\n SD[(results data)]\n\n A ---> SB\n A ---> SC\n \n SB ---> B\n SC ---> C\n\n B ---> C\n\n C ---> SD\n SD ---> D\n\n B -.-> D\n\n class A,B,C,D orange\n class SB,SC,SD lightslate" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--gold", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#levels-of-rap--gold", - "title": "RAP", - "section": "Levels of RAP- Gold", - "text": "Levels of RAP- Gold\n\nCode is fully packaged\nRepository automatically runs tests etc. via CI/CD or a different integration/deployment tool e.g. GitHub Actions\nProcess runs based on event-based triggers (e.g., new data in database) or on a schedule\nChanges to the RAP are clearly signposted. E.g. a changelog in the package, releases etc. (See gov.uk info on Semantic Versioning)\n\nSource: NHS Digital RAP community of practice" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-overview", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-overview", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Overview", + "text": "Model Overview\n\n\nthe baseline data is a year worth of a provider’s HES data\neach row in the baseline data is run through a series of steps\neach step creates a factor that says how many times (on average) to sample that row\nthe factors are multiplied together and used to create a random Poisson value\nwe resample the rows using this random values\nefficiencies are then applied, e.g. LoS reductions, type conversions\n\n\n\n\nIP/OP/A&E data\ncomplex, but not complicated" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#a-learning-journey-to-get-you-there", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#a-learning-journey-to-get-you-there", - "title": "RAP", - "section": "A learning journey to get you there", - "text": "A learning journey to get you there\n\nCode style, organising your files\nFunctions and iteration\nGit and GitHub\nPackaging your code\nTesting\nPackage management and versioning" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-diagram", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-diagram", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Diagram", + "text": "Model Diagram\n\n\n\n\n\nflowchart TB\n classDef blue fill:#5881c1,stroke:#2c2825,color:#2c2825;\n classDef orange fill:#f9bf07,stroke:#2c2825,color:#2c2825;\n classDef red fill:#ec6555,stroke:#2c2825,color:#2c2825;\n classDef lightslate fill:#b2b7b9,stroke:#2c2825,color:#2c2825;\n classDef slate fill:#e0e2e3,stroke:#2c2825,color:#2c2825;\n\n S[Baseline Activity]\n T[Future Activity]\n\n class S,T red\n\n subgraph rr[Row Resampling]\n direction LR\n\n subgraph pop[Population Changes]\n direction TB\n pop_p[Population Growth]\n pop_a[Age/Sex Structure]\n pop_h[Population Specific Health Status]\n\n class pop_p,pop_a,pop_h orange\n\n pop_p --- pop_a --- pop_h\n end\n\n subgraph dsi[Demand Supply Imbalances]\n direction TB\n dsi_w[Waiting List Adjustment]\n dsi_r[Repatriation/Expatriation]\n dsi_p[Private Healthcare Dynamics]\n\n class dsi_w,dsi_r,dsi_p orange\n\n dsi_w --- dsi_r --- dsi_p\n end\n\n subgraph nsi[Need Supply Imbalances]\n direction TB\n nsi_g[Gaps in Care]\n nsi_i[Inequalities]\n nsi_t[Threshold Imbalances]\n\n class nsi_g,nsi_i,nsi_t orange\n\n nsi_g --- nsi_i --- nsi_t\n end\n\n subgraph nda [Non-Demographic Adjustment]\n direction TB\n nda_m[Medical Interventions]\n nda_c[Changes to National Standards]\n nda_p[Patient Expectations]\n\n class nda_m,nda_c,nda_p orange\n\n nda_m --- nda_c --- nda_p\n end\n\n subgraph mit[Activity Mitigators]\n direction TB\n mit_a[Activity Avoidance]\n mit_t[Type Conversion]\n mit_e[Efficiencies]\n \n class mit_a,mit_t,mit_e orange\n\n mit_a --- mit_t --- mit_e\n end\n\n pop --- dsi --- nsi --- nda --- mit\n\n class dsi,nsi,pop,nda,mit lightslate\n end\n\n class rr slate\n \n S --> rr --> T\n\n\n\n\n\n\n\n\nuses either patient-level data, or minimal aggregation\nrow resampling grouped into 5 broad groups\n\npopulation changes address the changes to the structure of the population and health status over the medium term\ndemand supply imbalances: hospitals are currently struggling to keep pace with demand, so we correct for this to not carry forwards these into the future\nneed supply imbalance: addressing gaps in care that currently exist\nnon-demographic: such as the development of new medical technologies\nactivity mitigators: strategies trusts adopt for reducing activity, or delivering activity more efficiently\n\nsome assumptions set nationally, such as population growth via ONS population projections\nother assumptions set locally, with support from a Shiny app" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#how-we-can-help-each-other-get-there", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#how-we-can-help-each-other-get-there", - "title": "RAP", - "section": "How we can help each other get there", - "text": "How we can help each other get there\n\nWork as a team!\nCoffee and coding!\nAsk for help!\nDo pair coding!\nGet your code reviewed!\nJoin the NHS-R/ NHSPycom communities" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-diagram-1", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-diagram-1", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Diagram", + "text": "Model Diagram\n\n\n\n\n\nflowchart TB\n classDef blue fill:#5881c1,stroke:#2c2825,color:#2c2825;\n classDef orange fill:#f9bf07,stroke:#2c2825,color:#2c2825;\n classDef red fill:#ec6555,stroke:#2c2825,color:#2c2825;\n classDef lightslate fill:#b2b7b9,stroke:#2c2825,color:#2c2825;\n classDef slate fill:#e0e2e3,stroke:#2c2825,color:#2c2825;\n\n S[Baseline Activity]\n T[Future Activity]\n\n ORANGE[Implemented]\n BLUE[Not yet implemented]\n\n class ORANGE orange\n class BLUE blue\n\n class S,T red\n\n subgraph rr[Row Resampling]\n direction LR\n\n subgraph pop[Population Changes]\n direction TB\n pop_p[Population Growth]\n pop_a[Age/Sex Structure]\n pop_h[Population Specific Health Status]\n\n class pop_p,pop_a,pop_h orange\n\n pop_p --- pop_a --- pop_h\n end\n\n subgraph dsi[Demand Supply Imbalances]\n direction TB\n dsi_w[Waiting List Adjustment]\n dsi_r[Repatriation/Expatriation]\n dsi_p[Private Healthcare Dynamics]\n\n class dsi_w,dsi_r orange\n class dsi_p blue\n\n dsi_w --- dsi_r --- dsi_p\n end\n\n subgraph nsi[Need Supply Imbalances]\n direction TB\n nsi_g[Gaps in Care]\n nsi_i[Inequalities]\n nsi_t[Threshold Imbalances]\n\n class nsi_g,nsi_i,nsi_t blue\n\n nsi_g --- nsi_i --- nsi_t\n end\n\n subgraph nda [Non-Demographic Adjustment]\n direction TB\n nda_m[Medical Interventions]\n nda_c[Changes to National Standards]\n nda_p[Patient Expectations]\n\n class nda_m,nda_c,nda_p blue\n\n nda_m --- nda_c --- nda_p\n end\n\n subgraph mit[Activity Mitigators]\n direction TB\n mit_a[Activity Avoidance]\n mit_t[Type Conversion]\n mit_e[Efficiencies]\n \n class mit_a,mit_t,mit_e orange\n\n mit_a --- mit_t --- mit_e\n end\n\n pop --- dsi --- nsi --- nda --- mit\n\n class dsi,nsi,pop,nda,mit lightslate\n end\n\n class rr slate\n \n S --> rr --> T" }, { - "objectID": "presentations/2023-03-09_midlands-analyst-rap/index.html#haca", - "href": "presentations/2023-03-09_midlands-analyst-rap/index.html#haca", - "title": "RAP", - "section": "HACA", - "text": "HACA\n\nThe first national analytics conference for health and care\nInsight to action!\nJuly 11th and 12th, University of Birmingham\nAccepting abstracts for short and long talks and posters\nAbstract deadline 27th March\nHelp is available (with abstract, poster, preparing presentation…)!\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#monte-carlo-simulation", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#monte-carlo-simulation", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Monte Carlo Simulation", + "text": "Monte Carlo Simulation\n\n\n\nWe run the model N times, varying the input parameters each time slightly to handle the uncertainty.\nThe results of the model are aggregated at the end of each model run\nThe aggregated results are combined at the end into a single file\n\n\n\n\n\n\n\nflowchart LR\n classDef orange fill:#f9bf07,stroke:#2c2825,color:#2c2825;\n classDef red fill:#ec6555,stroke:#2c2825,color:#2c2825;\n \n A[Baseline Activity]\n Ba[Model Run 0]\n Bb[Model Run 1]\n Bc[Model Run 2]\n Bd[Model Run 3]\n Bn[Model Run n]\n C[Results]\n\n A ---> Ba ---> C\n A ---> Bb ---> C\n A ---> Bc ---> C\n A ---> Bd ---> C\n A ---> Bn ---> C\n \n class A,C red\n class Ba,Bb,Bc,Bd,Bn orange\n \n\n\n\n\n\n\n\nInspired by\n\nMapReduce (Google, 2004)\nSplit, Apply, Combine (H. Wickham, 2011)" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#a-note-on-richard-stallman", - "href": "presentations/2024-05-30_open-source-licensing/index.html#a-note-on-richard-stallman", - "title": "Open source licensing", - "section": "A note on Richard Stallman", - "text": "A note on Richard Stallman\n\nRichard Stallman has been heavily criticised for some of this views\nHe is hard to ignore when talking about open source so I am going to talk about him\nNothing in this talk should be read as endorsing any of his comments outside (or inside) the world of open source" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Parameters", + "text": "Model Parameters\n\nWe ask users to provide parameters in the form of 90% confidence intervals\nWe can then convert these confidence intervals into distributions\nDuring the model we sample values from these distributions for each model parameter\nAll of the parameters represent the average rate to sample a row of data from the baseline" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#the-origin-of-open-source", - "href": "presentations/2024-05-30_open-source-licensing/index.html#the-origin-of-open-source", - "title": "Open source licensing", - "section": "The origin of open source", - "text": "The origin of open source\n\nIn the 50s and 60s source code was routinely shared with hardware and users were often expected to modify to run on their hardware\nBy the late 1960s the production cost of software was rising relative to hardware and proprietary licences became more prevalent\nIn 1980 Richard Stallman’s department at MIT took delivery of a printer they were not able to modify the source code for\nRichard Stallman launched the GNU project in 1983 to fight for software freedoms\nMIT licence was launched in the late 1980s\nCathedral and the bazaar was released in 1997 (more on which later)" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-1", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-1", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Parameters", + "text": "Model Parameters\n\n“We expect in the future to see between a 25% reduction and a 25% increase in this activity”\n\n\n\n\ngrey highlighted section: 90% confidence intervals\nblack line: confidence intervals into distributions\nyellow points: sampled parameter for a model run" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#what-is-open-source", - "href": "presentations/2024-05-30_open-source-licensing/index.html#what-is-open-source", - "title": "Open source licensing", - "section": "What is open source?", - "text": "What is open source?\n\nThink free as in free speech, not free beer (Stallman)\n\n\nOpen source does not mean free of charge! Software freedom implies the ability to sell code\nFree of charge does not mean open source! Many free to download pieces of software are not open source (Zoom, for example)\n\n\nBy Chao-Kuei et al. - https://www.gnu.org/philosophy/categories.en.html, GPL, Link" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-2", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-2", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Parameters", + "text": "Model Parameters\n\n“We expect in the future to see between a 20% reduction and a 90% reduction in this activity”\n\n\n\n\ngrey highlighted section: 90% confidence intervals\nblack line: confidence intervals into distributions\nyellow points: sampled parameter for a model run" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#the-four-freedoms", - "href": "presentations/2024-05-30_open-source-licensing/index.html#the-four-freedoms", - "title": "Open source licensing", - "section": "The four freedoms", - "text": "The four freedoms\n\nFreedom 0: The freedom to use the program for any purpose.\nFreedom 1: The freedom to study how the program works, and change it to make it do what you wish.\nFreedom 2: The freedom to redistribute and make copies so you can help your neighbor.\nFreedom 3: The freedom to improve the program, and release your improvements (and modified versions in general) to the public, so that the whole community benefits." + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-3", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-parameters-3", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Parameters", + "text": "Model Parameters\n\n“We expect in the future to see between a 2% reduction and an 18% reduction in this activity”\n\n\n\n\ngrey highlighted section: 90% confidence intervals\nblack line: confidence intervals into distributions\nyellow points: sampled parameter for a model run" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#cathedral-and-the-bazaar", - "href": "presentations/2024-05-30_open-source-licensing/index.html#cathedral-and-the-bazaar", - "title": "Open source licensing", - "section": "Cathedral and the bazaar", - "text": "Cathedral and the bazaar\n\nEvery good work of software starts by scratching a developer’s personal itch.\nGood programmers know what to write. Great ones know what to rewrite (and reuse).\nPlan to throw one [version] away; you will, anyhow (copied from Frederick Brooks’s The Mythical Man-Month).\nIf you have the right attitude, interesting problems will find you.\nWhen you lose interest in a program, your last duty to it is to hand it off to a competent successor.\nTreating your users as co-developers is your least-hassle route to rapid code improvement and effective debugging.\nRelease early. Release often. And listen to your customers.\nGiven a large enough beta-tester and co-developer base, almost every problem will be characterized quickly and the fix obvious to someone.\nSmart data structures and dumb code works a lot better than the other way around.\nIf you treat your beta-testers as if they’re your most valuable resource, they will respond by becoming your most valuable resource." + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-1", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-1", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Run Example (1)", + "text": "Model Run Example (1)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\nf\n\n\n\n\n1\n50\nm\n100\n4\n1.00\n\n\n2\n50\nm\n110\n3\n1.00\n\n\n3\n51\nm\n120\n5\n1.00\n\n\n4\n50\nf\n100\n1\n1.00\n\n\n5\n50\nf\n110\n2\n1.00\n\n\n6\n52\nf\n120\n0\n1.00\n\n\n\n\n\n\n\n\n\nStart with baseline data - we are going to sample each row exactly once (column f)." }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#cathedral-and-the-bazaar-cont.", - "href": "presentations/2024-05-30_open-source-licensing/index.html#cathedral-and-the-bazaar-cont.", - "title": "Open source licensing", - "section": "Cathedral and the bazaar (cont.)", - "text": "Cathedral and the bazaar (cont.)\n\nThe next best thing to having good ideas is recognizing good ideas from your users. Sometimes the latter is better.\nOften, the most striking and innovative solutions come from realizing that your concept of the problem was wrong.\nPerfection (in design) is achieved not when there is nothing more to add, but rather when there is nothing more to take away. (Attributed to Antoine de Saint-Exupéry)\nAny tool should be useful in the expected way, but a truly great tool lends itself to uses you never expected.\nWhen writing gateway software of any kind, take pains to disturb the data stream as little as possible—and never throw away information unless the recipient forces you to!\nWhen your language is nowhere near Turing-complete, syntactic sugar can be your friend.\nA security system is only as secure as its secret. Beware of pseudo-secrets.\nTo solve an interesting problem, start by finding a problem that is interesting to you.\nProvided the development coordinator has a communications medium at least as good as the Internet, and knows how to lead without coercion, many heads are inevitably better than one." + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-2", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-2", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Run Example (2)", + "text": "Model Run Example (2)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\nf\n\n\n\n\n1\n50\nm\n100\n4\n1.00\n\n\n2\n50\nm\n110\n3\n1.00\n\n\n3\n51\nm\n120\n5\n1.00\n\n\n4\n50\nf\n100\n1\n1.00\n\n\n5\n50\nf\n110\n2\n1.00\n\n\n6\n52\nf\n120\n0\n1.00\n\n\n\n\n\n\n\nage\nsex\nf\n\n\n\n\n50\nm\n0.90\n\n\n51\nm\n1.10\n\n\n52\nm\n1.20\n\n\n50\nf\n0.80\n\n\n51\nf\n0.70\n\n\n52\nf\n1.30\n\n\n\n\n\n\n\nf\n\n\n\n\n1.00 × 0.90 = 0.90\n\n\n1.00 × 0.90 = 0.90\n\n\n1.00 × 1.10 = 1.10\n\n\n1.00 × 0.80 = 0.80\n\n\n1.00 × 0.80 = 0.80\n\n\n1.00 × 1.30 = 1.30\n\n\n\n\n\nWe perform a step where we join based on age and sex, then update the f column." }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#the-disciplines-of-open-source-are-the-disciplines-of-good-data-science", - "href": "presentations/2024-05-30_open-source-licensing/index.html#the-disciplines-of-open-source-are-the-disciplines-of-good-data-science", - "title": "Open source licensing", - "section": "The disciplines of open source are the disciplines of good data science", - "text": "The disciplines of open source are the disciplines of good data science\n\nMeaningful README\nMeaningful commit messages\nModularity\nSeparating data code from analytic code from interactive code\nAssigning issues and pull requests for action/ review\nDon’t forget one of the most lazy and incompetent developers you will ever work with is yourself, six months later" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-3", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-3", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Run Example (3)", + "text": "Model Run Example (3)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\nf\n\n\n\n\n1\n50\nm\n100\n4\n0.90\n\n\n2\n50\nm\n110\n3\n0.90\n\n\n3\n51\nm\n120\n5\n1.10\n\n\n4\n50\nf\n100\n1\n0.80\n\n\n5\n50\nf\n110\n2\n0.80\n\n\n6\n52\nf\n120\n0\n1.30\n\n\n\n\n\n\n\nspecialty\nf\n\n\n\n\n100\n0.90\n\n\n110\n1.10\n\n\n\n\n\n\n\nf\n\n\n\n\n0.90 × 0.90 = 0.81\n\n\n0.90 × 1.10 = 0.99\n\n\n1.10 × 1.00 = 1.10\n\n\n0.80 × 0.90 = 0.72\n\n\n0.80 × 1.10 = 0.88\n\n\n1.30 × 1.00 = 1.30\n\n\n\n\n\nThe next step joins on the specialty column, again updating f. Note, if there is no value to join on, then we multiply by 1." }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#what-licences-exist", - "href": "presentations/2024-05-30_open-source-licensing/index.html#what-licences-exist", - "title": "Open source licensing", - "section": "What licences exist?", - "text": "What licences exist?\n\nPermissive\n\nSuch as MIT but there are others. Recommended by NHSX draft guidelines on open source\nApache is a notable permissive licence- includes a patent licence\nIn our work the OGL is also relevant- civil servant publish stuff under OGL (and MIT- it isn’t particularly recommended for code)\n\nCopyleft\n\nGPL2, GPL3, AGPL (“the GPL of the web”)\nNote that the provisions of the GPL only apply when you distribute the code\nAt a certain point it all gets too complicated and you need a lawyer\nMPL is a notable copyleft licence- can combine with proprietary code as long as kept separate\n\nArguments for permissive/ copyleft- getting your code used versus preserving software freedoms for other people\nNote that most of the licences are impossible to read! There is a website to explain tl;dr" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-4", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-4", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Run Example (4)", + "text": "Model Run Example (4)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\nf\nn\n\n\n\n\n1\n50\nm\n100\n4\n0.90\n1\n\n\n2\n50\nm\n110\n3\n0.90\n0\n\n\n3\n51\nm\n120\n5\n1.10\n2\n\n\n4\n50\nf\n100\n1\n0.80\n1\n\n\n5\n50\nf\n110\n2\n0.80\n0\n\n\n6\n52\nf\n120\n0\n1.30\n3\n\n\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\n\n\n\n\n1\n50\nm\n100\n4\n\n\n3\n51\nm\n120\n5\n\n\n3\n51\nm\n120\n5\n\n\n4\n50\nf\n100\n1\n\n\n6\n52\nf\n120\n0\n\n\n6\n52\nf\n120\n0\n\n\n6\n52\nf\n120\n0\n\n\n\n\n\nOnce all of the steps are performed, sample a random value n from a Poisson distribution with λ=f, then we select each row n times." + }, + { + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-5", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#model-run-example-5", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Model Run Example (5)", + "text": "Model Run Example (5)\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\ng\n\n\n\n\n1\n50\nm\n100\n4\n0.75\n\n\n3\n51\nm\n120\n5\n0.50\n\n\n3\n51\nm\n120\n5\n1.00\n\n\n4\n50\nf\n100\n1\n0.90\n\n\n6\n52\nf\n120\n0\n0.80\n\n\n6\n52\nf\n120\n0\n0.80\n\n\n6\n52\nf\n120\n0\n0.80\n\n\n\n\n\n\n\nid\nage\nsex\nspecialty\nlos\n\n\n\n\n1\n50\nm\n100\n2\n\n\n3\n51\nm\n120\n1\n\n\n3\n51\nm\n120\n5\n\n\n4\n50\nf\n100\n0\n\n\n6\n52\nf\n120\n0\n\n\n6\n52\nf\n120\n0\n\n\n6\n52\nf\n120\n0\n\n\n\n\n\nAfter resampling, we apply efficiency steps. E.g., similar joins are used to create column g, which is then used to sample a new LOS from a binomial distribution." }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#what-is-copyright-and-why-does-it-matter", - "href": "presentations/2024-05-30_open-source-licensing/index.html#what-is-copyright-and-why-does-it-matter", - "title": "Open source licensing", - "section": "What is copyright and why does it matter", - "text": "What is copyright and why does it matter\n\nCopyright is assigned at the moment of creation\nIf you made it in your own time, it’s yours (usually!)\nIf you made it at work, it belongs to your employer\nIf someone paid you to make it (“work for hire”) it belongs to them\nCrucially, the copyright holder can relicence software\n\nIf it’s jointly authored it depends if it’s a “collective” or “joint” work\nHonestly it’s pretty complicated. Just vest copyright in an organisation or group of individuals you trust\nGoldacre review suggests using Crown copyright for copyright in the NHS because it’s a “shoal, not a big fish” (with apologies to Ben whom I am misquoting)" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-built", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-built", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "How the model is built", + "text": "How the model is built\n\nThe model is built in Python and can be run on any machine you can install Python on\nUses various packages, such as numpy and pandas\nReads data in .parquet format for efficiency\nReturns aggregated results as a .json file\nCould also output full row level results if needed" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#iceweasel", - "href": "presentations/2024-05-30_open-source-licensing/index.html#iceweasel", - "title": "Open source licensing", - "section": "Iceweasel", - "text": "Iceweasel\n\nIceweasel is a story of trademark rather than copyright\nDebian (a Linux flavour) had the permission to use the source code of Firefox, but not the logo\nSo they took the source code and made their own version\nThis sounds very obscure and unimportant but it could become important in future projects of ours, like…" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-built-1", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-built-1", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "How the model is built", + "text": "How the model is built\n\nCode is built in a modular approach\nEach activity type (Inpatients/Outpatients/A&E) has its own model code\nCode is reused where possible (e.g. all three models share the code for demographic adjustment)" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#what-we-have-learned-in-recent-projects", - "href": "presentations/2024-05-30_open-source-licensing/index.html#what-we-have-learned-in-recent-projects", - "title": "Open source licensing", - "section": "What we have learned in recent projects", - "text": "What we have learned in recent projects\n\nThe huge benefits of being open\n\nTransparency\nWorking with customers\nGoodwill\n\nNonfree mitigators\nDifferent licences for different repos" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-deployed", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#how-the-model-is-deployed", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "How the model is deployed", + "text": "How the model is deployed\n\nDeployed as a Docker Container\nRuns in Azure Container Instances\nEach model run creates a new container, and the container is destroyed when the model run completes" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#software-freedom-means-allowing-people-to-do-stuff-you-dont-like", - "href": "presentations/2024-05-30_open-source-licensing/index.html#software-freedom-means-allowing-people-to-do-stuff-you-dont-like", - "title": "Open source licensing", - "section": "Software freedom means allowing people to do stuff you don’t like", - "text": "Software freedom means allowing people to do stuff you don’t like\n\nFreedom 0: The freedom to use the program for any purpose.\nFreedom 3: The freedom to improve the program, and release your improvements (and modified versions in general) to the public, so that the whole community benefits.\nThe code isn’t the only thing with worth in the project\nThis is why there are whole businesses founded on “here’s the Linux source code”\nSo when we’re sharing code we are letting people do stupid things with it but we’re not recommending that they do stupid things with it\nPeople do stupid things with Excel and Microsoft don’t accept liability for that, and neither should we\nThis issue of sharing analytic code and merchantability for a particular purpose is poorly understood and I think everyone needs to be clearer on it (us, and our customers)\nIn my view a world where consultants are selling our code is better than a world where they’re selling their spreadsheets" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#data-extraction", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#data-extraction", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Data Extraction", + "text": "Data Extraction\n\nUses principles of RAP, using R + {targets} and Sql\nAll of the data required to run the model\nData is extracted from various sources\n\nSql Datawarehouse (HES data)\nONS population projections + life expectancy tables\nCentral returns, e.g. KH03\nODS data (organisation names, successors)\n\nExtracted data is uploaded to Azure storage containers" }, { - "objectID": "presentations/2024-05-30_open-source-licensing/index.html#open-source-as-in-piano", - "href": "presentations/2024-05-30_open-source-licensing/index.html#open-source-as-in-piano", - "title": "Open source licensing", - "section": "“Open source as in piano”", - "text": "“Open source as in piano”\n\nThe patient experience QDC project\nOur current project\nOpen source code is not necessarily to be run, but understood and learned from\nBuilding a group of people who can use and contribute to your code is arguably as important as writing it\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#inputs-app", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#inputs-app", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Inputs App", + "text": "Inputs App\nA {shiny} app that allows the user to set parameters, and submit as a job to run the model with those values." }, { - "objectID": "presentations/index.html", - "href": "presentations/index.html", - "title": "Presentations", - "section": "", - "text": "Title\nAuthor\nDate\n\n\n\n\nWhat is AI?\nData science team, Strategy Unit\n2024-10-10\n\n\nIdentifying patients at risk: Is this AI?\nYiWen Hon\n2024-10-10\n\n\nUsing R and Python to model future hospital activity: EARL Conference 2024\nYiWen Hon, Matt Dray, Tom Jemmett\n2024-09-05\n\n\nAgile and scrum working\nChris Beeley\n2024-08-22\n\n\nOpen source licensing: Or: how I learned to stop worrying and love openness\nChris Beeley\n2024-05-30\n\n\nGitHub as a team sport: DfT QA Month\nMatt Dray\n2024-05-23\n\n\nStore Data Safely: Coffee & Coding\nYiWen Hon, Matt Dray\n2024-05-16\n\n\nCoffee and Coding: Making my analytical workflow more reproducible with {targets}\nJacqueline Grout\n2024-01-25\n\n\nConference Check-in App: NHS-R/NHS.pycom 2023\nTom Jemmett\n2023-10-17\n\n\nSystem Dynamics in health and care: fitting square data into round models\nSally Thompson\n2023-10-09\n\n\nRepeating Yourself with Functions: Coffee and Coding\nSally Thompson\n2023-09-07\n\n\nCoffee and Coding: Working with Geospatial Data in R\nTom Jemmett\n2023-08-24\n\n\nUnit testing in R: NHS-R Community Webinar\nTom Jemmett\n2023-08-23\n\n\nEverything you ever wanted to know about data science: but were too afraid to ask\nChris Beeley\n2023-08-02\n\n\nTravels with R and Python: the power of data science in healthcare\nChris Beeley\n2023-08-02\n\n\nAn Introduction to the New Hospital Programme Demand Model: HACA 2023\nTom Jemmett\n2023-07-11\n\n\nWhat good data science looks like\nChris Beeley\n2023-05-23\n\n\nText mining of patient experience data\nChris Beeley\n2023-05-15\n\n\nCoffee and Coding: {targets}\nTom Jemmett\n2023-03-23\n\n\nCollaborative working\nChris Beeley\n2023-03-23\n\n\nCoffee and Coding: Good Coding Practices\nTom Jemmett\n2023-03-09\n\n\nRAP: what is it and how can my team start using it effectively?\nChris Beeley\n2023-03-09\n\n\nCoffee and coding: Intro session\nChris Beeley\n2023-02-23" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#inputs-app-1", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#inputs-app-1", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Inputs App", + "text": "Inputs App" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#what-is-testing", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#what-is-testing", - "title": "Unit testing in R", - "section": "What is testing?", - "text": "What is testing?\n\nSoftware testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation\nwikipedia" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#outputs-app", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#outputs-app", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Outputs App", + "text": "Outputs App\nA {shiny} app that allows the user to view the results of model runs." }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-can-we-test-our-code", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-can-we-test-our-code", - "title": "Unit testing in R", - "section": "How can we test our code?", - "text": "How can we test our code?\n\n\nStatically\n\n\n(without executing the code)\nhappens constantly, as we are writing code\nvia code reviews\ncompilers/interpreters/linters statically analyse the code for syntax errors\n\n\n\n\n\nDynamically" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#outputs-app-1", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#outputs-app-1", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Outputs App", + "text": "Outputs App" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-can-we-test-our-code-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-can-we-test-our-code-1", - "title": "Unit testing in R", - "section": "How can we test our code?", - "text": "How can we test our code?\n\n\nStatically\n\n(without executing the code)\nhappens constantly, as we are writing code\nvia code reviews\ncompilers/interpreters/linters statically analyse the code for syntax errors\n\n\n\n\nDynamically\n\n\n(by executing the code)\nsplit into functional and non-functional testing\ntesting can be manual, or automated\n\n\n\n\n\nnon-functional testing covers things like performance, security, and usability testing" + "objectID": "presentations/2023-07-11_haca-nhp-demand-model/index.html#questions", + "href": "presentations/2023-07-11_haca-nhp-demand-model/index.html#questions", + "title": "An Introduction to the New Hospital Programme Demand Model", + "section": "Questions?", + "text": "Questions?\n\nContact The Strategy Unit\n\n\n strategy.unit@nhs.net\n The-Strategy-Unit\n\n\nContact Me\n\n\n thomas.jemmett@nhs.net\n tomjemmett\n\n\n\n\n\nview slides at https://tinyurl.com/haca23nhp" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests", - "title": "Unit testing in R", - "section": "Different types of functional tests", - "text": "Different types of functional tests\nUnit Testing checks each component (or unit) for accuracy independently of one another.\n\nIntegration Testing integrates units to ensure that the code works together.\n\n\nEnd-to-End Testing (e2e) makes sure that the entire system functions correctly.\n\n\nUser Acceptance Testing (UAT) ensures that the product meets the real user’s requirements." + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#how-did-we-get-here", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#how-did-we-get-here", + "title": "Agile and scrum working", + "section": "How did we get here?", + "text": "How did we get here?\n\nWaterfall approaches were used in the early days of software development\n\nRequirements; Design; Development; Integration; Testing; Deployment\n\nYou only move to the next stage when the first one is complete\n(although actually it turns out you kind of don’t…)" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests-1", - "title": "Unit testing in R", - "section": "Different types of functional tests", - "text": "Different types of functional tests\nUnit Testing checks each component (or unit) for accuracy independently of one another.\nIntegration Testing integrates units to ensure that the code works together.\nEnd-to-End Testing (e2e) makes sure that the entire system functions correctly.\n\nUser Acceptance Testing (UAT) ensures that the product meets the real user’s requirements.\n\n\nUnit, Integration, and E2E testing are all things we can automate in code, whereas UAT testing is going to be manual" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#the-road-to-agile", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#the-road-to-agile", + "title": "Agile and scrum working", + "section": "The road to agile", + "text": "The road to agile\n\nSome of the ideas for agile floated around in the 20th century\nShewart’s Plan-Do-Study-Act cycle\nThe New New Product Development Game in 1986\nScrum (which we’ll return to) was proposed in 1993\nIn 2001 the Manifesto for Agile Software Development was published" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests-2", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#different-types-of-functional-tests-2", - "title": "Unit testing in R", - "section": "Different types of functional tests", - "text": "Different types of functional tests\nUnit Testing checks each component (or unit) for accuracy independently of one another.\n\nIntegration Testing integrates units to ensure that the code works together.\nEnd-to-End Testing (e2e) makes sure that the entire system functions correctly.\nUser Acceptance Testing (UAT) ensures that the product meets the real user’s requirements.\n\n\nOnly focussing on unit testing in this talk, but the techniques/packages could be extended to integration testing. Often other tools (potentially specific tools) are needed for E2E testing." + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#the-agile-manifesto", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#the-agile-manifesto", + "title": "Agile and scrum working", + "section": "The agile manifesto", + "text": "The agile manifesto\n\nCopyright © 2001 Kent Beck, Mike Beedle, Arie van Bennekum, Alistair Cockburn, Ward Cunningham, Martin Fowler, James Grenning, Jim Highsmith, Andrew Hunt, Ron Jeffries, Jon Kern, Brian Marick\nRobert C. Martin, Steve Mellor, Ken Schwaber, Jeff Sutherland, Dave Thomas\nthis declaration may be freely copied in any form, but only in its entirety through this notice." }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#example", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#example", - "title": "Unit testing in R", - "section": "Example", - "text": "Example\nWe have a {shiny} app which grabs some data from a database, manipulates the data, and generates a plot.\n\n\nwe would write unit tests to check the data manipulation and plot functions work correctly (with pre-created sample/simple datasets)\nwe would write integration tests to check that the data manipulation function works with the plot function (with similar data to what we used for the unit tests)\nwe would write e2e tests to ensure that from start to finish the app grabs the data and produces a plot as required\n\n\n\nsimple (unit tests) to complex (e2e tests)" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--software-and-the-mvp", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--software-and-the-mvp", + "title": "Agile and scrum working", + "section": "Agile principles- software and the MVP", + "text": "Agile principles- software and the MVP\n\nOur highest priority is to satisfy the customer through early and continuous delivery of valuable software.\nDeliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.\nWorking software is the primary measure of progress.\n\n(these principles and those on following slides copyright Ibid.)" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-pyramid", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-pyramid", - "title": "Unit testing in R", - "section": "Testing Pyramid", - "text": "Testing Pyramid\n\n\nImage source: The Testing Pyramid: Simplified for One and All headspin.io" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--working-with-customers", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--working-with-customers", + "title": "Agile and scrum working", + "section": "Agile principles- working with customers", + "text": "Agile principles- working with customers\n\nWelcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.\nBusiness people and developers must work together daily throughout the project." }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function", - "title": "Unit testing in R", - "section": "Let’s create a simple function…", - "text": "Let’s create a simple function…\n\nmy_function <- function(x, y) {\n \n stopifnot(\n \"x must be numeric\" = is.numeric(x),\n \"y must be numeric\" = is.numeric(y),\n \"x must be same length as y\" = length(x) == length(y),\n \"cannot divide by zero!\" = y != 0\n )\n\n x / y\n}" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--teamwork", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--teamwork", + "title": "Agile and scrum working", + "section": "Agile principles- teamwork", + "text": "Agile principles- teamwork\n\nBuild projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.\nThe most efficient and effective method of conveying information to and within a development team is face-to-face conversation.\nThe best architectures, requirements, and designs emerge from self-organizing teams.\nAt regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly." }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function-1", - "title": "Unit testing in R", - "section": "Let’s create a simple function…", - "text": "Let’s create a simple function…\n\nmy_function <- function(x, y) {\n \n stopifnot(\n \"x must be numeric\" = is.numeric(x),\n \"y must be numeric\" = is.numeric(y),\n \"x must be same length as y\" = length(x) == length(y),\n \"cannot divide by zero!\" = y != 0\n )\n\n x / y\n}" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--project-management", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-principles--project-management", + "title": "Agile and scrum working", + "section": "Agile principles- project management", + "text": "Agile principles- project management\n\nAgile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.\nContinuous attention to technical excellence and good design enhances agility.\nSimplicity–the art of maximizing the amount of work not done–is essential." }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function-2", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-create-a-simple-function-2", - "title": "Unit testing in R", - "section": "Let’s create a simple function…", - "text": "Let’s create a simple function…\n\nmy_function <- function(x, y) {\n \n stopifnot(\n \"x must be numeric\" = is.numeric(x),\n \"y must be numeric\" = is.numeric(y),\n \"x must be same length as y\" = length(x) == length(y),\n \"cannot divide by zero!\" = y != 0\n )\n\n x / y\n}\n\n\nThe Ten Rules of Defensive Programming in R" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#the-agile-advantage", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#the-agile-advantage", + "title": "Agile and scrum working", + "section": "The agile advantage", + "text": "The agile advantage\n\nBetter use of fixed resources to deliver an unknown outcome, rather than unknown resources to deliver a fixed outcome\nContinuous delivery" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test", - "title": "Unit testing in R", - "section": "… and create our first test", - "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#feature-creep", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#feature-creep", + "title": "Agile and scrum working", + "section": "Feature creep", + "text": "Feature creep\n\nUsers ask for: everything they need, everything they think they may need, everything they want, everything they think they may want\n\n“every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can”\n\nZawinski’s Law- Source" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-1", - "title": "Unit testing in R", - "section": "… and create our first test", - "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#regular-stakeholder-feedback", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#regular-stakeholder-feedback", + "title": "Agile and scrum working", + "section": "Regular stakeholder feedback", + "text": "Regular stakeholder feedback\n\nAgile teams are very responsive to product feedback\nThe project we’re curently working on is very agile whether we like it or not\nOur customers never know what they want until we show them something they don’t want" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-2", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-2", - "title": "Unit testing in R", - "section": "… and create our first test", - "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#more-agile-advantages", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#more-agile-advantages", + "title": "Agile and scrum working", + "section": "More agile advantages", + "text": "More agile advantages\n\nEarly and cheap failure\nContinuous testing and QA\nReduction in unproductive work\nTeam can improve regularly, not just the product" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-3", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-3", - "title": "Unit testing in R", - "section": "… and create our first test", - "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#agile-methods", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#agile-methods", + "title": "Agile and scrum working", + "section": "Agile methods", + "text": "Agile methods\n\nThere are lots of agile methodologies\nI’m not going to embarrass myself by pretending to understand them\nExamples include Lean, Crystal, and Extreme Programming" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-4", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-4", - "title": "Unit testing in R", - "section": "… and create our first test", - "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#scrum", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#scrum", + "title": "Agile and scrum working", + "section": "Scrum", + "text": "Scrum\n\nScrum is the agile methodology we have adopted\nDespite dire warnings to the contrary we have not adopted it wholesale but most of its principles\nThe fundamental organising principle of work in scrum is a sprint lasting 1-4 weeks\nEach sprint finishes with a defined and useful piece of software that can be shown to/ used by customers" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-5", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-5", - "title": "Unit testing in R", - "section": "… and create our first test", - "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})\n\nTest passed 😸" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#product-owner", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#product-owner", + "title": "Agile and scrum working", + "section": "Product owner", + "text": "Product owner\n\nThis person is responsible for the backlog- what goes in to the sprint\nThe backlog should be inclusive of all of the things that customers want or might want\nThe backlog should be prioritised\nThe product owner does this through deep and frequent conversations with customers" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#other-expect_-functions", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#other-expect_-functions", - "title": "Unit testing in R", - "section": "other expect_*() functions…", - "text": "other expect_*() functions…\n\ntest_that(\"my_function correctly divides values\", {\n expect_lt(\n my_function(4, 2),\n 10\n )\n expect_gt(\n my_function(1, 4),\n 0.2\n )\n expect_length(\n my_function(c(4, 1), c(2, 4)),\n 2\n )\n})\n\nTest passed 🎉\n\n\n\n{testthat} documentation" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#scrum-master-helps-the-scrum-team", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#scrum-master-helps-the-scrum-team", + "title": "Agile and scrum working", + "section": "Scrum master helps the scrum team", + "text": "Scrum master helps the scrum team\n\n“By coaching the team members in self-management and cross-functionality\nFocus on creating high-value Increments that meet the Definition of Done\nInfluence the removal of impediments to the Scrum Team’s progress\nEnsure that all Scrum events take place and are positive, productive, and kept within the timebox.”\n\nSource" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert", - "title": "Unit testing in R", - "section": "Arrange, Act, Assert", - "text": "Arrange, Act, Assert\n\n\n\n\n\ntest_that(\"my_function works\", {\n # arrange\n # \n #\n #\n\n # act\n #\n\n # assert\n #\n})" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#the-backlog", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#the-backlog", + "title": "Agile and scrum working", + "section": "The backlog", + "text": "The backlog\n\nHaving an accurate and well prioritised backlog is key\nDon’t estimate the backlog in hours- use “T shirt sizes” or “points”\nPeople are terrible at estimating how long things take- particularly in software\nEverything in the backlog needs a defined “Done” state" }, - { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-1", - "title": "Unit testing in R", - "section": "Arrange, Act, Assert", - "text": "Arrange, Act, Assert\n\n\nwe arrange the environment, before running the function\n\n\nto create sample values\ncreate fake/temporary files\nset random seed\nset R options/environment variables\n\n\n\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n #\n\n # assert\n #\n})" + { + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#sprint-planning", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#sprint-planning", + "title": "Agile and scrum working", + "section": "Sprint planning", + "text": "Sprint planning\n\nThe team, the product owner, and the scrum master plan the sprint\nSprints should be a fixed length of time less than one month\nThe sprint cannot be changed or added to (we break this rule)\nThe team works autonomously in the sprint- nobody decides who does what except the team\nCan take three hours and should if it needs to" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-2", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-2", - "title": "Unit testing in R", - "section": "Arrange, Act, Assert", - "text": "Arrange, Act, Assert\n\n\nwe arrange the environment, before running the function\nwe act by calling the function\n\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n actual <- my_function(x, y)\n\n # assert\n #\n})" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#standup", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#standup", + "title": "Agile and scrum working", + "section": "Standup", + "text": "Standup\n\nEvery day, for no more than 15 minutes (teams often stand up to reinforce this rule) team and scrum master meet\nEach person answers three questions\n\nWhat did you do yesterday to help the team finish the sprint?\nWhat will you do today to help the team finish the sprint?\nIs there an obstacle blocking you or the team from achieveing the sprint goal" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-3", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert-3", - "title": "Unit testing in R", - "section": "Arrange, Act, Assert", - "text": "Arrange, Act, Assert\n\n\nwe arrange the environment, before running the function\nwe act by calling the function\nwe assert that the actual results match our expected results\n\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n actual <- my_function(x, y)\n\n # assert\n expect_equal(actual, expected)\n})" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#sprint-retro", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#sprint-retro", + "title": "Agile and scrum working", + "section": "Sprint retro", + "text": "Sprint retro\n\nWhat went well, what could have gone better, and what to improve next time\nLooking at process, not blaming individuals\nRequires maturity and trust to bring up issues, and to respond to them in a constructive way\nShould agree at the end on one process improvement which goes in the next sprint\nWe’ve had some really, really good retros and I think it’s a really important process for a team" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#our-test-failed", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#our-test-failed", - "title": "Unit testing in R", - "section": "Our test failed!?! 😢", - "text": "Our test failed!?! 😢\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n actual <- my_function(x, y)\n\n # assert\n expect_equal(actual, expected)\n})\n\n── Failure: my_function works ──────────────────────────────────────────────────\n`actual` not equal to `expected`.\n1/1 mismatches\n[1] 0.714 - 0.714 == 7.14e-07\n\n\nError:\n! Test failed" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#team-perspective", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#team-perspective", + "title": "Agile and scrum working", + "section": "Team perspective", + "text": "Team perspective\n\nProduct owner- that’s me\n\nFocus, clarity and transparency, team delivery, clear and appropriate responsibilities\n\nScrum master- YiWen\nTeam member- Matt\nTeam member- Rhian" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#tolerance-to-the-rescue", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#tolerance-to-the-rescue", - "title": "Unit testing in R", - "section": "Tolerance to the rescue 🙂", - "text": "Tolerance to the rescue 🙂\n\ntest_that(\"my_function works\", {\n # arrange\n x <- 5\n y <- 7\n expected <- 0.714285\n\n # act\n actual <- my_function(x, y)\n\n # assert\n expect_equal(actual, expected, tolerance = 1e-6)\n})\n\nTest passed 🎊\n\n\n\n(this is a slightly artificial example, usually the default tolerance is good enough)" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#scrum-values", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#scrum-values", + "title": "Agile and scrum working", + "section": "Scrum values", + "text": "Scrum values\n\nCourage\nFocus\nCommitment\nRespect\nOpenness" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-edge-cases", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-edge-cases", - "title": "Unit testing in R", - "section": "Testing edge cases", - "text": "Testing edge cases\n\n\nRemember the validation steps we built into our function to handle edge cases?\n\nLet’s write tests for these edge cases:\nwe expect errors\n\n\ntest_that(\"my_function works\", {\n expect_error(my_function(5, 0))\n expect_error(my_function(\"a\", 3))\n expect_error(my_function(3, \"a\"))\n expect_error(my_function(1:2, 4))\n})\n\nTest passed 🎊" + "objectID": "presentations/2024-08-22_agile-and-scrum/index.html#using-agile-outside-of-software", + "href": "presentations/2024-08-22_agile-and-scrum/index.html#using-agile-outside-of-software", + "title": "Agile and scrum working", + "section": "Using agile outside of software", + "text": "Using agile outside of software\n\nData science is outside of software (IMHO)\n\nWe don’t have daily standups and some of our processes run longer than in software development\n\nYou can build cars with Agile\nMarketing and UX design\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example", - "title": "Unit testing in R", - "section": "Another (simple) example", - "text": "Another (simple) example\n\n\n\nmy_new_function <- function(x, y) {\n if (x > y) {\n \"x\"\n } else {\n \"y\"\n }\n}\n\n\nConsider this function - there is branched logic, so we need to carefully design tests to validate the logic works as intended." + "objectID": "presentations/2023-05-15_text-mining/index.html#patient-experience", + "href": "presentations/2023-05-15_text-mining/index.html#patient-experience", + "title": "Text mining of patient experience data", + "section": "Patient experience", + "text": "Patient experience\n\nThe NHS collects a lot of patient experience data\nRate the service 1-5 (Very poor… Excellent) but also give written feedback\n\n“Parking was difficult”\n“Doctor was rude”\n“You saved my life”\n\nMany organisations lack the staffing to read all of the feedback in a systematic way" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example-1", - "title": "Unit testing in R", - "section": "Another (simple) example", - "text": "Another (simple) example\n\nmy_new_function <- function(x, y) {\n if (x > y) {\n \"x\"\n } else {\n \"y\"\n }\n}\n\n\n\ntest_that(\"it returns 'x' if x is bigger than y\", {\n expect_equal(my_new_function(4, 3), \"x\")\n})\n\nTest passed 🎉\n\ntest_that(\"it returns 'y' if y is bigger than x\", {\n expect_equal(my_new_function(3, 4), \"y\")\n expect_equal(my_new_function(3, 3), \"y\")\n})\n\nTest passed 🥳" + "objectID": "presentations/2023-05-15_text-mining/index.html#text-mining", + "href": "presentations/2023-05-15_text-mining/index.html#text-mining", + "title": "Text mining of patient experience data", + "section": "Text mining", + "text": "Text mining\n\nWe have built an algorithm to read it\n\nTheme\n“Criticality”\n\nFits alongside other work happening within NHSE\n\nA framework for understanding patient experience" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-to-design-good-tests", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-to-design-good-tests", - "title": "Unit testing in R", - "section": "How to design good tests", - "text": "How to design good tests\na non-exhaustive list\n\nconsider all the functions arguments,\nwhat are the expected values for these arguments?\nwhat are unexpected values, and are they handled?\nare there edge cases that need to be handled?\nhave you covered all of the different paths in your code?\nhave you managed to create tests that check the range of results you expect?" + "objectID": "presentations/2023-05-15_text-mining/index.html#patient-experience-101", + "href": "presentations/2023-05-15_text-mining/index.html#patient-experience-101", + "title": "Text mining of patient experience data", + "section": "Patient experience 101", + "text": "Patient experience 101\n\nTick box scoring is not useful (or accurate)\nText based data is complex and built on human experience\nWe’re not making word clouds!\nWe’re not classifying movie reviews or Reddit posts\nThe tool should enhance, not replace, human understanding\n“A recommendation engine for feedback data”" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#but-why-create-tests", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#but-why-create-tests", - "title": "Unit testing in R", - "section": "But, why create tests?", - "text": "But, why create tests?\nanother non-exhaustive list\n\ngood tests will help you uncover existing issues in your code\nwill defend you from future changes that break existing functionality\nwill alert you to changes in dependencies that may have changed the functionality of your code\ncan act as documentation for other developers" + "objectID": "presentations/2023-05-15_text-mining/index.html#everything-open-all-the-time", + "href": "presentations/2023-05-15_text-mining/index.html#everything-open-all-the-time", + "title": "Text mining of patient experience data", + "section": "Everything open, all the time", + "text": "Everything open, all the time\n\nThis project was coded in the open and is MIT licensed\nEngage with the organisations as we find them\n\nDo they want code or a docker image?\nDo they want to fetch their own themes from an API?\nDo they want to use our dashboard?" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-complex-functions", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-complex-functions", - "title": "Unit testing in R", - "section": "Testing complex functions", - "text": "Testing complex functions\n\n\n\nmy_big_function <- function(type) {\n con <- dbConnect(RSQLite::SQLite(), \"data.db\")\n df <- tbl(con, \"data_table\") |>\n collect() |>\n mutate(across(date, lubridate::ymd))\n\n conditions <- read_csv(\n \"conditions.csv\", col_types = \"cc\"\n ) |>\n filter(condition_type == type)\n\n df |>\n semi_join(conditions, by = \"condition\") |>\n count(date) |>\n ggplot(aes(date, n)) +\n geom_line() +\n geom_point()\n}\n\n\nWhere do you even begin to start writing tests for something so complex?\n\n\nNote: to get the code on the left to fit on one page, I skipped including a few library calls\n\nlibrary(tidyverse)\nlibrary(DBI)" + "objectID": "presentations/2023-05-15_text-mining/index.html#phase-1", + "href": "presentations/2023-05-15_text-mining/index.html#phase-1", + "title": "Text mining of patient experience data", + "section": "Phase 1", + "text": "Phase 1\n\n10 categories and moderate performance on criticality analysis\nscikit-learn\nShiny\nReticulate\nR package of Python code" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions", - "title": "Unit testing in R", - "section": "Split the logic into smaller functions", - "text": "Split the logic into smaller functions\nFunction to get the data from the database\n\nget_data_from_sql <- function() {\n con <- dbConnect(RSQLite::SQLite(), \"data.db\")\n tbl(con, \"data_table\") |>\n collect() |>\n mutate(across(date, lubridate::ymd))\n}" + "objectID": "presentations/2023-05-15_text-mining/index.html#golem-all-the-things", + "href": "presentations/2023-05-15_text-mining/index.html#golem-all-the-things", + "title": "Text mining of patient experience data", + "section": "Golem all the things!", + "text": "Golem all the things!\n\nOpinionated way of building Shiny\nAllows flexibility in deployed versions using YAML\nAgnostic to deployment\nEmphasises dependency management and testing\nSeparate “reactive” and “business” logic (see the accompanying book)" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-1", - "title": "Unit testing in R", - "section": "Split the logic into smaller functions", - "text": "Split the logic into smaller functions\nFunction to get the relevant conditions\n\nget_conditions <- function(type) {\n read_csv(\n \"conditions.csv\", col_types = \"cc\"\n ) |>\n filter(condition_type == type)\n}" + "objectID": "presentations/2023-05-15_text-mining/index.html#phase-2", + "href": "presentations/2023-05-15_text-mining/index.html#phase-2", + "title": "Text mining of patient experience data", + "section": "Phase 2", + "text": "Phase 2\n\n30-50 categories and excellent criticality performance\nscikit-learn/ BERT\nMore Shiny\nSeparate the code bases\nFastAPI\nInspired by the Royal College of Paediatrics and Child Health API\nDocumentation, documentation, documentation" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-2", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-2", - "title": "Unit testing in R", - "section": "Split the logic into smaller functions", - "text": "Split the logic into smaller functions\nFunction to combine the data and create a count by date\n\nsummarise_data <- function(df, conditions) {\n df |>\n semi_join(conditions, by = \"condition\") |>\n count(date)\n}" + "objectID": "presentations/2023-05-15_text-mining/index.html#making-it-useful", + "href": "presentations/2023-05-15_text-mining/index.html#making-it-useful", + "title": "Text mining of patient experience data", + "section": "Making it useful", + "text": "Making it useful\n\nAccurately rating low frequency categories\nPer category precision and recall\nSpeed versus accuracy\nRepresenting the thematic structure" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-3", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-3", - "title": "Unit testing in R", - "section": "Split the logic into smaller functions", - "text": "Split the logic into smaller functions\nFunction to generate a plot from the summarised data\n\ncreate_plot <- function(df) {\n df |>\n ggplot(aes(date, n)) +\n geom_line() +\n geom_point()\n}" + "objectID": "presentations/2023-05-15_text-mining/index.html#the-future", + "href": "presentations/2023-05-15_text-mining/index.html#the-future", + "title": "Text mining of patient experience data", + "section": "The future", + "text": "The future\n\nOff the shelf, proprietary data collection systems dominate\nThey often offer bundled analytic products of low quality\nThe DS time can’t and doesn’t want to offer a complete data system\nHow can we best contribute to improving patient experience for patients in the NHS?\n\nIf the patient experience data won’t come to the mountain…" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-4", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#split-the-logic-into-smaller-functions-4", - "title": "Unit testing in R", - "section": "Split the logic into smaller functions", - "text": "Split the logic into smaller functions\nThe original function refactored to use the new functions\n\nmy_big_function <- function(type) {\n conditions <- get_conditions(type)\n\n get_data_from_sql() |>\n summarise_data(conditions) |>\n create_plot()\n}\n\n\nThis is going to be significantly easier to test, because we now can verify that the individual components work correctly, rather than having to consider all of the possibilities at once." + "objectID": "presentations/2023-05-15_text-mining/index.html#open-source-ftw", + "href": "presentations/2023-05-15_text-mining/index.html#open-source-ftw", + "title": "Text mining of patient experience data", + "section": "Open source FTW!", + "text": "Open source FTW!\n\nOften individuals in the NHS don’t want private companies to “benefit” from open code\nBut if they make their products better with open code the patients win\nBest practice as code" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data", - "title": "Unit testing in R", - "section": "Let’s test summarise_data", - "text": "Let’s test summarise_data\nsummarise_data <- function(df, conditions) {\n df |>\n semi_join(conditions, by = \"condition\") |>\n count(date)\n}" + "objectID": "presentations/2023-05-15_text-mining/index.html#the-projects", + "href": "presentations/2023-05-15_text-mining/index.html#the-projects", + "title": "Text mining of patient experience data", + "section": "The projects", + "text": "The projects\n\nhttps://github.com/CDU-data-science-team/pxtextmining\nhttps://github.com/CDU-data-science-team/experiencesdashboard\nhttps://github.com/CDU-data-science-team/PatientExperience-QDC" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-1", - "title": "Unit testing in R", - "section": "Let’s test summarise_data", - "text": "Let’s test summarise_data\ntest_that(\"it summarises the data\", {\n # arrange\n \n\n\n\n\n\n\n \n\n \n # act\n \n # assert\n \n})" + "objectID": "presentations/2023-05-15_text-mining/index.html#the-team", + "href": "presentations/2023-05-15_text-mining/index.html#the-team", + "title": "Text mining of patient experience data", + "section": "The team", + "text": "The team\n\nYiWen Hon (Python & Machine learning)\nOluwasegun Apejoye (Shiny)\n\nContact:\n\nchris.beeley1@nhs.net\nhttps://fosstodon.org/@chrisbeeley\n\n\n\n\nview slides at the-strategy-unit.github.io/data_science/presentations" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-2", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-2", - "title": "Unit testing in R", - "section": "Let’s test summarise_data", - "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n \n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n \n\n\n\n\n # act\n \n # assert\n \n})\n\nGenerate some random data to build a reasonably sized data frame.\nYou could also create a table manually, but part of the trick of writing good tests for this function is to make it so the dates don’t all have the same count.\nThe reason for this is it’s harder to know for sure that the count worked if every row returns the same value.\nWe don’t need the values to be exactly like they are in the real data, just close enough. Instead of dates, we can use numbers, and instead of actual conditions, we can use letters." + "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#section", + "href": "presentations/2023-10-17_conference-check-in-app/index.html#section", + "title": "Conference Check-in App", + "section": "", + "text": "digital.library.unt.edu/ark:/67531/metadc1039451/m1/1/\n\n\nClark, Junebug. [Registration Desk for the LPC Conference], photograph, 2016-03-17/2016-03-19; (https://digital.library.unt.edu/ark:/67531/metadc1039451/m1/1/: accessed October 16, 2023), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Special Collections." }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-3", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-3", - "title": "Unit testing in R", - "section": "Let’s test summarise_data", - "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n \n\n\n\n\n # act\n \n # assert\n \n})\n\nTests need to be reproducible, and generating our table at random will give us unpredictable results.\nSo, we need to set the random seed; now every time this test runs we will generate the same data." + "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#qr-codes-are-great", + "href": "presentations/2023-10-17_conference-check-in-app/index.html#qr-codes-are-great", + "title": "Conference Check-in App", + "section": "QR codes are great", + "text": "QR codes are great" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-4", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-4", - "title": "Unit testing in R", - "section": "Let’s test summarise_data", - "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n conditions <- tibble(condition = c(\"a\", \"b\")) \n \n\n\n\n # act\n \n # assert\n \n})\n\nCreate the conditions table. We don’t need all of the columns that are present in the real csv, just the ones that will make our code work.\nWe also need to test that the filtering join (semi_join) is working, so we want to use a subset of the conditions that were used in df." + "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#and-can-be-easily-generated-in-r", + "href": "presentations/2023-10-17_conference-check-in-app/index.html#and-can-be-easily-generated-in-r", + "title": "Conference Check-in App", + "section": "and can be easily generated in R", + "text": "and can be easily generated in R\ninstall.packages(\"qrcode\")\nlibrary(qrcode)\n\nqr_code(\"https://www.youtube.com/watch?v=dQw4w9WgXcQ\")" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-5", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-5", - "title": "Unit testing in R", - "section": "Let’s test summarise_data", - "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n conditions <- tibble(condition = c(\"a\", \"b\")) \n \n \n\n \n # act\n actual <- summarise_data(df, conditions)\n # assert\n \n})\n\nBecause we are generating df randomly, to figure out what our “expected” results are, I simply ran the code inside of the test to generate the “actual” results.\nGenerally, this isn’t a good idea. You are creating the results of your test from the code; ideally, you want to be thinking about what the results of your function should be.\nImagine your function doesn’t work as intended, there is some subtle bug that you are not yet aware of. By writing tests “backwards” you may write test cases that confirm the results, but not expose the bug. This is why it’s good to think about edge cases." + "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#why-not", + "href": "presentations/2023-10-17_conference-check-in-app/index.html#why-not", + "title": "Conference Check-in App", + "section": "Why not?", + "text": "Why not?\n\n{shiny} would be doing all the processing on the server side\nwe would need to read from a camera client side\nthen stream video to the server for {shiny} to detect and decode the QR codes" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-6", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-6", - "title": "Unit testing in R", - "section": "Let’s test summarise_data", - "text": "Let’s test summarise_data\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n conditions <- tibble(condition = c(\"a\", \"b\")) \n expected <- tibble(\n date = 1:10,\n n = c(19, 18, 12, 14, 17, 18, 24, 18, 31, 21)\n ) \n # act\n actual <- summarise_data(df, conditions)\n # assert\n \n})\n\nThat said, in cases where we can be confident (say by static analysis of our code) that it is correct, building tests in this way will give us the confidence going forwards that future changes do not break existing functionality.\nIn this case, I have created the expected data frame using the results from running the function." + "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work", + "href": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work", + "title": "Conference Check-in App", + "section": "How does this work?", + "text": "How does this work?\n\n\nFront-end\n\n\nuses the React JavaScript framework\n@yidel/react-qr-scanner\nApp scan’s a QR code, then sends this to our backend\nA window pops up to say who has checked in, or shows an error message" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-7", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#lets-test-summarise_data-7", - "title": "Unit testing in R", - "section": "Let’s test summarise_data", - "text": "Let’s test summarise_data\n\n\n\ntest_that(\"it summarises the data\", {\n # arrange\n set.seed(123)\n df <- tibble(\n date = sample(1:10, 300, TRUE),\n condition = sample(c(\"a\", \"b\", \"c\"), 300, TRUE)\n )\n conditions <- tibble(condition = c(\"a\", \"b\"))\n expected <- tibble(\n date = 1:10,\n n = c(19, 18, 12, 14, 17, 18, 24, 18, 31, 21)\n )\n # act\n actual <- summarise_data(df, conditions)\n # assert\n expect_equal(actual, expected)\n})\n\nTest passed 😸\n\n\n\nThe test works!" + "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work-1", + "href": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work-1", + "title": "Conference Check-in App", + "section": "How does this work?", + "text": "How does this work?\nBack-end\nUses the {plumber} R package to build the API, with endpoints for\n\ngetting the list of all of the attendees for that day\nuploading a list of attendees in bulk\nadding an attendee individually\ngetting an attendee\nchecking the attendee in" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#next-steps", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#next-steps", - "title": "Unit testing in R", - "section": "Next steps", - "text": "Next steps\n\nYou can add tests to any R project (to test functions),\nBut {testthat} works best with Packages\nThe R Packages book has 3 chapters on testing\nThere are two useful helper functions in {usethis}\n\nuse_testthat() will set up the folders for test scripts\nuse_test() will create a test file for the currently open script" + "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work-2", + "href": "presentations/2023-10-17_conference-check-in-app/index.html#how-does-this-work-2", + "title": "Conference Check-in App", + "section": "How does this work?", + "text": "How does this work?\nMore Back-end Stuff\n\nuses a simple SQLite DB that will be thrown away at the end of the conference\nwe send personalised emails using {blastula} to the attendees with their QR codes\nthe QR codes are just random ids (UUIDs) that identify each attendee\nuses websockets to update all of the clients when a user checks in (to update the list of attendees)" }, { - "objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#next-steps-1", - "href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#next-steps-1", - "title": "Unit testing in R", - "section": "Next steps", - "text": "Next steps\n\nIf your test needs to temporarily create a file, or change some R-options, the {withr} package has a lot of useful functions that will automatically clean things up when the test finishes\nIf you are writing tests that involve calling out to a database, or you want to test my_big_function (from before) without calling the intermediate functions, then you should look at the {mockery} package" + "objectID": "presentations/2023-10-17_conference-check-in-app/index.html#learning-different-tools-can-show-you-the-light", + "href": "presentations/2023-10-17_conference-check-in-app/index.html#learning-different-tools-can-show-you-the-light", + "title": "Conference Check-in App", + "section": "Learning different tools can show you the light", + "text": "Learning different tools can show you the light\n\nunsplash.com/photos/tMGMINwFOtI" }, { "objectID": "presentations/2024-01-25_coffee-and-coding/index.html#targets-for-analysts", diff --git a/site_libs/revealjs/dist/theme/quarto.css b/site_libs/revealjs/dist/theme/quarto.css index a5047c5..31095ab 100644 --- a/site_libs/revealjs/dist/theme/quarto.css +++ b/site_libs/revealjs/dist/theme/quarto.css @@ -5,4 +5,4 @@ * we also add `bright-[color]-` synonyms for the `-[color]-intense` classes since * that seems to be what ansi_up emits * -*/.ansi-black-fg{color:#3e424d}.ansi-black-bg{background-color:#3e424d}.ansi-black-intense-black,.ansi-bright-black-fg{color:#282c36}.ansi-black-intense-black,.ansi-bright-black-bg{background-color:#282c36}.ansi-red-fg{color:#e75c58}.ansi-red-bg{background-color:#e75c58}.ansi-red-intense-red,.ansi-bright-red-fg{color:#b22b31}.ansi-red-intense-red,.ansi-bright-red-bg{background-color:#b22b31}.ansi-green-fg{color:#00a250}.ansi-green-bg{background-color:#00a250}.ansi-green-intense-green,.ansi-bright-green-fg{color:#007427}.ansi-green-intense-green,.ansi-bright-green-bg{background-color:#007427}.ansi-yellow-fg{color:#ddb62b}.ansi-yellow-bg{background-color:#ddb62b}.ansi-yellow-intense-yellow,.ansi-bright-yellow-fg{color:#b27d12}.ansi-yellow-intense-yellow,.ansi-bright-yellow-bg{background-color:#b27d12}.ansi-blue-fg{color:#208ffb}.ansi-blue-bg{background-color:#208ffb}.ansi-blue-intense-blue,.ansi-bright-blue-fg{color:#0065ca}.ansi-blue-intense-blue,.ansi-bright-blue-bg{background-color:#0065ca}.ansi-magenta-fg{color:#d160c4}.ansi-magenta-bg{background-color:#d160c4}.ansi-magenta-intense-magenta,.ansi-bright-magenta-fg{color:#a03196}.ansi-magenta-intense-magenta,.ansi-bright-magenta-bg{background-color:#a03196}.ansi-cyan-fg{color:#60c6c8}.ansi-cyan-bg{background-color:#60c6c8}.ansi-cyan-intense-cyan,.ansi-bright-cyan-fg{color:#258f8f}.ansi-cyan-intense-cyan,.ansi-bright-cyan-bg{background-color:#258f8f}.ansi-white-fg{color:#c5c1b4}.ansi-white-bg{background-color:#c5c1b4}.ansi-white-intense-white,.ansi-bright-white-fg{color:#a1a6b2}.ansi-white-intense-white,.ansi-bright-white-bg{background-color:#a1a6b2}.ansi-default-inverse-fg{color:#fff}.ansi-default-inverse-bg{background-color:#000}.ansi-bold{font-weight:bold}.ansi-underline{text-decoration:underline}:root{--quarto-body-bg: #f5f4f3;--quarto-body-color: #635a54;--quarto-text-muted: #b0a7a0;--quarto-border-color: #f5f4f4;--quarto-border-width: 1px;--quarto-border-radius: 4px}table.gt_table{color:var(--quarto-body-color);font-size:1em;width:100%;background-color:rgba(0,0,0,0);border-top-width:inherit;border-bottom-width:inherit;border-color:var(--quarto-border-color)}table.gt_table th.gt_column_spanner_outer{color:var(--quarto-body-color);background-color:rgba(0,0,0,0);border-top-width:inherit;border-bottom-width:inherit;border-color:var(--quarto-border-color)}table.gt_table th.gt_col_heading{color:var(--quarto-body-color);font-weight:bold;background-color:rgba(0,0,0,0)}table.gt_table thead.gt_col_headings{border-bottom:1px solid currentColor;border-top-width:inherit;border-top-color:var(--quarto-border-color)}table.gt_table thead.gt_col_headings:not(:first-child){border-top-width:1px;border-top-color:var(--quarto-border-color)}table.gt_table td.gt_row{border-bottom-width:1px;border-bottom-color:var(--quarto-border-color);border-top-width:0px}table.gt_table tbody.gt_table_body{border-top-width:1px;border-bottom-width:1px;border-bottom-color:var(--quarto-border-color);border-top-color:currentColor}div.columns{display:initial;gap:initial}div.column{display:inline-block;overflow-x:initial;vertical-align:top;width:50%}.code-annotation-tip-content{word-wrap:break-word}.code-annotation-container-hidden{display:none !important}dl.code-annotation-container-grid{display:grid;grid-template-columns:min-content auto}dl.code-annotation-container-grid dt{grid-column:1}dl.code-annotation-container-grid dd{grid-column:2}pre.sourceCode.code-annotation-code{padding-right:0}code.sourceCode .code-annotation-anchor{z-index:100;position:relative;float:right;background-color:rgba(0,0,0,0)}input[type=checkbox]{margin-right:.5ch}:root{--mermaid-bg-color: #f5f4f3;--mermaid-edge-color: #999;--mermaid-node-fg-color: #635a54;--mermaid-fg-color: #635a54;--mermaid-fg-color--lighter: #7f746b;--mermaid-fg-color--lightest: #988d85;--mermaid-font-family: Source Sans Pro, Helvetica, sans-serif;--mermaid-label-bg-color: #f5f4f3;--mermaid-label-fg-color: #468;--mermaid-node-bg-color: rgba(68, 102, 136, 0.1);--mermaid-node-fg-color: #635a54}@media print{:root{font-size:11pt}#quarto-sidebar,#TOC,.nav-page{display:none}.page-columns .content{grid-column-start:page-start}.fixed-top{position:relative}.panel-caption,.figure-caption,figcaption{color:#666}}.code-copy-button{position:absolute;top:0;right:0;border:0;margin-top:5px;margin-right:5px;background-color:rgba(0,0,0,0);z-index:3}.code-copy-button:focus{outline:none}.code-copy-button-tooltip{font-size:.75em}pre.sourceCode:hover>.code-copy-button>.bi::before{display:inline-block;height:1rem;width:1rem;content:"";vertical-align:-0.125em;background-image:url('data:image/svg+xml,');background-repeat:no-repeat;background-size:1rem 1rem}pre.sourceCode:hover>.code-copy-button-checked>.bi::before{background-image:url('data:image/svg+xml,')}pre.sourceCode:hover>.code-copy-button:hover>.bi::before{background-image:url('data:image/svg+xml,')}pre.sourceCode:hover>.code-copy-button-checked:hover>.bi::before{background-image:url('data:image/svg+xml,')}.panel-tabset [role=tablist]{border-bottom:1px solid #f5f4f4;list-style:none;margin:0;padding:0;width:100%}.panel-tabset [role=tablist] *{-webkit-box-sizing:border-box;box-sizing:border-box}@media(min-width: 30em){.panel-tabset [role=tablist] li{display:inline-block}}.panel-tabset [role=tab]{border:1px solid rgba(0,0,0,0);border-top-color:#f5f4f4;display:block;padding:.5em 1em;text-decoration:none}@media(min-width: 30em){.panel-tabset [role=tab]{border-top-color:rgba(0,0,0,0);display:inline-block;margin-bottom:-1px}}.panel-tabset [role=tab][aria-selected=true]{background-color:#f5f4f4}@media(min-width: 30em){.panel-tabset [role=tab][aria-selected=true]{background-color:rgba(0,0,0,0);border:1px solid #f5f4f4;border-bottom-color:#f5f4f3}}@media(min-width: 30em){.panel-tabset [role=tab]:hover:not([aria-selected=true]){border:1px solid #f5f4f4}}.code-with-filename .code-with-filename-file{margin-bottom:0;padding-bottom:2px;padding-top:2px;padding-left:.7em;border:var(--quarto-border-width) solid var(--quarto-border-color);border-radius:var(--quarto-border-radius);border-bottom:0;border-bottom-left-radius:0%;border-bottom-right-radius:0%}.code-with-filename div.sourceCode,.reveal .code-with-filename div.sourceCode{margin-top:0;border-top-left-radius:0%;border-top-right-radius:0%}.code-with-filename .code-with-filename-file pre{margin-bottom:0}.code-with-filename .code-with-filename-file{background-color:rgba(219,219,219,.8)}.quarto-dark .code-with-filename .code-with-filename-file{background-color:#555}.code-with-filename .code-with-filename-file strong{font-weight:400}.reveal.center .slide aside,.reveal.center .slide div.aside{position:initial}section.has-light-background,section.has-light-background h1,section.has-light-background h2,section.has-light-background h3,section.has-light-background h4,section.has-light-background h5,section.has-light-background h6{color:#222}section.has-light-background a,section.has-light-background a:hover{color:#2a76dd}section.has-light-background code{color:#4758ab}section.has-dark-background,section.has-dark-background h1,section.has-dark-background h2,section.has-dark-background h3,section.has-dark-background h4,section.has-dark-background h5,section.has-dark-background h6{color:#fff}section.has-dark-background a,section.has-dark-background a:hover{color:#42affa}section.has-dark-background code{color:#ffa07a}#title-slide,div.reveal div.slides section.quarto-title-block{text-align:center}#title-slide .subtitle,div.reveal div.slides section.quarto-title-block .subtitle{margin-bottom:2.5rem}.reveal .slides{text-align:left}.reveal .title-slide h1{font-size:1.6em}.reveal[data-navigation-mode=linear] .title-slide h1{font-size:2.5em}.reveal div.sourceCode{border:1px solid #f5f4f4;border-radius:4px}.reveal pre{width:100%;box-shadow:none;background-color:#f5f4f3;border:none;margin:0;font-size:.55em}.reveal .code-with-filename .code-with-filename-file pre{background-color:unset}.reveal code{color:var(--quarto-hl-fu-color);background-color:rgba(0,0,0,0);white-space:pre-wrap}.reveal pre.sourceCode code{background-color:#f5f4f3;padding:6px 9px;max-height:500px;white-space:pre}.reveal pre code{background-color:#f5f4f3;color:#635a54}.reveal .column-output-location{display:flex;align-items:stretch}.reveal .column-output-location .column:first-of-type div.sourceCode{height:100%;background-color:#f5f4f3}.reveal blockquote{display:block;position:relative;color:#b0a7a0;width:unset;margin:var(--r-block-margin) auto;padding:.625rem 1.75rem;border-left:.25rem solid #b0a7a0;font-style:normal;background:none;box-shadow:none}.reveal blockquote p:first-child,.reveal blockquote p:last-child{display:block}.reveal .slide aside,.reveal .slide div.aside{position:absolute;bottom:20px;font-size:0.7em;color:#b0a7a0}.reveal .slide sup{font-size:0.7em}.reveal .slide.scrollable aside,.reveal .slide.scrollable div.aside{position:relative;margin-top:1em}.reveal .slide aside .aside-footnotes{margin-bottom:0}.reveal .slide aside .aside-footnotes li:first-of-type{margin-top:0}.reveal .layout-sidebar{display:flex;width:100%;margin-top:.8em}.reveal .layout-sidebar .panel-sidebar{width:270px}.reveal .layout-sidebar-left .panel-sidebar{margin-right:calc(0.5em*2)}.reveal .layout-sidebar-right .panel-sidebar{margin-left:calc(0.5em*2)}.reveal .layout-sidebar .panel-fill,.reveal .layout-sidebar .panel-center,.reveal .layout-sidebar .panel-tabset{flex:1}.reveal .panel-input,.reveal .panel-sidebar{font-size:.5em;padding:.5em;border-style:solid;border-color:#f5f4f4;border-width:1px;border-radius:4px;background-color:#f8f9fa}.reveal .panel-sidebar :first-child,.reveal .panel-fill :first-child{margin-top:0}.reveal .panel-sidebar :last-child,.reveal .panel-fill :last-child{margin-bottom:0}.panel-input>div,.panel-input>div>div{vertical-align:middle;padding-right:1em}.reveal p,.reveal .slides section,.reveal .slides section>section{line-height:1.3}.reveal.smaller .slides section,.reveal .slides section.smaller,.reveal .slides section .callout{font-size:0.7em}.reveal.smaller .slides section section{font-size:inherit}.reveal.smaller .slides h1,.reveal .slides section.smaller h1{font-size:calc(2.5em/0.7)}.reveal.smaller .slides h2,.reveal .slides section.smaller h2{font-size:calc(1.6em/0.7)}.reveal.smaller .slides h3,.reveal .slides section.smaller h3{font-size:calc(1.3em/0.7)}.reveal .columns>.column>:not(ul,ol){margin-left:.25em;margin-right:.25em}.reveal .columns>.column:first-child>:not(ul,ol){margin-right:.5em;margin-left:0}.reveal .columns>.column:last-child>:not(ul,ol){margin-right:0;margin-left:.5em}.reveal .slide-number{color:#7d9dcf;background-color:#f5f4f3}.reveal .footer{color:#b0a7a0}.reveal .footer a{color:#5881c1}.reveal .footer.has-dark-background{color:#fff}.reveal .footer.has-dark-background a{color:#7bc6fa}.reveal .footer.has-light-background{color:#505050}.reveal .footer.has-light-background a{color:#6a9bdd}.reveal .slide-number{color:#b0a7a0}.reveal .slide-number.has-dark-background{color:#fff}.reveal .slide-number.has-light-background{color:#505050}.reveal .slide figure>figcaption,.reveal .slide img.stretch+p.caption,.reveal .slide img.r-stretch+p.caption{font-size:0.7em}@media screen and (min-width: 500px){.reveal .controls[data-controls-layout=edges] .navigate-left{left:.2em}.reveal .controls[data-controls-layout=edges] .navigate-right{right:.2em}.reveal .controls[data-controls-layout=edges] .navigate-up{top:.4em}.reveal .controls[data-controls-layout=edges] .navigate-down{bottom:2.3em}}.tippy-box[data-theme~=light-border]{background-color:#f5f4f3;color:#635a54;border-radius:4px;border:solid 1px #b0a7a0;font-size:.6em}.tippy-box[data-theme~=light-border] .tippy-arrow{color:#b0a7a0}.tippy-box[data-placement^=bottom]>.tippy-content{padding:7px 10px;z-index:1}.reveal .callout.callout-style-simple .callout-body,.reveal .callout.callout-style-default .callout-body,.reveal .callout.callout-style-simple div.callout-title,.reveal .callout.callout-style-default div.callout-title{font-size:inherit}.reveal .callout.callout-style-default .callout-icon::before,.reveal .callout.callout-style-simple .callout-icon::before{height:2rem;width:2rem;background-size:2rem 2rem}.reveal .callout.callout-titled .callout-title p{margin-top:.5em}.reveal .callout.callout-titled .callout-icon::before{margin-top:1rem}.reveal .callout.callout-titled .callout-body>.callout-content>:last-child{margin-bottom:1rem}.reveal .panel-tabset [role=tab]{padding:.25em .7em}.reveal .slide-menu-button .fa-bars::before{background-image:url('data:image/svg+xml,')}.reveal .slide-chalkboard-buttons .fa-easel2::before{background-image:url('data:image/svg+xml,')}.reveal .slide-chalkboard-buttons .fa-brush::before{background-image:url('data:image/svg+xml,')}/*! light */.reveal ol[type=a]{list-style-type:lower-alpha}.reveal ol[type=a s]{list-style-type:lower-alpha}.reveal ol[type=A s]{list-style-type:upper-alpha}.reveal ol[type=i]{list-style-type:lower-roman}.reveal ol[type=i s]{list-style-type:lower-roman}.reveal ol[type=I s]{list-style-type:upper-roman}.reveal ol[type="1"]{list-style-type:decimal}.reveal ul.task-list{list-style:none}.reveal ul.task-list li input[type=checkbox]{width:2em;height:2em;margin:0 1em .5em -1.6em;vertical-align:middle}div.cell-output-display div.pagedtable-wrapper table.table{font-size:.6em}.reveal .code-annotation-container-hidden{display:none}.reveal code.sourceCode button.code-annotation-anchor,.reveal code.sourceCode .code-annotation-anchor{font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;color:var(--quarto-hl-co-color);border:solid var(--quarto-hl-co-color) 1px;border-radius:50%;font-size:.7em;line-height:1.2em;margin-top:2px;user-select:none;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;-o-user-select:none}.reveal code.sourceCode button.code-annotation-anchor{cursor:pointer}.reveal code.sourceCode a.code-annotation-anchor{text-align:center;vertical-align:middle;text-decoration:none;cursor:default;height:1.2em;width:1.2em}.reveal code.sourceCode.fragment a.code-annotation-anchor{left:auto}.reveal #code-annotation-line-highlight-gutter{width:100%;border-top:solid var(--quarto-hl-co-color) 1px;border-bottom:solid var(--quarto-hl-co-color) 1px;z-index:2}.reveal #code-annotation-line-highlight{margin-left:-8em;width:calc(100% + 4em);border-top:solid var(--quarto-hl-co-color) 1px;border-bottom:solid var(--quarto-hl-co-color) 1px;z-index:2;margin-bottom:-2px}.reveal code.sourceCode .code-annotation-anchor.code-annotation-active{background-color:var(--quarto-hl-normal-color, #aaaaaa);border:solid var(--quarto-hl-normal-color, #aaaaaa) 1px;color:#f5f4f3;font-weight:bolder}.reveal pre.code-annotation-code{padding-top:0;padding-bottom:0}.reveal pre.code-annotation-code code{z-index:3;padding-left:0px}.reveal dl.code-annotation-container-grid{margin-left:.1em}.reveal dl.code-annotation-container-grid dt{margin-top:.65rem;font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;border:solid #635a54 1px;border-radius:50%;height:1.3em;width:1.3em;line-height:1.3em;font-size:.5em;text-align:center;vertical-align:middle;text-decoration:none}.reveal dl.code-annotation-container-grid dd{margin-left:.25em}.reveal .scrollable ol li:first-child:nth-last-child(n+10),.reveal .scrollable ol li:first-child:nth-last-child(n+10)~li{margin-left:1em}html.print-pdf .reveal .slides .pdf-page:last-child{page-break-after:avoid}.reveal .quarto-title-block .quarto-title-authors{display:flex;justify-content:center}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author{padding-left:.5em;padding-right:.5em}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author a,.reveal .quarto-title-block .quarto-title-authors .quarto-title-author a:hover,.reveal .quarto-title-block .quarto-title-authors .quarto-title-author a:visited,.reveal .quarto-title-block .quarto-title-authors .quarto-title-author a:active{color:inherit;text-decoration:none}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-author-name{margin-bottom:.1rem}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-author-email{margin-top:0px;margin-bottom:.4em;font-size:.6em}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-author-orcid img{margin-bottom:4px}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-affiliation{font-size:.7em;margin-top:0px;margin-bottom:8px}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-affiliation:first{margin-top:12px}.reveal .slide blockquote{border-left:3px solid #b0a7a0;padding-left:.6em;background:#f6f7f7}.reveal h2{padding-bottom:.3em}.reveal .footer{color:#f9bd07;background-color:#2c2825;display:block;position:fixed;bottom:0px !important;padding-bottom:12px;padding-top:12px;width:100%;text-align:center;font-size:18px;z-index:2}.reveal .slide ul li{list-style:none}.reveal .slide ul li::before{content:"◼";color:#f9bd07;display:inline-block;width:1.5em;margin-left:-1.5em;font-size:66%}.reveal .progress span{background-color:#f9bd07}.reveal .slide-logo{z-index:3;bottom:-5px !important}.slide-background:first-child{background-color:#2c2825;background-image:radial-gradient(#635a54 1%, transparent 5%);background-position:0 0,10px 10px;background-size:30px 30px;background-repeat:repeat;height:100%;width:100%}.slide-background:first-child .slide-background-content{background-image:url("https://the-strategy-unit.github.io/assets/logo_yellow.svg");top:.5em;right:.5em;height:4em;width:4em;position:absolute}#title-slide{text-align:left}#title-slide h1{font-size:2em;color:#f9bd07 !important}#title-slide .subtitle{color:#c7c1bc}#title-slide .quarto-title-authors{justify-content:left;display:block}#title-slide .quarto-title-author{padding-top:1em;color:#ec6555;padding:0}#title-slide .quarto-title-author a{color:#5881c1}#title-slide p.institute{font-size:.75em;color:#c7c1bc}#title-slide p.date{font-size:.5em;color:#988d85}.reveal .slide-logo:first-child{display:none}.inverse .slide-background-content{background-color:#2c2825}.inverse h1,.inverse h2,.inverse h3,.inverse h4{color:#988d85 !important}.reveal .imitate-title{font-size:2em}.reveal .inverse .imitate-title{color:#f9bd07 !important}.reveal .inverse{color:#f5f4f3}.text-bottom{bottom:1em;position:absolute}.no-bullets ul{list-style-type:none;padding:0;margin:0}.small-table table{font-size:1.65rem}.small{font-size:1.2rem}.yellow{color:#f9bd07}.light-yellow{color:#fef2ce}.light-charcoal{color:#988d85}.very-light-charcoal{color:#c7c1bc}.center{text-align:center}/*# sourceMappingURL=f95d2bded9c28492b788fe14c3e9f347.css.map */ +*/.ansi-black-fg{color:#3e424d}.ansi-black-bg{background-color:#3e424d}.ansi-black-intense-black,.ansi-bright-black-fg{color:#282c36}.ansi-black-intense-black,.ansi-bright-black-bg{background-color:#282c36}.ansi-red-fg{color:#e75c58}.ansi-red-bg{background-color:#e75c58}.ansi-red-intense-red,.ansi-bright-red-fg{color:#b22b31}.ansi-red-intense-red,.ansi-bright-red-bg{background-color:#b22b31}.ansi-green-fg{color:#00a250}.ansi-green-bg{background-color:#00a250}.ansi-green-intense-green,.ansi-bright-green-fg{color:#007427}.ansi-green-intense-green,.ansi-bright-green-bg{background-color:#007427}.ansi-yellow-fg{color:#ddb62b}.ansi-yellow-bg{background-color:#ddb62b}.ansi-yellow-intense-yellow,.ansi-bright-yellow-fg{color:#b27d12}.ansi-yellow-intense-yellow,.ansi-bright-yellow-bg{background-color:#b27d12}.ansi-blue-fg{color:#208ffb}.ansi-blue-bg{background-color:#208ffb}.ansi-blue-intense-blue,.ansi-bright-blue-fg{color:#0065ca}.ansi-blue-intense-blue,.ansi-bright-blue-bg{background-color:#0065ca}.ansi-magenta-fg{color:#d160c4}.ansi-magenta-bg{background-color:#d160c4}.ansi-magenta-intense-magenta,.ansi-bright-magenta-fg{color:#a03196}.ansi-magenta-intense-magenta,.ansi-bright-magenta-bg{background-color:#a03196}.ansi-cyan-fg{color:#60c6c8}.ansi-cyan-bg{background-color:#60c6c8}.ansi-cyan-intense-cyan,.ansi-bright-cyan-fg{color:#258f8f}.ansi-cyan-intense-cyan,.ansi-bright-cyan-bg{background-color:#258f8f}.ansi-white-fg{color:#c5c1b4}.ansi-white-bg{background-color:#c5c1b4}.ansi-white-intense-white,.ansi-bright-white-fg{color:#a1a6b2}.ansi-white-intense-white,.ansi-bright-white-bg{background-color:#a1a6b2}.ansi-default-inverse-fg{color:#fff}.ansi-default-inverse-bg{background-color:#000}.ansi-bold{font-weight:bold}.ansi-underline{text-decoration:underline}:root{--quarto-body-bg: #f5f4f3;--quarto-body-color: #635a54;--quarto-text-muted: #b0a7a0;--quarto-border-color: #f5f4f4;--quarto-border-width: 1px;--quarto-border-radius: 4px}table.gt_table{color:var(--quarto-body-color);font-size:1em;width:100%;background-color:rgba(0,0,0,0);border-top-width:inherit;border-bottom-width:inherit;border-color:var(--quarto-border-color)}table.gt_table th.gt_column_spanner_outer{color:var(--quarto-body-color);background-color:rgba(0,0,0,0);border-top-width:inherit;border-bottom-width:inherit;border-color:var(--quarto-border-color)}table.gt_table th.gt_col_heading{color:var(--quarto-body-color);font-weight:bold;background-color:rgba(0,0,0,0)}table.gt_table thead.gt_col_headings{border-bottom:1px solid currentColor;border-top-width:inherit;border-top-color:var(--quarto-border-color)}table.gt_table thead.gt_col_headings:not(:first-child){border-top-width:1px;border-top-color:var(--quarto-border-color)}table.gt_table td.gt_row{border-bottom-width:1px;border-bottom-color:var(--quarto-border-color);border-top-width:0px}table.gt_table tbody.gt_table_body{border-top-width:1px;border-bottom-width:1px;border-bottom-color:var(--quarto-border-color);border-top-color:currentColor}div.columns{display:initial;gap:initial}div.column{display:inline-block;overflow-x:initial;vertical-align:top;width:50%}.code-annotation-tip-content{word-wrap:break-word}.code-annotation-container-hidden{display:none !important}dl.code-annotation-container-grid{display:grid;grid-template-columns:min-content auto}dl.code-annotation-container-grid dt{grid-column:1}dl.code-annotation-container-grid dd{grid-column:2}pre.sourceCode.code-annotation-code{padding-right:0}code.sourceCode .code-annotation-anchor{z-index:100;position:relative;float:right;background-color:rgba(0,0,0,0)}input[type=checkbox]{margin-right:.5ch}:root{--mermaid-bg-color: #f5f4f3;--mermaid-edge-color: #999;--mermaid-node-fg-color: #635a54;--mermaid-fg-color: #635a54;--mermaid-fg-color--lighter: #7f746b;--mermaid-fg-color--lightest: #988d85;--mermaid-font-family: Source Sans Pro, Helvetica, sans-serif;--mermaid-label-bg-color: #f5f4f3;--mermaid-label-fg-color: #468;--mermaid-node-bg-color: rgba(68, 102, 136, 0.1);--mermaid-node-fg-color: #635a54}@media print{:root{font-size:11pt}#quarto-sidebar,#TOC,.nav-page{display:none}.page-columns .content{grid-column-start:page-start}.fixed-top{position:relative}.panel-caption,.figure-caption,figcaption{color:#666}}.code-copy-button{position:absolute;top:0;right:0;border:0;margin-top:5px;margin-right:5px;background-color:rgba(0,0,0,0);z-index:3}.code-copy-button:focus{outline:none}.code-copy-button-tooltip{font-size:.75em}pre.sourceCode:hover>.code-copy-button>.bi::before{display:inline-block;height:1rem;width:1rem;content:"";vertical-align:-0.125em;background-image:url('data:image/svg+xml,');background-repeat:no-repeat;background-size:1rem 1rem}pre.sourceCode:hover>.code-copy-button-checked>.bi::before{background-image:url('data:image/svg+xml,')}pre.sourceCode:hover>.code-copy-button:hover>.bi::before{background-image:url('data:image/svg+xml,')}pre.sourceCode:hover>.code-copy-button-checked:hover>.bi::before{background-image:url('data:image/svg+xml,')}.panel-tabset [role=tablist]{border-bottom:1px solid #f5f4f4;list-style:none;margin:0;padding:0;width:100%}.panel-tabset [role=tablist] *{-webkit-box-sizing:border-box;box-sizing:border-box}@media(min-width: 30em){.panel-tabset [role=tablist] li{display:inline-block}}.panel-tabset [role=tab]{border:1px solid rgba(0,0,0,0);border-top-color:#f5f4f4;display:block;padding:.5em 1em;text-decoration:none}@media(min-width: 30em){.panel-tabset [role=tab]{border-top-color:rgba(0,0,0,0);display:inline-block;margin-bottom:-1px}}.panel-tabset [role=tab][aria-selected=true]{background-color:#f5f4f4}@media(min-width: 30em){.panel-tabset [role=tab][aria-selected=true]{background-color:rgba(0,0,0,0);border:1px solid #f5f4f4;border-bottom-color:#f5f4f3}}@media(min-width: 30em){.panel-tabset [role=tab]:hover:not([aria-selected=true]){border:1px solid #f5f4f4}}.code-with-filename .code-with-filename-file{margin-bottom:0;padding-bottom:2px;padding-top:2px;padding-left:.7em;border:var(--quarto-border-width) solid var(--quarto-border-color);border-radius:var(--quarto-border-radius);border-bottom:0;border-bottom-left-radius:0%;border-bottom-right-radius:0%}.code-with-filename div.sourceCode,.reveal .code-with-filename div.sourceCode{margin-top:0;border-top-left-radius:0%;border-top-right-radius:0%}.code-with-filename .code-with-filename-file pre{margin-bottom:0}.code-with-filename .code-with-filename-file{background-color:rgba(219,219,219,.8)}.quarto-dark .code-with-filename .code-with-filename-file{background-color:#555}.code-with-filename .code-with-filename-file strong{font-weight:400}.reveal.center .slide aside,.reveal.center .slide div.aside{position:initial}section.has-light-background,section.has-light-background h1,section.has-light-background h2,section.has-light-background h3,section.has-light-background h4,section.has-light-background h5,section.has-light-background h6{color:#222}section.has-light-background a,section.has-light-background a:hover{color:#2a76dd}section.has-light-background code{color:#4758ab}section.has-dark-background,section.has-dark-background h1,section.has-dark-background h2,section.has-dark-background h3,section.has-dark-background h4,section.has-dark-background h5,section.has-dark-background h6{color:#fff}section.has-dark-background a,section.has-dark-background a:hover{color:#42affa}section.has-dark-background code{color:#ffa07a}#title-slide,div.reveal div.slides section.quarto-title-block{text-align:center}#title-slide .subtitle,div.reveal div.slides section.quarto-title-block .subtitle{margin-bottom:2.5rem}.reveal .slides{text-align:left}.reveal .title-slide h1{font-size:1.6em}.reveal[data-navigation-mode=linear] .title-slide h1{font-size:2.5em}.reveal div.sourceCode{border:1px solid #f5f4f4;border-radius:4px}.reveal pre{width:100%;box-shadow:none;background-color:#f5f4f3;border:none;margin:0;font-size:.55em}.reveal .code-with-filename .code-with-filename-file pre{background-color:unset}.reveal code{color:var(--quarto-hl-fu-color);background-color:rgba(0,0,0,0);white-space:pre-wrap}.reveal pre.sourceCode code{background-color:#f5f4f3;padding:6px 9px;max-height:500px;white-space:pre}.reveal pre code{background-color:#f5f4f3;color:#635a54}.reveal .column-output-location{display:flex;align-items:stretch}.reveal .column-output-location .column:first-of-type div.sourceCode{height:100%;background-color:#f5f4f3}.reveal blockquote{display:block;position:relative;color:#b0a7a0;width:unset;margin:var(--r-block-margin) auto;padding:.625rem 1.75rem;border-left:.25rem solid #b0a7a0;font-style:normal;background:none;box-shadow:none}.reveal blockquote p:first-child,.reveal blockquote p:last-child{display:block}.reveal .slide aside,.reveal .slide div.aside{position:absolute;bottom:20px;font-size:0.7em;color:#b0a7a0}.reveal .slide sup{font-size:0.7em}.reveal .slide.scrollable aside,.reveal .slide.scrollable div.aside{position:relative;margin-top:1em}.reveal .slide aside .aside-footnotes{margin-bottom:0}.reveal .slide aside .aside-footnotes li:first-of-type{margin-top:0}.reveal .layout-sidebar{display:flex;width:100%;margin-top:.8em}.reveal .layout-sidebar .panel-sidebar{width:270px}.reveal .layout-sidebar-left .panel-sidebar{margin-right:calc(0.5em*2)}.reveal .layout-sidebar-right .panel-sidebar{margin-left:calc(0.5em*2)}.reveal .layout-sidebar .panel-fill,.reveal .layout-sidebar .panel-center,.reveal .layout-sidebar .panel-tabset{flex:1}.reveal .panel-input,.reveal .panel-sidebar{font-size:.5em;padding:.5em;border-style:solid;border-color:#f5f4f4;border-width:1px;border-radius:4px;background-color:#f8f9fa}.reveal .panel-sidebar :first-child,.reveal .panel-fill :first-child{margin-top:0}.reveal .panel-sidebar :last-child,.reveal .panel-fill :last-child{margin-bottom:0}.panel-input>div,.panel-input>div>div{vertical-align:middle;padding-right:1em}.reveal p,.reveal .slides section,.reveal .slides section>section{line-height:1.3}.reveal.smaller .slides section,.reveal .slides section.smaller,.reveal .slides section .callout{font-size:0.7em}.reveal.smaller .slides section section{font-size:inherit}.reveal.smaller .slides h1,.reveal .slides section.smaller h1{font-size:calc(2.5em/0.7)}.reveal.smaller .slides h2,.reveal .slides section.smaller h2{font-size:calc(1.6em/0.7)}.reveal.smaller .slides h3,.reveal .slides section.smaller h3{font-size:calc(1.3em/0.7)}.reveal .columns>.column>:not(ul,ol){margin-left:.25em;margin-right:.25em}.reveal .columns>.column:first-child>:not(ul,ol){margin-right:.5em;margin-left:0}.reveal .columns>.column:last-child>:not(ul,ol){margin-right:0;margin-left:.5em}.reveal .slide-number{color:#7d9dcf;background-color:#f5f4f3}.reveal .footer{color:#b0a7a0}.reveal .footer a{color:#5881c1}.reveal .footer.has-dark-background{color:#fff}.reveal .footer.has-dark-background a{color:#7bc6fa}.reveal .footer.has-light-background{color:#505050}.reveal .footer.has-light-background a{color:#6a9bdd}.reveal .slide-number{color:#b0a7a0}.reveal .slide-number.has-dark-background{color:#fff}.reveal .slide-number.has-light-background{color:#505050}.reveal .slide figure>figcaption,.reveal .slide img.stretch+p.caption,.reveal .slide img.r-stretch+p.caption{font-size:0.7em}@media screen and (min-width: 500px){.reveal .controls[data-controls-layout=edges] .navigate-left{left:.2em}.reveal .controls[data-controls-layout=edges] .navigate-right{right:.2em}.reveal .controls[data-controls-layout=edges] .navigate-up{top:.4em}.reveal .controls[data-controls-layout=edges] .navigate-down{bottom:2.3em}}.tippy-box[data-theme~=light-border]{background-color:#f5f4f3;color:#635a54;border-radius:4px;border:solid 1px #b0a7a0;font-size:.6em}.tippy-box[data-theme~=light-border] .tippy-arrow{color:#b0a7a0}.tippy-box[data-placement^=bottom]>.tippy-content{padding:7px 10px;z-index:1}.reveal .callout.callout-style-simple .callout-body,.reveal .callout.callout-style-default .callout-body,.reveal .callout.callout-style-simple div.callout-title,.reveal .callout.callout-style-default div.callout-title{font-size:inherit}.reveal .callout.callout-style-default .callout-icon::before,.reveal .callout.callout-style-simple .callout-icon::before{height:2rem;width:2rem;background-size:2rem 2rem}.reveal .callout.callout-titled .callout-title p{margin-top:.5em}.reveal .callout.callout-titled .callout-icon::before{margin-top:1rem}.reveal .callout.callout-titled .callout-body>.callout-content>:last-child{margin-bottom:1rem}.reveal .panel-tabset [role=tab]{padding:.25em .7em}.reveal .slide-menu-button .fa-bars::before{background-image:url('data:image/svg+xml,')}.reveal .slide-chalkboard-buttons .fa-easel2::before{background-image:url('data:image/svg+xml,')}.reveal .slide-chalkboard-buttons .fa-brush::before{background-image:url('data:image/svg+xml,')}/*! light */.reveal ol[type=a]{list-style-type:lower-alpha}.reveal ol[type=a s]{list-style-type:lower-alpha}.reveal ol[type=A s]{list-style-type:upper-alpha}.reveal ol[type=i]{list-style-type:lower-roman}.reveal ol[type=i s]{list-style-type:lower-roman}.reveal ol[type=I s]{list-style-type:upper-roman}.reveal ol[type="1"]{list-style-type:decimal}.reveal ul.task-list{list-style:none}.reveal ul.task-list li input[type=checkbox]{width:2em;height:2em;margin:0 1em .5em -1.6em;vertical-align:middle}div.cell-output-display div.pagedtable-wrapper table.table{font-size:.6em}.reveal .code-annotation-container-hidden{display:none}.reveal code.sourceCode button.code-annotation-anchor,.reveal code.sourceCode .code-annotation-anchor{font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;color:var(--quarto-hl-co-color);border:solid var(--quarto-hl-co-color) 1px;border-radius:50%;font-size:.7em;line-height:1.2em;margin-top:2px;user-select:none;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;-o-user-select:none}.reveal code.sourceCode button.code-annotation-anchor{cursor:pointer}.reveal code.sourceCode a.code-annotation-anchor{text-align:center;vertical-align:middle;text-decoration:none;cursor:default;height:1.2em;width:1.2em}.reveal code.sourceCode.fragment a.code-annotation-anchor{left:auto}.reveal #code-annotation-line-highlight-gutter{width:100%;border-top:solid var(--quarto-hl-co-color) 1px;border-bottom:solid var(--quarto-hl-co-color) 1px;z-index:2}.reveal #code-annotation-line-highlight{margin-left:-8em;width:calc(100% + 4em);border-top:solid var(--quarto-hl-co-color) 1px;border-bottom:solid var(--quarto-hl-co-color) 1px;z-index:2;margin-bottom:-2px}.reveal code.sourceCode .code-annotation-anchor.code-annotation-active{background-color:var(--quarto-hl-normal-color, #aaaaaa);border:solid var(--quarto-hl-normal-color, #aaaaaa) 1px;color:#f5f4f3;font-weight:bolder}.reveal pre.code-annotation-code{padding-top:0;padding-bottom:0}.reveal pre.code-annotation-code code{z-index:3;padding-left:0px}.reveal dl.code-annotation-container-grid{margin-left:.1em}.reveal dl.code-annotation-container-grid dt{margin-top:.65rem;font-family:SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;border:solid #635a54 1px;border-radius:50%;height:1.3em;width:1.3em;line-height:1.3em;font-size:.5em;text-align:center;vertical-align:middle;text-decoration:none}.reveal dl.code-annotation-container-grid dd{margin-left:.25em}.reveal .scrollable ol li:first-child:nth-last-child(n+10),.reveal .scrollable ol li:first-child:nth-last-child(n+10)~li{margin-left:1em}html.print-pdf .reveal .slides .pdf-page:last-child{page-break-after:avoid}.reveal .quarto-title-block .quarto-title-authors{display:flex;justify-content:center}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author{padding-left:.5em;padding-right:.5em}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author a,.reveal .quarto-title-block .quarto-title-authors .quarto-title-author a:hover,.reveal .quarto-title-block .quarto-title-authors .quarto-title-author a:visited,.reveal .quarto-title-block .quarto-title-authors .quarto-title-author a:active{color:inherit;text-decoration:none}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-author-name{margin-bottom:.1rem}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-author-email{margin-top:0px;margin-bottom:.4em;font-size:.6em}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-author-orcid img{margin-bottom:4px}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-affiliation{font-size:.7em;margin-top:0px;margin-bottom:8px}.reveal .quarto-title-block .quarto-title-authors .quarto-title-author .quarto-title-affiliation:first{margin-top:12px}.reveal .slide blockquote{border-left:3px solid #b0a7a0;padding-left:.6em;background:#f6f7f7}.reveal h2{padding-bottom:.3em}.reveal .footer{color:#f9bd07;background-color:#2c2825;display:block;position:fixed;bottom:0px !important;padding-bottom:12px;padding-top:12px;width:100%;text-align:center;font-size:18px;z-index:2}.reveal .slide ul li{list-style:none}.reveal .slide ul li::before{content:"◼";color:#f9bd07;display:inline-block;width:1.5em;margin-left:-1.5em;font-size:66%}.reveal .progress span{background-color:#f9bd07}.reveal .slide-logo{z-index:3;bottom:-5px !important}.slide-background:first-child{background-color:#2c2825;background-image:radial-gradient(#635a54 1%, transparent 5%);background-position:0 0,10px 10px;background-size:30px 30px;background-repeat:repeat;height:100%;width:100%}.slide-background:first-child .slide-background-content{background-image:url("https://the-strategy-unit.github.io/assets/logo_yellow.svg");top:.5em;right:.5em;height:4em;width:4em;position:absolute}#title-slide{text-align:left}#title-slide h1{font-size:2em;color:#f9bd07 !important}#title-slide .subtitle{color:#c7c1bc}#title-slide .quarto-title-authors{justify-content:left;display:block}#title-slide .quarto-title-author{padding-top:1em;color:#ec6555;padding:0}#title-slide .quarto-title-author a{color:#5881c1}#title-slide p.institute{font-size:.75em;color:#c7c1bc}#title-slide p.date{font-size:.5em;color:#988d85}.reveal .slide-logo:first-child{display:none}.inverse .slide-background-content{background-color:#2c2825}.inverse h1,.inverse h2,.inverse h3,.inverse h4{color:#988d85 !important}.reveal .imitate-title{font-size:2em}.reveal .inverse .imitate-title{color:#f9bd07 !important}.reveal .inverse{color:#f5f4f3}.text-bottom{bottom:1em;position:absolute}.no-bullets ul{list-style-type:none;padding:0;margin:0}.small-table table{font-size:1.65rem}.small{font-size:1.2rem}.medium{font-size:2rem}.yellow{color:#f9bd07}.light-yellow{color:#fef2ce}.light-charcoal{color:#988d85}.very-light-charcoal{color:#c7c1bc}.center{text-align:center}/*# sourceMappingURL=f95d2bded9c28492b788fe14c3e9f347.css.map */ diff --git a/sitemap.xml b/sitemap.xml index 0d4fcd4..f9cd3ed 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -2,170 +2,174 @@ https://the-strategy-unit.github.io/data_science/blogs/posts/2023-03-21-rstudio-tips/index.html - 2024-10-09T11:09:28.656Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2023-04-26-reinstalling-r-packages.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2024-01-10-advent-of-code-and-test-driven-development.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2023-03-24_hotfix-with-git.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2024-05-22-storing-data-safely/index.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2024-02-28_sankey_plot.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/index.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/style/git_and_github.html - 2024-10-09T11:09:28.720Z + 2024-10-09T12:08:58.136Z https://the-strategy-unit.github.io/data_science/style/data_storage.html - 2024-10-09T11:09:28.716Z + 2024-10-09T12:08:58.136Z https://the-strategy-unit.github.io/data_science/presentations/2023-03-23_coffee-and-coding/index.html - 2024-10-09T11:09:28.668Z + 2024-10-09T12:08:58.088Z https://the-strategy-unit.github.io/data_science/presentations/2024-10-10_what-is-ai-chris/index.html - 2024-10-09T11:09:28.716Z + 2024-10-09T12:08:58.136Z - https://the-strategy-unit.github.io/data_science/presentations/2023-10-17_conference-check-in-app/index.html - 2024-10-09T11:09:28.692Z + https://the-strategy-unit.github.io/data_science/presentations/2024-10-10_what-is-ai-tom/index.html + 2024-10-09T12:08:58.136Z - https://the-strategy-unit.github.io/data_science/presentations/2023-05-15_text-mining/index.html - 2024-10-09T11:09:28.672Z + https://the-strategy-unit.github.io/data_science/presentations/2023-08-23_nhs-r_unit-testing/index.html + 2024-10-09T12:08:58.092Z - https://the-strategy-unit.github.io/data_science/presentations/2024-08-22_agile-and-scrum/index.html - 2024-10-09T11:09:28.708Z + https://the-strategy-unit.github.io/data_science/presentations/index.html + 2024-10-09T12:08:58.136Z - https://the-strategy-unit.github.io/data_science/presentations/2023-07-11_haca-nhp-demand-model/index.html - 2024-10-09T11:09:28.672Z + https://the-strategy-unit.github.io/data_science/presentations/2024-05-30_open-source-licensing/index.html + 2024-10-09T12:08:58.124Z - https://the-strategy-unit.github.io/data_science/presentations/2023-08-24_coffee-and-coding_geospatial/index.html - 2024-10-09T11:09:28.676Z + https://the-strategy-unit.github.io/data_science/presentations/2023-03-09_midlands-analyst-rap/index.html + 2024-10-09T12:08:58.088Z - https://the-strategy-unit.github.io/data_science/presentations/2023-08-02_mlcsu-ksn-meeting/index.html - 2024-10-09T11:09:28.676Z + https://the-strategy-unit.github.io/data_science/presentations/2024-05-23_github-team-sport/index.html + 2024-10-09T12:08:58.120Z - https://the-strategy-unit.github.io/data_science/presentations/2023-02-23_coffee-and-coding/index.html - 2024-10-09T11:09:28.660Z + https://the-strategy-unit.github.io/data_science/presentations/2024-05-16_store-data-safely/index.html + 2024-10-09T12:08:58.112Z - https://the-strategy-unit.github.io/data_science/presentations/2023-09-07_coffee_and_coding_functions/index.html - 2024-10-09T11:09:28.676Z + https://the-strategy-unit.github.io/data_science/presentations/2023-03-23_collaborative-working/index.html + 2024-10-09T12:08:58.088Z - https://the-strategy-unit.github.io/data_science/presentations/2024-09-05_earl-nhp/index.html - 2024-10-09T11:09:28.708Z + https://the-strategy-unit.github.io/data_science/presentations/2023-03-09_coffee-and-coding/index.html + 2024-10-09T12:08:58.080Z - https://the-strategy-unit.github.io/data_science/presentations/2023-05-23_data-science-for-good/index.html - 2024-10-09T11:09:28.672Z + https://the-strategy-unit.github.io/data_science/presentations/2024-10-10_what-is-ai-yiwen/index.html + 2024-10-09T12:08:58.136Z https://the-strategy-unit.github.io/data_science/presentations/2023-02-01_what-is-data-science/index.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.080Z - https://the-strategy-unit.github.io/data_science/presentations/2024-10-10_what-is-ai-yiwen/index.html - 2024-10-09T11:09:28.716Z + https://the-strategy-unit.github.io/data_science/presentations/2023-05-23_data-science-for-good/index.html + 2024-10-09T12:08:58.088Z - https://the-strategy-unit.github.io/data_science/presentations/2023-03-09_coffee-and-coding/index.html - 2024-10-09T11:09:28.660Z + https://the-strategy-unit.github.io/data_science/presentations/2024-09-05_earl-nhp/index.html + 2024-10-09T12:08:58.128Z - https://the-strategy-unit.github.io/data_science/presentations/2023-03-23_collaborative-working/index.html - 2024-10-09T11:09:28.672Z + https://the-strategy-unit.github.io/data_science/presentations/2023-09-07_coffee_and_coding_functions/index.html + 2024-10-09T12:08:58.096Z - https://the-strategy-unit.github.io/data_science/presentations/2024-05-16_store-data-safely/index.html - 2024-10-09T11:09:28.696Z + https://the-strategy-unit.github.io/data_science/presentations/2023-02-23_coffee-and-coding/index.html + 2024-10-09T12:08:58.080Z - https://the-strategy-unit.github.io/data_science/presentations/2024-05-23_github-team-sport/index.html - 2024-10-09T11:09:28.700Z + https://the-strategy-unit.github.io/data_science/presentations/2023-08-02_mlcsu-ksn-meeting/index.html + 2024-10-09T12:08:58.092Z - https://the-strategy-unit.github.io/data_science/presentations/2023-03-09_midlands-analyst-rap/index.html - 2024-10-09T11:09:28.668Z + https://the-strategy-unit.github.io/data_science/presentations/2023-08-24_coffee-and-coding_geospatial/index.html + 2024-10-09T12:08:58.092Z - https://the-strategy-unit.github.io/data_science/presentations/2024-05-30_open-source-licensing/index.html - 2024-10-09T11:09:28.704Z + https://the-strategy-unit.github.io/data_science/presentations/2023-07-11_haca-nhp-demand-model/index.html + 2024-10-09T12:08:58.088Z - https://the-strategy-unit.github.io/data_science/presentations/index.html - 2024-10-09T11:09:28.716Z + https://the-strategy-unit.github.io/data_science/presentations/2024-08-22_agile-and-scrum/index.html + 2024-10-09T12:08:58.124Z - https://the-strategy-unit.github.io/data_science/presentations/2023-08-23_nhs-r_unit-testing/index.html - 2024-10-09T11:09:28.676Z + https://the-strategy-unit.github.io/data_science/presentations/2023-05-15_text-mining/index.html + 2024-10-09T12:08:58.088Z + + + https://the-strategy-unit.github.io/data_science/presentations/2023-10-17_conference-check-in-app/index.html + 2024-10-09T12:08:58.112Z https://the-strategy-unit.github.io/data_science/presentations/2024-01-25_coffee-and-coding/index.html - 2024-10-09T11:09:28.692Z + 2024-10-09T12:08:58.112Z https://the-strategy-unit.github.io/data_science/presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html - 2024-10-09T11:09:28.684Z + 2024-10-09T12:08:58.104Z https://the-strategy-unit.github.io/data_science/about.html - 2024-10-09T11:09:28.656Z + 2024-10-09T12:08:58.072Z https://the-strategy-unit.github.io/data_science/style/project_structure.html - 2024-10-09T11:09:28.720Z + 2024-10-09T12:08:58.136Z https://the-strategy-unit.github.io/data_science/style/style_guide.html - 2024-10-09T11:09:28.720Z + 2024-10-09T12:08:58.136Z https://the-strategy-unit.github.io/data_science/blogs/index.html - 2024-10-09T11:09:28.656Z + 2024-10-09T12:08:58.072Z https://the-strategy-unit.github.io/data_science/blogs/posts/2024-05-22-storing-data-safely/azure_python.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2024-08-08-map-and-nest/index.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2024-01-17_nearest_neighbour.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2023-04-26_alternative_remotes.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z https://the-strategy-unit.github.io/data_science/blogs/posts/2024-05-13_one-year-coffee-code.html - 2024-10-09T11:09:28.660Z + 2024-10-09T12:08:58.076Z