Skip to content

Latest commit

 

History

History
62 lines (43 loc) · 3.61 KB

Essential_Skills.md

File metadata and controls

62 lines (43 loc) · 3.61 KB

Prepare to Demonstrate these Technical Skills

Here are the essential skills that a Machine Learning Engineer needs, as mentioned in the first video of this lesson. Within each group are topics that you should be familiar with.

Study Tip: Copy and paste this list into a document and save to your computer for easy referral.

Table of Contents

Computer Science Fundamentals and Programming

Topics:

  • Data structures: Lists, stacks, queues, strings, hash maps, vectors, matrices, classes & objects, trees, graphs, etc.
  • Algorithms: Recursion, searching, sorting, optimization, dynamic programming, etc.
  • Computability and complexity: P vs. NP, NP-complete problems, big-O notation, approximate algorithms, etc.
  • Computer architecture: Memory, cache, bandwidth, threads & processes, deadlocks, etc.

Probability and Statistics

Topics:

  • Basic probability: Conditional probability, Bayes rule, likelihood, independence, etc.
  • Probabilistic models: Bayes Nets, Markov Decision Processes, Hidden Markov Models, etc.
  • Statistical measures: Mean, median, mode, variance, population parameters vs. sample statistics etc.
  • Proximity and error metrics: Cosine similarity, mean-squared error, Manhattan and Euclidean distance, log-loss, etc.
  • Distributions and random sampling: Uniform, normal, binomial, Poisson, etc.
  • Analysis methods: ANOVA, hypothesis testing, factor analysis, etc.

Data Modeling and Evaluation

Topics:

  • Data preprocessing: Munging/wrangling, transforming, aggregating, etc.
  • Pattern recognition: Correlations, clusters, trends, outliers & anomalies, etc.
  • Dimensionality reduction: Eigenvectors, Principal Component Analysis, etc.
  • Prediction: Classification, regression, sequence prediction, etc.; suitable error/accuracy metrics.
  • Evaluation: Training-testing split, sequential vs. randomized cross-validation, etc.

Applying Machine Learning Algorithms and Libraries

Topics:

  • Models: Parametric vs. nonparametric, decision tree, nearest neighbor, neural net, support vector machine, ensemble of multiple models, etc.
  • Learning procedure: Linear regression, gradient descent, genetic algorithms, bagging, boosting, and other model-specific methods; regularization, hyperparameter tuning, etc.
  • Tradeoffs and gotchas: Relative advantages and disadvantages, bias and variance, overfitting and underfitting, vanishing/exploding gradients, missing data, data leakage, etc.

Software Engineering and System Design

Topics:

  • Software interface: Library calls, REST APIs, data collection endpoints, database queries, etc.
  • User interface: Capturing user inputs & application events, displaying results & visualization, etc.
  • Scalability: Map-reduce, distributed processing, etc.
  • Deployment: Cloud hosting, containers & instances, microservices, etc.