Preparing for a statistical data science interview
In this article, I'll cover the steps I followed during my recent application for a Senior Data Scientist role. I hope this helps you organise your own preparation for data science and any other roles 😄
Start with the job requirements
To write a good application and prepare for interview, you need to know your selling points against the job criteria. Below are criteria for three different Senior Data Scientist roles I found. They all have similarities but can vary in tooling and technology used. Select each of the panels to view the requirements.
In preparing an application, you want to be ready to stress where the requirements of the role and your strong areas collide.
Write and submit an application
When it comes to applications, you've either worked as a Data Scientist or you haven't. If you have, showcase your experience and projects. If you haven't, apply Data Science techniques and solve data problems in your current role or in your own time and showcase those projects. Either way, whilst sifting applications they're gonna be looking for relevant experience. Give them what they want.
When I've been on the other side of recruitment sifting applications, the number one thing that marks candidates down is not providing relevant evidence of using the skills required for the job they're applying for.
All of the examples in this article are made up and light on detail but the structure and format of them are the same as what I use for real. You'll need to expand on them but they give a solid framework.
For applications, the style should be direct, punchy, quick to scan, easy for a hiring manager to sift and shouldn't include anything that makes you look bad. If you don't have any direct data science experience, show data science techniques you use in your current role. If you don't have any data science examples in your current role, start 'doing data science' in your current role! No one just starts out as a data scientist, but the good news is, you don't have to be a data scientist to apply analytical techniques. The CV and personal statement below cover both angles.
Prepare competency answers using STAR
Also called behaviour questions, these require prepared answers that tell a story about your behaviour. Always try to start sentences with 'I' rather than 'we'. The assessor is interested in your contribution, not what other people did. Approach these questions like you are trying to tell the assessor how great you are, and what an asset you'll be by proving you've handled tough situations before, and delivered strong results. Look at the competencies you'll be assessed on, think about what you've done in the past, and start writing up an answer in the STAR format.
- Situation = briefly outline the context
- Task = briefly outline what you needed to do and why
- Action = go into detail about what you did and your thinking process (why you did those things)
- Result = drive home that as a result of your actions, X was the outcome, quantify results if possible (saved X%)
Here is a short example that follows the STAR format.
For the real thing I would expand on the action paragraphs, adding in:
- Why you chose that analytical method
- What alternatives and options you explored
- How you handled messy data
- How you evaluated the model
- How you updated the model to include new data
- What obstacles and issues you overcame
- How you got buy-in from others
- How you validated your methods
- How you handled conflict and disagreements
- Did you need to delegate any tasks
- Was there time pressure
- How did you prioritise conflicting tasks
- How did you avoid burnout juggling extra responsibilities
- How you ensured standards were high
- How you ensured you were meeting the customer's needs
- How you measured success
- How did you disseminate the analysis to non-technical people
- How you deployed and maintained the model
I would spend some time thinking through possible hypothetical questions that might test your understanding of the company, business area or sector. These might include questions like:
- Imagine we gave you data on X, what kind of analysis would you perform on it?
- How would you use data science techniques to improve products and services in our sector?
- We do X analysis here, why do you think that's important for us?
Prepare technical presentation
Main thing for any presentation is to keep it clear and concise. Address any points you're asked to, otherwise stick to the rule of three. Keep it mostly high level, but be prepared to drill into details. I was asked to present a recent analytical project. I don't like PowerPoint but created some slides as talking points:
- Introduction - quick about me then into the problem statement
- Research - how I researched the issue
- Development - how I built the solution
- User journey - to understand the product
- Challenges and solutions - how I overcame obstacles
- Analytical techniques - drill into key data science techniques used
- Launch - releasing a working product or model into production
- Outcomes - the value that was added and success metrics
Review statistical concepts
Statistics underpin almost all of data science. I think data science can even be referred to as statistical learning. So going back to basics can never hurt. I wouldn't get too bogged down during this part, but certainly don't neglect it.
- Descriptive statistics
- Inferential statistics
- Distributions (normal, binomial, poisson, exponential)
- Sampling (random, stratified, cluster)
- Hypothesis testing
- Statistical significance
- Regression
- Confidence Intervals
- Correlation
- P-values
- Probability (Bayes theorem)
- Bias
- Testing (z-score, t-test, Chi-square, ANOVA)
A great book for brushing up on statistical concepts is Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python. Another extremely useful book I mention later that covers lots of topics including statistics is Ace the Data Science Interview: 201 Real Interview Questions Asked By FAANG, Tech Startups, & Wall Street.
Review machine learning concepts
As with statistical concepts it's always good to refresh your knowledge of machine learning algorithms and when to use each before heading into a data science interview. These can be broadly categorised as:
- Supervised learning - classification, regression algorithms
- Unsupervised learning - clustering algorithms
- Reinforcement learning
A book I refer back to again and again on machine learning is Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
Complete a practice take-home test
Although I don't really agree with take-home tests or coding challenges, it is out there, it exists, and if I find a role I really want but there's a test attached I'll consider it based upon time investment. A test might mean spending significant time investment brushing up on concepts you've not used in a while. Nevertheless, putting aside the LeetCode style coding challenges covered in my coding interview topics in Python article, I figure for data science there will be only one of two possibilities for a test. It will either be an analytical (tell us something interesting about this data) or modelling (predict X outcome with this data, model this data to calculate X outcome) project.
For analytical I'd use Jupyter Notebook or Jupyter Lab and for modelling I'd use Visual Studio Code with the cookiecutter package for good code organisation. This makes creating a new machine learning project as easy as:
pip install cookiecutter
cookiecutter https://github.com/drivendata/cookiecutter-data-science
I found a Reddit post linking to a practice case study with code applicable to any analytical role - and with some modifications to a data science role. This is the kind of thing any take-home might look like:
Build a simple model based on insights you've found and describe how its predictions add value to the company. Present the model you fitted, why you chose it, explain the model as if speaking to a non-technical audience and how the predictions could have an impact on the business processes going forward.
I have also come across statistical and numerical aptitude tests but they are usually only at entry-level or for large recruitment campaigns. For the last numerical aptitude test I had I used Assessment Day to prepare. They are usually fast pace like one minute per question so the winning formula is, read the question, look for the specific data the question is talking about, perform calculations, sense check, select the answer, then move on (you don't have time to double check answer). For statistical tests, here is a good practice test with answers at the end.
A book that guided me through the case study and coding aspects of data science interviews Ace the Data Science Interview: 201 Real Interview Questions Asked By FAANG, Tech Startups, & Wall Street. This is an invaluable resource considering sheer breadth of it's contents. I took a risk on this one and decided to try and get it delivered ASAP before my interview. It was just what I needed and covers almost all topics you'd need to be aware of. This includes chapters on probability, statistics, machine learning, coding / data structures and algorithms, SQL and database design, product sense and case studies.
On the day
The main things you should do on the day of the interview include:
- Stay relaxed!
- Be yourself
- Enjoy the process as much as possible
- Let your passion for data science show
- Tell them how amazing you are in your answers (not arrogant but confident)
- Like in the application say 'I' not 'we' (they are interested in your actions)
- Remember the STAR format (it will keep your answers on track)
- Remember the data science lifecycle (it will keep your answers on track)
- Ask questions (you're interviewing them too!)
- Don't be afraid to say if you don't know something (How would you learn about it?)
After the interview
Well done! You did it! The interview is over and you can breathe a sigh of relief. My last major tip might not be what you're expecting... Now that the interview is over, write down the questions you were asked (to practice in future) and then forget about the interview completely! Don't dwell on things that you could have said, mistakes you think you made, or even what went well. Just resign it to the history books.
Yes, celebrate that it's over and done with and that you gave it a solid effort, but expect the answer to be 'sorry we went with another candidate, but we thought you were great'. Expect the worst, hope for the best. By doing this, you'll force yourself to view applications and interviews as opportunities and won't over-invest yourself emotionally in them. Everyone fails interviews for many different reasons. If you did great, it's their loss. If you stumbled, see it as practice and improve for next time. The key to getting what you want in anything is to never stop trying, failing, then improving, then trying again.
I hope this article helped you prepare for your own data science interview and wish you the best of luck with it!
If you enjoyed this article, be sure to check out other articles on the site.