Big Data I – Tools of The Trade

So who’s employing data scientists in 2018 and what are they after?

Let’s analyse 10 US, 10 UK and 10 Australian data science jobs from indeed.com and tally the desired data science skills and tools (software).

1. Data Science Skills

keySkills

communSkills

tertStuds

yearsXP

MLdetail

 

datMinDetail

busAnDetail

dataTypes

dataStorage

softDD

2. Data Science Tools

progLang

pyDetail

apache

NoSQL

clouds

visTools

MLtools

3. Years Experience (Scatter Plots)

skillsScatter

toolsScatter

Discussion

Some of the categorizations made were arbitrary, such as placing ‘regression’ and ‘decision trees’ with the machine learning section rather than data mining section, or separating the more formal ‘data wrangling’ from its synonyms, such as cleaning or preparing data.

The former was done for simplicity, the latter to compare general interest in data wrangling to specific requests for the technical term, ‘data wrangling’, for which there were much fewer. Indeed, this pattern was observed for many skills requested, such as ‘machine learning’, ‘data mining’, or ‘NoSQL’, where general interest was moderate to high but requests for specifics, such as ‘deep learning’, ‘clustering’ or ‘Mongo DB’ , were limited.

The .csv and .ipynb files used can be downloaded here .

Conclusion

From the given sample, the basic data scientist should have at least a STEM Bachelor degree, if not a Masters or PHD, have good communication and business skills, and possibly a few years work experience, though the majority of job postings didn’t mention it.

Python and R were the programming languages of choice, Hadoop and Spark were the leading Apache big data tools, and Tableau was the most popular visualisation tool.

None of this should come as any surprise, but it’s good fun to verify data science with data science. For more information on data science skills and tools for employment, check out

  1. http://www.kdnuggets.com/2016/05/10-must-have-skills-data-scientist.html
  2. http://www.datasciencecentral.com/profiles/blogs/7-key-skills-of-effective-data-scientists
  3. https://blog.udacity.com/2014/11/data-science-job-skills.html
  4. http://r4stats.com/2017/03/13/jobs-for-data-science-up-7-fold-for-statistician-down-by-half/

Or you can investigate indeed.com job posting terms directly at

http://www.indeed.com/jobtrends

Next post we’ll look at all the tools/software discussed here and how to choose one over another (eg R vs SAS). Until then,

Data science needs deep thought.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s