Analyzing the Software Engineering skills trends with ML/NLP

May 23, 2019

Earlier this week, our team released a Skills Mapper that shows how technical skills trend over time. In some ways it is comparable to Google Trends, which displays trending internet searches. The main difference, however, is that we collect our data by analyzing online job descriptions. This information can then be used by software engineers to see trends in the domain, and personalized to each - which skills to pick up to optimize for compatibility with future jobs.

Here’s how we did it, step by step:

1. Tracked over 300,000 jobs over 30-plus months: The first step we took in building our Skills Mapper was data collection. For the last two and a half years, STELLARES has scoured the top job boards via scraping and API integration to keep track of over 300,000 job postings, and monitor how they change over time.

2. Built job classifiers:  We used various deterministic title parsing logic to identify a broad range of roles. This data was then used as a training set for our machine learning (ML) model who based on the job description content itself ultimately used to classify more roles, even with nuanced and less structured role title

3. Analyzed job descriptions using natural language processing (NLP). Similar to the process we employed in the second step, we built models to identify the different sections of each JD including job responsibilities, requirements, perks, about the company, etc. Using NLP and various language algorithms, we extracted the necessary skills for each position. These skills were extracted with context, such as required proficiency level, and whether the requirement itself was strong (must have) or soft (nice to have).

4. Contextualized skills: Our natural language processing engine then deduced to what extent the skills it identified were required for the corresponding position, and whether the employer might be lenient towards those without a high degree of proficiency. The end result of this process are the Skills Mapper classification of Required Skills (candidate must command these skills with high proficiency coming to work) and Skills to Learn (candidates may pick these skills on the job, if they have the right background).

5. Mapped Trending Data. As all JDs were classified to their role types and locations, with skills extracted and classified, we’ve built an engine to convey this information on a sleek, intuitive dashboard that shows trending skills over time, as they appear in JDs, and in which context.

6. Determining Skill Relevancy  When a user enters their skills in our Mapper, we apply a skill-matching algorithm between the user’s entered qualifications and each of the currently open job descriptions for the selected subset of the market (role and location). The matching algorithm we developed relies on a complex skill ontology, that represents how all skills are related to one another and what are the distances between the different skills. Leveraging this ontology, the matching algorithm calculates the distances between a person (a set of skills as provided) and all jobs in market subset, identifying what % of potential jobs (given picked role and location) are “perfect fit” (user has all skills required by the position), “close fit” (aggregated knowledge user has covers 70% of the aggregated knowledge required by jobs), and “stretch fit” (aggregated knowledge by user covers 50% of the aggregated knowledge required by jobs).

7. Learning Recommendation. As a final step, the system analyzes the jobs for which the user isn’t perfectly qualified, and makes recommendations as to which additional skills the user could acquire in order to close the skills gap, and have a stronger profile as an applicant.

STELLARES Skill Mapper vs. Google Trends - Sharing Some Insights

With 30-plus months of trending job skill data analyzed, we decided to compare it with Google Trends, as we expected to find close proximity between search patterns and skills trends. We chose four basic roles to examine: full-stack engineer, backend engineer, data scientist, and product designer. We’ve used United States as location for the four roles.

We specifically selected roles that are in high demand and, therefore, have a high search volume in Google. Had we taken roles with low demand, we couldn’t have compared between our Skill Mapper and Google Trends, as Google Trends does not aggregate search results based on individual job roles/postings. Therefore, STELLARES’ Skills Mapper is actually more accurate than Google Trends as a source for trending skills.

1. Below are graphs visualizing the trends of React, AngularJS and Vue.js for full-stack developer:

We can see similar trends on both platforms for the above mentioned skills. Since the data from Google Trends is from the last five years, as opposed to the data from the STELLARES’ Skill Mapper, which is from the last two and a half years, only the second half of the Google Trends graph is relevant for this comparison. For the period of the last two and a half years, the data on both graphs is quite similar.

There are some hiccups in the Google Trends graph, which are likely explained by a temporary buzz around those skills, perhaps due to some marketing, publications, etc.

2. Below are graphs visualizing the trends of AWS, Kubernetes and Azure for backend developer:

Again, the trends are similiar on both platforms. One can observe that Azure seems to be slightly less popular in Google Trends. This can be explained by the fact that the trends shown in Skills Mapper pertain to sought after job skills as opposed to a general public interest in topics related to those skills, which is what is shown in Google Trends. Our Skill Mapper analyzes job descriptions, and their authors would many times mention Azure when requiring Kubernetes, as having knowledge of one or the other would be sufficient. Nonetheless, Azure is less popular than AWS and Kubernetes.

3. Below are graphs visualizing the trends of Tensorflow and Hadoop for data scientist:

Once again, if we compare the graphs within the context of the last two and a half years, the data is quite similar. Regarding Tensorflow, the positive increase in Google Trends is more pronounced than that found in our Skill Mapper. This can likely be explained by an acute, short-term increase in public interest in topics surroundings Tensorflow.

4. Finally, graphs visualizing the trends of Sketch and Photoshop for product designers:

Although the trends by our Skill Mapper and Google Trends are similar, we can see that Photoshop is more popular in Google Trends and less popular than Sketch in our Skill Mapper. This might be explained by the fact that many people use Photoshop in their personal lives or for hobby projects, where Sketch is more used professionally.

Find out how YOUR skills stack up in the Skills Mapper now!