The Practical Quant Thu, 31 Jan 2019 15:00:00 GMT language
[A version of this post appears on the O'Reilly Radar blog.]
The O'Reilly Data Show Podcast: Maryam Jahanshahi on building tools to help improve efficiency and fairness in how companies recruit.
In this episode of the Data Show, I spoke with Maryam Jahanshahi, research scientist at TapRecruit, a startup that uses machine learning and analytics to help companies recruit more effectively. In an upcoming survey, we found that a “skills gap” or “lack of skilled people” was one of the main bottlenecks holding back adoption of AI technologies. Many companies are exploring a variety of internal and external programs to train staff on new tools and processes. The other route is to hire new talent. But recent reports suggest that demand for data professionals is strong and competition for experienced talent is fierce. Jahanshahi and her team are building natural language and statistical tools that can help companies improve their ability to attract and retain talent across many key areas.
Here are some highlights from our conversation:
Optimal job titles
The conventional wisdom in our field has always been that you want to optimize for “the number of good candidates” divided by “the number of total candidates.” ... The thinking is that one of the ways in which you get a good signal-to-noise ratio is if you advertise for a more senior role. ... In fact, we found the number of qualified applicants was lower for the senior data scientist role.
... We saw from some of our behavioral experiments that people were feeling like that was too senior a role for them to apply to. What we would call the "confidence gap" was kicking in at that point. It's a pretty well-known phenomena that there are different groups of the population that are less confident. This has been best characterized in terms of gender. It's the idea that most women only apply for jobs when they meet 100% of the qualifications versus most men will apply even with 60% of the qualifications. That was actually manifesting.
We saw a lot of big companies that would offer 401(k), that would offer health insurance or family leave, but wouldn't mention those benefits in the job descriptions. This had an impact on how candidates perceived these companies. Even though it's implied that Coca-Cola is probably going to give you 401(k) and health insurance, not mentioning it changes the way you think of that job.
... So, don't forget the things that really should be there. Even the boring stuff really matters for most candidates. You'd think it would only matter for older candidates, but, actually, millennials and everyone in every age group are very concerned about these things because it's not specifically about the 401(k) plan; it's about what it implies in terms of the company—that the company is going to take care of you, is going to give you leave, is going to provide a good workplace.
We found the best way to deal with representation at the end of the process is actually to deal with representation early in the process. What I mean by that is having a robust or a healthy candidate pool at the start of the process. We found for data scientist roles, that was about having 100 candidates apply for your job.
... If we're not getting to the point where we can attract 100 applicants, we'll take a look at that job description. We'll see what's wrong with it and what could be turning off candidates; it could be that you're not syndicating the job description well, it's not getting into search results, or it could be that it's actually turning off a lot of people. You could be asking for too many qualifications, and that turns off a lot of people. ... Sometimes it involves taking a step back and taking a look at what we're doing in this process that's not helping us and that's starving us of candidates.
- Sharad Goel and Sam Corbett-Davies on “Why it’s hard to design fair machine learning models”
- “Comparing production-grade NLP libraries”
- “What are machine learning engineers?”
- “The next generation of AI assistants in enterprise”
- David Blei on “Topic models: Past, present, and future”
- David Ferrucci on why “Language understanding remains one of AI’s grand challenges”