Developing recruiting tools

Sep 6, 2015

Finding employees can be a hard problem for most tech companies. How does one find stellar employees? Word of mouth, open source projects, StackOverlow, campus events?

Recently I put together a system for some recruiter friends that does the following:

  • git clone several open source projects

  • Perform the following GIT query for names and emails

  git log --format='%aN <%aE>' | awk '{arr[$0]++} END{for (i in arr){print arr[i], i;}}' | sort -rn | cut -d\  -f2-  


  • Collect those results into a collection

  • For each of those queries, query the GitHub interface for statistics about the author.

  • Future work: Cross reference emails against StackOverlow….need to investigate the methods for doing so there.

If you want to try something similar, here is a small PoC to get users for a repo:

#!/usr/bin/env bash

function getAuthors(){
  git clone -q $1 /tmp/_git
  cd /tmp/_git
  git log --format='%aN <%aE>' | awk '{arr[$0]++} END{for (i in arr){print arr[i], i;}}' | sort -rn | cut -d\  -f2-  
  rm -rf /tmp/_git

getAuthors $1

You can use it by passing a GIT URL to the script like so:


Which will find the authors and emails of those who have contributed to Boost ASIO.

Christopher Kohlhoff

Troy D. Straszheim

John Maddock

Beman Dawes

Stephen Kelly

Michael A. Jackson

Daniel James

Vladimir Prus

Bryce Adelstein-Lelbach

Vicente J. Botet Escriba

Steven Watanabe

Nicola Musatti

Hartmut Kaiser

For this bear in mind that GIT commits are malleable and can be manipulated as anyone can do this for fun:

export GIT_AUTHOR_NAME="Linus Torvalds"
git commit -m "Enjoy!"


That is why it is important to verify against say the GitHub statistics API.


I’m leaving that stage of the process out for now as I am still finalizing the design.