Free Code Camp is an open source community aiming at teaching people to program and develop projects for non-profit organizations. CodeNewbie is an international community that helps people who are learning to program. Together, Free Code Camp and CodeNewbie designed a survey and distributed it through Twitter and mailing lists to more than 15000 people who have been learning to program. The objective was to understand their motivations towards programming and how they are learning to program, crossing this information with demographic data (gender, age, etc.) and socio-economic situation.
The methodology and the main results are described in "We asked 15,000 people who they are, and how they're learning to code". The data set of the answers collected has been released under Open Database License and in Kaggle they have proposed it as a case study: 2016 New Coder Survey. In this post, we describe the basic exploratory analysis we have carried out on this data set.
The available data set consists of 113 variables, each one corresponding to a question of the survey. These items are divided into two groups: students' demographic information and their approach to programming.
- Age (Age)
- Gender (Gender)
- CountryLive (Place of residence)
- CountryCitizen (Birthplace)
- CityPopulation (Population of your city)
- IsEthnicMinority (Are you part of an ethnic minority in your country?)
- LanguageAtHome (What language do you speak at home?)
- SchoolDegree (Level of education)
- SchoolMajor (What did you study at the University?)
- FinanciallySupporting (Do you have any dependent?)
- HasDebt (Do you have any credit?)
- HasHomeMortgage (Do you have any mortgage?)
- HomeMortgage (How much do you still have to pay for your mortgage?)
- HasStudentDebt (Student debt remaining)
- EmploymentStatus (Employment situation)
- EmploymentField (Employment sector)
- Income (Income in the last year)
- CommuteTime (How much time do you spend commuting?)
- IsUnderEmployed (Do you consider yourself underemployed?)
- HasServedInMilitary (Have you done military service?)
- IsReceiveDisabilityBenefits (Do you receive any disability benefit?)
- HasHighSpeedInternet (Do you have a high-speed Internet connection at home?)
Questions about programming
- IsSoftwareDev (Do you work as a developer?)
- JobPref (Preferred type of company: startup, multinational, etc.)
- JobRoleInterest (Job role of interest)
- JobApplyWhen (When are you planning to look for a job as a developer?)
- ExpectedEarning (How much money do you expect to earn in your first year as a developer?)
- JobWherePref (Preferred working place: office, home, etc.)
- JobRelocate (Availability to relocate)
- CodeEvent (Participation in programming events)
- Resource (Preferred learning platforms)
- Podcast (Favorite podcasts on programming)
- HoursLearning (Hours dedicated to learning weekly)
- MonthsProgramming (Months programming)
- Bootcamp (Attendance to any bootcamp: name of the bootcamp, if it has been completed, if you would recommend it, if you needed credit, etc.)
- MoneyForLearning (Money invested in learning)
By applying data analysis techniques, we have obtained an interesting and revealing demographic profiling of the computer programming students who participated in the survey.
If we take a look at the histogram of the age variable, we notice that the majority of students are between 20 and 30 years old, being 25 the most common age and 27 years the average.
Programming students are mostly men, more than 75%, a considerably higher percentage than female students (21%).
Job role of interest
The majority of students are interested in web programming, and most of them would like to become Full-Stack Developers. The second favorite area after web programming is Data Science.
After analyzing the fields students are currently working on, we notice that more than half of the respondents operates in the software development and IT industry. We were expecting this result since this field is steadily growing. Indeed, it requires continuous updates of professional skills and knowledge.
Work preferences by age
The preference for being self-employed (freelance) increases with age, being it the favorite option for people over 60. Students under 30 would rather work for a startup or start their own business; this preference decreases as the age increases. The preferable option for programming students aged between 20 and 50 is working for a medium-sized company.
Labor sector and underemployment
Employees feel less underemployed when working in software development, followed by software development and IT; by contrast, employees feel more undervalued in the catering industry.
Availability to relocate by age
The availability to relocate significantly decreases when age increases. Almost 80% of the students under 30 years old would relocate.
Participation in bootcamps
We have taken into account the bootcamps with more than 10 participants, and we have analyzed the answers to the question "Would you recommend this bootcamp?" given by students who had already completed the bootcamp. The results reveal that the preferred bootcamp is Dev Academy, with 100% of satisfied students. On the contrary, Galvanize was the worst rated, with less than 40% of satisfied participants.
Investment in training and expected salary
Students who invest more money in training, in general, expect a higher wage, although a large number of students expect a high salary despite spending very little money in education (in many cases $0).
Current income and investment in training
There is no significant relationship between the current income of working students and the money they invested in training, i.e., students who have higher incomes do not necessarily use it for their education.
Hours devoted to learning and expected salary
The number of hours that students dedicate to learn programming has no correlation with the wages expected in the future. It means that those who expect a higher salary do not devote more training hours to achieve this objective.
Months programming and hours devoted to learning
Students who have been programming for a longer time tend to spend fewer hours studying. We can conclude that it is easier to learn and then master a new programming language when there is a previous programming knowledge.
On average, students began to program approximately 11 months before participating in this survey.
Age and hours devoted to learningThe students' age has no correlation with the number of hours dedicated to studying.
The aim of this study is to show that Data Science techniques allow the analysis and interpretation of a data set; in this case, surveys conducted to computer programming students. We do not intend to be exhaustive and cover all the possible analysis dimensions. We just want to show that even with a simple analysis of variables, using visualization techniques such as bar graphics, pie charts, and, in some cases, regression lines, it is possible to extract a series of insights that are relevant to the scenario. One of them, for example, is that programming is mostly chosen by men.
Our professional team can effectively address Data Analytics projects in any complex scenario with the maximum guarantees of success. If you would like more information about this area, please do not hesitate to contact us. We will be glad to help.
[translated by Luca de Filippis]