NOTE: I will be posting to and administering the Math Blog while Antonio and his family recover from the fire at his home.
Inc. recently ran an article:
The Strange and Sudden Disappearance of a Coding Bootcamp Founder
For months, Jim O’Kelly taught students how to code via Slack and video lectures. On September 27, he suddenly vanished. And with him, students say, was $100,000 in tuition money.
This sad story of dashed hopes and possible international crime reminded me of my cautionary October 23, 2015 post Data Science Boot Camps
In a difficult economy that continues to limp eight years after the crash in 2008, where international trade and other factors continue to eliminate stable, well paying middle class jobs, especially for persons without college degrees, many people are understandably drawn to the prospect of seemingly high paying, seemingly stable software development and data science jobs where industry leaders, various pundits, and many politicians in both political parties claim there is a shortage of qualified employees.
Boot camps seem to offer rapid — a few months — and relatively inexpensive — a few thousand to a few tens of thousands of dollars — retraining or training (for college age students just starting out in life) followed by a lucrative and high-paying job as a software developer, data scientist, or some other purportedly high paid STEM (Science, Technology, Engineering, and Mathematics) job.
Indeed, if there was a truly desperate shortage for basic STEM skills taught in K-12 (Kindergarten through 12the grade in the United States) or basic programming skills, boot camps would be likely to be successful for many students. This is not the case.
STEM Shortage Claims are Exaggerated and Misleading
Here is Netscape founder and venture capitalist Marc Andreessen writing in the Wall Street Journal:
Why Software is Eating the World by Marc Andreessen (August 20, 2011)
Secondly, many people in the U.S. and around the world lack the education and skills required to participate in the great new companies coming out of the software revolution. This is a tragedy since every company I work with is absolutely starved for talent. Qualified software engineers, managers, marketers and salespeople in Silicon Valley can rack up dozens of high-paying, high-upside job offers any time they want, while national unemployment and underemployment is sky high. This problem is even worse than it looks because many workers in existing industries will be stranded on the wrong side of software-based disruption and may never be able to work in their fields again. There’s no way through this problem other than education, and we have a long way to go.
Emphasis on starved added. Most of us expect people who are starving not to be picky — even a little bit. If I am stranded on a desert island with no food and a cargo container full of dog food washes ashore, I am going to fall down on my knees and thank God Almighty for his grace and mercy. 🙂 I am not going to send the dog food back and complain that I face a shortage of food and why didn’t a refrigerated cargo container full of choice Kobe beef wash ashore.
The phrase “qualified software engineers, managers, …” hides a lot in the adjective qualified. High technology employers are extremely picky in both hiring and retaining engineers. They often claim to be seeking candidates with a minimum of three years of paid professional experience in dozens of very specific technical skills: C++, MATLAB for example. The requirement that this be years of paid professional experience precludes graduates of boot camps from consideration as well as graduates of for-profit universities like ITT Technical Institute or DeVry, or even graduates of top rated Computer Science (CS) programs at top universities such as Stanford and Carnegie-Mellon University.
It is not uncommon for graduates of top rated programs to encounter lengthy job hunts and little interest in their qualifications. See for example Chand John’s account of his lengthy job search as a graduate of Stanford: The Ph.D. — Industry Gap (September 19, 2013) on The Chronicle of Higher Education web site.
No Shortage of Data Scientists
In many respects, data science is a fancy new buzzword for statistical data analysis. This is not a new field. Most STEM Ph.D.’s, most of whom are unable to gain permanent stable employment in their chosen field of study, have extensive training in statistics and data analysis. It is common for them to have advanced training in statistical methods that are rarely needed in most commercial data science applications. Data science jobs are already filled with many surplus Ph.D.s in fields such as Physics, Applied Mathematics, Economics, a number of sub-fields of Biology, and many others.
Graduates of boot camps face stiff competition from Ph.D.’s, actuaries, and other highly qualified persons. A data science boot camp might make sense for someone with an otherwise strong technical background who wants to get up to speed on some specific tools or methods that are popular in the nascent data science industry but are not common in academia, the insurance industry, or wherever the student gained experience in statistics, data analysis, and related skills. This does not apply to waiters, unless they are an out of work Ph.D., or truck drivers trying to retrain or other students lacking a college-level education, people who are targeted by particularly questionable continuing education programs.
The Standard Class Format is a Poor Fit for Continuing Education
Continuing education programs that follow a standard school or college format with regular classes, e.g. on weekends or in the evening, with regular homework and tests with regular due dates are a poor fit for continuing education. In real life, especially with the unstable contract and part-time jobs that have become common today, unpredictable interruptions are common. It is often impossible to complete a class on the planned schedule. I encountered this problem with several community college, university extension, and similar courses that I took when I was younger. I was unable to complete many of the courses because something urgent came up either professionally or personally.
Self-Paced Education is Hard to Complete
The alternative of self-paced education where the student works when they can is more practical but still quite difficult. The Devschool in the Inc. article seems to have offered a flexible, self-paced system:
The students were drawn into the school for a variety of reasons. Johnson, for example, needed a flexible school that she could attend during a cross-country move. Others, like Benjamin Soung, 25, of New York, liked that it was a go-at-your-own pace curriculum, allowing them to work as quickly or slowly as they needed to. And many of the students, like Frevert, liked that the school touted a job guarantee. All students who graduate are promised a job paying $55,000 a year, the website claims.
(Emphasis added)
Serious problems with self-paced education include:
It is common to forget a lot if there is a serious interruption lasting several weeks or months. Serious interruptions such as a crisis at work, problems with a child, problems with a spouse, a break-up, a divorce, the illness of aging parents or other family members, and other events (Antonio’s fire) are common in adult life.
It is easier to procrastinate and let work slide without deadlines and a regular schedule, even in the absence of a serious interruption. Even highly disciplined students with good study skills encounter difficulties.
Self-paced programs work better with small, bite sized lessons or projects. Nothing like a term paper or lengthier exercise. Some important topics are difficult or impossible to learn this way. Deeper longer immersion is required. This is one of the reasons college or university classroom experience is not the same as paid, professional experience.
Conclusion
Caveat Emptor Buyer beware! The old Latin proverb.
In the Inc. story on Devschool, it sounds like a number of students did not do adequate research into Devschool and its founder, discovering some alarming information only when he disappeared. It should be added that these are only allegations and suspicions at this point. Devschool and its founder have not given their side of the story as yet.
Continuing education is often unsuccessful even for students with strong relevant backgrounds and good study skills. Data science boot camps are particularly suspect for individuals lacking a strong technical background already, such as advanced college-level coursework and good grades in probability and statistics or the equivalent.
© 2016 John F. McGowan
About the Author
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech). He can be reached at [email protected].
Sorry to hear about Anthony’s house. How many passports does he have by the way?
Thank you, Chris. 2 passports myself, one my wife.
Hi,
What is the best way to start Data Scientist career?
FWIW
I don’t consider myself a data scientist. Indeed it is very vaguely defined. But here are my impressions:
Most people I know or have met who were “data scientists,” not a huge sample, have had Ph.D.’s in quantitative areas that teach statistics and data analysis at an advanced level either through practice or formal training or both. A few have not had Ph.D.’s but have had comparable training and experience.
I would say at minimum one needs to acquire a mastery of probability and statistics at the college level:
https://www.amazon.com/Introduction-Probability-Theory-Applications-Vol/dp/0471257087
for example.
In addition, most data scientists are dealing with mushy data about people — marketing data, finance, economics, etc. The formal training at the college or university level is usually inadequate to understand the issues involved with messy data like this, except maybe in some advanced economics or sociology courses. The economics courses that I had were far too idealized for real world data about real people.
I recommend studying books like Daryl Huff’s How to Lie with Statistics (the classic), Joel Best’s books (Damned Lies and Statistics by Joel Best https://www.ucpress.edu/book.php?isbn=9780520274709, More Damned Lies and Statistics https://www.ucpress.edu/book.php?isbn=9780520238305) that discuss many of the real world problems associated with analyzing data about people.
Hope this helps,
John
Also I would suggest learning Python and possibly the statistical language R. Python with certain extensions appears to becoming the standard language for data analysis in many “data scientist” positions. Python is free, open-source, mostly readable, and relatively easy to learn (much better than C or C++ for someone who has never programmed before).
I recommend starting with simple procedural programming in Python. Leave the object oriented features for later. Procedural programming is all that is needed for data analysis.
See, for example:
https://www.coursera.org/specializations/data-science-python
This is an example, not a recommendation or endorsement for the Coursera course linked above.
John