How I got into Machine Learning (in 2010)
As a development engineer at ABB Turbo Systems, I developed complex mechanical components (e.g., radial turbines, shafts, casings, threaded connectors) using FEA/CFD simulations, and to investigate the feasibility of technological ideas and new concepts.
In 2009 I played around with ANNs in JavaNNS, and eventually turned to very basic polynomial response surface models implemented in Mathcad to model the behavior of a mechanical system for the first time. I also used Taguchi matrices to design my experiments.
In 2010 I started using Weka and many of its included algorithms (such as decision trees, support vector machines, Kriging, and Principal Component Analysis) routinely in a loose application of CRISP-DM for iterative and incremental (“agile”?) development of mechanical components. I also used my own Python implementations of Latin Hypercube Sampling and Particle Swarm Optimization, as well as libraries like ecspy. From a usage perspective, all these tools were clunky, but they did the job well.
In late 2011 I wanted to use scikit-learn, but it had just dropped support for neural networks and delegated them to the then still nascent PyBrain library, which proved to be too slow, similarly to bpnn.py.
In 2012, eventually, libFANN became my workhorse for ANNs. It was fast but didn’t include a way to avoid overfitting automatically by early stopping. So, I had to implement methods to do so in Python from a paper. I’d also had to write my own helper scripts for parametric studies on network architectures and training hyperparameters, for cross-validation, and for creating ensembles of neural networks. And, of course, I also had to do my own juggling of data on different nodes of an HPC cluster to achieve all of the above within a sensible time frame.
In 2015 I returned to Weka in order to help a technology development team at Hilti figure out whether they had already exhausted the optimization potential of a new product component (they hadn’t). In 2016 I continued applying the mindset of experimental design, ML/DS and iterative approaches like CRISP-DM and my own (see diagram above) to support teams in problem-solving. And, in 2017, I built prototypes of a new two-sided data-driven business model that required me to dabble in Natural Language Processing using NLTK, spaCy, and VADER for sentiment analysis - all done using Jupyter notebooks.
Yes, software tools are being “repurposed” all the time, in the broadest sense of the word; after all, should we be reinventing the wheel with every new problem? Or should also BD/ML/AI/DS as a field of practice strive to become more akin to the messy world of Javascript, with its countless hipstery whatever.js libraries and dependency hell? I doubt it.
Nowadays, much like sources of learning, software tools for BD/ML/AI/DS are also increasingly interconnected, polished, and well documented. The trend of asking questions of forums such as Quora and StackExchange also makes it more likely that toy examples and key use-cases are documented and can be found somewhere online. This makes software use and solution reuse easier and overall cheaper.
You don’t need to be an expert in statistical learning or optimization algorithms anymore, in order to get started in the field. Understandably, that annoys these experts who see newbies making grave errors and oversimplifications on their journey of learning. But that’s inevitable—look for example at how the field of engineering simulation with FEA and CFD has become commoditized. Nothing new here.
Contrary to the past, where acquiring knowledge in BD/ML/AI/DS was more compartmentalized, large-batched, and dead-serious, the present looks more user-friendly and results-oriented. You can get started with a cursory understanding of the concepts, then start playing with software tools to learn more, then learn from what you achieved; rinse and repeat, akin to Lean’s LAMDA/PDSA, or Lean Startup’s Build-Measure-Learn cycle.