- We see more and more companies using deep learning algorithms. We, therefore, moved deep learning from the innovator to the early adopter category. Related to this, there are new challenges in deep learning, such as deploying algorithms on edge devices and training very large models.
- Although adoption is increasing at a slow pace, there are more commercial robotics platforms now available. We see some use outside of academia, but believe there are more undiscovered use-cases in the future.
- GPU programming remains a promising technology that is underused right now. Besides deep learning, we believe there are more interesting applications.
- Deploying machine learning in a typical compute stack is becoming easier with technologies such as Kubernetes. We see an increase in tools to automate more and more parts, such as the data collection and retraining steps.
- AutoML is a promising technology that can help refocus data scientists on the actual problem domain, rather than optimizing hyperparameters.
Each year, the InfoQ editors discuss the current state of AI, ML and data engineering to identify the key trends that you as a software engineer, architect, or data scientist should watch. We curate our discussions into a technology adoption curve with supporting commentary to help you understand how things are evolving. We also explore what we believe you should be considering as part of your roadmap and skills development.
For the first time, we’ve recorded these discussions as a special episode of The InfoQ Podcast. Kimberly McGuire, a robotics engineer at Bitcraze, who is working with autonomous drones daily, joined the editors to share her experiences and views.
Deep learning moves to Early Adopters
Although deep learning only started catching our attention in 2016, we are now moving it from the Innovators category to Early Adopters. We see that there are two major frameworks for deep learning: TensorFlow and Pytorch. Both are widely used throughout the industry. We acknowledge that PyTorch is the dominant player in the academic research space, while TensorFlow is the leader in the commercial/enterprise space. Both frameworks tend to stay fairly even in terms of features, so which framework to pick depends on your requirements in terms of production performance.
We notice that more and more developers and organizations are collecting and storing their data in such a way that it is easy for this to be processed by deep learning algorithms in order to “learn things” relevant to business goals. A lot of people set up their machine learning projects specifically for deep learning. TensorFlow and PyTorch are building abstraction layers for many types of data, and are including a lot of public data sets into their software as well.
We also see that the size of datasets used for deep learning is increasing a lot. We see that the next challenge is distributed training, with distributed data and parallel training. Examples of such frameworks are FairScale, DeepSpeed, and Horovod. This is why we are introducing ‘large-scale distributed deep-learning’ to the topic list in the Innovators category.
Another challenge we are seeing right now in the industry is related to the training data itself. Some companies don’t have large datasets, which means they are benefitting a lot from using pre-trained models for their specific domain. As creating a dataset can be an expensive undertaking, selecting the right data for your model is a new challenge that engineering teams have to learn how to conquer.
Edge deployment of deep learning applications is a challenge
Right now there are still challenges when running AI on edge devices such as mobile/cell phones, a Raspberry Pi, or even smaller microprocessors. The challenge is taking your model trained on a large cluster and deploying it on a small piece of hardware. Techniques to achieve this are the quantization of network weights (using fewer bits for network weights), network pruning (removing weights that are not contributing a lot), and network distillation (training a smaller neural network to predict the same). This can be achieved by, for example, Google’s TensorFlow light and NVIDIA’s TensorRT. We do sometimes see a performance drop when we are shrinking models, but how much the performance drops, and whether this is a problem, is application dependent.
What is interesting is that we see that companies are adapting their hardware to better support neural networks. We see this in Apple devices, as well as in NVIDIA graphics cards that have tensor cores. Google’s new Pixel phone also has a Tensor chip which allows the running of neural networks locally. We see this as a positive trend which will make machine learning viable under more circumstances than is currently possible.
Commercial robot platforms for limited applications have become more popular
In the household, robot vacuum cleaners are already a common occurrence. A new robotics platform which is becoming more and more popular is the Spot: the walking robot by Boston Dynamics. It is being used by police stations and by the army for surveillance purposes. Despite the success of such robot platforms they are still only in limited use, and in very limited use cases. However, with the increasing capabilities of AI we hope to see more use cases in the future.
One type of robot that is becoming successful are self-driving cars. Waymo and other companies are testing cars without a safety driver inside, which means these companies are confident in the capabilities of these vehicles. We believe the challenges to mass deployment are scaling up the area in which these vehicles drive, and proving that the cars are safe before they hit the road.
GPU and CUDA programming allows parallelisation of your problem
GPU programming allows programs to execute massively parallel tasks. If a programmer has a goal which can be reached by splitting a task into many small subtasks that don’t depend on each other, this program is suitable for GPU programming. Unfortunately, programming in CUDA, the GPU-programming language of NVIDIA, is still hard for many developers. There are frameworks which help you, such as PyTorch, Numba, and PyCUDA, which should make it more accessible to the generic market. Right now most developers are using GPUs for deep learning applications, but we hope to see more applications in the future.
Semi-supervised Natural Language Processing performs well on benchmarks
GPT-3 and other similar language models are outstanding for ‘generic natural language APIs’. They can handle a wide variety of inputs, and are breaking many of the existing benchmarks. We see that the more data is used in a semi-supervised fashion, the better the final results are. They are not only good at the normal benchmarks, but they generalise simultaneously to many benchmarks.
In regards to the architecture of these neural networks, we see people moving away from recurrent neural networks like the LSTM in favor of the transformer architecture. The models which are trained are enormous, use a lot of data, and cost a lot of money to train. This leads to some criticism on the amount of money and energy used to produce these models. Another problem with large models is the inference speed. When you are working on real-time applications for these algorithms, they might not be fast enough.
MLOps and Data ops allow easy training and retraining of algorithms
We see that all major cloud vendors are supporting general purpose container orchestration frameworks such as Kubernetes, which also increasingly integrates first-class support for ML-based use cases. This means one can easily deploy databases as containers on a cloud platform and scale this up and down. A benefit is that it comes with built in monitoring. One tool to look out for is KubeFlow, which enables orchestrating complicated workflows on Kubernetes.
For deploying algorithms on the edge we see an improvement in tooling. There is K3s, which is Kubernetes for on the edge. There is KubeEdge, which differs from K3s. Although both products are still in initial stages, they will hopefully improve the deployment of container based AI on the edge.
We also see several products emerging that support the complete ML Ops life cycle. One such tool is AWS Sage maker, which can help you easily train your models. We believe that eventually ML will be integrated into the full DevOps lifecycle. This will create a feedback loop where you deploy an application, monitor the application, and depending on what is happening: go back and make changes before redeploying it.
AutoML allows for automating part of the ML life cycle
We see a slight increase in people who are using so-called “AutoML”: a technique in which parts of the machine learning life cycle are automated. The programmers can focus on getting the right data and a rough idea of the model, while the computer can figure out what the best hyperparameters are. Right now this mostly works for finding architectures for neural networks, and for finding the best hyperparameters to train a model.
We believe that this is a good step forward, as it means that machine learning engineers and data scientists will have a larger role in translating business logic into a format machine learning can solve. We do believe this effort makes it more important to keep track of experiments one is conducting. A technology such as MLflow can help keep track of experiments.
Overall, we believe the problem space is moving from “finding the best model to capture your data” to “finding the best data to train your model”. Your data has to be of high quality, your data set balanced, and it has to contain all possible edge cases for your application. Doing this is currently mostly manual work and requires a good understanding of the problem domain.
What to learn to become a machine learning engineer
We believe that the education of machine learning has also changed over the last couple of years. Starting with classical literature might not be the best approach anymore, as there have been so many advances in the last couple of years. We recommend picking a deep learning framework such as TensorFlow or PyTorch.
It’s a good idea to pick a discipline you want to specialise in. Here at InfoQ we distinguish the following categories of disciplines: data scientist, data engineer, data analyst, or data operations. Depending on the specialisation you pick you want to learn more about programming, statistics, or neural networks and other algorithms.
One tip we as InfoQ editors want to share is that we recommend taking part in a Kaggle competition. You can pick a problem in a domain you want to learn more about, such as image recognition or semantic segmentation. By building a good algorithm and submitting results on Kaggle you will see how good your solution performs compared to other Kaggle users participating in the same competition. You will have both motivation to get a higher ranking on the Kaggle leaderboards, and often the winner of a competition writes down what steps they used for the winning approach at the end of the competition. This way you continuously learn more tricks you can directly apply to your problem domain.
Last but not least, InfoQ also has many resources. We frequently publish news, articles, presentations, and podcasts on the latest and greatest in machine learning. You can also look at our article ‘How to get hired as a machine learning engineer’. Last but not least, make sure you attend the QCon plus conference hosted in November and attend the track “ML Everywhere”.
About the Authors
Roland Meertens is a computer vision engineer working smart computer-vision algorithms for self-driving vehicles at Autonomous Intelligent Driving. Previously I worked on deep learning approaches for natural language processing (NLP) problems, social robotics, and computer vision for drones. machine learning and computer vision problems. Interesting things he worked on are neural machine translation, obstacle avoidance on small drones, and a social robot for elderly people. Besides putting news about machine learning on InfoQ he sometimes publishes posts on his blog pinchofintelligence.com and twitter(https://twitter.com/rolandmeertens). In his spare time he likes to run through the woods and participate in obstacle runs.
Kimberly McGuire is currently working at Bitcraze AB as software developer. In 2019 she finished her PhD at the Faculty of Aerospace Engineering of the Delft University of Technology in the Netherlands. The topic was about “Swarm Exploration with Pocket Drones”. McGuire looked at bio-inspired ways to accomplish indoor exploration on computational limited MAVs, which can fit on the palm of your hand. Alongside this, she’s got a broad interest in embodied artificial intelligence and she tries to keep up with the latest developments.
Srini Penchikala is a senior IT architect based out of Austin, Texas. He has over 25 years of experience in software architecture, design, and development, and has a current focus on cloud-native architectures, microservices and service mesh, cloud data pipelines, and continuous delivery. Penchikala wrote Big-Data Processing with Apache Spark and co-wrote Spring Roo in Action, from Manning. He is a frequent conference speaker, is a big-data trainer, and has published several articles on various technical websites.
Raghavan “Rags” Srinivas (@ragss) works as an Architect/Developer Evangelist goaled with helping developers build highly scalable and available systems. As an OpenStack advocate and solutions architect at Rackspace he was constantly challenged from low level infrastructure to high level application issues. His general focus area is in distributed systems, with a specialization in Cloud Computing and Big Data. He worked on Hadoop, HBase and NoSQL during its early stages. He is also a repeat JavaOne rock star speaker award winner.
Anthony Alford is a Development Group Manager at Genesys where he is working on several AI and ML projects related to customer experience. He has over 20 years experience in designing and building scalable software. Anthony holds a Ph.D. degree in Electrical Engineering with specialization in Intelligent Robotics Software and has worked on various problems in the areas of human-AI interaction and predictive analytics for SaaS business optimization.