About a year ago, Google announced the launch of Vertex AI, a managed AI platform designed to help companies accelerate the implementation of AI models. In celebration of the service’s anniversary and the kick-off of Google’s Applied ML Summit, Google this morning announced new features coming to Vertex, including a dedicated server for AI systems training and “sample-based” explanations.
“A year ago, we launched Vertex AI with the goal of enabling a new generation of AI that empowers data scientists and engineers to do satisfying and creative work,” Henry Tappen, product manager of the Google Cloud group, told via email. Vidak For Congress. “The new Vertex AI features we are launching today will continue to accelerate the deployment of machine learning models in organizations and democratize AI so that more people can deploy models in production, continuously monitor, and increase business impact with AI.”
As Google has presented it in the past, the advantage of Vertex is that it brings together Google Cloud services for AI under a unified user interface and API. Customers such as Ford, Seagate, Wayfair, Cashapp, Cruise and Lowe’s use the service to build, train and deploy machine learning models in one environment, Google claims, moving models from experimentation to production.
Vertex competes with managed AI platforms from cloud providers such as Amazon Web Services and Azure. Technically, it fits into the category of platforms known as MLOps, a set of best practices for businesses to use AI. Deloitte predicts that the market for MLOps will be worth $4 billion by 2025, growing nearly 12x since 2019.
Gartner predicts that the rise of managed services like Vertex will drive the cloud market to grow by 18.4% by 2021, with cloud expected to account for 14.2% of total global IT spending. “As enterprises invest more in mobility, collaboration and other remote working technologies and infrastructure, the public cloud is growing [will] be sustained through 2024,” Gartner wrote in a November 2020 study.
One of the new features in Vertex is the AI Training Reduction Server, a technology that Google claims optimizes the bandwidth and latency of multi-system distributed training on Nvidia GPUs. In machine learning, “distributed training” refers to spreading the work of training a system across multiple machines, GPUs, CPUs, or custom chips, reducing the time and resources required to complete the training.
“This significantly reduces the training time required for large language workloads, such as BERT, and further allows for cost sharing across approaches,” said Andrew Moore, VP and GM of cloud AI at Google, in a post today at the Google Cloud blog. “In many mission-critical business scenarios, a shortened training cycle allows data scientists to train a model with higher predictive performance within the constraints of an implementation window.”
In preview, Vertex now includes Tabular Workflows, which aim to make the modeling process more customizable. As Moore explained, Tabular Workflows allows users to choose which parts of the workflow they want Google’s “AutoML” technology to handle versus which parts they want to engineer themselves. AutoML, or automated machine learning – which is not unique to Google Cloud or Vertex – encompasses any technology that automates aspects of AI development and can touch development stages, from starting with a raw dataset to building a machine learning model that is ready. is for implementation. AutoML can save time, but can’t always match the human scale, especially where precision is required.
“Elements of Tabular Workflows can also be integrated into your existing Vertex AI pipelines,” said Moore. “We’ve added new managed algorithms, including advanced research models like TabNet, new algorithms for feature selection, model distillation, and…more.”
Focusing on development pipelines, Vertex will also integrate (in preview) with serverless Spark, the serverless version of Apache’s maintained open source data processing analytics engine. Now Vertex users can start a serverless Spark session to interactively develop code.
Elsewhere, customers can analyze features of data in the Neo4j platform and then deploy models with Vertex thanks to a new partnership with Neo4j. And — thanks to a partnership between Google and Labelbox — it’s now easier to access Labelbox’s data labeling services for images, text, audio, and video data from the Vertex dashboard. Labels are necessary for most AI models to learn how to make predictions; train the models to identify the relationships between labels, known as annotations, and sample data (for example, the caption “frog” and a picture of a frog).
In the event that data is mislabeled, Moore offers example-based explanations as a solution. The new Vertex features, in preview, use “sample-based” explanations to diagnose and treat data issues. Of course, no explainable AI technique can absorb every error; computer linguist Vagrant Gautam warns against putting too much trust in tools and techniques used to explain AI.
“Google has documentation of limitations and a more detailed whitepaper on explainable AI, but this is not mentioned anywhere [today’s Vertex AI announcement]’, they told Vidak For Congress via email. “The announcement highlights that ‘skill in skills should not be the primary eligibility criterion’ and that the new features they offer ‘can scale AI for non-software experts’. My concern is that non-experts are more confident in AI and in the explainability of AI than they should, and now several Google customers can build and deploy models faster without stopping to ask if that’s a problem for which in the first place a machine learning solution is needed and call their models explainable (and therefore reliable and good) without knowing the full extent of the limitations around them for their particular cases.
Still, Moore suggests that example-based explanations can be a useful tool when used in conjunction with other model audit practices.
“Data scientists shouldn’t be infrastructure engineers or operations engineers to keep models accurate, explainable, scalable, disaster-proof and secure in an ever-changing environment,” Moore added. “Our customers demand tools to easily manage and maintain machine learning models. †