Iguazio Data Science Platform Simplifies AI App Projects; Adds Integrated Feature Store
Iguazio is adding a production-ready feature store to its data science platform. It will allow users to more quickly and easily develop, deploy and manage AI apps, according to execs.
Iguazio is adding a production-ready feature store to its data science platform. The offering aims to bring off-the-shelf technology to the growing area of data science while also significantly reduce the skills barrier.
Ignazio's integrated feature store is designed to lower the skills required to work on AI applications, so that even firms without professional data scientists can participate.
In specific, Iguazio’s approach lets users catalog, store and share features centrally, making it easier for teams to collaborate on development, deployment and management of AI apps – even across hybrid, cloud and multi-cloud environments, according to Asaf Somekh, Iguazio Co-Founder and CEO.
"For companies that don't have hundreds of data scientists and data engineers, building a feature store from scratch, in-house, is not feasible," Somekh said, adding "We wanted to bring this functionality to our customers, and provide them with an off-the-shelf solution for feature engineering across training, serving and monitoring in hybrid environments."
Architecturally, Iguazio's feature store is built on its open source MLOps framework, MLRun, enabling contributors to add data sources and contribute additional functionality.
Iguazio's feature store offers a "unified" approach, as it is integrated within its data science platform. This unique design means It plugs seamlessly into the data ingestion, model training, model serving, and model monitoring components. This reduces significant development and operations overhead while also boosting performance, Somekh added.
Iguazio provides "next-level automation of model monitoring and drift detection," Somekh added, to support model training at scale and to run continuous integration and continuous delivery (CI/CD) of machine learning (ML), he added.
Also notable, Iguazio's "unified" feature store is available online and offline.
In a recent post which also appeared on Medium, Ignazio's VP Product Adi Hirschtein explained the need for a modern "unified" feature store:
A feature store is not just a catalog of features with a nice management interface; it's mainly a transformation service designed for solving a complex problem of feature engineering and more specifically, real-time feature engineering.
In addition, the desired approach for feature engineering, even if it's real-time, is to have one logic for generating features for training and serving. Therefore, the concept should be to build it once and then use it for both offline training and online serving.
Therefore, we need a unified feature store for the training and serving layers.
The typical architecture of machine learning pipelines comprises two layers: training and serving, with two different engines managing features. However, the machine learning lifecycle is such that training a model is done with an "offline" feature store while the inference may run in real time using an online store.
Having two different engines can lead to training-serving skew and therefore bad business outcomes. As such, a key advantage of a modern feature store is its ability to unify the logic of generating features for both training and serving, ensuring that the features are being calculated in the same way for both layers.
An Iguazio online post detailed how 'unified feature store' benefits ML professionals this way:
Features are properties that are used as inputs to a machine learning model. For instance, a recommendation application might use the total amount per purchase or product category as one of its many features.
Generating a new feature, called 'feature engineering,' takes a tremendous amount of work. The same features must be used both for training, based on historical data, and for the model prediction based on the online or real-time data. This creates a significant additional engineering effort, and leads to model inaccuracy when the online and offline features do not match. Furthermore, monitoring solutions must be built to track features and results and send alerts of data or model drift.
The Iguazio integrated feature store, at the heart of its data science and MLOps platform, solves those challenges. Accelerate the development and deployment of AI applications with automated feature engineering, improved accuracy, feature sharing and glueless integration with training, serving and monitoring frameworks.
In detail, Iguazio's integrated feature store provides users these important advantages:
Ability To Build Features Once and Plug Them Anywhere, Seamlessly: Because the Iguazio feature store is a centralized and versioned catalog, everyone can engineer and store features (along with metadata and statistics), as well as share and reuse them, and analyze impacts on existing models. Users can collect many independent features into vectors and use those from their jobs or real-time services. Iguazio's high-performance engines automatically join and accurately compute all features.
Real-Time Features and Drift Detection: Iguazio can detect model drift and inaccuracies automatically. Upon such discoveries, Iguazio can alert the users or initiate automated re-training workflows.
Robust Data Transformation: Users can create complex feature engineering processes with Iguazio's built-in robust data transformation service. This service includes feature aggregations with sliding windows, dozens of pre-built transformations, or support for custom logic in native Python code. With a simple API and SDK, data scientists can easily create features without requiring long data engineering cycles.
Feature Catalog: To let users share, search and collaborate on features, evaluate features with detailed statistics and analysis, and see how features correlate to both data sources and models with an easy-to-use user interface.
Integrated Data and Model Monitoring: Iguazio captures the feature statistics in real-time, enabling drift detection based on actual data drift. Thanks to the Iguazio feature store's integration, capabilities such as concept drift monitoring and feature monitoring are available out-of-the-box.
Real-Time Feature Engineering: Users develop features once. The feature transformation pipeline calculates features in real-time based on incoming events or streams and serves the results at millisecond-level latency or pushes them directly into a stream.
Data Governance: With strict governance, Iguazio users can also keep the data lineage of a feature, with the tracking information capturing how the feature was generated, critical for regulatory compliance.
Iguazio customers include Payoneer, Quadient, and Tulipan for various use cases such as fraud prediction and real-time recommendations.