In the digital-first economy, as the data volume continues to grow exponentially, marketers are looking at ways to harness customer data and create optimal customer experiences and generate value out of these interactions. Data-savvy enterprises are adding data science as a capability within their organizations to help generate insights from their customer’s transactional, operational, behavioral data. Artificial Intelligence (AI) is a key competence for businesses looking to create personalized, at-scale customer experiences while keeping costs of these operations down. As per McKinsey’s ‘2020 State of AI study’ 79% of respondents attribute a direct revenue increase due to AI adoption in sales and marketing functions.
The path to AI adoption for Data Scientists (DS), however, isn’t easy, since wrangling large amounts of enterprise data to use in machine learning models is time consuming and complex. In fact, evidence suggests that ~80% of data science projects fail to go into production due to data access, availability, and quality issues, among a host of obvious other reasons. This implies that data scientists spend unproportionate amount of time on preparing data rather than creating insights from the data. Furthermore, enterprise data in most instances is siloed, and customer profiles created from this data is usually fragmented – thereby reducing the impact or ability to create anything meaningful for the customer. And finally, working in accordance with privacy and regulations increases the complexity in managing data.
To overcome some of the challenges above and to reduce time and effort, Adobe Experience Platform provides DS the support needed to accelerate generating actionable insights using machine learning models.
Figure 1: Adobe Experience Platform
- Data collection and unification of a customer’s 360 profile and making the profile available in real-time to activate on the Adobe EDGE network.
- Query service to interact with the data and data science workspace for generating machine learning models.
- Activating the data across Adobe Experience Cloud and non-Adobe products to create a personalized experience.
For AI adoption, the DS team focuses on data acquisition steps before they can start to build any ML models. The AEP platform speeds up the data acquisition steps by providing tools that reduce effort during data preparation.
- Data Ingestion: the platform is built on lambda architecture that allows for streaming and batch ingestion of data. Batch data can be ingested in ML-friendly parquet format that reduces a transformation step to use this data.
- Data Standardization: data ingested in the platform conforms to Experience Data Model (XDM) schema which ensures a standard interpretation of data in the platform. There are also industry standard schemas to leverage for most common use cases which can be enhanced further by using mixins to suit requirements.
- Data Exploration: Query service allows for data exploration using SQL to understand the data better.
- Data Compliance: Data ingested in the platform is classified based on governance and privacy restrictions ensuring no PII data is incorrectly or illegally used within the platform.
These features not only address the data quantity and quality issues, but by creating a standard definition of data across the platform, ensures ML model output accuracy.
Data Science Journey Steps
The ingestion and organization of data is essential to the model workflows as it removes the lengthy process of data preparation and improves the pace at which insights from the data can be generated. Data Science Workspace gives DS the tools to create the model from scratch and deploy it to production. The built-in Sensei AI framework provides a mechanism for DS to organize and manage ML services, from algorithm onboarding, through experimentation, and finally to service deployment.
Figure 2: Workflow to create Machine Learning Model
Model Selection & Feature Engineering
With the data centralized and standardized into XDM schemas, DS have the option to use pre-built ML recipes, build a recipe from scratch or import their existing recipe from Github. Data prep to extract features is made less tedious with some time saving components built into feature engineering framework such as reuse of feature pipeline entities, auto compute score to indicate data quality, and automatically extract a combination of features for an ML model.
For models coded from scratch, Data Science workspace facilitates in-model selection by evaluating metrics from recipe instances and determines the best performing model. For citizen DS, the tool helps in understanding and visualizing data using Jupyter Notebooks for data exploration and model building.
Figure 3: Data science workspace architecture
A recipe is a top-level container for the ML/AI algorithms, processing logic and configuration required to build a trained model, train it at scale, and help solve a specific business use case. DS can build their own recipes or use pre-built recipes to get enterprise started on the AI journey for some of the most common use cases. These recipes can be adapted to:
- Product Recommendation recipe uses machine learning to analyze a customer’s interactions with products in the past, like purchase history, and provide insights into other products they may be interested in. Deploying product recommendation service reduces the friction in discovering relevant products by customers and greatly improves customer engagement with the brand.
- Product Purchase Prediction recipe utilizes machine learning to predict customer purchase behavior based on customer profile and past purchase history to predict the probability for a customer to convert through a purchase event.
- Retail Sales recipe enables the data scientist to find the relationship between demand and pricing and suggest optimized pricing decisions to maximize sales and revenue for a given period.
Model Engineering & Experimentation
The platform is built with clear separation between data lake storage (ADLS), data access and computation environment for processing large amounts of data. This standardizes the authoring of recipes and removes the overhead of accessing data, computation, model management, and model evaluation, and as a result allows DS to focus only on building algorithms for model development. The models developed using Python/TensorFlow, R, PySpark, and Scala have runtime environment support for training runs.
The platform also provides model insight framework which allows one to visually understand and compare model performance for different runs. This helps data scientists experiment with changing features and identify the best performing mode.
Finally, the platform makes it easy to publish the optimized model to work on production data as an intelligent service using Adobe I/O. It automates ML pipelines to retrain models based on new data and continuously refines the model. The intelligent service can also use the updated model for predictions.
Benefits of the Approach
The value in using Experience Platform for ML/AI is that DS can focus on solving for business challenges by building and fine-tuning algorithms that provide benefit to customers. The open platform is built with the same core cloud technologies used in common cloud solutions like Google or Azure that provides speed, reliability, and scale to manage large data sets to operate on. Brands looking to understand customer sentiment, predict churn, or improve inventory management, or any other ML use case can leverage the platform to build models specific to each. These capabilities allow DS teams to optimize each step of their workflow and concentrate on generating value for the business sooner.
AI and machine learning have been around for a while, but with Experience Platform, data scientists now have the tools to help transform enterprise with rich insights and take their personalized customer experience to the next level.