AI Journey: From Concept to Deployment – A Step-by-Step Guide
“AI is probably the most important thing humanity has ever worked on.” – Sundar Pichai, CEO of Alphabet
With quotes like these coming from top executives and artificial intelligence (AI) constantly finding its way into the headlines, there is increased pressure on companies to start their AI journey. However, there is also immense opportunity! Whether the end goal is to find actionable insights, drive efficiency, or engage customers more effectively, companies can take advantage of this new capability while also ensuring they are not left behind by their peers.
Once your organization makes a commitment to AI, the most challenging hurdle is often where to start. What tools do we need? What are the best use cases? Should we keep AI internal or expose features to our customers? How do we make sure our implementation of AI is secure and ethical? This blog is meant to help you think through those early considerations as you start your journey with AI.
We will break the journey up into 4 parts:
- Organization Readiness & Identifying a Use Case
- Data Collection, Storage, & Preparation
- Applying AI Models, Refinement, & Consumption
- Deploy, Monitor, & Maintain
In addition to walking you through the journey and key questions to consider, we will call out some common tools that you may hear from your technology experts or consultants. These can help with orienting you in technology specific conversations.
Keep in mind, integrating AI is a complex journey with many discovery activities, decision points, and ongoing validation efforts. UDig is already partnering with clients to leverage AI in moving their organizations to the next level. We would welcome the opportunity to help you build a plan and deliver solutions to add significant value to your organization.
Organization Readiness & Identifying a Use Case
At the start of your AI journey, a common pitfall is digging into the specifics too fast. Before getting into the tactical considerations, you should ensure your organization is ready for AI. If your organization is still on the fence, take a look at our AI Readiness blog: Is Your Business AI Ready?
As part of your readiness confirmation, you will need to select an initial business use case where you are looking to apply AI. As you select your use case, you should be thinking about the value or ROI associated with the use case. Your team should develop a clear vision of the targeted benefits. Conducting a “Hot House” session can serve as a means to rapidly build out robust use cases with benefits and target value while also allowing for prioritization.
An additional consideration for achieving your benefits will be the margin of error that is acceptable to achieve the outcome. Over time you will be able to optimize the model and associated outputs, but what are minimum thresholds to meet the business case as part of piloting the solution?
Data Collection, Storage, & Preparation | AI Journey
Once you settle on an initial use case, you will start with a review of your own data. You will need to think about the data specific to your identified use case and desired outcomes. For pilots, you likely only need to focus on a small sub-set of your data. As part of the data identification, you should validate what data you have today, what additional data to acquire, and how to source and store any additional data sets. Taking these actions allows you to inventory and catalog your target data sources.
After your inventory is complete, an important next step will be a review of how these data sets are currently being collected over time. Are you collecting data from various sources such as sensors, databases, web services, and manual inputs? After collection, is the data stored locally, in the cloud, or possibly it simply resides in spreadsheets supporting a business process. Developing a clear understanding of the collection and storage methods will help establish a foundation before any preparation steps.
Before consuming the data with your new AI capability, you may have multiple preparation steps to execute this phase of your AI journey. For example, there may be a requirement to migrate the required data to a common repository. If you are leveraging multiple data sources, extracting the data and loading it to a single repository will help reduce complexities. Whether or not this migration is needed, this is a good opportunity to assess your data governance maturity for existing and net new data. For more on data governance related to AI, please reference our “Capitalizing the AI Wave to Advance Data Governance” LinkedIn Live event recording.
Other preparation steps for your data include “cleaning” or standardizing to conform with the future model consumption requirements. There may be known quality concerns with your data that need to be addressed. There may also be a need to transform specific data sets prior to use. This ensured the data is in a usable format for AI application.
All these considerations are important to review and understand before building your AI solution. Understanding your data lineage and governance practices are critical for success of your pilot as well as future security and ethical concerns associated with the AI outputs.
Common Tools for Reference
- Databases: SQL Server, Azure SQL, Oracle, MySQL, PostgreSQL, MongoDB, Cassandra
- Data Storage & Lakes: Amazon S3, Google Cloud Storage, Azure Data Lake
- Cloud Data Platforms: Snowflake, Databricks, Amazon Redshift, Microsoft Synapse, Microsoft Fabric
- Data Integration: Azure Data Factory, Azure Synapse, Fivetran, Python, R, Apache Kafka, Apache NiFi,
- ETL: Microsoft SSIS, Informatica, Talend
- Data Cleaning & Transformation: dbt, Apache Spark
Applying AI Models, Refinement, & Consumption
Once you have your foundation set with the data, you will be ready to assess the models you will leverage to achieve your desired outcomes. If you have clear target outcomes and know the business case for value and ROI, this will help direct your model decisions in your AI journey. AI models sit at the core of artificial intelligence (i.e., advanced analytics, machine learning) and are the result of applying algorithms to data sets. The use cases help to dictate the model selection such as regression, classification, or clustering.
There are pre-existing models that may suit your needs while there are other situations where you may need to build your own model or supplement an existing model. You may hear the term “feature” during these conversations, which is a connection between your data and the model you plan to apply. For example, if you plan to execute on a predictive maintenance use case, you may have machine operations data as a feature that includes data from sensors measuring vibration, temperature, or energy usage. If you have existing repositories of information that will be critical to generating your output, Retrieval-Augmented Generation (RAG) is a technique in artificial intelligence that combines the strengths of both retrieval-based and generative models. For example, a customer support chatbot can leverage existing customer information to generate a proactive response. Similarly, Large Language Models (LLMs) are all the hype right now and rightly so given their ability to enable natural language processing (NLP). You may have heard about GPT-4, BERT, or T5, which can all be integrated into your AI solution to enhance human interaction with the new solution.
Once your model is selected or developed, the model is refined via training, testing, tuning, and validation. Training involves running early datasets through the model to make early adjustments for targeted outcomes. Testing can then take a different data set to ensure outcomes are consistent and free from bias. Tuning focuses on optimizing the model parameters to further improve performance. Validation is completed throughout the training, testing, and tuning to ensure outputs are meeting expectations for the solution. This step should include human review of the outcomes to ensure bias and ethical considerations are taken into account.
Lastly, you will need to think about how the outputs from the models will be consumed by end users. Your users will need a user interface (UI) to consume the outputs such as via a dashboard or rendering through a customer support chatbot. Basic data visualization tools can be utilized, or results can be presented through application UIs directly.
Common Tools for Reference
- Feature Extraction & Selection: Python (scikit-learn, pandas), R, Featuretools
- Programming Languages: Python, R, Java, Julia
- Machine Learning Libraries: scikit-learn, TensorFlow, Keras, PyTorch, XGBoost, LightGBM, CatBoost
- Integrated Development Environments (IDEs): Jupyter Notebook, PyCharm, RStudio
- Evaluation Metrics: Built-in functions in scikit-learn, TensorFlow, Keras, PyTorch
- Data Visualization: Tableau, Power BI, Plotly, D3.js, Matplotlib, Seaborn
- Dashboards: Grafana, Kibana, Redash
- Web/Mobile Development: React, Angular, Vue.js, Flutter
Cloud Platforms (for integrated solutions)
- Amazon Web Services (AWS): SageMaker, Lambda, EC2, S3
- Google Cloud Platform (GCP): AI Platform, Dataflow, BigQuery
- Microsoft Azure: Azure Machine Learning, Azure Databricks, Azure Synapse Analytics
Deploy, Monitor, & Maintain | AI Journey
Now that your AI solution is ready for primetime, the solution needs to be deployed to a production environment where end users can leverage the capability. This step focuses on deploying the solution for consumption of real-world data, also often referenced as “going live” with the solution. In many cases of this part of the AI journey, this includes your connections to the data sources and consumers via application programming interfaces (APIs). The deployment step is the official point where you “turn on” your AI capability for use by your target audience. Prior to this point, the building and refinement efforts are managed with your project team and a sub-set of end users in test environments to mitigate risk.
From an operational perspective, you should look at your approach to monitoring and maintaining the AI solution. These activities can help to ensure maximum performance of the model, updating the model with new learnings to maintain or even improve the outputs, and of course, manage error handling scenarios. Drift detection is a common terminology associated with this maintenance of the model. A final key component is human review and feedback. This input is critical to assess or improve model performance especially in the case of leveraging LLMs.
Common Tools for Reference
- Model Serving: TensorFlow Serving, TorchServe, OpenVINO, ONNX Runtime
- APIs: FastAPI, Flask, Django, RESTful APIs
- Containerization: Docker, Kubernetes
- Monitoring Tools: Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), MLflow
- Retraining Pipelines: Apache Airflow, Kubeflow, MLflow
Additional Tools for Collaboration, Version Control, Security, & Compliance
To successfully deliver your AI solutions, project teams will need to have tools in place to support collaboration, version control of code/models, and enable the appropriate security and compliance associated with input and output data. For example, your organization should maintain data privacy and compliance with regulations (e.g., GDPR, CCPA) as well as implement measures to protect data and models from unauthorized access.
Below are some common tools to support these activities:
- Version Control: Git, GitHub, GitLab, Bitbucket
- Project Management & Collaboration: Jira, Confluence, Slack, Trello
- Model Versioning & Experiment Tracking: MLflow, DVC (Data Version Control), Weights & Biases
- Data Privacy & Compliance: Tools for anonymization and encryption like IBM Guardium, Dataguise, Protegrity
- Security Tools: AWS Key Management Service (KMS), Azure Key Vault
If you are ready to get your AI journey started, UDig would welcome the opportunity to partner with you. With a deep understanding of the AI model application and broader digital solution implementation, we are ready to help you unlock new potential for your organization and your customers. Contact us here to dig in further.
About John Mayberry
John Mayberry leads a portfolio of vertical markets for UDig. He brings over 15 years of experience working with clients in this vertical to implement innovative custom solutions, modernize technology portfolios, and oversee programs that address clients’ top strategic initiatives.