Seven keys to success with machine learning
An operational data journey to AI
With the boom of recent advances in industrial digitization, many companies are applying predictive techniques and algorithms to optimize and improve production processes.
Because time-series data generated in discrete manufacturing or industrial process operations provides valuable information on the past, present, and future health of equipment and production lines, it is integral to any company’s digitalization efforts.
Machine learning uses time-series data to find patterns undetected by the human eye. It has quickly become a leading approach to maximizing the value of operations data. Companies can apply machine learning to their time-series data when:
- There is a pattern.
- Historical data is available.
- It is not possible to solve the problem mathematically directly, or in other words, there is not a mathematical equation relating process variables, such as temperature and reactor yield.
When it comes to machine learning projects, there are seven keys to success.
1. Understand your company maturity level
In order to identify the merits of operational intelligence initiatives, companies need to first assess the maturity of their efforts.
The digitalization team must consider both the technology and the culture of their company.
The team should first ensure that data is contextualized and properly cleaned and that users know proper operational processes. Next, the team should identify basic operational processes and their trends. The team can then standardize best practices and establish a real-time data governance strategy.
2. Ensure data quality
Engineers have an expression for when poor quality data input leads to unreliable and even unusable data output: "Garbage in, garbage out."
Operations data from sensors, control systems, assets, and mobile devices can generate poor quality data for a variety of reasons:
- Communication failures—control systems, OPC server failures that return incorrect data, such as "Comm Fail," "I / O Timeout," etc.
- Stale data due to network problems that delay updating, update with the same values, or update with nonsensical values.
- Sensor accuracy failures that may affect one or more pieces of equipment.
As a company’s connected assets continues to grow, monitoring for poor quality data becomes an increasingly important challenge. Companies need to integrate information, equipment, and processes to with information technologies to ensure data quality.
Data standardization, range checking to remove unrealistic data values, and gap filling are all crucial for creating a useable, accurate data set.
3. Employ real-time data governance
While the people, processes, and tools used during different phases of the data collection process may differ, it’s important to establish a real-time data governance strategy to ensure data accuracy and quality. Establishing standard policies and processes helps companies develop and accurately manage their real-time data.
4. Use a machine learning platform that fits your model, not the model that fits the machine learning platform
The cloud is crucial to integrating information technology (IT) systems with operational technology (OT) systems.
Each cloud provider provides similar building blocks for high-volume, diverse data analytics solutions.
Use an IoT platform that enables and streamlines the implementation of common design patterns within and between these environments.
5. Visualize and find the pattern
Visualization allows user interaction with the environment, visualizing trends or mitigating errors based on acceptance thresholds. Users can add data from various systems to the visualizations to manage and detect the most significant events. Turning data into knowledge is key for rapid decision-making.
Visualization plays a fundamental role in all stages of data analysis.
With real-time visualizations, users detect anomalies or filter operations by event types as they happen. This enables richer data for subsequent machine learning and means that critical business decisions can be made in real-time.
6. Share your knowledge
In a data-sharing context, data integrity and security are of utmost importance. Uploading operations data on a shared platform should come with a guarantee that it is only accessible to authorized people for a well-defined purpose.
7. Solution automation
Finally, model training allows companies to automate machine learning solutions. This means companies can minimize the risk of human error and reduce time to market. In order to properly automate a solution, your company should first establish requirements, which could be anything from having a continuous data supply to data visualization goals.
A machine learning algorithm is only as good as its input data. Machine learning project success lies not in the technology but in the solution itself.