IIot and Smart Manufacturing

What is IIoT and Smart Manufacturing

IIoT refers to industrial IoT, or the Industrial Internet of Things. Standard IoT describes a network of interconnected devices that send and receive data to and from each other through the internet.

IIoT and Smart Manufacturing is the usage of connected devices for industrial applications, such as manufacturing and other industrial processes. It involves the use of things such as machine learning and real-time data to optimize industrial processes through a connected network of sensors, actuators, and software. The implementation of IIoT is referred to as Industry 4.0, or the Fourth Industrial Revolution.

Currently, most conventional industrial processes are still using Industry 3.0 practices. However, with the ongoing development and implementation of IIoT across industries, we are trending towards Industry 4.0 – with manufacturing plants being one of the major recipients of this change.

Manufacturing Plant Operational Structure

In order to understand the impact that Industry 4.0 and IIoT and Smart Manufacturing have on manufacturing plants, it is necessary to understand the existing structure that allows a manufacturing plant to operate.

A manufacturing plant has an operational structure of several levels; each of these levels has a certain function and is comprised of equipment, software, or a mixture. This is known as the automation pyramid.

Level 0 is the field level, containing field devices and instruments such as sensors and actuators.

Level 1 is the direct control level, containing PLCs (programmable logic controllers) and HMIs (human-machine interfaces). HMIs display parameter values and allow remote control of devices through stop and start instructions, as well as set point adjustment. HMIs are connected to the PLCs, which are then connected to the field devices.

Level 2 is supervisory control, and contains the SCADA system (supervisory control and data acquisition). The SCADA is a system of software and hardware, and is used for real-time data collection and processing, as well as automatic process control. SCADA collects its data from PLCs and HMIs over communications protocols such as OPC UA and Modbus.

Level 3 is the planning level, containing the MES (manufacturing execution system). The MES is responsible for monitoring and recording the entire production process from raw materials to finished products.  

Level 4 is the management level, containing the ERP system (enterprise resource planning). ERP is responsible for centralizing all of the information within the organization. It’s used to manage accounting, procurement, and the supply chain, among others –  and is more focused on the business aspect rather than the manufacturing aspect.

With an IIoT and Smart Manufacturing system in place, there is an additional layer: the cloud, which is above all the other layers, and implements analytics such as machine learning. the field devices are referred to as edge devices. An edge device has no physical connection to the PLC – it’s instead connected through Wi-Fi. These devices communicate with the PLC over the native protocol, where all the process control is done.

Scenario 1: Optimizing Production and Quality

Conventional Manufacturing – No IIoT (Industry 3.0)

During production, human operators observe the MES system to monitor parameters such as availability, performance, and quality – which are multiplied to give the OEE (overall equipment effectiveness). An OEE of 100% shows perfect production – the goods are manufactured as fast as possible and at the highest quality possible.

If one of the parameters is low, such as the performance (production speed), the operator can instruct the SCADA system to increase the machine speed; this will result in goods being manufactured faster – and a higher performance value.

However, while goods are being produced faster, there also tends to be more waste – so the quality will drop. The operator will have to decide exactly how much to set the machine speed in order to find a good compromise between quality and output. To find the exact balance that maximizes profitability is a difficult task – one which is almost impossible for a human to accomplish.

Smart Manufacturing – Using IIoT (Industry 4.0)

IIoT and Smart Manufacturing enables all of the devices and systems to be able to send and receive information to and from the same place, in real time, without human intervention. This allows the machine learning to make optimal decisions regarding equipment and parameter set points to make the manufacturing process as efficient as possible.

With this system in place, no humans are required to make complex decisions. This results in optimized decisions to be made as quickly as possible – and conditions that result in the greatest profitability for the manufacturing plant.

Scenario 2: Equipment Maintenance

Conventional Manufacturing – No IIoT (Industry 3.0)

The primary method of maintenance is condition monitoring, also known as condition-based maintenance (CbM).

Condition-based maintenance relies on real-time parameters measured by an equipment’s sensors such as temperature, pressure, speed, vibration. Each of these parameters is given a particular range for which the values are acceptable for a given piece of equipment. These parameters are actively monitored, and once a value is measured outside of the acceptable range, maintenance is scheduled.

The issue with condition-based maintenance is that the equipment’s fault is detected after a certain amount of degradation has already taken place. Depending on the rate at which degradation is taking place, this may not leave enough time for timely maintenance to be carried out. The amount of degradation may have also caused damage which is more costly to repair than if it were addressed earlier. The reverse could also be true; a parameter has exceeded a certain boundary, leading to maintenance to be performed immediately. However, there could’ve been a more convenient time, or maybe the machine could’ve carried on running for a considerable amount of time before maintenance being necessary – leading to excessive, unnecessary costs.

Smart Manufacturing – Using IIoT (Industry 4.0)

With IIoT, the method of maintenance can evolve to predictive maintenance (PdM).

Like condition-based maintenance, predictive maintenance also uses sensors to continuously monitor parameters. However, predictive maintenance also continuously collects and analyzes both historical and real-time data using statistical methods and machine learning. Because data trends are being analyzed instead of absolute values, problems can be detected much earlier, and an accurate failure time is determined – allowing maintenance to be scheduled at the most convenient, effective time.

Scenario 3: Adding a New Device

Conventional Manufacturing – No IIoT (Industry 3.0)

Without IIoT, every time a new field device is installed in the plant – such as a pressure transmitter, flowmeter, control valve – it needs to be manually wired into a PLC. Then, its tag needs to be added to the PLC, HMI, OPC server, SCADA, and MES. This is a costly and time-consuming process.

Smart Manufacturing – Using IIoT (Industry 4.0)

When a new device is installed, no complex engineering is required to connect it to the cloud and the existing devices.

The edge devices, PLCs, HMIs, SCADA, MES, ERP, and machine learning all publish their tags and data into the unified namespace – a centralized data repository.

The machine learning allows continuous, real-time collection of data from all of the devices. It can then use this data to run algorithms and publish additional tags into the namespace

Summary

In essence, IIoT and Industry 4.0 allow manufacturing plants to address many of the inefficiencies and solve a lot of the challenges that they face. The use of interconnected sensors and machines, along with free-flowing data enables smarter decisions to be made regarding all aspects of production and operations – leading to reduced downtime, faster production, higher-quality production, and increased profitability.

TQS Integration

TQS Integration is a global technology consulting and digital systems integrator. We provide you with expertise for the digitization of your systems and the digital transformation of your enterprise.

With clients across the pharmaceutical, process manufacturing, oil and gas, and food and beverage industries, we make your data work for you – so you can maximize its potential to make smarter business decisions.

Please contact us for more information.

Checking and reviewing the suite of validation documentation can be very time consuming.  TQS Integration can provide the resources needed to ensure GMP and all regulatory compliance requirements are met. Why not let us do the heavy lifting - providing you with faster end-to-end quality review processes - to ensure speed to value and implementation of your processes and more importantly freeing up your capacity and resources. TQS Integration can help to alleviate these pressures by placing people with the right skills, at the right time, in the right place.

Data Integrity

TQS PI Documentation set will provide evidence that the PI System has been validated.

The expected system lifecycle steps include but not limited to:

"I want to thank you all for all the great work and all your efforts to expedite this important validation activity and everything you are doing in general. Data integrity is important to us ensuring accuracy and compliance. I really appreciate it. Not only can we put our validation work in your trusted hands, but your work has also freed up so much time for us to focus on other areas of the operation."-Top 10 Pharma Company

How we can help you

At TQS, the validation strategy has been developed to systemically test the PI System at different levels for data integrity.  The Installation Qualification (IQ) covers the minimum set of verification to assure proper installation of the software components for the PI System.  Operational Qualification (OQ) will verify the correct operation of the PI system against the user requirements and design specification including those around data acquisition and storage, start up and shutdown and high availability etc.

Dedicated Quality Assurance Engineers will monitor, review, and approve every phase of the process to ensure the implementation and design of the PI System adhere to company standards and regulations.  QA will conduct and participate in every phase of the SDLC including requirements review, design review and test case reviews including test evidence.  Having empirical evidence of the fact that the PI System works as expected ensures a successful outcome during inspections with regulatory organisations, ensuring data integrity is intact.

For information, please contact us.

High Frequency Data

Processes in industrial operation occur often at different time scales, some are fast (sub seconds to hours), and others are slow (hours, days, weeks, or months). In a biotechnology facility for example, there are slow moving batch processes, fast purification steps and very fast filling lines. Capturing events at different time scales and analyzing them, requires a data strategy for the acquisition, storage, and analysis.

To optimize storage space and network bandwidth, the OSIsoft PI system differentiates between high frequency data also known as snapshot values and compressed or archived data. Data are archived from the snapshot table by applying a swinging door compression algorithm. This data strategy has proven to be great balance between displaying real time data in high resolution as well as storing sufficiently enough data for historical data analysis.

The drawback of this approach is that the snapshot queue contains only a single value for each process variable, so analysis based on snapshot or event driven data is limited to single point. There are some valuable use cases such as statistical process control, alarm management or event triggers. However, Machine Learning (ML) or multivariate models (MVA) are usually based on time series vectors.

To accommodate advance modeling of high frequency data, the OSIsoft PI system requires expansion off the snapshot table to a low latency time series storage:

High Frequency Data

The requirements for the Snapshot Db are primarily driven by read speed as well as write speed. Some open-source time series databases such as QuestDB that allows a million writes per seconds are available now. The read speeds are even more impressive: We measured ~ 800K read/sec for a standard OSIsoft PI system, whereas a low latency TSDB is faster by a factor of 800 - 1,000 (see demo:  QuestDB · Console )

An additional benefit of using open source TSDB is that it allows us to add open-source ML and MVA libraries as well as, to take advantage of the very rich open-source visualization ecosphere. For example, the following shows a Grafana Dashboard of the Snapshot Db:

Summary

The OSIsoft PI system has been designed to capture real time events in a snapshot table and store compressed data in the PI Data Archive. This data architecture is optimized for short term event data and long-term data storage. Missing in this scenario are capabilities to store and analyze high frequency data for which modern low latency time series databases can provide. By adding a dedicated high frequency data store, fast processes can be monitored and analyzed in parallel to an already existing data infrastructure. This will open a large range of new uses cases that are difficult or impossible to realize with existing systems.

For information, please contact us.

The TQS Pandas PiFrames for OSIsoft® PI System® library has been designed to accelerate multivariate analytics (MVA) and machine learning (ML) for the OSIsoft PI system. The difference to the existing PI Analysis calculation engine, is that TQS Pandas PiFrames is designed for vector or matrix operations instead of single value operation.

The TQS Pandas PiFrames for OSIsoft® PI System® library makes it very easy to work with structured and contextualized data in Python. Time segments can be defined as Event Frames (OSIsoft EF) and retrieved together with sensor data as structured Pandas data frames. This allows simple and very complex analytics of one dimensional or multi-dimensional data.

One use case in Biotechnology is the transition analysis (TA) on chromatography columns. Chromatography is used to purify the product and the performance of the chromatography column is key to achieve a good product quality. There are several metrics that can be calculated to monitor the columns performance, the following lists a few:

The calculations are based on the transition peak which mathematically is a probability density function (pdf). The peak is calculated from the raw sensor data – the transition or cumulative distribution function - by numerical derivation. Often the curves are normalized by the flow rate to account for differences in total volume. The following shows an example of the transition (cdf) and the derivative (pdf):

Transition Analysis

The transition peak of the pdf is used to calculate, for example, the peak asymmetry using the following formula:

Assymetry = b/a

Where b and a are the 10% peak heights left (blue line) and right of the peak maximum (black line). Though the calculations are simple, the major problem is the numerical differentiation of noisy sensor data. This steps introduces so much additional noise that the peak shape is hard to analyze. Therefore, the analysis includes data smoothing steps as the LOWESS filter to reduce the noise level in the raw data and upsampling to increase the resolution.

It was performed using simulated data with different noise leves from 0 to 2.5% to evaluate how acurate and precise this analysis is.

The results show that this calculation has some significant variation even at low noise levels. There are also differences in the accuracy, which are introduced by the filtering step. Depending on the sensor data quality, this approach might not be senssitive enpough to pick up small changes in the columns performance.

To improve the results, the same test was performed by fitting an exponential modified gaussian directly to the transition curve.

The fitting routine led to much better accuracy and precision. This is mainly due to the fact that the tranisiton curve doesn’t have to be modified and therefore no additional noise or peak distortion is being introduced.

Summary:

Transition Analysis in biotech production is a great approach to monitor the column performance during chromatography steps. There are a lot of simple metrics available as key performance indicator (KPI’s), but they mostly operate on derived signal, which introduce noise and distortions in the calculation.

Using the raw transition signal and fitting a distribution function would be a much better way. Though this makes the analysis more complex and increases the latency, however much higher precision and accuracy can be achieved in the results.

For information, please contact us.

TQS Pharma Batch

Businesses within the pharmaceutical and life sciences sector must continuously ensure batch quality is maintained at the highest standards. After all, quality is the most critical metric in pharmaceutical manufacturing, nothing is more important than protecting patient health. However, the impact doesn’t go without also reaching bottom-lines and profitability. The numbers speak for themselves: The cost of a single batch deviation can range from $20,000 to $1M per batch, depending on the product.

Tight control of processes, inputs, and other variables is a necessity for successful pharmaceutical manufacturing. Traditionally, there have not been effective ways of looking at historical and time-series data to investigate deviations and variability besides spending painfully tedious hours of subject matter expert (SME) time in spreadsheets. Engineers look to create process parameter profiles to serve as guides for reducing process variability and increasing yield for all future batch development—also known as the “golden profile”.

But there are two problems with this. First, creating golden batch profiles repeatedly requires many hours spent manually sifting through years of data or delayed lab results that make it difficult to optimize process inputs to control the batch yield. And second, out-of-tolerance events will still occur, regardless of applying diligence in controlling the Critical Process Parameters (CPPs) of a recipe, as measured by a group of Critical Quality Attributes (CQAs). Often, it becomes clear the number of variables and the cause-and-effect relationships connecting these two aspects are more complex than originally assumed.

Find Your “Golden Batch”—Efficiently

The data is there. But it’s time to efficiently analyze it. The method of manually extracting production data from historians and various repositories within an industrial control system and creating graphs in Excel is outdated and doesn’t solve the whole puzzle of accurately finding the relationships mentioned above. There are many limitations on how a spreadsheet can actually be applied to understand complex process variability and provide actionable insights. Leading pharmaceutical companies have made the transition to advanced analytics to find their perfect batch parameters.

Applying Advanced Analytics to Make Data-Backed Decisions

The most efficient and intuitive way to lead your team to golden batch discovery and application is through advanced analytics. Applying the technology eliminates all manual work in spreadsheets and automatically cleanses, contextualizes, aggregates, and analyzes your process data in near real-time. It makes the manual connections that your engineers won’t have to—freeing up their time to apply the analysis to your process parameters and production methods to see improvements in quality and performance.

Seeq, the leading provider of advanced analytics, can be scaled applied across your entire organization, running on standard office computers and communicating directly with historians to quickly extract data and present results.

A Behind-the-Scenes Look

To visualize the application in action and for this specific business issue, assume you’re examining a production process with six CPPs connected to a single unit procedure. Using historical data from ideal batches with acceptable specifications on all CQAs, advanced analytics enables you to simply and easily graph these six variables from all the previous unit procedures. Curves representing performance from historical CPPs can then be superimposed on top of each other using identical scales to reveal new insights within the application.

It’s immediately seen if the curves tend to form a tight group, or if they are spread out, showing different values at various times. Seeq can easily aggregate these curves without the need for complex formulas or macros to establish an ideal profile for each CPP. Engineers can replicate this procedure, resulting in an updated reference profile and boundary for every variable. In the end, this process reveals new opportunities for process optimization.

In the screenshot below, Seeq’s advanced analytics is analyzing the cell culture process in an upstream biopharmaceutical manufacturer that is producing Penicillin. The technology is used to create a model for Penicillin concentration based on historical batches to find the CPPs that will produce the ideal batch. This model can then be deployed on future batches with golden profiles for CPPs to effectively track deviations and prevent them from occurring.

Batch Quality

In another example, a leading pharmaceutical manufacturer saved millions of dollars by gaining the ability to rapidly identify and analyze root cause analysis of abnormal batches via similar modeling techniques in Seeq. The team reduced the number of out-of-specification batches by adjusting process parameters during the batch and saved on the reduction of wasted energy and materials.

Additionally, Bristol-Meyers Squibb utilizes modern technologies, including advanced analytics, to capture the specialized knowledge needed to test the uniformity of their column packing processes. Seeq is deployed to rapidly identify the data of interest for conductivity testing to calculate asymmetry, summarize data, and plot the curves for verification by their SMEs. The entire team is empowered to operationalize their analytics by calculating a CPP and distributing it across the entire enterprise, providing reliable and fast insight as to when a column was packed correctly. In turn, this prevents product losses, product quality issues, and even complete losses of a batch.

Developing and deploying an online predictive model of pharmaceutical product quality and yield can additionally aid in fault detection and enable rapid root cause analysis, helping to ensure quality standards are maintained with every batch.

Across multiple use cases, one thing is clear—advanced analytics is the future of trusting batch quality to the highest extent for pharmaceutical and life sciences manufacturing. Combining the latest initiatives in digital transformation, machine learning, and Industry 4.0, it’s the technology that empowers your engineers to their fullest potential in making data-driven decisions to tremendously improve operations.

Applying Advanced Analytics to Your Operation

Are you ready to increase your batch quality and yield by incorporating seamless golden batch development cycles and application with advanced analytics? Make sure to watch this webinar from Seeq for insight on additional ways that advanced analytics can be used to capture knowledge from all parts of the product evolution cycle—from laboratory process design and development through scale-up and commercial manufacturing.

If you’re looking to see the technology live and in action, schedule a demo of the technology here.

machin learning with osi pi

Python based machine learning (ML) libraries have evolved at an unbelievable pace. It is most impressive that the time-consuming steps such as data encoding, feature selection, model comparison and even model optimization have been fully automated. For example, the relatively new Python library PyCaret calculates the metrics of over 21 different regression models and selects the best one with just a few lines of codes. Machine learning with OSI Pi has come along way.

There are plenty of industrial applications, where these algorithms could be successfully applied. But there are two major bottlenecks for successful projects:

  1. Historical Data collection for the Model Development
    1. Real time data collection for the Model Integration

Model Development data could be downloaded in Excel or text\csv files and analyzed offline. The drawback is that this approach cannot be productized and is limited to off-line applications.

To accelerate the model development and model integration (MD\MI pipelines) for the OSIsoft PI System, TQS has developed a Python library called TQS Pandas PiFrames for OSIsoft® PI System® that connects to the PI System and provides PI data as Pandas data frames. The Pandas data frame is the preferred data structure in Python for data scientists and is supported by many ML libraries. Therefore, the TQS Pandas PiFrames for OSIsoft® PI System® can be easily integrated into ML projects in both model development and model integration.

The following shows some code examples in Python.

  1. Connecting to the PI Data Historian and PI System:


cdf = ConnectToDefaultAF()
cdf = ConnectToDefaultPI()


df = GetMultipleAttributeValuesByVariable("Bio Reactor 1",["Temperature","Concentration","Level"],'t-2h','t',60,0,None)

The resulting data frame is a time series:


The data frame can also be arranged by variable columns:

df = GetMultipleAttributeValuesByFrame("Batch_0_*","Bio Reactor 1",["Temperature","Concentration","Level"],'t-7d','t',60,0,None)

During the last couple of months, we have developed use cases around OSIsoft PI system that are based on the TQS Pandas PiFrames for OSIsoft® PI System® library:

The library has shown to significantly reduce the model development and model integration time.

SUMMARY

Machine Learning and AI projects are often slow to develop and difficult to integrate. The main reason is that most Python libraries are expecting Pandas data frames (or Numpy arrays) and these data structures are not readily available in industrial automation. TQS Integration has developed the TQS Pandas PiFrames for OSIsoft® PI System® libraries to accelerate both model development and model integration. The library is user friendly, fast and scales well for all common machine learning (ML) applications.

For information, please contact us.

Advanced data analytics is empowering process manufacturing teams across all verticals.

Enhanced accessibility into operational and equipment data has surged a transformation in the process manufacturing industry. Engineers can now see both historical and time-series data from their operation as it’s happening and at remote locations, so entire teams can be up-to-speed continuously and reliably. The only problem with this? Finding their team is “DRIP”—Data rich, information poor.

With tremendous amounts of data, a lack of proper organization, cleansing, and contextualizing only puts process engineers at a standstill. Some chemical environments have 20,000 to 70,000 signals (or sensors), oil refineries can have 100,000, and enterprise sensor data signals can reach millions.

These amounts of data can be overwhelming, but tactfully refining it can lead to greatly advantageous insights. Many SMEs and process engineers’ valuable time is filled with sorting through spreadsheets to try to wrangle the data, and not visualizing and analyzing patterns and models that lead to effective insight. With advanced analytics, process manufacturers can easily see all up-to-date data from disparate sources and make decisions based on the analysis to immediately improve operations.

Moving Up from “Data Janitors”

Moving data from “raw” to ready for analysis should not take up the majority of your subject matter experts’ time. Some organizations in today’s world still report that over 70 percent of their time involved with operational analytics is only dedicated to cleansing their data.

But your team is not “data janitors.” Today’s technology can take care of the monotonous and very time-consuming tasks of accessing, cleansing, and contextualizing data so your team can move straight to benefitting from the insights.

The Difference Between Spreadsheets and Advanced Analytics

For an entire generation, spreadsheets have been the method of choice for analyzing data in the process manufacturing industry. At the moment of analysis, the tool in use needs to enable user input to define critical time periods of interest and relevant context. Spreadsheets have been the way of putting the user in control of data investigation while offering a familiar, albeit cumbersome, path of analysis.

But the downfalls of spreadsheets have become increasingly apparent:

All of these pain points combine to an ultimate difficulty to reconcile and analyze data in the broader business context necessary for profitability and efficiency use cases to improve operational performance.

With advanced analytics, experts in process manufacturing operations on the front lines of configuring data analytics, improvements to the production’s yield, quality, availability, and bottom-lines are readily available.

How It’s Done

Advanced analytics leverages innovations in big data, machine learning, and web technologies to integrate and connect to all process manufacturing data sources and drive business improvement. Some of the capabilities include:

The Impact of Advanced Analytics

Simply put, advanced analytics gives you the whole picture. It draws relationships and correlations between specific data that need to be made in order to improve performance based on accurate and reliable insight. Seeq’s advanced analytics solution is specifically designed for process manufacturing data and has been empowering and saving leading manufacturers time and money upon immediate implementation. Learn more about the application and how it eliminates the need for spreadsheet exhaustion here.

Data Latency

The topic of system latency has come up a couple of times in recent projects. If you really think about it, this is not surprising. As more manufacturing gets integrated, data must be synchronized and\or orchestrated between different applications. Here are just some examples:

  1. MES: Manufacturing execution system typically connect to a variety of data sources, so the workflow developer needs to know timeout settings for different applications. Connections to the automation system will have a very low latency, but what is the expected data latency of the historian?
  1. Analysis: More and more companies move towards real-time analytics. But just how fast can you really expect calculations to be updated? This is especially true for Enterprise level systems, that are typically clones from source OSIsoft PI servers by way of PI-to-PI. So you are looking at a data flow for example:

    Source -> PI Data Archive (local) -> PI-to-PI -> PI Data Archive (region) -> PI-to-PI -> PI Data Archive (enterprise) and latency in each step.
  2. Reports: One example are product release reports. How long do you need to wait to make sure that all data have been collected?

The OSIsoft PI time series object provides a time stamp which is typically provided from the source system. This time stamp will bubble up though interfaces and data archives unchanged. This makes sense when you compare historical data, but it will mask the latency in your data.

To detect when the data point gets queued and recorded at the data server, PI offers 2 event queue that can be monitored:

AFDataPipeType.Snapshot ... to monitor the snapshot queue

AFDataPipeType.Archive ... to monitor the archive queue

You can use PowerShell scripts, which have the advantage of being a lighter application that can be combined with the existing OSIsoft PowerShell library. PowerShell is also available on most server, so you don't need a separate development environment for code changes.

The first step is to connect to the OSIsoft PI Server using the AFSDK:

function Connect-PIServer{
[OutputType('OSIsoft.AF.PI.PIServer')]
param ([string] [Parameter(Mandatory=$true, Position=0, ValueFromPipeline=$true,
ValueFromPipelineByPropertyName=$true)] $PIServerName)
$Library=$env:PIHOME+"\AF\PublicAssemblies\OSIsoft.AFSDK.dll"
Add-Type -Path $Library
$PIServer=[OSIsoft.AF.PI.PIServer]::FindPIServer($PIServerName)
$PIServer.Connect()
Write-Output($PIServer)
}

The function opens a connection to the server and returns the .NET object.

By monitoring the queues and writing the values, it will look like the following:

function Get-PointReference{
param ([PSTypeName('OSIsoft.AF.PI.PIServer')] [Parameter(Mandatory=$true,
Position=0, ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true)] $PIServer,
[string] [Parameter(Mandatory=$true, Position=1, ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true)]
$PIPointName)
$PIPoint=[OSIsoft.AF.PI.PIPoint]::FindPIPoint($PIServer,$PIPointName)
Write-Output($PIPoint)
}

function Get-QueueValues{
param ( [PSTypeName('OSIsoft.AF.PI.PIPoint')] [Parameter(Mandatory=$true,
Position=0, ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true)] $PIPoint,
[double] [Parameter(Mandatory=$true, Position=1, ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true)] $DurationInSeconds )
# get the pi point and cretae NET list
$PIPointList = New-Object System.Collections.Generic.List[OSIsoft.AF.PI.PIPoint]
$PIPointList.Add($PIPoint)
# create the pipeline
$ArchivePipeline=[OSIsoft.AF.PI.PIDataPipe]::new( [OSIsoft.AF.Data.AFDataPipeType]::Archive)
$SnapShotPipeline=[OSIsoft.AF.PI.PIDataPipe]::new( [OSIsoft.AF.Data.AFDataPipeType]::Snapshot)
# add signups
$ArchivePipeline.AddSignups($PIPointList)
$SnapShotPipeline.AddSignups($PIPointList)
# now the polling
$EndTime=(Get-Date).AddSeconds($DurationInSeconds)
While((Get-Date) -lt $EndTime){
$ArchiveEvents = $ArchivePipeline.GetUpdateEvents(1000);
$SnapShotEvents = $SnapShotPipeline.GetUpdateEvents(1000);
$RecordedTime=(Get-Date)
# format output:
foreach($ArchiveEvent in $ArchiveEvents){
$AFEvent = New-Object PSObject -Property @{
Name = $ArchiveEvent.Value.PIPoint.Name
Type = "ArchiveEvent"
Action = $ArchiveEvent.Action
TimeStamp = $ArchiveEvent.Value.Timestamp.LocalTime.ToString("yyyy-MM-dd HH:mm:ss.fff")
QueueTime = $RecordedTime.ToString("yyyy-MM-dd HH:mm:ss.fff")
Value = $ArchiveEvent.Value.Value.ToString()
}
$AFEvent.pstypenames.Add('My.DataQueueItem')
Write-Output($AFEvent)
}
foreach($SnapShotEvent in $SnapShotEvents){
$AFEvent = New-Object PSObject -Property @{
Name = $SnapShotEvent.Value.PIPoint.Name
Type = "SnapShotEvent"
Action = $SnapShotEvent.Action
TimeStamp = $SnapShotEvent.Value.Timestamp.LocalTime.ToString("yyyy-MM-dd HH:mm:ss.fff")
QueueTime = $RecordedTime.ToString("yyyy-MM-dd HH:mm:ss.fff")
Value = $SnapShotEvent.Value.Value.ToString()
}
$AFEvent.pstypenames.Add('My.DataQueueItem')
Write-Output($AFEvent)
}
# 150 ms delay
Start-Sleep -m 150
}
$ArchivePipeline.Dispose()
$SnapShotPipeline.Dispose()
}

These 2 scripts are all you need to monitor events coming into a single server. The data latency is simply the difference between the value's time stamp and the time recorded.

Measuring the data latency between 2 servers - for example a local and an enterprise server - can be done the same way. You just need 2 server objects and then monitor the snapshot (or archive) events.

unction Get-Server2ServerLatency{
param ( [PSTypeName('OSIsoft.AF.PI.PIPoint')] [Parameter(Mandatory=$true, Position=0,
ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true)] $SourcePoint,
[PSTypeName('OSIsoft.AF.PI.PIPoint')] [Parameter(Mandatory=$true, Position=1,
ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true)] $TargetPoint,
[double] [Parameter(Mandatory=$true, Position=2, ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true)] $DurationInSeconds )
$SourceList = New-Object System.Collections.Generic.List[OSIsoft.AF.PI.PIPoint]
$SourceList.Add($SourcePoint)
$TargetList = New-Object System.Collections.Generic.List[OSIsoft.AF.PI.PIPoint]
$TargetList.Add($TargetPoint)
# create the pipeline
$SourcePipeline=[OSIsoft.AF.PI.PIDataPipe]::new( [OSIsoft.AF.Data.AFDataPipeType]::Snapshot)
$TargetPipeline=[OSIsoft.AF.PI.PIDataPipe]::new( [OSIsoft.AF.Data.AFDataPipeType]::Snapshot)
# add signups
$SourcePipeline.AddSignups($SourceList)
$TargetPipeline.AddSignups($TargetList)
# now the polling
$EndTime=(Get-Date).AddSeconds($DurationInSeconds)
While((Get-Date) -lt $EndTime){
$SourceEvents = $SourcePipeline.GetUpdateEvents(1000);
$TargetEvents = $TargetPipeline.GetUpdateEvents(1000);
$RecordedTime=(Get-Date)
# format output:
foreach($SourceEvent in $SourceEvents){
$AFEvent = New-Object PSObject -Property @{
Name = $SourceEvent.Value.PIPoint.Name
Type = "SourceEvent"
Action = $SourceEvent.Action
TimeStamp = $SourceEvent.Value.Timestamp.LocalTime.ToString("yyyy-MM-dd HH:mm:ss.fff")
QueueTime = $RecordedTime.ToString("yyyy-MM-dd HH:mm:ss.fff")
Value = $SourceEvent.Value.Value.ToString()
}
$AFEvent.pstypenames.Add('My.DataQueueItem')
Write-Output($AFEvent)
}
foreach($TargetEvent in $TargetEvents){
$AFEvent = New-Object PSObject -Property @{
Name = $TargetEvent.Value.PIPoint.Name
Type = "TargetEvent"
Action = $TargetEvent.Action
TimeStamp = $TargetEvent.Value.Timestamp.LocalTime.ToString("yyyy-MM-dd HH:mm:ss.fff")
QueueTime = $RecordedTime.ToString("yyyy-MM-dd HH:mm:ss.fff")
Value = $TargetEvent.Value.Value.ToString()
}
$AFEvent.pstypenames.Add('My.DataQueueItem')
Write-Output($AFEvent)
}
# 150 ms delay
Start-Sleep -m 150
}
$SourcePipeline.Dispose()
$TargetPipeline.Dispose()
}

Here is a quick test of a PI2PI interface reading and writing to the same server:

Get-Server2ServerLatency $srv $srv sinusoid sinusclone 30

As you can see the difference between target and source is a bit over 1 sec, which is to be expected since the scan rate is 1 second.

SUMMARY

Data latency is a key metric for every system that captures, stores, analyses, or processes data. Every sequential operation will add to the overall system latency and must be accounted for. It is not only the data transport over networks that is the major contributor, but also data queues that facilitate the packaging of data into messages that add significant delays. This topic is especially important for cloud-based systems that rely on on-premises sensor data.

As shown in this blog, data latency can and should be measured and be part of the architectural planning process. As a rule of thumb, sub second data latencies are challenging especially when the number of data sources increases.

Please contact us for more information.

Machine Learning (ML) has seen an exponential growth during the last five years and many analytical platforms have adopted ML technologies to provide packaged solutions to their users. So, why has Machine Learning become mainstream?

Let’s take a look at Technically Multivariate Analysis (MVA). While many algorithms have been widely available for a long time, MVA is still considered a subset of ML algorithms. MVA typically refers to two algorithms:

As such, MVA has become a de facto standard in manufacturing batch processing and others. Some typical use cases are:

In principle, industrial datasets are not different from other supervised or unsupervised learning problems and they can be evaluated using a wide range of algorithms. Multivariate Analysis was preferred because it offered global and local explainability. MVA models are multivariate extensions of the well understood linear regression that provide weights (slope) for each variable. This enables critical understanding and optimization of underlying process dynamics which is a very important aspect in manufacturing.

NEW CHANGES IN INDUSTRIAL MACHINE LEARNING

In the past, many ML algorithms were considered black box models, because the inner mechanics of the model were not transparent to the user. These model types had limited utility in manufacturing since they could not answer the WHY and therefore lacked credibility.

This has very much changed. Today, model explainers in ML are a very active field of research and excellent libraries have become available to analyze the underlying model mechanics of highly complex architectures.

The following shows an example of applying ML technologies to a typical MVA project type. In the original publication (https://journals.sagepub.com/doi/10.1366/0003702021955358 ), several preprocessing steps have been studied together with PLS to build a predictive model. All steps were performed using commercial off the shelf software that manually worked the analysis.

Using ML pipelines, the same study can be structured as follows:

pipeline=Pipeline(steps= [('preprocess', None), ('regression',None)])
preprocessing_options=[{'preprocess': (SNV(),)},
                       {'preprocess': (MSC(),)},
                       {'preprocess': (SavitzkyGolay(9,2,1),)},
                       {'preprocess': (make_pipeline(SNV(),SavitzkyGolay(9,2,1)),)}]

regression_options=[{'regression': (PLSRegression(),), 'regression__n_components': np.arange(1,10)},
                    {'regression': (LinearRegression(),)},
                    {'regression': (xgb.XGBRegressor(objective="reg:squarederror", random_state=42),)}]
param_grid = []
for preprocess in preprocessing_options:
    for regression in regression_options:
        param_grid.append({**preprocess, **regression})
search=GridSearchCV(pipeline,param_grid=param_grid, scoring=score, n_jobs=2,cv=kf_10,refit=False)

This small code example manages to test every combination of prepossessing and regression steps, then automatically select the best model. [A combination of SNV (Standard Normal Variate), 1st derivative and XGBoost showed the highest cross validated explained variance of 0.958].

The transformed spectra and the model weights can be overlaid to provide insights into the model mechanics:

Conclusion

Multivariate Analysis (MVA) has been successfully applied in manufacturing and is here to stay. But there is no doubt that Machine Learning (ML) data engineering concepts will be widely applied to this domain as well. Pipelines and autotuning libraries will ultimately replace the manual work of selecting data transformation, model selection and hyper parameter tuning. New ML algorithms and Deep Learner, in combination with local and global explainer, will expand Manufacturing Intelligence and provide key insights into Process Dynamics.

Special Thanks

Thanks to Dr. Salvador Garcia-Munoz for providing code examples and data sets.

For more information, please contact us.

Detailed equipment & batch data models set up by pharmaceutical and biotech companies have enabled the creation of equipment centric machine learning (ML) models for example, batch evolution monitoring. The next step is to extend the existing equipment centric models and create process or end-to-end models.

The challenge is that the current data models do not fully support the extension:

·        Equipment models are based on the ISA-95 structure and reflect only the physical layout of the manufacturing facilities.

·        Batch Execution Systems (BES) are integrated using ISA-88 and entail only equipment that is controlled by the batch execution system. Often BES systems are set up to execute single unit procedures and subsequent processing steps are executed separately.

·        Management Execution Systems (MES) typically map the entire process and material flow but as a level 3+ system is difficult to integrate into a data modelling pipeline.

·        There are also facilities that use paper-based process tracking instead of MES\BES, which makes traceability even more challenging.

Batch-to-Batch traceability can quickly become very complex especially when many different assets are involved. The following shows an example of a reactor train in a biotech facility:

It shows all the different product pathways from reactor ‘01’ to the final processing step, as an example in red: 01, 11, 22, 33, 44. At any moment in time, the other reactors are either being cleaned or used for a parallel process.

Such a process is difficult to model in a BES or MES system and real time visibility or historical analysis is very challenging. This is especially true if subsequent processing steps are to be included (Chromatography, Fill and Finish, ....)

The missing link to model the different pathways is to integrate each transfer between reactors or equipment. OSIsoft AF offers the AF Transfer model that is fully integrated in the AF system. AF Transfer event can be defined with the out-the-box properties:

·        Source Equipment

·        Destination Equipment

·        Start Time

·        End Time

The AF Transfer model has a lot of the same features that AF event frames offer. Transfers can be templated and through the in-and-outflow ports defined in different granularities.

Once the transfer between equipment has been defined, batches can be traced back in real time with or without using the batch id. This is possible through the equipment and time context of the transfer model:

In this case, starting from the end reactor ‘44’ all previous steps can be retraced by going backwards in time and using the source-destination equipment relationships

The implementation requires a data reference to configure each transfer. The configuration user interface requires the following attributes:

·        Destination Element: Attribute of the destination Element

·        Name: Name of the transfer

·        Optional: Description, Batch Id and Total

The result is transfer logs can be matched up to the corresponding unit procedures by time and equipment context as shown below:

As shown in this example, the end time of transfer log 'Transfer Id S7MZUDGK' matches the start time of unit procedure: "Batch Id WNJ6H99R". The entire pathway can now be reconstructed in one query.

Conclusion

The sequence of discrete processing events such as unit procedures can be modelled using the OSIsoft AF Transfer class. The resulting transfer logs allow retracing the process backwards in time by using the source-destination relationship of the transfer model. Modelling the process flow is key to expanding equipment centric ML models.

Please contact us for more information.

© All rights reserved.