August 5, 2024

Data Quality - The Elephant in the Room

Adrian Turner

Chief Technology Officer

At the risk of stating the obvious, the power of embedding data science automation into tower networks relies entirely on the quality and quantity of the available data. One might be forgiven then for assuming that without the input of high-quality ubiquitous data from individual sites, there’s no way to deliver visibility and optimisations to a network.

This issue sits front and centre when we open discussions with new customers, and the regular concern we hear is this:

since systems are only as powerful as their weakest element, and PowerX doesn’t manufacture on-site data collection systems, how can you provide the highest quality automation?

This is a valid question, and not without merit. Remote and reliable data availability has been one of the biggest hurdles preventing the industry from moving faster to data-driven decisions, and there are indeed some specific data challenges facing the tower industry right now. Connectivity gaps can prevent site data being available. Elements of the powertrain may not be connected for central monitoring. Sensor data might be inconsistent with power equipment telemetry – sometimes erroneous, and occasionally missing entirely. These are familiar concerns that we address daily.

None of these problems are insurmountable

and more importantly, the quality and volume of existing data streams are already unlocking improvements that sit untapped in this sometimes less-than-perfect data. And these improvements – delivering increased efficiencies, extending asset life, reducing fuel costs and emissions – are providing sound incentives for the wider data gaps to be addressed one by one.

The key message we deliver is this: data fidelity is not an ‘all-or-nothing’ job. Tower operators shouldn’t sit waiting for perfect data on every site before they act, because the returns on data science-led optimisation are already being realised. As data quality and quantity improves, these benefits will only grow and become more essential to network growth and profitability.

Some background: RMS systems were installed originally to meet a contractual obligation to monitor sites. It’s nobody’s fault, but at the time not enough focus was placed on connecting all moving parts of the powertrain and on collecting granular data. The evolution of IoT (internet of things) allows for much more granular and accurate data to be captured, raising awareness of the need for data-driven decisions. But as we move toward a data-centric model of tower management, this legacy has made the collection of robust data a continuing challenge.

But not all data is the same, and the situation is not as dire as many imagine.

Data is our raw material, and PowerX has developed a robust set of analytical tools used to evaluate its quality against several benchmarks. The following graphs represent real analyses of data taken from a representative sample of tower sites that our platform monitors.

For all the graphs, the vertical axis shows the value of a data input, calculated using metrics which are synthesised into an overall quality percentage rating. The horizontal axis represents individual tower sites distributed across this axis.

This first graph shows an aggregation of all the data sources from a representative selection of tower sites in our key markets. Collectively, the unified data sources sit in the 30-80% quality range, but that’s a broad spread and doesn’t provide much granularity. We must dig into the details to see why there is such a range.

Breaking it down by type, we can look at the quality of data received on a tower’s overall system load. Almost all this reporting sits in the 80-100% quality range, making it good quality data, instantly available for data science analysis and optimisation. This metric is crucial for understanding the changing nature of energy demand on individual sites so that power operations and planning teams can dimension sites accordingly when telecom demand changes.

Next, the quality of reporting on the metrics of grid power – one of the metered points that are installed when the utility supply is set-up. Installing meters is a key enabler for accurately reconciling actual energy consumption vs. predicted costs with the utility companies. It’s also critical for measuring grid quality and understanding at a granular level how much energy is consumed by individual tenants. This telemetry identifies any deterioration in grid quality or classification, which could lead to poor voltage or outages that require alternative power to deliver quality of service and maintained uptime. Again, this is great data overall and there are a multitude of analyses and optimisations that can be applied using this telemetry.

The following graph – solar – is where we witness a slight dip in quality. The best data sources peak at around 80%, with most of the data sitting in the 60-80% range. Many of these gaps are caused by equipment failure, which for solar controllers is often precipitated by overheating when controllers are not suitably matched to the voltage delivered by the arrays. Sometimes data mappings may hide solar telemetry under other data groupings, making solar data appear as if it is missing. Many of these issues can be solved with data science – unpicking solar from other data streams, monitoring arrays and controllers for predictive equipment failure, and using asset registers to ensure that controllers are correctly sized for array voltages.

It’s when we start looking at data for two other typical elements of a powertrain – fuel and generators – that we start to see a decline in data quality. For gensets, the data quality is polarised (or bimodal if you’re a statistician). At the bottom are the older generators – many pre-dating IoT. At the top, newer models that capture a much wider and detailed array of variables.

Generators have a lifespan of around 20,000 hours, and so the older models at the bottom of the vertical axis are being continuously and organically replaced. Each time that happens, one more datum point moves from the bottom of the graph to the top. Theoretically.

Why theoretically? Modern generators can deliver an incredibly diverse range of telemetries. But when planning or procurement looks at trimming installation costs, one of the first casualties is the generator controller – the unit that gathers all these data sources and passes them upstream to the operations team. It seems a sensible decision at the time – surely the NOC doesn’t need to see all those variables? – but this is a false economy. Failing to invest in broad spectrum controllers blocks key use cases such as expected vs. actual CPH, generator efficiency, fuel integrity and a host of other warning signs and inefficiencies that can have a profound impact on fuel usage and generator life expectancy.  

The same holds true of the battery data. Although it exhibits a varied profile now, we’re seeing similar improvements coming down the line as the newest lithium-ion battery technology gets adopted during upgrade projects, with more detailed telemetry on battery performance. Like the genset data, PowerX is seeing steady and dependable increases in the quality and granularity of this data.

Unfortunately, fuel data continues to be suboptimal.

Much more investment is needed in the industry in the quality and robustness of fuel sensors. This compounds and propagates the over-reliance on manual dip-stick measurements – a tried-and-tested ‘old school’ method of gathering information on fuel levels that is simply unscalable (and unacceptable) in today’s IoT age.

Unlike generators and batteries – which are benefitting from the organic process of ageing out and being replaced – there’s no similar process happening with fuel sensors, which is ironic given that fuel theft in its many and increasingly sophisticated incarnations represents a significant impact on a tower network's OpEx. PowerX is actively challenging vendors in this space to step up and tackle the issue because it impacts the bottom line all the way through the value chain.

We’re also confronting this issue ourselves head-on.

For sites without existing RTU / RMS for data capture, PowerX Connect proprietary firmware bridges the physical assets on a site and the PowerX Sentry cloud-based module of our platform. Firmware can be updated over the air to ensure that the latest software is continually operating. Data is stored locally in case of communication failure, and PowerX maintains all drivers as well as downloading the latest firmware onto the device on-site when required.

But for all these metrics – system load, grid, solar, battery, genset or fuel – the data (although not always perfect) is granular, robust, accurate and reliable enough for optimisations to be effective and far-reaching. Data science delivers significant benefits even with some of the problematic data identified in our network analyses. And as TowerCos and Mobile Operators continue what they’re doing already – upgrading sensors, replacing assets, increasing points of telemetry – these benefits are only going to increase and become more essential to the growth and profitability of tower networks.

There’s still work to be done improving data quality, but we’ve seen a step change in awareness, demand, commitment, and availability of solutions to improve data quality. It bears repeating – this is not an ‘all or nothing’ approach. Waiting for ‘perfect data’ before implementing data-based optimisation could end up costing tower companies millions in lost efficiencies.

related Blog POsts