A Technical Yet Accessible Guide to Spain’s Blackout Investigation: Week 1
I’d like to think that, despite all the non-technical issues surrounding the Iberian blackout, there are still people genuinely interested in using the inertia (pun intended) of this historic event as an opportunity to deepen their knowledge of the Spanish power grid (or power grids more generally), and to be better prepared to understand or even help prevent (‘Cyber-informed engineering’ is a concept I like) potential cyber-related scenarios in the future. If that’s the case, you might find this guide useful.
In this first week, I’ll dive into the cyber-physical side of photovoltaic (solar) power plants, since official statements in the days following the blackout pointed to widespread generation losses in this type of plant, in the southwest of Spain.
Let's begin.
The Cyber Dimension of the Blackout
Over the weekend, the Spanish government announced the formation of two working groups to investigate the root causes of the blackout, one focused on the 'technical error' angle and the other on 'cyber'. While the two areas are clearly interconnected, splitting the effort allows for more targeted expertise and analysis. The cyber investigation involves Spain’s National Cryptologic Centre (CCN) and, by extension, the National Intelligence Centre (CNI), along with INCIBE. Together, they are conducting a cybersecurity inquiry to determine whether a cyberattack played a role in the blackout.
One of the initial steps was to verify that REE’s control centers were secure, something that now appears to have been confirmed. However, the Spanish power grid includes a wide range of private companies, particularly in the renewable energy sector. As reported by ElEconomista, the CNI and CCN are currently assessing both the physical (including personnel) and cyber security posture of the 36 REE-authorized Generation Control Centers (CCGs). While such evaluations are standard in sectors like nuclear, where the focus is on preventing sabotage, terrorism, and proliferation, this level of scrutiny is surely new for other types of generation facilities.
These CCGs are run by private companies and authorized by REE to ensure generation plants comply with Spanish grid codes, serving as the telemetry and telecontrol interface between specific groups of plants and the appropriate REE's control center. In particular, photovoltaic and other renewable power plants must choose one of the 34 authorized CCGs, out of the 36 available.
Here’s a diagram I made to make things a bit clearer. It reflects the massive challenge, from the cyber perspective, of modern, and decentralized power grids, especially in countries where renewable energies (decentralized in nature) play a key role (56% of the Spanish power mix in 2024). Let's elaborate on it.
*Those terms in Spanish will be bolded for clarity.
The CECRE
The CECRE (Centro de Control de Energías Renovables) is a specialized control center operated by the Spanish electricity transmission system operator (TSO, Red Eléctrica de España -REE-), and dedicated exclusively to handle the supervision and control of renewable energies. Some readers might be wondering why a separate control center, distinct from REE's main center (CECOEL), is necessary.
Renewable energies pose a very specific technological challenge for the interconnected power grid. This is acknowledged by every tech person involved in this field, modeled in every paper published on the topic, carefully described in official incident reports across countries, and obviously there are many different technological solutions implemented to address some of these challenges: synthetic inertia, batteries..This doesn't mean these sources are unreliable, it just means power grids need to adapt to them. The good news is that REE has been working for years to ensure the Spanish grid can effectively integrate this essential form of clean energy (hopefully with nuclear power in the mix as well, which would position Spain as a leading zero-emission, science-friendly country). For some historical context, in an article from a few years ago describing the CECRE architecture, we find the following introduction:
"However, the push by European and national administrations to promote the development of wind-based generation should not inhibit critical analysis that highlights the aspects of these technologies that need improvement. This is not a limitation, but rather an opportunity to carry out actions which, if taken in the right direction, will maximize the integration of wind generation into the broader mix of technologies necessary to meet electricity demand. Among these actions is the CECRE"
All renewable energy generation plants with a capacity over 1 MW are required to report telemetry data (active power, P) every 12 seconds to CECRE via the CCGs. For plants over 5 MW, CECRE also requires additional telemetry, such as reactive power (Q) and voltage, and the ability to receive and execute remote control commands, including setting active power set-points. All this information flows through the CCGs (CCRs in the diagram below), there is no direct connection between the CECRE and the renewable energy generations plants.
The communication between the CECRE and CCGS uses the ICCP protocol over the internet, via secured and redundant communications links.
It's worth mentioning that some of the CCGs are operated by energy giants such as, Iberdrola, Naturgy, Repsol or Endesa, which also participate from non-renewable, strategic generation plants designed as critical infrastructures, so a mature security posture for, at least, some of these CCGs should be assumed.
2nd Order Control Centers
Now things start to get more complex. There is a growing number of utilities operating renewable energy plants, either through their own control centers or by outsourcing to managed service providers. These second-order control centers, let’s call them SOCCs, aren’t directly connected to REE; instead, they interface with the CCGs. It’s the responsibility of the CCGs to coordinate with the SOCCs and agree on the technical requirements needed to enforce REE’s grid requirements and ensure proper data transmission. On top of that, SOCCs also handle the day-to-day operation and maintenance (O&M) of the generation assets they manage
To illustrate the issue, let’s focus on a photovoltaic power plant, since REE initially linked the observed generation losses to this type of facility. I’ve put together the following diagram to help keep everyone focused.
The Power Plant Controller (PPC) is the 'brain' of the plant. It monitors and controls the operation in real time, coordinating all inverters, and managing the plant’s response to TSO's signals (in this case, CECRE's orders sent via the CCGs). The PPC also receives a series of PQ measurements from the Point of Interconnection (POI), where the generated electricity is injected into the grid, which allows it to precisely regulate power output based on what’s actually reaching the grid.
The utility (via SOCCs) will also connect remotely to carry out O&M tasks, so ideally, there should be a solid security posture in place to prevent any unauthorized access, on top of proper network segmentation within the plant. That said, in this case, I have serious doubts about how well these deployments were actually implemented. It seems some companies have done their job properly securing these assets, but others are still far behind. For example, as I reported to INCIBE last week, many of these systems are publicly accessible due to misconfigured VPNs, which ended up exposing all sorts of local network components to the internet, including PLCs and SCADA systems. Not a good sign, if you ask me.
Spain's Sistema de Reducción Automática de Potencia (SRAP) - Automatic Power Reduction System -
In this context, one of the cyberattack scenarios I spent some time analyzing last week involved the possibility of a large-scale, coordinated, and simultaneous attack that could have triggered a sudden and massive disconnection of photovoltaic (PV) power plants. Keep in mind that within less than 5 seconds, 15 GW of generation capacity was lost, most of it reportedly from PV sources.
Before going any further, I want to make it clear that this is just a scenario I analyzed because I think it offers useful insight into how the Spanish power grid operates. I'm not suggesting that this is what actually happened.
In 2022, REE introduced a system known as SRAP (Sistema de Reducción Automática de Potencia, or Automatic Power Reduction System). The goal of this grid code update was to enhance grid stability in response to the growing integration of renewable energy. Under this system, the TSO (REE) can instruct those plants that have voluntarily enrolled in the SRAP program, to reduce their power output at any time. If you're wondering why a company would choose to enroll in an unpaid, voluntary program...there are indirect economic benefits which are related to how the energy market works.
Although there are no official figures on how many PV plants are currently enrolled, based on unofficial data from 2022 and 2023 and accounting for the continued growth of PV in the energy mix, I estimate that around 35% of PV plants may be now part of this program, representing roughly 11 GW of capacity.
SRAP defines three response modes for power reduction: fast, medium, and slow. In fast mode, the PV plant has to drop its power generation to zero in under 5 seconds, which, I admit that instantly left me thinking...
Three Incidents of Generation Loss
Yesterday, the government announced that a preliminary data analysis has identified a third generation loss event, which occurred 19 seconds before the blackout.
I checked GridRadar’s Malaga PMU graph, and you can clearly spot the generation loss,as the frequency shows a drop.
Just to be sure, I reached out to Rafael Segundo (via a LinkedIn comment) from the Zürich University of Applied Sciences, where a team seems to be actively researching the oscillations that preceded the blackout, and they confirmed it.
Another thing I mentioned in that comment, and maybe some of the more experienced readers can weigh in, is the possibility of estimating how many MW were 'lost' during those frequency drops. That could help us better understand the type of plant (or group of plants) involved in the events.
Apparently there are standard ways to calculate this, but it would requires knowing the frequency BIAS coefficient, which the ENTSO-E establishes for each European grid zone every year, which does not seem to be public for Spain.
So I tried to estimate it using the total TSO K-factor for the Continental Europe grid and Spain’s approximate share of it, around 7% for the peninsula, which gives a bias of roughly 2000 MW/Hz. That means a 50 mHz frequency drop, like the one observed during the second generation loss event at 12:33:16, would correspond to a generation loss of about 100 MW, which, coincidentally, matches the statement made by the Spanish photovoltaic industry representative
"We don’t know what kind of incident occurred at those two plants in the southwest of Spain, presumably located in Huelva or Extremadura, but it really doesn’t add up that one or two plants of at most 100 megawatts (MW) could cause the collapse of an entire power system"
Conclusions
A ‘black swan’ is a, non-cyber, term that gained popularity years ago also in the cyber sphere: basically it's a kind of unpredictable, or extremely rare event which has major consequences. Should we consider the Iberian blackout a ‘black swan’? Well, I guess that yes and no. Objectively it’s an extremely rare event which was only hypothesized, but, so far, let’s be cautious with regards to its unpredictability because the initiating event is still unknown.
Remember that complex cyber-physical systems that sustain critical infrastructures, such as nuclear power plants, the power grid or those safety systems onboard aircraft are designed to withstand a single point of failure. That’s the reason when something really bad happens, it is assumed that a chain of errors built up to create the fatal scenario.
In the absence of any obvious physical failure, like fires or damaged lines, what triggered the oscillations in the first place? Are these oscillations related to the subsequent generation losses? Common Cause Failure from electronic power devices (such as inverters) to these specific oscillations?etc,...
There are still many questions and a wide range of potential causes that the experts from the working groups will undoubtedly uncover. Keep in mind that this is a European-level investigation. In that context, it’s worth reading the statement from France's RTE, which helps set the record straight on some of the misinformation that’s been circulating.
Please, let’s try to keep everyone as well-informed as possible to prevent magical thinking, conspiracy theories, and ignorance from taking over. We all know what comes next.
Management Executive at Idfon Power Engineering Consultants (iPEC) Limited
3moThey have come up with the likely reason and vindicated my position. Thanks. Miscalculation by Spanish power grid operator REE contributed to massive blackout, report finds | Reuters https://www.reuters.com/business/energy/investigation-into-spains-april-28-blackout-shows-no-evidence-cyberattack-2025-06-17/ They didn't have enough conventional generation to renewable electricity generation ratio defined by the Idowu Oyebanjo factor K as highlighted earlier. If it took a minute to make this conclusion based on the results of my PhD thesis, where I stated clearly that electricity systems with significant penetration of renewable electricity will witness embarrassing grid collapse events until they undertake some of the solutions I propounded, it means there is something in the way of further research and development into my PhD thesis in order to resolve the scenarios of grid collapses the world will witness in the aggressive rush towards a low carbon economy. The good news is that we have the intellectual capacity to determine the solutions to the problem! Up University of Manchester The University of Manchester
Energy Economist, Entrepreneur, Commodity Trader, Author, Investor & Strategic Advisor (also trade finance, sustainability, and tech) - ex BCG / INSEAD, Member Int'l Assoc. for Energy Econ. IAEE =>opinions are my own
4mosome more details here on the blackout https://unpopular-truth.com/2025/05/16/blackouts-what-causes-them/
CEO / Head of ICS/OT Security bei GAI NetConsult GmbH
4moInteresting and very well written, thanks for sharing Ruben Santamarta. According to the first preliminary report of ENTSO-E (https://www.entsoe.eu/news/2025/05/09/entso-e-expert-panel-initiates-the-investigation-into-the-causes-of-iberian-blackout/) the two (or three?) generation power losses accounted for about 2.2 GW which is quite a high number: ----- * Starting at 12:32:57 CET and within 20 seconds afterwards, presumably a series of different generation trips were registered in the south of Spain, accounting to an initially estimated total of 2200 MW. * No generation trips were observed in Portugal and France. As a result of these events the frequency decreased and a voltage increase is observed in Spain and Portugal. Between 12:33:18 and 12:33:21 , the frequency of the Iberian Peninsula power system continued decreasing and reached 48,0 Hz. * The automatic load shedding defence plans of Spain and Portugal were activated. At 12:33:21 CET, the AC overhead lines between France and Spain were disconnected by protection devices against loss of synchronism. * At 12:33:24 CET, the Iberian electricity system collapsed completely and the HVDC lines between France and Spain stopped transmitting power.
Very interesting Ruben Santamarta. Thanks for sharing. From a risk perspective I think that we should analyze two events separately. One is the still unknown cause of the initial perturbation, that is probably not a black swan and might even have a high probability of ocurrence, but an estimated lower impact. The second one is the fact that such perturbation could tear down the entire system. And this is the real black swan. Something that never happened in the last 100 years and that should have never happened, as the system is designed to withstand any major disruption. It is important to find out the triggering cause, for prevention and liability purposes. But there might be other types of physical or cyber events with similar effects in the future. The most important thing in my opinion is to find out why the system was not capable of withstanding the oscillation and solve it asap.
CEO en Alias Robotics
4moThanks Ruben for another excelent piece of work.