A NEW MODEL FOR RESILIENCE

FOR ORGANIZATIONS OF ALL SIZES

This page was previewed in the Fall edition of ISE Magazine published Sept./Oct. 2025.

A Guide to Organizational Resilience in Changing Times

A typical definition of Resiliency is: Ensuring that the operation and purpose of the organization are fulfilled under any circumstances. This means anticipating, withstanding, adapting to, and swiftly recovering from accidents, failures, and deliberate attacks as seamlessly as possible. It requires a plan that responds dynamically to changing conditions, incidents and financial considerations spanning the whole organization and beyond.

The definition itself has changed little over the years but its scope today is drastically different.

Cybersecurity and Resilience

Cybersecurity is a monster topic with hundreds of actions. It is stressful since the famous “you are only as strong as your weakest link” runs the days and nights of cybersecurity professionals. 

Resilience has always been a topic with the same stress-levels run by the mantra “what could possibly go wrong?” For decades the risk to organizations have been largely of their own making or caused by faults that might be predicted or not. They were seldom caused by deliberate attacks from outside or within. 

r01

How the world has changed. Resiliency has now taken on the role of a critical topic – effectively a superset of cybersecurity. Next, this page shows the big picture, summarizing how the pieces fit together with Resilience becoming the foundation for the governance of the organization.

resiliencecybersecurity

This framework shows how executive decisions on cybersecurity translate directly into organizational resilience. Cybersecurity must be treated as an executive imperative — policies, zero trust, asset stewardship, and supply chain controls. When these imperatives are acted on, they deliver resilience outcomes the board can measure continually against risk tolerance, verified response, and compliance.

In summary, Resiliency has evolved to become the foundation that ensures the organization can deliver its purpose.

Scope and Accountability

With the integration of Zero Trust, cybersecurity and resilience, together with supply chain management and business impact, it naturally leads to a new accountability in organizations. It calls for the creation of a new executive position on every board: Chief Resiliency Executive/Officer (CRE/CRO). The title is less important than the role and accountability.  This develops a more strategic balance between business and technical requirements than previous approaches. 

The Resiliency Plan

The plan is the implementation of a methodology that anticipates, withstands, and recovers from adverse events, constantly adapts to new conditions, and is continually assessed. Any actions planned must prioritize and align with the operational and financial considerations of the organization.

The scope of the plan is considerable, reflecting the many causes of disruption. They can stem from poor architectural decisions, human errors, insufficient cyberattack detection or response, software or data compromises, critical infrastructure failures, poorly planned mergers, supply chain failures, untested failover procedures, etc.

It’s a documented, all-encompassing framework that must prioritize actions according to the purpose of Resiliency. It has Reactive Aspects (e.g., incident discovery and recovery) and Proactive Aspects (asset stewardship and protection—Disaster Recovery being a subset, IT, network, system and app software, security policy and ZT implementations, supply chain integrity, ZT aspects of HR, policies, etc.).

The cybersecurity mantra “you are only as strong as your weakest link” applies to Resiliency. It was so different looking back at Disaster Recovery software I wrote for a client in simpler times, 30-plus years ago. Now they are not, as the chart below shows. Implementing Resiliency is like jumping into a moving car where repairs or upgrades are done while it’s in motion and under increasing attack without the passengers noticing!

Yes, business continuity is still the organizational purpose that lives inside the data centers, the cloud-based containers where the software applications operate and micro-segmented data are stored. 

res1

Network, IT and Security Best Practices

If you have ever wandered into your data center and wondered why a device made by a provider that no longer exists is still connected and the lights are still on, then you know the extent of the problem. Perhaps your providers’ network has failover devices that have been untested since before their last merger. Perhaps you have an inventory of IoT devices or systems that are either obsolete, have direct Internet connectivity or are just not changeable.

Clearly, if the infrastructure is compromised, access to the systems and software applications could be lost or degraded. Tier 1 service providers must provide alternate paths even if the performance is downgraded. So, best practices involve ensuring that there are physically separated routes to the network; should one path fail, then there must be a way of prioritizing key applications or data access.

flowchart fig2

Asset Stewardship

Threats that target vulnerable asset systems and applications software, intellectual property, databases, and customer-sensitive data must be checked, encrypted, and securely backed up. The above chart requires that backups be made without connection to the internet to enhance their integrity. It may be inconvenient but having a physical or air-gapped backup is preferred. The chart also shows microsegmentation—the separation of software apps and data so that a compromise is limited to a small element of the system.

Prevention

It’s important with today’s sophisticated and evolving Ransomware as a Service attacks that threats be detected before they strike. See the feature on breach detection at cybyr. com/assumebreach.

An Incident Recovery Plan

An Incident Recovery Plan lays out what actions to take when compromise, system failure, or performance degradation occurs, laying out failover actions. The priority is always the integrity of mission-critical systems and data.

Financing: Resilience as a Profit Center

Resilience and Cybersecurity can be costly if not carefully implemented. However, they have become a competitive necessity.

The recommendation is to calculate the number of customers who will only do business with organizations who have comprehensive and well-documented cybersecurity procedures. Multiply by the revenue generated from them. The second part of the equation is to divide the cost of the actions by the number of customers to see the net positive impact to profitability per customer/client. This can also be leveraged as a competitive sales and marketing advantage to create additional business revenue. As resilience evolves, the impact becomes measurable and feeds back into the decisions on threat appetite when deciding what new actions are affordable (or not).  

Verification

Best practices demand that data and systems are restored and then verified. For example, checking to see that they are malware-free. In larger organizations, mere duplication of key systems may not be sufficient, especially when the infrastructure spans several countries connected via local Telcos and MSPs.

If it isn’t (continually) tested, then it isn’t resilient.

So, “What could possibly go wrong?” My decade of working for a network test company showed that the unexpected almost always happens. New architectures throw up unexpected challenges or unacceptable performance occurs in failover mode. Everything must be examined.

Important: While it is not the responsibility of those running the Resiliency program or project to do the testing. It is their responsibility to collaborate with those doing the testing to verify that it was done and was documented. This is obviously dependent on the size of the organization.

Examples of Testing to Ensure Resilience

  • Network infrastructure and services change or are “improved” frequently and must be assessed. This applies to the integrity of the networks, Cloud or branch-based services, connected devices and protective security software. This is pure Zero Trust thinking applied to networks and software.
  • Many features that enable failovers should be tested once and then again when software changes are automatically updated. The automatic discovery of what priority is always the integrity of mission-critical systems and data.
  • This necessarily involves collaboration and verification with service providers. We all know that poor regression testing from software suppliers, unverified by their customer, can be disastrous. Interrelated software may have similar resilience issues, but that is not in the scope of this article.
  • Testing of Resiliency is completely aligned to the Zero Trust Principle “Never Trust, Continually Verify.”
  • Given that Resilience requires primarily proactive preventative actions and an incident plan to react to events that occur. Testing is governed entirely by the methodology and actions that that an organization implements. Those responsible for testing also therefore play a key role in contributing to the development of Resiliency and Incident Recovery plans to help ensure that vulnerabilities are well understood. Developing and documenting the Resiliency Test Plan becomes as essential as the Resiliency Plan itself.

These are some examples of the scope of testing preventative measures.

  • Software from supply chains are particular targets for security attacks so verifying their operation needs to be frequent. This is where a properly configured and tested Zero Trust approach greatly reduces risk. This includes enforcement of identity management, authentication, access policies, security functions, policy enforcement and monitoring, then the testing plan must verify that is the case.
  • One example is testing that attribute-based access control is correctly enforced. I.e., ensuring that a transaction from a privileged user came from an approved device at an approved location and time was preventable with the properly tested Zero Trust Policy Management software. This would have avoided the infamous collapse of a hotel chain’s complete infrastructure even if other factors such as social engineering also occurred.
  • A second example of where Zero Trust’s “Assume Breach” plays an important role in the prevention of Ransomware is a Service attacks. These can test for effective detection of Lateral Movement and other systems Discovery software to prevent access to key data and software as transaction flows called out in Step 2 of the Zero Trust Implementation model traverse the infrastructure.
  • Bridging preventative and incident response resiliency is the backing up and restoration of mission critical information. No testing can be regarded as complete until all backup systems are tested. This means that all systems are restored as error free, malware free and functionally operational.
  • Next, as specified in the Incident Recovery Plan, restoration of key business-related components must be tested until the recovery time to full operations is minimized and seamless. New architectures that throw up unexpected challenges or unacceptable performance that occurs in failover mode. Everything must be examined and verified.
  • Network infrastructure and services change or are “improved” frequently and must be assessed. This applies to the integrity of the networks, Cloud or branch-based services, connected devices and protective security software. Many features that enable smooth failovers should be tested once and again when software changes are automatically updated. The automatic discovery of what is operating in the network is key.
  • Supply chain and collaboration. Many issues impacting the resilience of systems can arise with the installation of new systems, both in terms of functionality and performance. This is where preventative measures and collaboration between developer and client/customer requires collaboration to make sure proper regression testing and staged implementation can avert business-terminating incidents. From a security perspective, it is important to use software written in memory-safe languages, develop using DevSecOps methods and provide software bill of materials, etc.
  • Critical infrastructure organizations typically involve operational networks (OT) so the fragile state of many vulnerable IoT/IIoT devices must not be overlooked, This includes security of elevators, even HVAC systems.
  • Physical Security must also be included. Almost every organization uses cameras and card key systems not just vulnerable wi-fi systems.

Background

For the last several months, I have been working in a Cloud Security Alliance group specifically focused on the Zero Trust aspects of Resilience, including the application of the European Digital Operational Resilience Act. (DORA), the ISO standard for Business Impact Analysis, etc.

Conclusion

Due to the increasing use of Cloud systems, distributed networks and, of course, cybersecurity, Resilience has become another business imperative, so organizations are not victims of disruption, as in “How soon can we make our plan robust, tested and integrated into everything we do?”

We have only scratched the surface of this monster topic to help those in the world of IT and secure networking get a sense of the difference they can make to the success of their organization.  As we said at the beginning, the elevation of Cybersecurity at an executive level has created a similar rise in the role Resilience to deliver the organzation’s business outcomes.

What Could Possibly Go Wrong?

Stewardship and Curation

Stewardship is the developement and governance of a Resileince Plan. This plan covers what should be done ensurance the rapid disccovery and recovery from system failure whether from cybersecurity or other reasons. Asset Curation is the implementation of the plan over time. Both are measureable and reportable as part of an overall management plan and overall security plan.

Curating Your Assets

Asset curation is about knowing what assets you are responsible for, what is the business impact should they be compromised and therefore the priority of what should be protected. Each asset may be vulnerable to attack or loss and to follow Zero Trust strategy, their protection must be continually validated so that business risk is minimized.

The good news is that for small businesses the majority of these processes do not require unbudgeted cost, technical expertise or even great technical experience.

However, asset curation is based on fundamentals. I.e., executive awareness and commitment that responsibility for minimizing business risk is now an essential part of both executive and departmental thinking across the organization and beyond. It does not stop when contracting with suppliers of all kinds. The strength or vulnerability of their assets become your responsibility to verify too. The awareness begins with the realization that effectively cybersecurity has become a war that will affect all organizations. 75% of attacks begin with attacks on data backups.

These are the principles that should be the basis of every organizations security policy and create the context for asset curation.

The Elements of Asset Curation

Discover every asset you have inside and outside the organization and the transactions between them.

Discovery

The process of discovery is at a minimum, living documentation and can be an automated process that is likely to migrate to an adaptive AI based tool as they become commonplace. This information must itself be properly protected since it becomes a blueprint for attacks.

Examples of Assets Together with their Status and value to the organization:

  • Internally held customer/client information including any personal or access information etc., the loss of which could damage the organizations credibility or ability to conduct business.
  • All intellectual property, corporate, financial and customer transactional data and records.
  • Compute hardware, operating system, system and application software and network inventory including current revision, update and maintenance status. Inventory of supplier verification of their similar policies in place. Documentation of known shortfalls and resources and time to remediate
  • Inventory of HR information regarding all staff and contractors in terms of potential insider threats, approved privilege levels, approved physical locations and approved devices used to access corporate assets and training on the use of defensive tools
  • Third parties with data, compute or network hosting services such as MSPs, service providers, integrators and most importantly Cloud providers of compute and storage capabilities. Where clientless operation is selected, care is needed to validate the security of these operations. Responsibility does not stop when you contract with suppliers.
  • Third parties with access to the organization’s assets include CRM systems and any verified plug-ins, externally hosted website plugins including those that grant access to customers, hosted firewalls and the use of automated updates, external organizations that have access to sensitive corporate information such as CPA firms, legal counsels, PR Firms and recruitment companies, physical security companies and their supplied IoT devices.
  • Email systems require special attention including the use of any basic protection in place to limit phishing or other ransomware dangers inadvertently stored.

Management of Assets

  • A strategy and process for the ongoing management of the above assets should include micro-segmentation (the separate storage, access tracking, encryption and configuration of assets) to ensure that attacks on any one asset does not impact all assets.
  • Data encryption should be implemented on all data stored.
  • The creation of a set of rules that govern the permitted access to the data (users, software and devices), time of day, length of transactions locations that are permitted. These are to be used to validate transactions by any Zero Trust-enabled monitoring software used.

Back-up and Resilience

  • Plan and cost of back-up, security, offline storage and testing of stored backups.
  • Prioritize the backup of data to separate fast changing data and software assets and its frequency.
  • Instigate regular backups both full and incremental backups and store them in air-gapped offline facilities
  • Test backups by restoring them as part of the process. Ensure that that encrypted data can be decrypted according to the rules mentioned above. Without testing backups they have no value.
  • The Zero Trust principle of assume breach applies here. This is where software known as Content Disarm and Reconstruction can be used to ensure that software and data have not become infected with malware and that it can be removed.
  • Finally, document and test a resilience plan so that if/when an attack is successful normal service can be resumed

Risk and Threat Tolerance

From the above steps, the scope, value and cost of protection can be fed back into the organizations security policy and the decision can be made on what and when should be protected based on cost expected risk reduction, impact to the organization and its tolerance to threats over time.

Ongoing Implementation

This in turn will allow  the development of an ongoing Security Plan so that implementation and risk reduction (collectively know as its Security Posture) can be measured against the plan over time and update as circumstances dictate.

Finally, Asset curation decisions will also be important factors on IT strategy, use of hybrid clouds, which suppliers, service providers and which outsource companies to use etc.

Summary

The critical task of asset curation does requirement executive buy-in and a holistic approach but the cost, expertise and resources are very limited. Some automation and backup resources are no more than those required for normal IT functions. However, taking and documenting these steps can help reduce insurance costs and show due diligence as a competitive advantage when providing products and services for large enterprises.

I wanted to create coverage of what is a critical aspect of cybersecurity that can be the basis of organizational protection while acknowledge the fundamentals that have prevented SMBs from implementing cybersecurity let alone Zero Trust endable cybersecurity.

  1. No awareness of the escalating, existing threatening risks
  2. Little understand that it impacts the entire organization and beyond and that 
  3. They have little budget, expertise or resources available.

My contribution was intended to spell out an ordered  list of actions to be taken instead of the questions to address or simple information, which I felt could was not very SMB focused.

The Board/Executive Team Catch-22

Without a holistic approach to cybersecurity covering the whole organization there is little chance of protecting it from cyber-attacks.

Without understanding the impact of cybersecurity on business, HR, marketing, sales and governance, the board will not be able to integrate cybersecurity as a competitive and commercial advantage.

With CISO cybersecurity expertise limited to IT, advice to the board, there is no way for the board understand that a holistic security policy or strategy is required. Most cybersecurity experts are not business experts.

Board/Executive Team Service

  • Reports on your whole organization’s cybersecurity status based on department-wide interviews.
  • Analyzes/rates your weak links/risks, recommends around 12 actions and predictable risk reduction.
  • Brings understand of cybersecurity compliance to reduce liability, increase competitiveness.
  • Brings State-of-the-art Zero Trust methodology to delegate, verify and vet third party supply chains.
  • Review/Create your security policy, based on risk and budget – and security expertise present
  • Review/Create your security strategy – your measurable quarterly plan of action.
  • Regular quarterly report on risk reduction and next actions.
  • Monthly awareness report.