IT OUTAGE : Prevention and Measures

IT Outage Prevention and Measures Insight Greentick

In recent times, the digital way of getting things done has tremendously increase throughout the world. Prevention and Measures. Nepal is no exception in this matter, extending from Banks, Insurance, Finance to getting Government related jobs accomplished. As an identification Government of Nepal release Nagarik App to provide digital certification to citizens. This provides benefit to get rid of carrying valuable documents with them to Government offices and helps in performing frequently done transactions like paying water, electricity bills online. Digital app is one of the good step of Government. But, are these systems flawless? Of course, not. The more user are using the more failure of app is reporting in current time.

 

Due to reasons like network failures (hardware and software failure), security attacks, problems in upgrade deployment, natural disasters, power outage, internet outage these apps have good chances of not performing as planned enforcing IT outage and causing panics to customers and businesses.

 

Meanwhile, Australia’s Commonwealth Bank had a massive technology related outage for 12 hours. Customers were not able to get money, transfer money, login to their account during this period.

 

In August 2020, US-based telecoms giant CenturyLink had to interrupt internet service in global markets because of a bug. This affected many popular streaming services, gaming platforms and webcasts of European Soccer. This was labeled as a massive IP outage by the company but they don’t give further details about the issue.  CenturyLink had already experienced massive outage in 2018 affecting ATM withdrawals, 911 calls and other services.

Background

Even the case of data loss has come forward. Amazon AWS faced an outage because of power failure in one of their datacenters, which then resulted in failure of backup generators. 7.5% of Amazon EC2 instances and EBS volumes were unavailable during outage. Amazon refer that some data store on these instances and volumes were not recoverable due to hardware damage.

 

On other instance, due to massive spike in CPU utilizations on the network, Cloud flare visitors received 502 errors in July 2019. The company confirmed the reason was due to bad software deployment. And they rolled back the issue immediately after half hour of outage.

 

Big Tech’s like Apple, Microsoft, Google, Facebook, Twitter all have been facing outages due to the physical network failures, data storage failures, cloud failures, security failures etc. This creates a panic situation to customers who get attract to new trends in technologies very quickly.

 

Not only these issues, even internet outages are increasing globally, day by day. According to SPglobal’s Report, the outages increased by 142% since early January of 2021. The effects of these outages are huge with effects on not being able to withdraw money, communicate, use ATM cards, prevent data loss, network failures, security breaches bringing up risks and disappointments to consumers life.

Reasons

The disruption in use of information technology for shorter or longer interval of time is called IT outage. IT outages can be caused due to various reasons: Malicious attacks on Service providers to steal information, Power Failures and Backup power failures, high increase in requests for services within short interval that could cause system to crash, Network failures, Data breach, Internet outage are few examples that can cause an IT outage. This affects the life of millions and not to plan and prepare for next IT outage could be the wrong move enterprises are making.

 

IT outages are a disaster and becoming aware of it to keep good response practice is essential. All organizations are pushing to meet their consumer’s high demand of services all the time which can result loopholes in maintaining security and standards. Even though, all the issues may not be address. But, there are certain steps to follow to mitigate the chances of downtime in technology.

How to behave as a consumer

In case of IT outage, there is not much that a consumer can do, but to wait and hope that the service becomes online again. The responsible IT teams and Service providers want to keep their systems running as soon as possible. Therefore, they are the ones making all the efforts to bring these systems back. While there is very little that a consumer can do beforehand and during an outage, it is important to check your data details during the aftermath of outage. Sometimes, the data recovery can take time and your transactions and details may not be displayed properly for certain time. But, it is important to contact your service providers if your data is missing and confirm with service providers.

How to prepare as a Business

IT outage has a direct and big impact on enterprises. IT outage results in high query from customers on the frontend as well as dealing with backend to resolve the issues for an enterprise. As an enterprise, taking measures to prevent and prepare for these situations is a huge step moving forward.

 

The Big Tech companies spend billions every quarter and still have outages. This implies on how important it is for other medium and small business organizations. To deal with outages and not to cause outages in their organization. Auditing and maintaining International standards in Technology, Data Storage, Information System, Cybersecurity is the approach enterprises should take to mitigate risks that can result in IT outage.

 

As an enterprise there are certain measures and procedures to follow before and after IT outage. Below are some step of enterprise

Maintaining ISO Standards

This is an important step moving forward. Whether it is related to Workplace safety, maintaining assets and resources of an organization, HR training for proper guidelines to follow keep HR updated regarding changes to system or provide security steps to be taken for issues discovered, following a well-maintained ISO standard do not give cheap chances of escalating an issue that may result in an outage.

Information Technology and Security Audit

A timely Audit of Information Technology and Security can help organization from keeping their systems up and running. In Nepal, Government have regulated governmental departments to start following good IT/IS Audit practices enforcing them to check their systems within certain interval of time but has not been able to properly regulate other private organizations to follow this step.

Network Audit

Malicious attacks on systems or hardware failure have always been an issue for all organizations. Attackers are looking to disrupt network system of organizations to keep them down for substantial amount of time. It brings forth a necessity to audit network systems in a timely manner to reduce chances of software, hardware and network failure.

Vulnerability Assessment and Penetration Testing

The use of technology to get services fascinates everyone. Mostly, these services today are request via web or mobile apps. As easier as it gets to access these services, the harder it gets to maintain them. VAPT helps to find flaws within the systems and  are extremely prioritized by organizations.

HR Awareness and Audit

The continuous improvement on these services demand organizations to lead continuous training and information sharing to its employee. HR awareness is an important task in an organization. Making HR policies is not enough if employees cannot be effectively made aware of different scenarios that can forth come if the policies are not met. Also, Raising awareness towards workplace privacy, workplace safety, managing security standards, sharing information of potential hazards to team leaders or superiors can be done in employees’ level by an enterprise to reduce risks.

 

HR Audit has also gained tremendous value and importance in recent times. Moreover, this helps an enterprise to keep a track if their employees are keeping up the required standards and are well informed about organizational policies. Government should also emphasize if HR values and HR practices on these organizations are maintained through the means of HR Audit to eliminate chances of unknowingly exposing organization to potential risks.

Data Recovery

An organization values keeping customers data safe and secure as its top priority. A regular backup of these data is most to prevent any data loss. Whether by malicious attacks or physical damage due to natural or human enforced disasters. During any kind of IT outage, it is easier to lose valuable data without proper data recovery methods enforce.

 

Steps to make Organization Data Safe
There are certain steps that an organization can follow in case of an outage.
  • Planning: An organization can plan beforehand about the necessary process to follow in case of outage. Planning for system checks that are in order will help not to miss out on. Where the issue is and helps resolve these issues quickly.
  • Handling Alerts: An organization always need to consider preparing an Immediate Response Teams. This team will help organization in finding problems during downtime. Also, need to address a team that will respond to customers query during downtime of system.
  • Checklists: Response teams need to follow checklists to mitigate issues in an order for not missing out where the issue lies. After that, analyze the checklist completely.
  • Exercise: exercise planned drills monthly, quarterly, yearly.

What can we do?

In country like Nepal, Government should regulate certain policies and enforce these technology related organizations to maintain those standards. Audit of Information Technology, System and Security at regular intervals is an important step. So, the impact due to loss from IT outage in country like Nepal can be very high.

Network Outages

RECENT POSTS

 

SERVICES