The Top 10 Reasons for that Mobile Enterprise App Outage

Oct 22, 20196 min read

There are a myriad of ways that an application can fail. There can be introduction of a product defect and lack of process to catch that defect, there can be a threat that can exploit a known or unknown vulnerability, or even something as simple as an “infallible” system administrator tripping over a power cord. Failures can be exacerbated by poor design, inadequate or no architecture, strange implementation decisions, lack of technical depth compounded by an utterly complete lack of training.

We need to recognize what generally causes mobile enterprise app outages:

One cause of large service outages is software defects introduced into production. Much of the time when you hear things like “this company’s systems were down for 4 hours today” the problem was some sort of software issue. Software issues cause far more outages than hardware, networking failures or DOS attacks or anything else. Sometimes the outage is the result of a software upgrade gone bad, inadequate testing, or fundamental misunderstanding of the requirements. There are a ton of various causes of software problems. Good software design and testing will be the front-line on having a truly resilient service.
Other times it’s the result of a database problem (including misconfiguration, product defect, lack of experience with the technology, choosing the wrong type). Databases are the bane of many technologists’ existence. Frequently these database related outages happen because a database was corrupted or bad data was being written to a database. Companies literally take down their systems to prevent further damage to the data in their databases.
Human error…enough said.
Connectivity issues cause a lot of the outages we experience especially with wireless and mobile computing. DNS failures or DOS attacks on a DNS provider are other forms of connectivity problems. It seems like every year there’s at least one big DNS outage.
The other major cause of network outages is fiber cuts that result in large geographic areas having either no connectivity or greatly degraded capability. You need to design your solution to withstand these types of problems.
Hardware failures would be next in the list of things that cause outages. Design failures that result in production solutions with a single point of failure occur much more often than any of us would like.
Localized catastrophes — fires, earthquakes, etc — that may destroy a site or critical infrastructure near a site. These are more rare, but underscore a lack of planning to build solutions that can withstand such a catastrophic event.
Developers tend to leverage a plethora of 3rd party tools for modern software systems. Not only can this introduce software licensing issues, but some developers uncritically include code snippets or libraries that introduce both functional and security weaknesses that can manifest in ways that cause outages or confidentiality defects. Some introduced code can even create dependencies on external systems or services which can also introduce both availability and confidentiality risks.
Mobile brings to bear its own whole set of additional issues. For example, a mobile operator may decide to block the ports used by a specific app and even connecting to public wireless can be problematic as the provider of the “free” access attempts to limit your “outrageous usage” by blocking ports that they don’t like (e.g. POP and IMAP are often blocked). With mobile just the act of moving from one network to another can cause a temporary service outage on the device — even when both networks are available (or appear available).
Or any number of pedestrian mobile issues could be at play such as phone storage limitations causing performance problems: re: https://www.androidpolice.com/2019/10/03/clear-app-cache-data-android/

Some straightforward solutions to the above issues include implementation of good process for introducing new solutions (leverage leading IT frameworks to make sure you are doing the right thing), world class change management (yes, everyone should have “world class” change management), planning (in mobile computing it is truly important to plan and measure and plan and measure (twice)— only when you know you are doing the right thing is it time to implement) and manage your vulnerabilities and defects — your own as well as a strong 3rd party patch management program. As with any technology solution it is critical to build in a reasonable amount of resilience — Based upon the solution, build-in the appropriate level of infrastructure redundancy to include servers, connectivity/networking, and storage.

Eric Svetcov is CSO and CTO for Medigram. Eric is an Information Security and IT authority with International Experience and Deep Cloud Computing Knowledge. He is a recognized leader in the healthcare technology space for building highly resilient and performant solutions with security, privacy and compliance requirements built in at all levels of the solution. Some highlights include Eric’s leadership as a startup veteran whereas he built the security program and IT operations from scratch 4 times. He lives to deliver best in class security program and product; he is adept at building effective security solutions that allow clients to sleep well at night.

In any professional sport whether that is football, basketball, and business, Svetcov knows that offense gets you in the door, but it’s great security defense that keeps you in the game; Eric knows that it is the team that wins championships.

Eric Svetcov has repeatedly built security programs that meet certification programs the first time. He co-wrote the book on how CISOs should perform. He also lead the team that won the race for the first ISO 27001 certification for cloud companies.

Having clients using our solution to save or improve lives electrifies the Medigram team. Eric is driven to improve patient care. He does that by and is widely known for driving cutting edge solutions with embedded market winning compliance requirements into high performance cloud computing solutions that solve customer business problems. Eric shares the Medigram team’s passion for helping all stakeholders win together by bringing deep cross functional leadership to the market. This is to help make stakeholders better at their jobs – and helping them be successful in securing and delivering information quickly on mobile so that teams of doctors may save lives quickly.

Svetcov re-designed architecture and infrastructure for a real-time data analytics product delivering $18MM ARR in its first year, drove IT/Operations/Security that enabled 5x headcount growth over a 2-year period. Eric builds scalable Governance/Security programs, including ISO 27001, HITRUST, HIPAA, COBIT, and ITIL in rapid growth environments.

He was an Advisory Council member for the CISO Executive Network, led the first global Cloud Computing Company (Salesforce) through ISO 27001 Certification and did it again with Mede/Analytics where he also led HITRUST Certification with more than 700 assessed controls. Eric’s deep experience includes acting as Caldicott Guardian, Chief Privacy Officer, and Data Protection Officer.

Eric is excited about applying his experience in Architecture, Solution Design, Security, Privacy, IT, Support, & Technical Operations to build one of the most important companies of our time. At Medigram we are building a new kind of company to solve some of the hardest challenges in medicine; initially targeting to save hundreds of thousands of lives and delay disability for millions more. He is skilled at engendering confidence, trust, and performance with both internal and external teams to deliver within specified timeframes. He believes that focusing on privacy and security and leading it is not only the ethical choice, but also good for customers and business.

Eric’s confidence in building systems, stems in part by his experience teaching cloud security to security leadership in both the private and government sectors. He built the original HIPAA audit program that KPMG used for U.S. Department of Health and Human System audits. He is a member of the CISO executive network and former advisory board member of CISO executive network, where he learns from and co-mentors and collaborates with other experts including through speaking at conferences.

At Medigram, Eric Svetcov is working to build the standard that every organization can aspire to and for developing and nurturing security leaders –he’s always striving with the team on how we can impact and make a difference.

In his off hours, Eric enjoys sports with his kids and spends time on his passion for bringing up standards for cyber security nationally. He continues to spend part of his free time creating training programs and writing about cybersecurity.

The Top 10 Reasons for that Mobile Enterprise App Outage

Recent Posts