Cloud Concentration Brings Operational Risks

July 18, 2018

The sales job included a theme that the Internet is so decentralized that anything named cloud simply inherited all the good ‘ilities’ that massive decentralization might deliver. One of the core tenets of that Internet story-telling is about the resiliency that decentralization guarantees — as in: no single outage can cause problems for your cloud-delivered services.

The reality is that story is too often just nonsense. Google, Amazon, and Microsoft control enormous amounts of the infrastructure and operations upon which “cloud” services depend. When any of these three companies has an outage, broad swaths of “cloud-delivered” business services experience outages as well. If you are going to depend on cloud-enabled operations for your success, use care in how you define that success, and how your communicate about service levels with your customers. This is especially problematic for industries where customers have been trained to expect extremely high service levels — global financial services enterprises fall into this category.  Factor this risk into your business & technical plans.

Yesterday, Tuesday, July 17th, Google’s services that provide computing storage and data management tools for companies failed. The customers who depended upon those foundational services also failed. Snapchat and Spotify were two high profile examples, but the outages were far more widespread. Google services that also depend upon the storage and data management tools also failed. It appears that Google Cloud Networking was knocked down as the company reported “We are investigating a problem with Google Cloud Global Load balancers returning 502s for many services including AppEngine, Stackdriver, Dialogflow, as well as customer Global Load Balancers.” This has broad impact because all customers attempting to deliver high quality services depend on load balancers. It appears that this networking issue caused downstream outages like those experienced by Breitbart and Drudge Report.

This was the 5th non-trivial outage this year for Google’s AppEngine. Google’s ComputeEngine has had 7 outages this year, 12 for their Cloud Networking, 8 for their Stackdriver, 3 for Google Cloud Console, 3 for Cloud Pub/Sub, 4 for Google Kubernetes Engine, 2 for Cloud Storage, 4 for BigQuery, 2 for their DataStore, 2 for their Cloud Developer Tools, and 1 reported for their Identity & Security services.
To add insult, it appears that the Google Enterprise Support page may have also not working during the outage on the 17th…

Amazon also experienced service failures that were widely reported the day before (although they are not documented on the company’s service health dashboard).
Microsoft had “internal routing” failures that resulted in widespread service outages for over an hour on the 16th as well.

Plan this reality into your cloud-enabled strategies, architectures, designs, implementations, testing, monitoring, reporting, and contracts.

REFERENCES:
Google Cloud Status Dashboard
https://status.cloud.google.com/summary

Amazon Service Health Dashboard
https://status.aws.amazon.com/

Azure Status History
https://azure.microsoft.com/en-us/status/history/
Azure Status
https://azure.microsoft.com/en-us/status/
Office365 Health
https://portal.office.com/servicestatus

Google Cloud Has Disruption, Bringing Snapchat, Spotify Down
https://www.bloomberg.com/news/articles/2018-07-17/google-cloud-has-disruption-bringing-snapchat-spotify-down
By Mark Bergen, July 17, 2018

Spotify, Snapchat, and more are down following Google Cloud incident (update: fixed)
https://venturebeat.com/2018/07/17/discord-snapchat-and-more-are-down-following-google-cloud-incident/
Jeff Grubb, JULY 17, 2018

Google Cloud Platform fixes issues that took down Spotify, Snapchat and other popular sites
https://www.cnbc.com/2018/07/13/google-cloud-platform-reports-issues-snap-and-other-popular-apps-affe.html
Chloe Aiello, 07-17-2018

[Update: Resolved] Google Cloud has been experiencing an outage, resulting in widespread problems with several services
https://www.androidpolice.com/2018/07/17/google-cloud-experiencing-outage-resulting-widespread-problems-several-services/
By Ryne Hager, 07-17-2018

Google Enterprise Support page Outage Reference

Advertisements

Bias & Error In Security AI/ML

July 14, 2018

It is difficult to get through a few minutes today without the arrival of some sort of vendor spam including the use of artificial intelligence and machine learning (AI/ML) to analyze event/threat/vulnerability data and then provide actionable guidance, or to perform/trigger actions themselves.

Global financial services enterprises have extreme risk analysis needs in the face of enormous streams of threat, vulnerability, and event data. While it might seem attractive to hook up with one or more of these AI/ML hypsters, think hard before incorporating these types of systems into your risk analysis pipelines.  At some point they will be exposed to discovery — and at that point is there risk to your brand?

In a manner analogous to facial recognition technologies, AI/ML-driven security analysis technology is coded, configured, and trained by humans, and must incorporate the potential for material bias and unknown error.

Microsoft recently called for regulation of facial recognition technology and its application.  I don’t know if regulation is the appropriate path for AI/ML-driven security analysis technologies.  I think that we do, though, need to remain aware of the bias and error in our implementations — and protect our employers from unjustifiable liability risks on this front.  Demand transparency and strong evidence of due diligence from your vendors, and test, test, test.

References:

“Facial recognition technology: The need for public regulation and corporate responsibility.” Jul 13, 2018, by Brad Smith – President and Chief Legal Officer, Microsoft
https://blogs.microsoft.com/on-the-issues/2018/07/13/facial-recognition-technology-the-need-for-public-regulation-and-corporate-responsibility/

“The Future Computed – Artificial Intelligence and its role in society.”
By Microsoft.
https://blogs.microsoft.com/uploads/2018/02/The-Future-Computed_2.8.18.pdf


Panera Bread Didn’t Take Security Seriously

April 3, 2018

I finally just read Brian Krebs and Dylan Houlihan on the 2017-2018 Panera Bread data breach of millions of customer records via unsafe APIs and applications.  This breach involved a collection of seriously flawed and insecure software wrapped in seriously flawed management.  Everyone in our business should read this and ensure that our leaders do too.  Could this happen to your organization?

Dylan Houlihan had a couple excellent recommendations.  He wrote:

 

  • “We need to collectively examine what the incentives are that enabled this to happen. I do not believe it was a singular failure with any particular employee. It’s easy to point to certain individuals, but they do not end up in those positions unless that behavior is fundamentally compatible with the broader corporate culture and priorities.”
  • “If you are a security professional, please, I implore you, set up a basic page describing a non-threatening process for submitting security vulnerability disclosures. Make this process obviously distinct from the, “Hi I think my account is hacked” customer support process. Make sure this is immediately read by someone qualified and engaged to investigate those reports, both technically and practically speaking. You do not need to offer a bug bounty or a reward. Just offering a way to allow people to easily contact you with confidence would go a long way.”

 

REFERENCES:

“No, Panera Bread Doesn’t Take Security Seriously.” By Dylan Houlihan, 04-02-2018
https://medium.com/@djhoulihan/no-panera-bread-doesnt-take-security-seriously-bf078027f815

“Panerabread.com Leaks Millions of Customer Records.” By Brian Krebs, 04-02-2018
https://krebsonsecurity.com/2018/04/panerabread-com-leaks-millions-of-customer-records/


Risk-Taking and Secure-Enough Software

January 24, 2018

Each of us involved in global Financial Services enterprises have risk management strategy and policy.  In action, those are supported and operationalized through a vast, dynamic, organic, many-dimensional web of risk management activities.

One facet of this activity involves the creation, acquisition, implementation, and use of risk-appropriate software. Sometimes this is abbreviated to simply “secure software.” In some forums I use the term “secure-enough software” to help highlight that there is some risk-related goal setting and goal achieving that needs to be architected, designed, and coded into software we create — or the same needs to be achieved by our vendors or partners when we acquire software.

As I have mentioned repeatedly in this blog, global financial services enterprises succeed through taking goal-aligned risks. Our attempts to live out that challenge are at best uneven.

Some software developers, architects, or some agile team member will zealously and enthusiastically take risks in their attempts to improve a given or a set of software quality characteristics. Others only reluctantly take risks.[1] In either case, the risks are only vaguely described (if at all) and the analysis of their appropriateness is opaque and un- or under-documented.

My experience is that too many personnel have little to no involvement, training and context on which to ground their risk analysis and risk acceptance decision making.  As a result of that gap, risk acceptance throughout any software development lifecycle is too often based on project momentum, emotion, short term self-promotion, fiction, or some version of risk management theater.

The magnitude of risk associated with this type of risk taking has only been enlarged by those attempting to extract value from one or another cloud thing or cloud service.

Those of us involved in secure software work need to clearly express the extent of our localized organization’s willingness to take risk in order to meet specific objectives, AND how the resulting behaviors align with published and carefully vetted enterprise strategic (risk management) objectives.

All of this leads me to the topic of risk appetite.

  • What needs to be included in a description of risk appetite that is intended for those involved in software development and acquisition?
  • Are there certain dimensions of software-centric risk management concern that need to be accounted for in that description of risk appetite?
  • Are there certain aspects of vocabulary that need to be more prescribed than others in order to efficiently train technical personnel about risk-taking in a global financial services enterprise?
  • Are there rules of thumb that seem to help when attempting to assess the appropriateness of given software-centric risks?

What do you think?

REFERENCES:

Risk Appetite and Tolerance Executive Summary.
A guidance paper from the Institute of Risk Management September 2011
https://www.theirm.org/media/464806/IRMRiskAppetiteExecSummaryweb.pdf

[1] A similar phrase is found in the abstract of “Risk Appetite in Architectural Decision-Making.” by Andrzej Zalewski. http://ieeexplore.ieee.org/document/7958473/

 


‘Best Practices’ IT Should Avoid

June 20, 2017

12 ‘Best Practices’ IT Should Avoid At All Costs.

A colleague mentioned this title and I could not resist scanning the list.

They offer support to some of the funnier Dilbert cartoons, AND they should spark some reflection (maybe more) for some of us working in Global Financial Services.

1. Tell everyone they’re your customer
2. Establish SLAs and treat them like contracts
3. Tell dumb-user stories
4. Institute charge-backs
5. Insist on ROI
6. Charter IT projects
7. Assign project sponsors
8. Establish a cloud computing strategy
9. Go Agile. Go offshore. Do both at the same time
10. Interrupt interruptions with interruptions
11. Juggle lots of projects
12. Say no or yes no matter the request

If any of these ring local (or ring true), then I strongly recommend Bob Lewis’ review of these ‘best practices.’

If any of them make you wince, you might want to read an excellent response to Mr. Lewis by Dieder Pironet.

In any case, this seems like an important set of issues. Both these authors do a good job reminding us that we should avoid simply repeating any them without careful analysis & consideration.

REFERENCES:
12 ‘Best Practices’ IT Should Avoid At All Costs.
http://www.cio.com/article/3200445/it-strategy/12-best-practices-it-should-avoid-at-all-costs.html
By Bob Lewis, 06-13-2017

12 ‘best practices’ IT should avoid at all costs – My stance.
https://www.linkedin.com/pulse/12-best-practices-should-avoid-all-costs-my-stance-didier-pironet
By Didier Pironet, 06-19-2017


New Technology and Service Options Do Not Trump Law and Regulations

May 16, 2017

A couple weeks ago I received a letter from Wells Fargo. After mentioning some brokerage account details there were a couple paragraphs of disclosure about $2.5 M in penalties for failing to effectively protect business-related electronic records.  Wells Fargo has been having a rough time lately.  But this situation is just so self-inflicted, and so likely to happen elseware as Financial Services organization’s technology personnel attempt to demonstrate that they can “deliver more for less…” that I thought it might be worth sharing as a cautionary tale.

The disclosures outlined that the bank’s brokerage and independent wealth management businesses paid $1 million and another $1.5 million in fines & penalties because they failed to keep hundreds of millions of electronic documents in a “write once, read many” format — as required by the regulations under which they do business.

Federal securities laws and Financial Industry Regulatory Authority (FINRA) rules require that electronic storage media hosting certain business-related electronic records “preserve the records exclusively in a non-rewriteable and non-erasable format.” This type of storage media has a legacy of being referred to as WORM or “write once, read many” technology that prevents the alteration or destruction of the data they store. The SEC has stated that these requirements are an essential part of the investor protection function because a firm’s books and records are the “primary means of monitoring compliance with applicable securities laws, including anti-fraud provisions and financial responsibility standards.”  Requiring WORM technology is associated with maintaining the integrity of certain financial records.

Over the past decade, the volume of sensitive financial data stored electronically has risen exponentially and there have been increasingly aggressive attempts to hack into electronic data repositories, posing a threat to inadequately protected records, further emphasizing the need to maintain records in WORM format. At the same time, in some financial services organizations “productivity” measures have resulted in large scale, internally-initiated customer fraud, again posing a threat to inadequately protected records.

My letter resulted from a set of FINRA actions announced late last December that imposed fines against 12 firms for a total of $14.4 million “for significant deficiencies relating to the preservation of broker-dealer and customer records in a format that prevents alteration.” In their December 21st press release FINRA said that they “found that at various times, and in most cases for prolonged periods, the firms failed to maintain electronic records in ‘write once, read many,” or WORM, format.”

FINRA reported that each of these 12 firms had technology, procedural and supervisory deficiencies that affected millions, and in some cases, hundreds of millions, of records core to the firms’ brokerage businesses, spanning multiple systems and categories of records. FINRA also announced that three of the firms failed to retain certain broker-dealer records the firms were required to keep under applicable record retention rules.

Brad Bennett, FINRA’s Executive Vice President and Chief of Enforcement, said, “These disciplinary actions are a result of FINRA’s focus on ensuring that firms maintain accurate, complete and adequately protected electronic records. Ensuring the integrity of these records is critical to the investor protection function because they are a primary means by which regulators examine for misconduct in the securities industry.”

FINRA reported 99 related “books and records” cases in 2016, which resulted in $22.5 million in fines. That seems like real money…

Failure to effectively protect these types of regulated electronic records may result in reputational (impacting brand & sales) and financial (fines & penalties) harm. Keep that in mind as vendors and hype-sters attempt to sell us services that persist regulated data. New technology and service options do not supersede or replace established law and regulations underwhich our Financial Services companies operate.

REFERENCES:
“FINRA Fines 12 Firms a Total of $14.4 Million for Failing to Protect Records From Alteration.”
December 21, 2016
http://www.finra.org/newsroom/2016/finra-fines-12-firms-total-144-million-failing-protect-records-alteration

“Annual Eversheds Sutherland Analysis of FINRA Cases Shows Record-Breaking 2016.”
February 28, 2017
https://us.eversheds-sutherland.com/NewsCommentary/Press-Releases/197511/Annual-Eversheds-Sutherland-Analysis-of-FINRA-Cases-Shows-Record-Breaking-2016

“Is Compliance in FINRA’s Crosshairs?”
http://www.napa-net.org/news/technical-competence/regulatory-agencies/is-compliance-in-finras-crosshairs/

SEC Rule 17a-4 & 17a-3 of the Securities Exchange Act of 1934:
“SEC Rule 17a-4 & 17a-3 – Records to be made by and preserved by certain exchange members, brokers and dealers.” (vendor summary)
http://www.17a-4.com/regulations-summary/

“SEC Interpretation: Electronic Storage of Broker-Dealer Records.”
https://www.sec.gov/rules/interp/34-47806.htm

“(17a-3) Records to be Made by Certain Exchange Members, Brokers and Dealers.”
http://www.finra.org/industry/interpretationsfor/sea-rule-17a-3

“(17a-4) Records to be Preserved by Certain Exchange Members, Brokers and Dealers.”
http://www.finra.org/industry/interpretationsfor/sea-rule-17a-4


​The Treacherous 12 – Cloud Computing Top Threats in 2016

April 25, 2017

The Cloud Security Alliance published “The Treacherous 12 – Cloud Computing Top Threats in 2016” last year.  I just saw it cited in a security conference presentation and realized that I had not shared this reference.  For those involved in decision-making about risk management of their applications, data, and operations, this resource has some value.  If you have not yet experienced a challenge to host your business in “the cloud”** it is likely you will in the future.

In my opinion, the Cloud Security Alliance is wildly optimistic about the business and compliance costs and the real risks associated with using shared, fluid, “cloud” services to host many types of global financial services business applications & non-public data.  That said, financial services is a diverse collection of business activities, some of which may be well served by credible “cloud” service providers (for example, but not limited to, some types of sales, marketing, and human resource activities).  In that context, the Cloud Security Alliance still publishes some content that can help decision-makers understand more about what they are getting into.

“The Treacherous 12 – Cloud Computing Top Threats in 2016” outlines what “experts identified as the 12 critical issues to cloud security (ranked in order of severity per survey results)”:

  1. Data Breaches
  2. Weak Identity, Credential and Access Management
  3. Insecure APIs
  4. System and Application Vulnerabilities
  5. Account Hijacking
  6. Malicious Insider
  7. Advanced Persistent Threats (APTs)
  8. Data Loss
  9. Insufficient Due Diligence
  10. Abuse and Nefarious Use of Cloud Services
  11. Denial of Service
  12. Shared Technology Issues

For each of these categories, the paper includes some sample business impacts, supporting anecdotes and examples, candidate controls that may help address given risks, and links to related resources.

If your role requires evaluating risks and opportunities associated with “cloud” anything, consider using this resource to help flesh out some key risk issues.

 

**Remember, as abstraction is peeled away “the cloud” is an ecosystem constructed of other people’s “computers” supported by other people’s employees…

REFERENCES:

Cloud Security Alliance:
https://cloudsecurityalliance.org

“The Treacherous 12 – Cloud Computing Top Threats in 2016”
https://downloads.cloudsecurityalliance.org/assets/research/top-threats/Treacherous-12_Cloud-Computing_Top-Threats.pdf


%d bloggers like this: