It’s been described as the most severe threat to online security since the start of widescale internet usage. It’s so serious it has it’s own website and so effective that the NSA reportedly used it to gather critical intelligence. A part of roughly two-thirds of all websites; it allows attackers to eavesdrop communications, steal data and impersonate services and users.
This isn’t a Hollywood script, this is CVE-2014-0160, better-known as Heartbleed, and it’s still a very real threat to server security.
Heartbleed is a vulnerability in OpenSSL that allows the theft of information normally protected by SSL/TLS encryption. The vulnerability is the result of improper input validation in the implementation of the TLS heartbeat extension, adopted into widespread use with the release of OpenSSL version 1.0.1 on 14th March, 2012.
Canadian Social Insurance Numbers and Mumsnet accounts are just two of the high-profile thefts attributed to CVE-2014-0160.
The bug was patched on the day of disclosure (April 7th 2014) with major online players such as Facebook, Google and Yahoo quick to reassure users of the fix. Google is said to have started work on a patch in late March, suggesting it knew about the issue before the disclosure.
If it ain’t (heart) broke…
Whether you’re Google or a small website, an unsecure system is a scary proposition for any system administrator. Data, user accounts and private communications are all at risk, but implementing a quick fix isn’t always advisable, as former Opera software developer Yngve Pettersen discovered.
His research, highlighted on The Register, revealed that at least 2,500 website administrators had made their previously secure sites vulnerable to Heartbleed by upgrading an unaffected server to a newer, but not yet officially patched, version of OpenSSL. He dubbed this servers ‘Heartbroken’.
For an administrator the pressure to do something is strong, but consider the potentially severe financial and security costs of rushing a fix. Petterson believes the total patching cost of fixing servers that have been upgraded to unsecure Open SSL versions could exceed £7m.
A more considered approach
If you’re taking on a bug like CVE-2014-0160 you’ll need more than a fly swatter – you’ll need a plan. Patching a large environment or reacting quickly to a zero day exploit can be a very difficult thing to do, with requirements and conditions changing quickly. In our experience there’s a number of things that can be done to help get the job done quickly and accurately.
The 5 Step process
1. Get a complete and thorough understanding of the issue
This understanding needs to be thorough enough to be able to identify all of the conditions that cause a particular server to be impacted by the issue in question.
2. Create a complete and up-to-date list of supported servers
The more automated and regular the data collections are the better your list will be. This list MUST contain all of the relevant information required to make a decision whether the server is impacted by the issue or not.
If we take Heartbleed as an example, the description of the vulnerability was contained within CVE-2014-0160 which basically said “all versions of OpenSSL between and including 1.0.1 and 1.0.1f were vulnerable to a remote crafted attack which could access sensitive information on the server “. This means, for this case, our list of supported servers needs to include information about whether OpenSSL is installed and if so; which version.
3. Compile an up-to-date list of all of the impacted servers
The complete list of supported servers that you’ve just created now needs to be reduced to include ONLY the impacted hosts – this is the list of servers that will be resolved by the SA’s. If we again take Heartbleed as an example, any server running versions of OpenSSL that was not in the range 1.0.1 to 1.0.1f was not impacted and did not need to appear on the patching list.
If you’re using our product I-insight then the Issue Tracker application module will allow you to define the issue causing criteria (in the form of a script) and then scan all of the collected machine data to identify servers where the criteria is true. Issue Tracker will also track remediation progress.
4. Have a detailed execution plan
The execution plan needs to detail exactly what is required to solve the issue, if needed this execution plan should be OS specific and contain information about where the fix is kept and how it should be applied to each server.
5. Understand how to test for the correct resolution of the issue
This bit is the key, you must fully understand how to check that the issue has been correctly and fully resolved. This test can be done manually or, preferably, in an automated way. It is important to make sure that the issue stays fully resolved on all servers, so ideally you should re-run the tests regularly to confirm compliance.
It is often time consuming and resource intensive to complete this process fully, but ultimately resolving the issue or issues in one pass will save time in the long run. In our experience it is also very important to make sure that all servers are checked for all issue causing conditions on a regular basis. This will make sure that the environment hasn’t regressed due to file restores, bare metal restores or even new servers being introduced.
If you’re using I-insight, the Issue Tracker module will make sure that the issue is not reintroduced into the environment by comparing the issue causing criteria against all collected machine data, every time a collection is run.
Good, now repeat 300,000 times
Not you personally, but all of the administrators for Heartbleed-affected servers – according to research by Errata Security, over 300,000 servers are still believed to be vulnerable to the bug. With that in mind, you might want to check if your favourite websites have been affected - Mashable has a list.
Please leave your comments below, it would be interesting to see your feedback.
Even though necessary, this poses the risk of unintended System Updates to a company’s server infrastructure and therefore a risk to its business. Although in most cases only System Enquiry access is intended, we frequently come across situations like the one in this real life anecdote:
“A few years back I worked in a consulting capacity with a customer on a new system design. After a few iterations we needed to look at an existing production system design running a mobile call processing application for a Telecoms client. The customer phoned their in-house Technical Support, with the question of how often these systems actually get restarted.
The friendly System Administrator on the phone said ‘ Let me log on to the server quickly. I’ll have a look for you.’
Then silence …
What had happened ?
As we found out later the System Administrator had directly logged into the machine as a privileged user to enquire about the machine’s reboot history. He used the following command:
‘ last | reboot ‘ , however he meant to use ’ last | grep reboot ‘. What should have been a simple ‘System Enquiry’ turned into mistakenly shutting down a production server, which in this case was followed by a lengthy attempt to invoke DR – the business impact was significant !”
Direct System Access is a risk, regardless of System Enquiry or System Update ! According to Emerson Network Power 2013 Study On Datacenter Outages 584 respondents in U.S. organisations reported unplanned outages. 48 percent cited Human Error as Root Cause.
Unplanned outages can cause significant cost to companies strongly relying on IT Infrastructure. A more detailed cost analysis can be found here.
How much Direct System Access is really needed?
The following table shows typical Infrastructure Operations tasks and how their access types compare.
|Access type||System Enquiry||System Update|
|Casual access to view configuration||100%||0%|
|Automated data collection scripts||100%||0%|
The majority of all system access results in (intended) System Enquiry operations. As System Enquiry operations are typically not time-critical they don’t necessarily require Direct System Access.
Can the associated risk be removed or minimised ?
threeisquared’s technology helps avoid all or a large percentage of System Enquiry access – with I-Insight we replace ‘Direct System Access’ with ‘Cached System Access’.
‘Cached System Access’ is a technology allowing most day-to-day System Administration tasks including configuration view, troubleshooting, change preparation and data collection scripts to be carried out in a ‘zero-touch’ fashion. No single production machine will be accessed during those tasks. Our solution will eliminate unnecessary risks and significantly speed up work streams that would otherwise require Direct System Access to large numbers of servers.
For detailed information how ‘Cached System Access’ can work for your IT Operations please get in touch with us here.
London/Manchester, December 2013 – Many industries see a shift away from a ‘box and wires’ view of their IT Infrastructure towards more intelligent and efficient software solutions. Increased pressure from regulatory bodies, internal and external Audit or Compliance departments leave less resources for demanding day-to-day operational tasks. As IT Infrastructure becomes increasingly complex so do the myriads of different work streams that are needed to keep large, global Datacenters running and server downtime to an absolute minimum. Often traditional tools and methods like threshold breach driven alerting, discrete measurements, disjointed or non-existing application views paired with reactive Support models and unreliable inventory data mean that ‘keeping the lights on’ can be a daunting undertaking that often results in operational ‘fire fighting’.
As the impact of those factors and its implications to sustainable and IT dependant businesses become more apparent there is a significant change in the Infrastructure and Datacenter Management Software landscape. Structured and targeted analytics replace operational guesswork, predictive diagnostics replace uncertainty, Datacenter configuration drift is understood and addressed, random resource access is replaced by channelled, risk avoiding methods and Datacenter Inventories and Golden Source data are no longer maintained manually but automatically updated through discovery methods.
threeisquared are committed to driving the change towards automated and efficient Datacenter Management. On December 5 2013 we partner up with TechHub Manchester and industry leaders Barclays and IBM to enter a dialog about changes in our industry and how entrepreneurship will shape the Future Of Enterprise Software.
London, October 2013 – threeisquared introduces Microsoft Windows Support to I-Insight throughout its application stack including Machine Data collection, import and data population and correlation. For best possible performance and efficiency the data collector is multithreaded and fully configurable supporting all major Windows versions.
The collector’s architecture turns data collection for large numbers of servers into a matter of minutes – with minimal space and organisational requirements. The data collector is agent-less – completely removing the need to roll out endpoint device components. No need to deal with daemons that are not running, updates that need to be distributed, end point resource consumption or in fact any other of the usual problems associated with distributed application components.
Furthermore, during collection no data is stored at any time on any endpoint device, all data collection methods do fully support Microsoft’s Windows Management Instrumentation (WMI). Additional WMI classes can be added dynamically, depending on individual requirements and need for new Machine Data.
“Downtime is always unwelcome and costly but for some organisations it can prove disastrous. Gartner estimates that the average cost to the business of downtime for small or medium sized enterprises is approximately £27,000 per hour but for larger organisations, with IT as a crucial part of the core business, this figure can be much larger.”
Download Whitepaper: Be prepared when disaster strikes
“The consequences of overlooking even a single security vulnerability can be severe. If a computer on the enterprise network gets infected with malware or is exposed to a root exploit, it can be expensive to fix, expose client details and/or intellectual property and may also prevent users from being productive while the problem is being addressed.”
Download Whitepaper: Protect Your IT Assets