Blog Single Author Fullwidth

Issue tracker blog entry

Swatting one hell of a bug (with our tried and tested 5-Point Patching Process)

It’s been described as the most severe threat to online security since the start of widescale internet usage. It’s so serious it has it’s own website and so effective that the NSA reportedly used it to gather critical intelligence. A part of roughly two-thirds of all websites; it allows attackers to eavesdrop communications, steal data and impersonate services and users.

This isn’t a Hollywood script, this is CVE-2014-0160, better-known as Heartbleed, and it’s still a very real threat to server security.


High-profile thefts

Heartbleed is a vulnerability in OpenSSL that allows the theft of information normally protected by SSL/TLS encryption. The vulnerability is the result of improper input validation in the implementation of the TLS heartbeat extension, adopted into widespread use with the release of OpenSSL version 1.0.1 on 14th March, 2012.

Canadian Social Insurance Numbers and Mumsnet accounts are just two of the high-profile thefts attributed to CVE-2014-0160.

The bug was patched on the day of disclosure (April 7th 2014) with major online players such as Facebook, Google and Yahoo quick to reassure users of the fix. Google is said to have started work on a patch in late March, suggesting it knew about the issue before the disclosure.


If it ain’t (heart) broke…

Whether you’re Google or a small website, an unsecure system is a scary proposition for any system administrator. Data, user accounts and private communications are all at risk, but implementing a quick fix isn’t always advisable, as former Opera software developer Yngve Pettersen discovered.

His research, highlighted on The Register, revealed that at least 2,500 website administrators had made their previously secure sites vulnerable to Heartbleed by upgrading an unaffected server to a newer, but not yet officially patched, version of OpenSSL. He dubbed this servers ‘Heartbroken’.

For an administrator the pressure to do something is strong, but consider the potentially severe financial and security costs of rushing a fix. Petterson believes the total patching cost of fixing servers that have been upgraded to unsecure Open SSL versions could exceed £7m.


A more considered approach

If you’re taking on a bug like CVE-2014-0160 you’ll need more than a fly swatter – you’ll need a plan. Patching a large environment or reacting quickly to a zero day exploit can be a very difficult thing to do, with requirements and conditions changing quickly. In our experience there’s a number of things that can be done to help get the job done quickly and accurately.


The 5 Step process

1. Get a complete and thorough understanding of the issue

This understanding needs to be thorough enough to be able to identify all of the conditions that cause a particular server to be impacted by the issue in question.


2. Create a complete and up-to-date list of supported servers

The more automated and regular the data collections are the better your list will be. This list MUST contain all of the relevant information required to make a decision whether the server is impacted by the issue or not.

If we take Heartbleed as an example, the description of the vulnerability was contained within CVE-2014-0160 which basically said “all versions of OpenSSL between and including 1.0.1 and 1.0.1f were vulnerable to a remote crafted attack which could access sensitive information on the server “. This means, for this case, our list of supported servers needs to include information about whether OpenSSL is installed and if so; which version.


3. Compile an up-to-date list of all of the impacted servers

The complete list of supported servers that you’ve just created now needs to be reduced to include ONLY the impacted hosts – this is the list of servers that will be resolved by the SA’s. If we again take Heartbleed as an example, any server running versions of OpenSSL that was not in the range 1.0.1 to 1.0.1f was not impacted and did not need to appear on the patching list.

If you’re using our product I-insight then the Issue Tracker application module will allow you to define the issue causing criteria (in the form of a script) and then scan all of the collected machine data to identify servers where the criteria is true. Issue Tracker will also track remediation progress.


4. Have a detailed execution plan

The execution plan needs to detail exactly what is required to solve the issue, if needed this execution plan should be OS specific and contain information about where the fix is kept and how it should be applied to each server.


5. Understand how to test for the correct resolution of the issue

This bit is the key, you must fully understand how to check that the issue has been correctly and fully resolved. This test can be done manually or, preferably, in an automated way. It is important to make sure that the issue stays fully resolved on all servers, so ideally you should re-run the tests regularly to confirm compliance.

It is often time consuming and resource intensive to complete this process fully, but ultimately resolving the issue or issues in one pass will save time in the long run. In our experience it is also very important to make sure that all servers are checked for all issue causing conditions on a regular basis. This will make sure that the environment hasn’t regressed due to file restores, bare metal restores or even new servers being introduced.

If you’re using I-insight, the Issue Tracker module will make sure that the issue is not reintroduced into the environment by comparing the issue causing criteria against all collected machine data, every time a collection is run.


Good, now repeat 300,000 times

Not you personally, but all of the administrators for Heartbleed-affected servers – according to research by Errata Security, over 300,000 servers are still believed to be vulnerable to the bug. With that in mind, you might want to check if your favourite websites have been affected - Mashable has a list.


Please leave your comments below, it would be interesting to see your feedback.


Direct System Access – Necessity or Risk?

32x32_ London, December 2013.   What is Direct System Access ?  A System Administrator can log into a server to carry out tasks as part of his duties, often working directly on a production system.

Even though necessary, this poses the risk of unintended System Updates to a company’s server infrastructure and therefore a risk to its business. Although in most cases only System Enquiry access is intended, we frequently come across situations like the one in this real life anecdote:


“A few years back I worked in a consulting capacity with a customer on a new system design. After a few iterations we needed to look at an existing production system design running a mobile call processing application for a Telecoms client. The customer phoned their in-house Technical Support, with the question of how often these systems actually get restarted.

The friendly System Administrator on the phone said ‘ Let me log on to the server quickly. I’ll have a look for you.’ 

Then silence  …

What had happened ? 
As we found out later the System Administrator had directly logged into the machine as a privileged user to enquire about the machine’s reboot history. He used the following command:
‘ last | reboot ‘  , however he meant to use  ’ last | grep reboot ‘.  What should have been a simple ‘System Enquiry’ turned into mistakenly shutting down a production server, which in this case was followed by a lengthy attempt to invoke DR – the business impact was significant !”


Direct System Access is a risk, regardless of System Enquiry or System Update ! According to Emerson Network Power 2013 Study On Datacenter Outages 584 respondents in U.S. organisations reported unplanned outages.  48 percent cited Human Error as Root Cause.

Unplanned outages can cause significant cost to companies strongly relying on IT Infrastructure. A more detailed cost analysis can be found here.


How much Direct System Access is really needed?

The following table shows typical Infrastructure Operations tasks and how their access types compare.


Access type System Enquiry System Update
Access reasons
Casual access to view configuration 100% 0%
Change preparation 100% 0%
Change execution 60% 40%
Trouble shooting 95% 5%
Automated data collection scripts 100% 0%

The majority of all system access results in (intended) System Enquiry operations. As System Enquiry operations are typically not time-critical they don’t necessarily require Direct System Access.



Can the associated risk be removed or minimised ?

threeisquared’s technology helps avoid all or a large percentage of System Enquiry access –  with I-Insight we replace ‘Direct System Access’ with ‘Cached System Access’.

‘Cached System Access’ is a technology allowing most day-to-day System Administration tasks including configuration view, troubleshooting, change preparation and data collection scripts to be carried out in a ‘zero-touch’ fashion. No single production machine will be accessed during those tasks. Our solution will eliminate unnecessary risks and significantly speed up work streams that would otherwise require Direct System Access to large numbers of servers.

For detailed information how ‘Cached System Access’ can work for your IT Operations please get in touch with us here.



The Future of Enterprise Software

24x24_ London/Manchester, December 2013 – Many industries see a shift away from a ‘box and wires’ view of their IT Infrastructure towards more intelligent and efficient software solutions. Increased pressure from regulatory bodies, internal and external Audit or Compliance departments leave less resources for demanding day-to-day operational tasks. As IT Infrastructure becomes increasingly complex so do the myriads of different work streams that are needed to keep large, global Datacenters running and server downtime to an absolute minimum. Often traditional tools and methods like threshold breach driven alerting, discrete measurements, disjointed or non-existing application views paired with reactive Support models and unreliable inventory data mean that ‘keeping the lights on’ can be a daunting undertaking that often results in operational ‘fire fighting’.

As the impact of those factors and its implications to sustainable and IT dependant businesses become more apparent there is a significant change in the Infrastructure and Datacenter Management Software landscape. Structured and targeted analytics replace operational guesswork, predictive diagnostics replace uncertainty, Datacenter configuration drift is understood and addressed, random resource access is replaced by channelled, risk avoiding methods and Datacenter Inventories and Golden Source data are no longer maintained manually but automatically updated through discovery methods.

threeisquared are committed to driving the change towards automated and efficient Datacenter Management. On December 5 2013 we partner up with TechHub Manchester and industry leaders Barclays and IBM to enter a dialog about changes in our industry and how entrepreneurship will shape the Future Of Enterprise Software.



I-Insight: Support for Microsoft Windows

24x24_ London, October 2013 – threeisquared introduces Microsoft Windows Support to I-Insight throughout its application stack including  Machine Data collection,  import and data population and correlation. For best possible performance and efficiency the data collector is multithreaded  and fully configurable supporting all major  Windows versions.

The collector’s architecture turns data collection for large  numbers of servers into a matter of minutes – with minimal space and organisational requirements. The data collector is agent-less – completely removing the need to roll out endpoint device components. No need to deal with daemons that are not running, updates that need to be distributed, end point resource consumption or in fact any other of the usual problems associated with distributed application components.

Furthermore, during collection no data is stored at any time on any endpoint device, all data collection methods do fully support Microsoft’s Windows Management Instrumentation (WMI). Additional WMI classes can be added dynamically, depending on individual requirements and need for new Machine Data.


Whitepaper: Be prepared when disaster strikes !

“Downtime is always unwelcome and costly but for some organisations it can prove disastrous. Gartner estimates that the average cost to the business of downtime for small or medium sized enterprises is approximately £27,000 per hour but for larger organisations, with IT as a crucial part of the core business, this figure can be much larger.”

Download Whitepaper:   Be prepared when disaster strikes


Whitepaper – Protect Your IT Assets !

“The consequences of overlooking even a single security vulnerability can be severe. If a computer on the enterprise network gets infected with malware or is exposed to a root exploit, it can be expensive to fix, expose client details and/or intellectual property and may also prevent users from being productive while the problem is being addressed.”

Download Whitepaper:   Protect Your IT Assets

© Copyright 2013, 2014, 2015, 2016 - threeisquared Ltd, Registered at Companies House for England and Wales. Incorporation Number 8640292.

mautic is open source marketing automation