Cloud fundamentals/Understanding the cloud

This section aims to help the reader understand "the cloud", or at least what Microsoft think the cloud is. The cloud idea is quite an old one, it is a simple way of saying there are bits in this system you don't need to know about, just trust they work and hide them in a cloud. However to find out about the bits in the cloud, we need to lift the lid and look inside.

Cloud principles and delivery mechanisms edit

Businesses have a choice between providing their IT 'back-end' (servers etc.) on their own premises or using similar facilities located in the cloud.  There are advantages and disadvantages to each, often depending on the size and type of business.

There are a number of sites/documents available on the Internet that discuss the issues in detail such as:   On premises vs. Cloud based solutions from a company called GFI.

Think about how your organization (company, school college etc.) operates, do they store and process all their data on site or use the cloud, or maybe both?

What do you expect from your IT system? edit

There are four fundamental characteristics you need to consider in relation to cloud services. They are:

Elasticity - Can you expand the storage capacity if you need it, and reduce it when you don't. Cloud systems have huge pools of storage, thus extra storage can be created for a customer faster than the data requirement can grow.

Scalability - Can you add more power, usually in the form of the ability to deal with more connections (customer requests) at the same time. As extra machines will be virtual machines, they can be created in seconds.

Redundancy - If part of the system breaks, is there another part that can take over. This feature is often built in the servers, such as spare power supplies and hard drives, as is software that can spot potential problems and alert administrators before a failure actually occurs.

Availability - If the whole system breaks, so the failure affects many organisations not just those using the same hardware as you, how quickly can things be back up and running. One common measure for this is referred to as Five Nines, in other words the system will be available 99.999% of the time. This equates to a down time of a couple of minutes a year on average. This is achieved by having the extra storage (elasticity), rapidly available new servers (scalability) and parts that can take over in event of a failure (redundancy).

Capital expenditure funding model edit

Often if a company host their own data processing servers and storage, they will have had to purchase the equipment and software from what is referred to as Capital Expenditure.  In other words they will have had to pay out for the full cost of equipment before they use it.  This is sometimes abbreviated to CapEx.

Wikipedia has an explanation of the term capital expenditure which is worth reading if you are not sure, it will also lead to the term Total Cost of Ownership which is another important business concept. Total cost of ownership (TCO) is important as it will be taken in to account when an organization looks at either updating its equipment or moving in to the cloud.

Operational expense funding model edit

An alternative is to essentially rent time and space on someone else's system, this may be a locally based Internet Service Provider or a company offering a cloud service.  In this case the company would either pay a subscription for the service (based on agreed limits), or 'pay-as-you-go' paying for processor time and storage as it is used.  Sometimes there is a combination of both.  Here the expenditure is considered to be part of the Operational Costs (abbreviated to OpEx).

There are a range of arguments relating to both methods, for example it can be argued that a capital expenditure is better as there may be tax incentives that make it financially attractive.  On the other hand, if money is in short supply, it makes more sense to only pay for the IT facilities you use, when you use them.

Customisable vs. Configurable edit

Equipment and software that is on-premises can be customised as you have full access to it.  On the other hand in the cloud you may be able to configure the resources, you are not able to easily customise them.  This does not mean that you can not use customised software in the cloud, it just means that you may have to use a specific software tool (e.g. version of a programming language) or brand of hardware.

The hybrid approach edit

Organizations can use a combination of the two approaches, this is called a hybrid model.  Some processing / storage can be done locally (particularly if the data is very sensitive), while some processing is done in the cloud.  The processing in the cloud may only be used occasionally when required, this reduces the risk of overloading the on premises equipment.  The ability to grow and contract in the cloud is referred to as the elasticity of the cloud.

Cloud security requirements and policies edit

Microsoft understandably have a lot to say about security, so it is worth taking some time to look over their Security in the cloud webpage. This does only scratch the surface of security, so you will have to look further afield to get a more complete picture. The Wikipedia Cloud computing security article provides more information and links to other topics not covered by this MTA but which are important in understanding the cloud. Don't forget, details about security are often hidden for security reasons!

Don't get sidetracked by trying to read everything! If your aim is to pass the MTA exam concentrate on those topics Microsoft say they will test, but also read around the subject as Microsoft make it clear the exam will not be limited to just the topics they list.

Managing privacy edit

Microsoft commissioned a number of reports on privacy, however they can be a heavy read. What emerges from these reports tends to that commercial cloud users expect cloud providers to meet certain accepted standards such as SSAE-16, PCI DSS, or ISO 27001. These standards may also cover topics below, but the exam is unlikely to expect you to know the detailed content of these standards.

The security sections below contain links to videos that not only cover security in general but also talk about managing privacy.

The software used to create the cloud is very complex and is designed to prevent one customer from seeing another customers data. Great stress is put upon physical security in order to keep your data private, and on top of that it will be encrypted to give a layer of logical security too.

Managing compliance edit

Compliance is an area of business which is to do with making sure that an organization is following all the controls or regulations that apply to it. This could range from ensuring they follow health and safety guidelines, through to controls on moving money or data from country to country. These are often complex when it comes to data, as you will have appreciated if you dared to read the full details of any of the standards mentioned above.

As far as we are concerned here, compliance goals are met by Microsoft in Office 365 through the use of templates as discussed in this video (about 1 minute in). Applying a template ensure the way your data is stored meets the requirements of the chosen regulations, for example you may be required to store your data in a particular part of the world (e.g. the EEC). Microsoft continually monitor regulations and controls, and update the compliance software to take changes in to account. See the next section for more details.

Securing data at rest edit

Would you trust Microsoft or Google or any other cloud provider with your data? There is no need to answer that, but think about it.

Microsoft go to great lengths to secure Office 365 data as shown in this video. If you are not convinced see what their security team say in this video. These videos concentrate on protecting your data at rest and checking no-one looks at it.

On the other hand this Google video looks at how their datacenters operate and highlight some of the logical methods (did you spot encryption) and physical methods (did you spot retina scanners and alligator pools) they use. (If you missed the alligators watch the video again.)

By now you should be reasonably convinced that these (and similar) companies do go to a lot of trouble to protect your data and manage privacy. The techniques they use demonstrate some of the requirements of the security compliance standards mentioned in the privacy section above.

Securing data in transit edit

Vijay Kumar from Microsoft explain how Office 365 data is protected in transit in this video.

Data and operations transparency edit

In this context the term transparency means that you can find out what is going on (more or less) with your data, such as where it is, and knowing there is someone available on the end of a phone 24x7 that you can consult if you need to. Also you are assured that if there is a problem it will be investigated and you can access the outcomes. This is explained by Microsoft in the context of Office 365 on this webpage.

How a cloud service stays up-to-date and available edit

This Microsoft blog post gives an outline of how best practice is applied in order to keep Office 365 up and running.

The service and feature improvement process edit

Like many big software companies, Microsoft will be working continually on improving their products. Improvements can arise from a number of sources including user feedback and problems that have been raised by users. With conventional software distribution these improvements would manifest themselves as updates, hot fixes, or service packs, but with the software as a service approach the improvements are normally automatically added on a rolling basis as and when they are ready for release. To a certain extent this takes the burden of system administrators to ensure updates are tested and applied in a timely manner. Information on updates is provided in this roadmap.

Monitoring the service health edit

How do you maintain things? edit

It is inevitable that at some stage something will either break or need to be replaced, and normally this would require services to be stopped and equipment switched off for a while. However this disruption can mostly be avoided in the cloud by providing redundancy, resilience, distributed services and monitoring. In addition you probably like to know what is going on, so most (if not all) cloud services have a Service Health Dashboard that tells you the current state of your cloud, including information on any problems that may have affected you that have been dealt with without you realising.

Having redundant services, such as spare virtual machines that can be rapidly brought on line, means that if your virtual server (or the actual physical hardware) fails, your processing can be transferred to a 'new' server in seconds. This also replies to storage. The major cloud providers like Microsoft, Google and Amazon not only have spare capacity at your 'local' datacenter, but can also switch your cloud to an alternative datacenter if required.

Resilience is closely linked to redundancy, in that your cloud will be operating on two or more virtual machines in different places, not just one. This is constantly monitored and your load will be split across the machines so that if one fails the other can takeover rapidly.

When you have a complex cloud like Office 365 which includes SharePoint, Exchange and Lync, the individual services are distributed across a number of different physical servers. This means that if one part of the system has a major failure, the remaining services are unaffected.

Underpinning all of this is a monitoring system that not only informs you are a user, via the Service Health Dashboard, of the state of your cloud, but also the datacenter staff. This enables the support team to pro-actively deal with problems before they become major incidents, and provide you with a level of service that meets or exceeds your expectations.

Where next - what is the roadmap edit

Microsoft provide a roadmap of where they are taking Office 365. This comprehensive map indicates what they have updated, what they are in the process of or intend to update, and what they are no longer working on. At the time of writing it indicates there are 22 new features launched, 15 on the way, 40 under development and one cancelled (Simplifying Admin service settings!). The roadmap also gives access to previous updates.

Service Level Agreements edit

For an overview of what a service level agreement is, see the Wikipedia article. The article goes in to more depth (and technical language!) than you would be expected to know for and of the MTA's, but it does give a flavour. In essence a SLA details what you the customer can expect from a supplier, and in this case it will primarily revolve around reliability which Microsoft give as an uptime of 99.9%. This equates to a maximum loss of service of just over 8 hours per year, while in practice the actual level is more likely to be better than 99.99% or 53 minutes per year. Microsoft list their support mechanisms for Office 365 on this webpage.

Service provider liability edit

This term describes what the service provider will do if they do not meet their service level agreement. Given a 99.9% uptime target, Microsoft are very unlikely to fail to meet this given their global resources. You also have to take in to account that failures elsewhere in the Internet are not covered by Microsoft's liability. You Internet Service Provider may also cover themselves by similar limited liability.

If Microsoft did fail to meet their agreed service level, there would probably be some sort of of financial penalty they would pay their customers. However the agreement is probably worded in such a way that an actual payout would be very unlikely because of the way they can rapidly move services around the globe to mitigate against the problem, and the fact that the customer data will remain untouched.

Different types of cloud service edit

One approach to defining the types of cloud computing is to look at what kind of services they provide.  This is not the same as the level of service, which may be judged on for example how quickly and organisation responds to your request for a service.

There are a number of terms and acronyms commonly used when discussing the different types of cloud computing:  

IaaS (Infrastructure as a service) edit

In this instance the provider offers the hardware such as servers and storage along with a network connection to it, an little else.  You as the client are responsible for installing and maintaining the operating system and other software.  This service is the basis that all the other services are based on (see diagram above).  The Amazon Elastic Compute Cloud (Amazon EC2) is an example of this type of service.

PaaS (Platform as a service) edit

Here the provider makes available not just the hardware but also some of the software .  This software would typically include the operating system, database, web server, and programming tools.  Systems such as Microsoft Azure falls in to this category.  The user has little control of the underlying hardware, but is able to manage the applications they install.

See the Wikipedia page for more information  Wikipedia - Platform as a service

SaaS (Software as a service) edit

This extends the PaaS idea to include key business applications such as email, on-line shop, or  Customer Relationship Management (CRM) facilities. One example of SaaS is a product called Sales Cloud from the company Salesforce.com.  See the Wikipedia page for more information.  Wikipedia - Software as a service

XaaS (Anything as a service) edit

This is a rather vague term as it covers anything that can be provided.  It could be considered to be a system where you can draw on expertise or equipment for any aspect of IT and computing.  It implies that if something (computing related) exists, the someone can provided it from a central point.  For example companies exist that can produce printed circuit boards for prototype electronic devices from the diagram files you send them.

UCaaS (Unified Communications as a service) edit

Unified communications is a term that describes a system that provides for all types of communication on one infrastructure.  This would include the processing and storage of data, email, video, and telephony services.  This is similar to the converged network idea that Cisco talk about.  This video from Cisco gives some examples   How connected we are

Hybrid solutions edit

There is no requirement for an organisation to use just a cloud service or just use an on premises solution, the two can be combined in a hybrid model.  Services like Microsoft Azure can do this particularly well as key parts of the network such as Active Directory are common to both systems.  

A hybrid solution would suit an organisation that has a large investment in it's own equipment and is either expecting to grow rapidly or has seasonal computing demands.  A hybrid cloud solution can not only grow, it can shrink back too when not required.

Another scenario would be for an organisation that holds confidential data that it would not trust to a third party.  The sensitive data could be held and processed on premises while cloud computing could be used for less sensitive processing.