Category: HP

The day the systems administrators was eliminated from the Earth… fact or fiction?

As software becomes more complex and demands scalability of the cloud, IT’s mechanics of today, the systems administrator, will disappear. Tomorrow’s systems administrator will be entirely unlike anything we have today.

For as long as there have been computer systems, there has always been a group of individuals managing them and monitoring them named system administrators. These individuals were the glue of data centers,  responsible for provisioning and managing systems. From the monolithic platforms of the old ages to todays mixed bag approach of hardware, storage, operating systems, middleware, and software.

The typical System Administrator usually possessed super human diagnostic skills and repair capabilities to keep a complex mix of various disparate systems humming along happily. The best system administrators have always been the “Full Stack” individuals who were armed with all skills needed to keep systems up and running but these individuals were few and far between.

Data centers have become more complex over the past decade as systems have been broken down, deconstructed into functional components and segregated into groupings. Storage has been migrated to centralized blocks like a SAN and NAS thus inevitably forcing personnel to become specialized in specific tasks and skills.

Over the years, this same trend has happened with Systems Infrastructure Engineers/Administrators, Network Engineers/Administrators and Application Engineers/Administrators.

Everywhere you look intelligence is being built directly into products.I was browsing the aisle at Lowe’s this past weekend and noted that clothes washers, dryers, and refrigerators are now being shipped equipped with WIFI and NFC to assist with troubleshooting problems, collecting error logs and opening problem service tickets. No longer do we need to pour over those thousand pages long manuals looking for error code EC2F to tell us that the water filter has failed, the software can do it for us! Thus is has become immediately apparent that if tech such as this has made its way into low-level basic consumer items things must be changing even more rapidly at the top.

I obviously work in the tech industry and would like to think of myself as a technologist and someone who is very intrigued by emerging technologies. Electric cars, drones, remotely operated vehicles, smartphones, laptops that can last 12+ hours daily while fitting in your jeans pocket and the amazing ability to order items from around the globe and have them shipped to your door. These things astound me.

The modern car was invented in 1886 and in 1903, we invented the airplane. The first commercial air flight was not until 1914 but to see how far we have come in such a short time is astounding. It almost makes you think we were asleep for the last Century prior.

As technology has evolved there has been a need for software to also evolve at a similarly rapid pace. In many ways, we have outpaced software with hardware engineering over the last Score and now software is slowly catching up and surpassing hardware engineering.

Calm down, I know I am rambling again. I will digress and get to the point.

The fact is, the Systems Administrator as we know it is a dying breed. Like the dinosaur, the caveman and the wooly mammoth. All of these were great at some things but never enough to stay alive and thus were wiped out.

So what happens next? Do we all lose our jobs? Does the stock fall into a free fall and we all start drinking Brawndo the Thirst Mutilator (if you havent seen Idiocracy I feel for you.) The fact is, it’s going to be a long, slow and painful death.

Companies are going to embrace cloud at a rapid rate and as this happens people will either adapt or cling to their current ways. Not every company is going to be “cloudy”.

Stop. Let me state something. I absolutely HATE the word Cloud. It sounds so stupid. Cloud. Cloud. Cloud. Just say it. How about we all instead embrace the term share nothing scalable distributed computing. That sounds better.

So, is this the end of the world? No, but it does mean “The Times They Are a Changin” to quote Mr. Dylan.

A fact is, change is inevitable. If things didn’t change we would still be living in huts, hunting with our bare hands and using horses as our primary methods of transportation. We wouldn’t have indoor toilets, governments, rules, regulations, or protection from others as there would be no law system.

Sometimes change is good and sometimes its bad. In this case, I see many good things coming down the road but I think we all need to see the signs posted along the highway.

Burying ones head in the dirt like an Ostrich is not going to protect you.

Advertisements

How to build a large scale multi-tenant cloud solution

It’s not terribly difficult to design and build a turnkey integrated pre-configured SDDC ready to use solution. However building one that completely abstracts the compute, storage and network physical resources and provides multiple tenants a pool of logical resources along with all the necessary management, operational and application level services and allows to scale resources with seamless addition of new rack units.

The architecture should be a vendor agnostic solution with limited software tie-in to hardware vendor specifics but expandable to support various vendor hardware needs with plug-n-play architecture.

Decisions should be made early if the solution will come in various forms and factors from appliances, quarter, half and full racks providing different levels of capacity, performance, redundancy HA, SLA’s. Building a ground-up architecture to expand to mega rack scale architecture in future with distributed infrastructure resources without impacting the customer experience and usage.

The design should contain more than one physical rack with each rack unit composing of: Compute Servers with direct attached storage (software defined) a Top of the Rack and Management Switches hardware Data Plane, Control Plane and Management Plane software Management plane software Platform level Operations, Management and Monitoring software Application-centric workload Services.

Most companies have a solution based on a number of existing technologies, architectures, products, and processes that have been part of the legacy application hosting and IT operations environments. These environment can usually be repurposed for some of the scalable cloud components which saves time, cost and the result is a stable environment that operations can still manage/operate with existing processes and solutions.

In order to evolve the platform to provide not only for stability and supportability but additional features such as elasticity and improved time to market companies should begin immediately initiating a project to investigate and redesign the underlying platform.

In scope for this effort are assessments of the network physical and logical architecture, the server physical and logical architecture, the storage physical and logical architecture, the server virtualization technology, and the platform-as-a-service technology.

The approach to this effort will include building a mini proof of concept based on a hypothesized preferred architecture and benchmarking it against alternative designs. This proof of concept then should be able to scale to a production sized system.

Implement a scalable elastic IaaS – PaaS leveraging self-service automation and orchestration that enables end users the ability to self-service provision applications within the cloud itself.

Suggested phases of the project would be as follows:

Phase Description:

  • Phase I Implementation of POC platforms
  • Phase II Implementation of logical resources
  • Phase III Validation of physical and logical resources
  • Phase II Implementation of platform as a service components
  • Phase IV Validation of platform as a service components
  • Phase V Platform as a service testing begins
  • Phase VI Review, document complete knowledge transfer
  • Phase VII Present fact findings to executive management

Typically there are four fundamental components to cloud design; infrastructure, platform, applications, and business process.

The infrastructure and platform as a service components are typically the ideal starting place to drive new revenue opportunities, whether by reselling or enabling greater agility within the business.

With industries embracing cloud design at a record pace and technology corporations focusing on automation this allows the benefit of moving towards a cloud data infrastructure design.

Cloud Data infrastructure allows the ability to provide services, servers, storage, and networking on-demand at any time with minimal limits helping to create new opportunities and drive new revenue.

The “Elastic” pay-as-you-go data center infrastructure should provide a managed services platform allowing application owner groups the ability to operate individually while sharing a stable common platform.

Having a common platform and infrastructure model will allow applications to mature while minimizing code changes and revisions due to hardware, drivers, software dependencies and infrastructure lifecycle changes.

This will provide a stable scalable solution that can be deployed at any location regardless of geography.

Today’s data centers are migrating away from the client-server distributed model of the past towards the more virtualized model of the future.

Storage: As business applications grow in complexity, the need for larger more reliable storage becomes a data center imperative. Disaster Recovery / Business Continuity: Data centers must maintain business processes for the overall business to remain competitive. Dense server racks make it very difficult to keep data centers cool and keep costs down. Cabling: Many of today’s data centers have evolved into a complex mass of interconnected cables that further increase rack density and further reduce data center ventilation.

These virtualization strategies introduce their own unique set of problems, such as security vulnerabilities, limited management capabilities, and many of the same proprietary limitations encountered with the previous generation of data center components.

When taken together, these limitations serve as barriers against the promise of application agility that the virtualized data center was intended to provide.

The fundamental building block of an elastic infrastructure is the workload. Workloads should be thought of as the amount of work that a single server or ‘application gear/container/instance’ can provide given the amount of resources allocated to it

Those resources encompass compute (CPU & RAM), data (disk latency & throughput), and networking (latency & throughput). A workload is an application, part of an application, or a group of application that’s work together. There are two general types of workload that the most customers need to address: those running within a Platform-as-a-Service construct and those running on a hypervisor construct. Sometimes bare metal should also be considered where applicable but this is in rare circumstances.

Much like database sharding, the design should be limited by fundamental sizing limitations which will allow a subset of resources to be configured at maximum size hosting multiple copies of virtual machines, applications group and distributed load balanced across a cluster of hypervisors that share a common persistent storage back end.

This is similar to load balancing but not exactly the same as a customer or specific application will only be placed in particular ‘Cradles’. A distribution system will be developed to determine where tenants will be placed upon login to and direct them to the Cradle they were assigned.

In order to aggregate as many workloads as possible in each availability zone or region, a specific reference architecture design should be made to determine the ratio virtual servers per physical server.

The size will be driven by a variety of factors including oversubscription models, technology partners, and network limitations.The initial offering will result in a prototype and help determine scalability & capacity and this design should scale in a linear predictable fashion.

The cloud control system and its associated implementations will be comprised of Regions or Availability Zones. Similar in many ways to what Amazon AWS does currently.

The availability zone model allows the ability to isolates one fault domain from another. Each availability zone has isolation and redundancy in management, hardware, network, power, and facilities. If power is lost in a given availability zone tenants in another availability zone are not impacted. Each availability zone resides in a single datacenter facility and is relatively independent. Availability zones are then aggregated into a regions and regions into the global resource pool.

The basic components would be as follows:

· Hypervisor and container management control plane
· Cloud orchestration
· Cloud blueprints/templates
· Automation
· Operating system and application provisioning
· Continuous application delivery
· Utilization monitoring, capacity planning, and reporting

hardware considerations should be as follows:

· Compute scalability
· Compute performance
· Storage scalability
· Storage performance
· Network scalability
· Network performance
· Network architecture limitations
· Oversubscription rates & capacity planning
· Solid-state flash leveraged to increase performance and decrease deployment times

Business concerns would be:

· Cost-basis requirements
· Margins
· Calculating cost VS profits to show ROI (chargeback/show back)
· Licensing costs

The extensibility of the solution dictates the ability to use third party tools for authentication, monitoring, and legacy applications. The best cloud control system should allow the ability to integrate legacy systems and software with relative ease. Its my own personal preference to lead with Open Source software but that decision is left to the user to decide.

Monitoring,  capacity planning, and resource optimization should consider the following:

· Reactive – Break-Fix monitoring where systems and nodes are monitored for availability and service is manually restored
· Proactive – Collect metrics data to maintain availability, performance, and meet SLA requirements
Forecasting – Use proactive metric data to perform capacity planning and optimize capital usage

Because cloud computing is a fundamental paradigm shift in how Information Technology services are usually delivered it will cause significant disruption inside most of the current organizations. Helping each of these organizations embrace the change will be key.

While final impacts are currently impossible to measure it’s clear that a self-service model is clearly the future and integral to delivering customer satisfaction, both from an internal or external user perspective.

Some proof of concept initiatives would be as follows:

· Determine a go-forward architecture for the IaaS and PaaS offering inclusive of a software defined network
· Benchmark competing architecture options against one another from a price, performance, and manageability perspective
· Establish a “mini-cradle” that can be maintained and used for future infrastructure design initiatives and tests
· Determine how application deployment can be fully or partially automated
· Determine a cloud control system to facilitate provisioning of Operating Systems and multi-tiered applications
· Complete the delivery of FAC to generate metrics and provide statistics
· Show the value of self-service to internal organizations
· Measure the ROI based on cost of the cloud service delivery combined with the business value
· Don’t build complex for the initial offering
· Avoid spending large amounts of capital expenses on the initial design

After implementing a proof of concept testing encompassing the following(and more) should be done:

Proof of Functionality

  • The solution system runs in our datacenter; on our hardware
  • The solution system can be implemented with multi-network configuration
  • The solution system can be implemented with as few manual steps as possible (automated installation)
  • The solution systems have the ability to drive implementation via API
  • The solution system provides a single point of management for all components
  • The solution system enables dynamic application mobility by decoupling the definition of an application from the underlying hardware and software
  • The solution system can support FAC production operating systems
  • The solution system Hypervisor and guest OS are installed and fully functional
  • The solution systems support internal and external authentication against existing authentication infrastructure.
  • The solution system functions as designed and tested

Proof of Resiliency

  • The solution system components are designed for high availability
  • The solution system provides multi-zone (inter-DC,inter-region, etc.) management
  • The solution system provides multi Data Center management

Integration Testing

  • The solution system is compatible with legacy, current, and future systems integration

Complexity Testing

  • The solution system has the ability to manage both simple and complex configurations

Metric Creation

  • The solution systems have metrics that can be monitored

Upgrading HP Virtual Connect VC 4.10 and Onboard Administrator OA 4.12

HP has released new versions of their chassis firmware which includes release 4.10 of the Virtual Connect (VC) firmware and release 4.01 of the Onboard Administrator (OA).

The OA upgrade is fairly trivial, and can be done by navigating to your HP OA website and telling it to update itself, see the screen shot.

The VC upgrade is a little more complex, and needs to have the Virtual Connect Support Utility (VCSU) installed before you can upgrade. This utility comes in Windows and Linux versions.

Windows VCSU utility:
http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?swItem=MTX-6cfcb4e34df146ee80f0a45b29&ac.admitted=1381352414135.876444892.199480143

Linux VCSU utility:
http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?swItem=MTX-45c7919ed0514bc69b37d7d22a&ac.admitted=1381334044440.876444892.199480143

VC firmware:
http://h20565.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdHome/?sp4ts.oid=3794431&spf_p.tpst=swdMain&spf_p.prp_swdMain=wsrp-navigationalState%3DswEnvOID%253D4141%257CswLang%253D%257Caction%253DlistDriver&javax.portlet.begCacheTok=com.vignette.cachetoken

OA firmware:
http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?sp4ts.oid=3664326&spf_p.tpst=swdMain&spf_p.prp_swdMain=wsrp-navigationalState%3Didx%253D%257CswItem%253DMTX_5907a9e1eb4a4353a79e053cb7%257CswEnvOID%253D54%257CitemLocale%253D%257CswLang%253D%257Cmode%253D%257Caction%253DdriverDocument&javax.portlet.begCacheTok=com.vignette.cachetoken

VC release notes, in case you want to read the documentation:
http://h20566.www2.hp.com/portal/site/hpsc/template.BINARYPORTLET/public/kb/docDisplay/resource.process/?spf_p.tpst=kbDocDisplay_ws_BI&spf_p.rid_kbDocDisplay=docDisplayResURL&javax.portlet.begCacheTok=com.vignette.cachetoken&spf_p.rst_kbDocDisplay=wsrp-resourceState%3DdocId%253Demr_na-c03923055-2%257CdocLocale%253Den_US

Note: You can get yourself into a bind if you attempt to upgrade the VC modules from a guest virtual machine inside the blade chassis.

Since you are updating the networking components, it might bring network connectivity to the guest VM down long enough to hang the upgrade process.

Its recommended to install the VCSU utility on a machine outside the chassis you are upgrading, but with fast network connectivity to that chassis.