Category: EMC

Vendor and Cloud lock-in; Good? Bad? Indifferent?

Vendor lock-in, also known as proprietary lock-in or customer lock-in, is when a customer becomes dependent on a vendor for products and services. Thus, the customer is unable to use another vendor without substantial switching costs.

The evolving complexity of data center architectures makes migrating from one product to another difficult and painful regardless of the level of “lock-in.” As with applications, the more integrated an infrastructure solution, architecture and business processes, the less likely it is to be replaced.

The expression “If it isn’t broke, don’t fix it” is commonplace in IT.

I have always touted the anti-vendor lock-in motto. Everything should be Open Source and the End User should have the ability to participate, contribute, consume and modify solutions to fit their specific needs. However is this always the right solution?

Some companies are more limited when it comes to resources. Others are incredibly large and complex making the adoption of Open Source (without support) complicated. Perhaps a customer requires a stable and validated platform to satisfy legal or compliance requirements. If the Vendor they select has a roadmap that matches the companies there might be synergy between the two and thus, Vendor lock-in might be avoided. However, what happens when a Company or Vendor suddenly changes their roadmap?

Most organizations cannot move rapidly between architectures and platform investments (CAPEX) which typically only occur every 3-5 years. If the roadmap deviates there could be problems.

For instance, again let’s assume the customer needs a stable and validated platform to satisfy legal, government or compliance requirements. Would Open Source be a good fit for them or are they better using a Closed Source solution? Do they have the necessary staff to support a truly Open Source solution internally without relying on a Vendor? Would it make sense for them to do this when CAPEX vs OPEXi s compared

The recent trend is for Vendors to develop Open Source solutions; using this as a means to market their Company as “Open” which has become a buzzword. Such terms like Distributed, Cloud, Scale Out, and Pets vs Cattle have also become commonplace in the IT industry.

If a Company or individual makes something Open Source but there is no community adoption or involvement is it really an open Source project? In my opinion, just because I post source code to GitHub doesn’t truthfully translate into a community project. There must be adoption and contribution to add features, fixes, and evolve the solution.

In my experience, the Open Source model works for some and not for others. It all depends on what you are building, who the End User is, regulatory compliance requirements and setting expectations in what you are hoping to achieve. Without setting expectations, milestones and goals it is difficult to guarantee success.

Then comes the other major discussion surrounding Public Cloud and how some also considered it to be the next evolution of Vendor lock-in.

For example, if I deploy my infrastructure in Amazon and then choose to move to Google, Microsoft or Rackspace, is the incompatibility between different Public Clouds then considered lock-in? What about Hybrid Cloud? Where does that fit into this mix?

While there have been some standards put in place such as OVF formats the fact is getting locked into a Public Cloud provider can be just as bad or even worse than being locked into an on-premise architecture or Hybrid Cloud architecture, but it all depends on how the implementation is designed. Moving forward as Public Cloud grows in adoption I think we will see more companies distribute their applications across multiple Public Cloud endpoints and will use common software to manage the various environments. Thus, being a “single pane of glass” view into their infrastructure. Solutions like Cloudforms are trying to help solve these current and frustrating limitations.

Recently, I spoke with someone who mentioned their Company selected OpenStack to prevent Vendor lock-in as it’s truly an Open Source solution. While this is somewhat true, the reality is moving from one OpenStack distribution to another is far from simple. While the API level components and architecture are mostly the same across different distributions the underlying infrastructure can be substantially different. Is that not a type of Vendor lock-in? I believe the term could qualify as “Open Source solution lock-in.”

The next time someone mentions lock-in ask them what they truly mean and what they are honestly afraid of. Is it that they want to participate in the evolution of a solution or product or that they are terrified to admit they have been locked-in to a single Vendor for the foreseeable future?

Is it that they want to participate in the evolution of a solution or product or that they are terrified to admit they have been locked-in to a single Vendor for the foreseeable future?

The future is definitely headed towards Open Source solutions and I think companies such as Red Hat and others will guide the way, providing support and validating these Open Source solutions helping to make them effortless to implement, maintain, and scale.

All one needs to do is look at the largest Software Company in the world, Microsoft, and see how they are aggressively adopting Open Source and Linux. This is a far cry from the Microsoft v1.0 which solely invested in their own Operating System and neglected others such as Linux and Unix.

So, what do you think? Is Vendor lock-in, whether software related, hardware related, Private or Public Cloud, truly a bad thing for companies and End Users or is it a case by case basis?

The day the systems administrators was eliminated from the Earth… fact or fiction?

As software becomes more complex and demands scalability of the cloud, IT’s mechanics of today, the systems administrator, will disappear. Tomorrow’s systems administrator will be entirely unlike anything we have today.

For as long as there have been computer systems, there has always been a group of individuals managing them and monitoring them named system administrators. These individuals were the glue of data centers,  responsible for provisioning and managing systems. From the monolithic platforms of the old ages to todays mixed bag approach of hardware, storage, operating systems, middleware, and software.

The typical System Administrator usually possessed super human diagnostic skills and repair capabilities to keep a complex mix of various disparate systems humming along happily. The best system administrators have always been the “Full Stack” individuals who were armed with all skills needed to keep systems up and running but these individuals were few and far between.

Data centers have become more complex over the past decade as systems have been broken down, deconstructed into functional components and segregated into groupings. Storage has been migrated to centralized blocks like a SAN and NAS thus inevitably forcing personnel to become specialized in specific tasks and skills.

Over the years, this same trend has happened with Systems Infrastructure Engineers/Administrators, Network Engineers/Administrators and Application Engineers/Administrators.

Everywhere you look intelligence is being built directly into products.I was browsing the aisle at Lowe’s this past weekend and noted that clothes washers, dryers, and refrigerators are now being shipped equipped with WIFI and NFC to assist with troubleshooting problems, collecting error logs and opening problem service tickets. No longer do we need to pour over those thousand pages long manuals looking for error code EC2F to tell us that the water filter has failed, the software can do it for us! Thus is has become immediately apparent that if tech such as this has made its way into low-level basic consumer items things must be changing even more rapidly at the top.

I obviously work in the tech industry and would like to think of myself as a technologist and someone who is very intrigued by emerging technologies. Electric cars, drones, remotely operated vehicles, smartphones, laptops that can last 12+ hours daily while fitting in your jeans pocket and the amazing ability to order items from around the globe and have them shipped to your door. These things astound me.

The modern car was invented in 1886 and in 1903, we invented the airplane. The first commercial air flight was not until 1914 but to see how far we have come in such a short time is astounding. It almost makes you think we were asleep for the last Century prior.

As technology has evolved there has been a need for software to also evolve at a similarly rapid pace. In many ways, we have outpaced software with hardware engineering over the last Score and now software is slowly catching up and surpassing hardware engineering.

Calm down, I know I am rambling again. I will digress and get to the point.

The fact is, the Systems Administrator as we know it is a dying breed. Like the dinosaur, the caveman and the wooly mammoth. All of these were great at some things but never enough to stay alive and thus were wiped out.

So what happens next? Do we all lose our jobs? Does the stock fall into a free fall and we all start drinking Brawndo the Thirst Mutilator (if you havent seen Idiocracy I feel for you.) The fact is, it’s going to be a long, slow and painful death.

Companies are going to embrace cloud at a rapid rate and as this happens people will either adapt or cling to their current ways. Not every company is going to be “cloudy”.

Stop. Let me state something. I absolutely HATE the word Cloud. It sounds so stupid. Cloud. Cloud. Cloud. Just say it. How about we all instead embrace the term share nothing scalable distributed computing. That sounds better.

So, is this the end of the world? No, but it does mean “The Times They Are a Changin” to quote Mr. Dylan.

A fact is, change is inevitable. If things didn’t change we would still be living in huts, hunting with our bare hands and using horses as our primary methods of transportation. We wouldn’t have indoor toilets, governments, rules, regulations, or protection from others as there would be no law system.

Sometimes change is good and sometimes its bad. In this case, I see many good things coming down the road but I think we all need to see the signs posted along the highway.

Burying ones head in the dirt like an Ostrich is not going to protect you.

How to build a large scale multi-tenant cloud solution

It’s not terribly difficult to design and build a turnkey integrated pre-configured SDDC ready to use solution. However building one that completely abstracts the compute, storage and network physical resources and provides multiple tenants a pool of logical resources along with all the necessary management, operational and application level services and allows to scale resources with seamless addition of new rack units.

The architecture should be a vendor agnostic solution with limited software tie-in to hardware vendor specifics but expandable to support various vendor hardware needs with plug-n-play architecture.

Decisions should be made early if the solution will come in various forms and factors from appliances, quarter, half and full racks providing different levels of capacity, performance, redundancy HA, SLA’s. Building a ground-up architecture to expand to mega rack scale architecture in future with distributed infrastructure resources without impacting the customer experience and usage.

The design should contain more than one physical rack with each rack unit composing of: Compute Servers with direct attached storage (software defined) a Top of the Rack and Management Switches hardware Data Plane, Control Plane and Management Plane software Management plane software Platform level Operations, Management and Monitoring software Application-centric workload Services.

Most companies have a solution based on a number of existing technologies, architectures, products, and processes that have been part of the legacy application hosting and IT operations environments. These environment can usually be repurposed for some of the scalable cloud components which saves time, cost and the result is a stable environment that operations can still manage/operate with existing processes and solutions.

In order to evolve the platform to provide not only for stability and supportability but additional features such as elasticity and improved time to market companies should begin immediately initiating a project to investigate and redesign the underlying platform.

In scope for this effort are assessments of the network physical and logical architecture, the server physical and logical architecture, the storage physical and logical architecture, the server virtualization technology, and the platform-as-a-service technology.

The approach to this effort will include building a mini proof of concept based on a hypothesized preferred architecture and benchmarking it against alternative designs. This proof of concept then should be able to scale to a production sized system.

Implement a scalable elastic IaaS – PaaS leveraging self-service automation and orchestration that enables end users the ability to self-service provision applications within the cloud itself.

Suggested phases of the project would be as follows:

Phase Description:

  • Phase I Implementation of POC platforms
  • Phase II Implementation of logical resources
  • Phase III Validation of physical and logical resources
  • Phase II Implementation of platform as a service components
  • Phase IV Validation of platform as a service components
  • Phase V Platform as a service testing begins
  • Phase VI Review, document complete knowledge transfer
  • Phase VII Present fact findings to executive management

Typically there are four fundamental components to cloud design; infrastructure, platform, applications, and business process.

The infrastructure and platform as a service components are typically the ideal starting place to drive new revenue opportunities, whether by reselling or enabling greater agility within the business.

With industries embracing cloud design at a record pace and technology corporations focusing on automation this allows the benefit of moving towards a cloud data infrastructure design.

Cloud Data infrastructure allows the ability to provide services, servers, storage, and networking on-demand at any time with minimal limits helping to create new opportunities and drive new revenue.

The “Elastic” pay-as-you-go data center infrastructure should provide a managed services platform allowing application owner groups the ability to operate individually while sharing a stable common platform.

Having a common platform and infrastructure model will allow applications to mature while minimizing code changes and revisions due to hardware, drivers, software dependencies and infrastructure lifecycle changes.

This will provide a stable scalable solution that can be deployed at any location regardless of geography.

Today’s data centers are migrating away from the client-server distributed model of the past towards the more virtualized model of the future.

Storage: As business applications grow in complexity, the need for larger more reliable storage becomes a data center imperative. Disaster Recovery / Business Continuity: Data centers must maintain business processes for the overall business to remain competitive. Dense server racks make it very difficult to keep data centers cool and keep costs down. Cabling: Many of today’s data centers have evolved into a complex mass of interconnected cables that further increase rack density and further reduce data center ventilation.

These virtualization strategies introduce their own unique set of problems, such as security vulnerabilities, limited management capabilities, and many of the same proprietary limitations encountered with the previous generation of data center components.

When taken together, these limitations serve as barriers against the promise of application agility that the virtualized data center was intended to provide.

The fundamental building block of an elastic infrastructure is the workload. Workloads should be thought of as the amount of work that a single server or ‘application gear/container/instance’ can provide given the amount of resources allocated to it

Those resources encompass compute (CPU & RAM), data (disk latency & throughput), and networking (latency & throughput). A workload is an application, part of an application, or a group of application that’s work together. There are two general types of workload that the most customers need to address: those running within a Platform-as-a-Service construct and those running on a hypervisor construct. Sometimes bare metal should also be considered where applicable but this is in rare circumstances.

Much like database sharding, the design should be limited by fundamental sizing limitations which will allow a subset of resources to be configured at maximum size hosting multiple copies of virtual machines, applications group and distributed load balanced across a cluster of hypervisors that share a common persistent storage back end.

This is similar to load balancing but not exactly the same as a customer or specific application will only be placed in particular ‘Cradles’. A distribution system will be developed to determine where tenants will be placed upon login to and direct them to the Cradle they were assigned.

In order to aggregate as many workloads as possible in each availability zone or region, a specific reference architecture design should be made to determine the ratio virtual servers per physical server.

The size will be driven by a variety of factors including oversubscription models, technology partners, and network limitations.The initial offering will result in a prototype and help determine scalability & capacity and this design should scale in a linear predictable fashion.

The cloud control system and its associated implementations will be comprised of Regions or Availability Zones. Similar in many ways to what Amazon AWS does currently.

The availability zone model allows the ability to isolates one fault domain from another. Each availability zone has isolation and redundancy in management, hardware, network, power, and facilities. If power is lost in a given availability zone tenants in another availability zone are not impacted. Each availability zone resides in a single datacenter facility and is relatively independent. Availability zones are then aggregated into a regions and regions into the global resource pool.

The basic components would be as follows:

· Hypervisor and container management control plane
· Cloud orchestration
· Cloud blueprints/templates
· Automation
· Operating system and application provisioning
· Continuous application delivery
· Utilization monitoring, capacity planning, and reporting

hardware considerations should be as follows:

· Compute scalability
· Compute performance
· Storage scalability
· Storage performance
· Network scalability
· Network performance
· Network architecture limitations
· Oversubscription rates & capacity planning
· Solid-state flash leveraged to increase performance and decrease deployment times

Business concerns would be:

· Cost-basis requirements
· Margins
· Calculating cost VS profits to show ROI (chargeback/show back)
· Licensing costs

The extensibility of the solution dictates the ability to use third party tools for authentication, monitoring, and legacy applications. The best cloud control system should allow the ability to integrate legacy systems and software with relative ease. Its my own personal preference to lead with Open Source software but that decision is left to the user to decide.

Monitoring,  capacity planning, and resource optimization should consider the following:

· Reactive – Break-Fix monitoring where systems and nodes are monitored for availability and service is manually restored
· Proactive – Collect metrics data to maintain availability, performance, and meet SLA requirements
Forecasting – Use proactive metric data to perform capacity planning and optimize capital usage

Because cloud computing is a fundamental paradigm shift in how Information Technology services are usually delivered it will cause significant disruption inside most of the current organizations. Helping each of these organizations embrace the change will be key.

While final impacts are currently impossible to measure it’s clear that a self-service model is clearly the future and integral to delivering customer satisfaction, both from an internal or external user perspective.

Some proof of concept initiatives would be as follows:

· Determine a go-forward architecture for the IaaS and PaaS offering inclusive of a software defined network
· Benchmark competing architecture options against one another from a price, performance, and manageability perspective
· Establish a “mini-cradle” that can be maintained and used for future infrastructure design initiatives and tests
· Determine how application deployment can be fully or partially automated
· Determine a cloud control system to facilitate provisioning of Operating Systems and multi-tiered applications
· Complete the delivery of FAC to generate metrics and provide statistics
· Show the value of self-service to internal organizations
· Measure the ROI based on cost of the cloud service delivery combined with the business value
· Don’t build complex for the initial offering
· Avoid spending large amounts of capital expenses on the initial design

After implementing a proof of concept testing encompassing the following(and more) should be done:

Proof of Functionality

  • The solution system runs in our datacenter; on our hardware
  • The solution system can be implemented with multi-network configuration
  • The solution system can be implemented with as few manual steps as possible (automated installation)
  • The solution systems have the ability to drive implementation via API
  • The solution system provides a single point of management for all components
  • The solution system enables dynamic application mobility by decoupling the definition of an application from the underlying hardware and software
  • The solution system can support FAC production operating systems
  • The solution system Hypervisor and guest OS are installed and fully functional
  • The solution systems support internal and external authentication against existing authentication infrastructure.
  • The solution system functions as designed and tested

Proof of Resiliency

  • The solution system components are designed for high availability
  • The solution system provides multi-zone (inter-DC,inter-region, etc.) management
  • The solution system provides multi Data Center management

Integration Testing

  • The solution system is compatible with legacy, current, and future systems integration

Complexity Testing

  • The solution system has the ability to manage both simple and complex configurations

Metric Creation

  • The solution systems have metrics that can be monitored

Troubleshooting vSphere Auto Deploy with the vCenter Appliance

I was pulled into an issue the other day where some ESXi Hosts were failing to boot via Auto Deploy. Now, this shouldn’t come as a shock to anyone but I absolutely HATE the Auto Deploy product as I feel its flakey and doesn’t work properly and can be cumbersome to manage if one is not comfortable using command line. I personally am not a GUI person and feel right at home in the command line but have seen enough random issues with Auto Deploy over the years that I rarely ever recommend using it for large scale deployment unless the environment has a requirement to be stateless and boot from SAN is not an option.

So. Lets dig in and discuss Auto Deploy, how it works, find the issue and sold the problem! Below we have a diagram that shows the typical chain of events when an server boots and is directed to use Auto Deploy via PXE.

So, obviously the first thing we need to check is if there is an error on the console output. This environment is using Cisco UCS which is a stateless computing platform. UCS allows hardware to migrate between chassis or domains and the logical design of the server follows. The logical design contains everything that would make up a normal whitebox server but the hardware details are specifically abstracted. This allows for administrators to provision nearly an infinite of logical design. If you want to know more about UCS you can find details here:

https://www.youtube.com/user/bradhedlund/videos

So, when we open the KVM console of the UCS blade we see the following:

error

Right off the bat we know that DHCP seems to be working and we can see what device is handing out the IP addresses. Looking into the DHCP device I was able to determine that it was a Windows Server OS responsible for the DHCP configurations:

DHCP

A quick look at the DHCP configuration looks correct and we can see that the DHCP server is sending the TFTP request to 10.40.80.7 which is our vCenter Appliance. So lets go and check whats going on there! Opening a web browser to the vCenter Appliance admin page we see that Auto Deploy is running though looks can be deceiving.

VCSA

 

The best way to determine if Auto Deploy TFTP is working is to login and check so I logged in via SSH and elected to take a look. Immediately after logging in I noticed the TFTP daemon was not running. Further, when I checked the chkconfig it was also not enabled to run at boot. So, I decided to check the TFTPD config file and thats when I saw the issue.

2015-03-28 12_28_22-bnpappvcs611.corp.fairisaac.com - PuTTY

Someone had modified the config file to run the ATFTPD service under root. This was preventing the service from starting as it didnt have the proper runtime. I made a quick change using VI and saved the file.

2015-03-28 12_28_39-bnpappvcs611.corp.fairisaac.com - PuTTY

I then adjusted the start up level  for the service

chkconfig

A quick reboot of the server was then done to make sure things were working.

2015-03-28 12_29_14-BP-UCSPERFLAB-FI _ VMWESXINF092 (Chassis - 1 Server - 6) - KVM Console(Launched

Then it was time to celebrate!

 

 

 

 

 

 

 

XtremIO Information and basic setup

XtremIO:  The Out of the Box Experience

EDIT 1: All XtremIO bricks ship with both 8Gb FC and 10Gb iSCSI.  You do not have to specify that when ordering.

EDIT2: EMC has asked me to remove the listing of default passwords.

EDIT3: The Xbricks ship with XtremIO Software on them but it’s still highly recommended to do an install to the latest GA code on installation.

Often when customers buy a new storage array they don’t get to see the entire installation process. Usually, it’s not a big deal as it’s a bit tricky and not that exciting.  It’s mostly tasks that are only used once and then never needed again.  But when installing our XtremIO brick in the Varrow lab I thought it might be fun to walk through the installation of that system.  If you’re like me you probably think installing new gear is fun and I thought it might be interesting to see.

Physical Installation

The physical install is very easy.  Below is the layout for a single XBrick.

As you can see, from the bottom, it goes Controller 1, disk enclosure, Controller 2, and then two battery backup units.  When you install this you need to confirm your cabling.  It’s well documented and simple…just confirm it.  I won’t go through that in detail since the documentation lays it out as simple as wiring up a BluRay player.  The Infiniband ports go to there other controller in a single brick install or to the IB switch in a multi-brick install.  The SAS connections go to the DAE.  Each brick has both 8Gb FC and 10Gb iSCSI ports.  Power from each controller goes to the battery backup units.

Ports on the Controllers

It’s also highly recommended you lay out the physical gear as in the diagram above.  You do name things based on where they are, such as Controller 1 being on the bottom.  Obviously Controller 1 doesn’t know if it’s on the bottom but it will help you remember which one is which later.

Also note that there is an option for a physical XMS, XtremIO Management Server.  This would be an additional 1U server that runs the management UI for all of the XBricks in your cluster.  But most people will just use the virtual XMS.  To me it’s better to use the virtual since you can just run it as a VM and protect it like you would any other VM with tools such as VMware HA.  If the XMS is down for any reason the cluster will continue to service hosts, you just can’t make any changes until it is back up.  If you completely lose the XMS and need to reinstall it’s not a big deal as you just point it to your existing cluster and it will read in all configuration information.

Here is a two-XBrick configuration.

Notice that the main difference is the addition of the two Infiniband switches in the middle.  When you have a single XBrick the two controllers are cabled to each other for the backplane, but when you go to multiple XBricks you connect them to the Infiniband switches.

 Initial Configuration

Before you being you’ll need to gather some information, namely IP addresses.  You’ll need 5 total IPs for one brick:

  • One for each controller’s management
  • One for each controller’s lights-out management port
  • One for the XMS

You’ll also want to get the default logins for the different steps.  While the install documentation for XtremIO tells you which user account to log in as in each step they do not tell you the password.  EMC has this documented in KB 000172817 on support.emc.com.  The title of the article is “XtremIO: Default System / Cluster Access Credentials for XtremIO”.

Let’s get going!  Confirm that your cabling is correct and everything is powered on.

Management and Tech Interfaces

This can be a little odd so follow along.  On the back of each controller is a Tech interface.  Think of it like a console interface but over SSH.  The Tech interface has a hardcoded IP address of 169.254.254.1/20 (255.255.240.0).

Every controller has the same IP on the Tech interface!  Don’t connect them to a switch or you’ll get duplicate IP problems!

The idea is that you will connect your notebook directly to the Tech interface on each controller, one at a time.  These interfaces do not need to be connected once the initial configuration is done.  If you’re doing this in a lab, like me, you can connect both to a switch but only activate one port at a time.

Connect your notebook to the Tech interface on Controller 1, the one on the bottom of the stack.  Assign the IP of 169.254.254.2/20 to your NIC.  SSH in to 169.254.154.1 and login as xinstall.

 

Connecting to the Tech Interface

 

You’ll be presented with the menu shown in the screenshot.  We’ll perform the base configuration here for each controller.  Select Option 2 – Configuration.

Going Through Option 2

 

You’ll be prompted for the following information:

  • Cluster Name – Logical name you are assigning to the cluster.  Pick something that means something and remember it.
  • Brick Number – This is the number for the brick that this controller belongs to and starts at the bottom with number 1.  If you only have one brick just say 1.  This is where physical placement becomes important to helping you remember your configuration items.
  • Controller ID – The ID of the controller you are configuring.  This is either 1 or 2 (two controllers per brick).  Controller 1 is on the bottom of the brick, 2 is on top.  See note above about physical placement.
  • Management IP Information – IP address, subnet mask, and default gateway for the controller’s Management Ethernet interface.
  • KVM IP Address Information – IP address, subnet mask, and default gateway for the controller’s Lights-Out KVM interface.

Confirming Configuration

 

One the script configures the controller I suggest you do Option 3 and then Option 4 to confirm the configuration is correct.  You’ll also see the configuration script assign some IPs to the interfaces ib0 and ib1.  These are the back-end Infiniband interfaces and the IPs are automatically assigned for those.  Go ahead and Exit.

Displaying the Configuration

 

Now go and do the same thing to the second controller in the brick but be sure to use the same cluster name, but different controller ID and IP information.

At this point your two (or more if you have more bricks) have their basic configuration and should being talking to each other.  The next step is to go ahead and install the XMS.

 Deploying the XMS

As I said earlier, the XtremIO Management Server (XMS) is the management front-end for your XtremIO cluster.  You have one of these per cluster, not per brick.  You have the option to order a physical XMS but most will prefer the virtual one.  The configuration is basically the same.  The only difference is that with the virtual XMS we will just do the initial configuration using the VM’s console in vCenter.  If you have the physical XMS you’ll connect to the appliances Tech port just like we did in the controller config above.  Else they are the same…in fact, if you look at the XMS VM you’ll see there are two NICs but only one is connected.  The disconnected NIC is actually the Tech interface that isn’t used by the VM.  So you can see that it’s the same software image for both.

The Physical XMS Tech interface has the same IP scheme as the controllers:  169.254.254.1/20

The XMS is not in the data path at all.  It’s just there for visibility and management changes.  If it goes down the cluster continues to operate just fine.  You can even delete and deploy a new XMS if you want and connect it to your cluster.  For this reason I like the virtual XMS using resiliency features like VMware HA.  By default is uses 2 vCPUs or I’d recommend VMware FT…but not yet…

To deploy the XMS you first go to support.emc.com and download the latest version.  This will be in OVA format.  You then deploy that OVA just as you would any other.  I’m not going to go through that here as that’s a common task.  The XMS has the following configuration:

  • 4GB of RAM
  • 2 vCPUs
  • 1 NIC (and 1 disconnected)
  • 80GB of HD space (thick provisioning recommended but not required)

Once the XMS is deployed and booted open the VM console and login as xinstall and choose Option 2 for Configuration.  You’ll be prompted for the following information:

  • Cluster Name – This is the same cluster name you used during the controller configuration
  • Management Network Information – IP address, subnet mask, and default gateway for the XMS’ management interface

When finished display the configuration and confirm that it is correct and then Exit.

Installing the XtremIO Software

The XtremIO controllers don’t actually ship with the XtremIO software on them.  They only ship with a basic Linux OS installed so you have to deploy the actual XtremIO OS to them.  Before you can do that you need to go to support.emc.com and download the latest version.  Once you’ve downloaded the file you need to copy that file to the XMS.  You can use this using whatever SCP tool you prefer.  You just need to copy it over and put it in the /var/lib/xms/images/ directory on the XMS.  I normally use the root account for authentication.

After you copy the XtremIO Software file to the XMS you’ll need to SSH to the XMS and login as xinstall.  From here you want to choose Option 6, Fresh Install.

Performing Fresh Install

You’ll be prompted for some information:

  • Management Node – Give it the management IP address of a controller in the cluster.  Doesn’t matter which one.
  • Expected Number of Bricks – How many bricks should there be?  The install wants to confirm that all controllers and bricks are talking.
  • Installation Image Filename – Give it the name of the file for the XtremIO software.  Just the filename and not the path!  Example, upgrade-to-2.2.3-17.tgz

It may seem odd that filename for the XtremIO Software is called upgrade-to-<version>.tgz but you use the same file for both new installs and upgrades.  You aren’t doing anything wrong.

Once you give it the correct information the scripts does a number of things.  You can see a lot of it in the screenshot above.  It will go out and confirm communication to all expected bricks and controllers.  If you tell it to expect 2 bricks and it only finds 2 controllers, instead of 4, it will error out.  If it does confirm everything is cabled correctly.  Little bit of advice….  I’ve found that I can get through this process so fast that I get to this point before my controllers have a chance to establish communication.  Therefore I’ll often wait 10 minutes between finishing the controller configuration and this step to give them time.

Assuming everything works fine it’ll install the software to all controllers and reboot them.  When it is done choose Exit.

Creating the Cluster

Almost done!  The last major step is to create your XtremIO cluster…meaning, bringing all of your bricks and controllers together.  To do this SSH in to the XMS and login as xmsadmin.  Oddly enough you’ll next be presented with a Username: prompt.  Login to that as tech.  This is the XMS management shell where you can execute CLI commands.

XMS CLI Interface

 

Once at the CLI enter the command:

create-cluster expected-number-of-bricks=<i> sc-mgr-host=”<j>” cluster-name=”<k>” 

Where:

  • i is the number of bricks that the creation tool should expect to see
  • j is the management IP address of any controller.   It doesn’t matter which one.
  • k is the name of the cluster that you want to use.

Cluster Creation Process

 

This process may take some time to complete…10 or 15 minutes.  Just watch it and confirm there are no errors.

Cluster Information

 

Once the process is done I like to run a show-clusters-info command to confirm the configuration.  At this point you are basically done.  You can point a web browser to the IP address of the XMS and login as admin to make sure the XMS is monitoring and managing your cluster.

XMS UI

 

There are a few more minor steps such as configuring call home support but at this point you have a working, functional, super-fast all flash array.

Conclusion

While it might seem like there are a lot of steps here there really aren’t.  You’re just standing up your controllers, management station, and tying it all together.  Now that I’ve done a few of these I’ve found I can rebuild our XtremIO in the lab in under 45 minutes, start to finish.  Now that doesn’t include time for racking the gear, cabling it, configuring the network, or connecting hosts but it shows that the array itself is very easy to stand up.

Soon I’ll be posting more articles and videos on using and managing XtremIO.  It’s amazingly simple.

Convert VMware NAA value to Frame and Device ID for vMax and Symmetrix

The following will convert a VMAX/Symmetrix NAA value into a frame and device ID.

  • Save the below into a .psm1 file
Function Convert-NaaIdToVmaxFrameAndDeviceId ($naa) {

# Original -> http://vmwise.com/2012/06/01/naa-ids-and-breaking-them-apart/
if ($naa.length -ne 36) { “NAA value must be 36 characters”; break }
$deviceString = $naa.ToCharArray()
# Prepended w/ Frame info.
$device = (“Frame $($deviceString[20])$($deviceString[21])$($deviceString[22])$($deviceString[23]) Device “)
$device += [char][Convert]::ToInt32(“$($deviceString[26])$($deviceString[27])”, 16)
$device += [char][Convert]::ToInt32(“$($deviceString[28])$($deviceString[29])”, 16)
$device += [char][Convert]::ToInt32(“$($deviceString[30])$($deviceString[31])”, 16)
$device += [char][Convert]::ToInt32(“$($deviceString[32])$($deviceString[33])”, 16)
$device += [char][Convert]::ToInt32(“$($deviceString[34])$($deviceString[35])”, 16)
return $device
}

  • Import-Module nameoffile.psm1
  • Using PowerCLI run the command below:
  • Convert-NaaIdToVMaxFrameAndDeviceID naa.#############################

Example below:
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314639
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031333033
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031323935
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314639
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031324136
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031323142
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031324237
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031324237
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031324439
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314336
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031323344
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031324338
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031323733
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031323632
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314437
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031323834
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031324541
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314134
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314235
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314336
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314134
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314235
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314437
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031323142
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031323344
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031314639
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192605358533031323036
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030363239
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030373931
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030373931
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030374344
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030383039
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030383435
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030363635
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031303236
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030364131
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031303632
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031303945
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192605358533031323432
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031304441
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192604594533031313136
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030364444
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192605358533031323745
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000192605358533031324241
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030344644
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030353339
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030353735
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030354231
Convert-NaaIdToVMaxFrameAndDeviceID naa.60000970000195701171533030354544

 

Installing Cisco Fabric Manager on Windows

This should be a pretty straight forward install, but for some odd reason Cisco has to make it much more complicated than it needs to be. Here is the straight forward way of getting this installed.
1. Download the latest version from the Cisco site.

Cisco MDS 9000 Family Management Software and Documentation

http://www.cisco.com/cisco/software/type.html?mdfid=282731430&catid=null

2. Download Java RE 1.6 –  FM (4.2.9) software does not support Java RE 7.

http://www.filehippo.com/download_jre_32/13491/

3.  Right click on Command Prompt > “Run as administrator”

>  cd “c:\Program Files (x86)\java\jre6\bin”
>  java.exe -Xmx512m -jar “C:\m9000-cd-4.2.9\software\m9000-fm-4.2.9.jar”

Follow the GUI prompts.