High availability and virtualization in Telecom: What are your thoughts?

Posted by Holger Herbert, Director of Product Management, GoAhead Software

Uninterrupted service is a core requirement for carrier grade applications. The boom in virtualization in the converged space has contributed to a lot of buzz around virtualization in the telecom vertical. Standards are moved forward by the SCOPE alliance and the SA Forum, which supports a virtualization-friendly environment through the PLM standard for execution environments. Companies like VirtualLogix, have embraced the move toward virtualization and have integrated through these SA Forum middleware standards.

Though all of the buzz, some designs incorporating high availability and virtualization have emerged within the telecom universe. Many questions around performance, ROI and real time usage characteristics have emerged and telecom software architects, software vendors and hardware vendors have had to field lots of questions around virtualization. But how far are TEM’s going in implementing these?

While there are many decisions to be made in implementing virtualized environments, here are some fundamental areas that need to be thought through:

  • Heterogeneous environments: Virtualization certainly touts many benefits in regard to its support of legacy environments and heterogeneous operating environments – all on a single hardware platform. At the same time deployment can affect performance and thus ROI and may affect “real-time requirements” that must be maintained. How does one weigh the benefits against migrating away from that old hardware legacy?
  • Complexity: Does the complexity of implementing virtualization increase risk in your telecom deployment? How do you mitigate these risks?
  • Performance: Do you believe the performance in regard to both transaction and failover would be adequate for your applications? How are you making the determination that your requirements are met?

We’d love to get your thoughts. Like many, we are curious if virtualization will bring massive innovation for TEM’s. Is it another tool in the deployment toolset to solve niche problems, as in converged markets, or are the barriers to entry too high for now? Let us know what you think, and how you use virtualization in telecom.

I and many others are looking forward to your thoughts!

Add comment September 5, 2008

The Information Model Management (IMM) Service: Part II

Why is IMM important?

Posted by David Fick, Chief Architect, GoAhead Software

There are a number of reasons why you as a system designer or architect would choose to use IMM, from standardizing your platform to ensuring reliability and extensibility. In this entry, we’ll explore some of the top reasons our customers are moving towards solutions that implement IMM.

  • Configuration of AIS services – If you are considering or already using an HA middleware package based on AIS, the AIS standards require that the resources to be managed by the AIS services be configured through IMM. The Notification (NTF) and Log services, along with the aforementioned AMF, are examples of such AIS services.
  • Standards-based – IMM is an industry standard which means your applications that utilize the IMM APIs won’t be tied to a particular HA middleware implementation.
  • Extensibility – The information model can be extended to include your own custom object classes and instances meaning you don’t need to create or buy a separate mechanism for configuring your applications or representing their operational state. You can also define custom administrative operations that can be initiated on the IMM objects owned by your applications.
  • Single point of access for system configuration and operational status – Given the extensibility of the information model and the fact that the AIS services utilize the information model for their configuration and run-time data, IMM can act as a single point of access for system management applications to read and modify configuration data, access run-time data, and perform administrative operations.
  • Consistent configuration changes – While not described in detail above, IMM supports the concept of configuration change bundles where a set of configuration changes for one or more objects in the information model can be “bundled” together and applied to the information model with all-or-nothing semantics. That is, if all of the changes cannot be applied then none of the changes are applied. This is a very powerful and useful concept that simplifies the implementation of complex configuration tasks within system management applications.
  • Reliability – An important characteristic for any critical services in an HA middleware package is reliability and IMM is no different. Given sufficient redundant resources on which to run, the most useful IMM implementation will ensure that access to the information model is reliable and available in the face of node and network interface failures. This in turns means that your applications can rely on access to the information model for their configuration data, for storing their operational status, and for processing administrative operations.

While AMF will always be the linchpin within any AIS-based HA middleware solution, IMM comes in a close second given its importance in the AIS architecture and its other desirable characteristics such as extensibility and reliability.

Add comment August 8, 2008

The Information Model Management (IMM) Service: Part I

What is IMM?

Posted by David Fick, Chief Architect, GoAhead Software

Along with the Availability Management Framework (AMF) described in a prior blog entry, one of the key services within the Application Interface Specification (AIS) interfaces defined by the SA Forum is the Information Model Management (IMM) Service. The purpose of IMM is to manage and provide cluster-wide access to a data repository, termed the information model, containing a set of objects (or if you prefer X.731 terminology, managed objects) that represent the configuration and operational data and administrative operations associated with the physical and logical resources in the system. These resources may include the following:

  • Hardware elements such as single board computers in a bladed chassis
  • Software elements such as an application process running on an O/S
  • Logical elements such as an AMF service group which is a logical grouping of associated like resources that can provide the same service

There are two types of objects that are stored in the information model:

  • Configuration – Configuration objects contain configuration attributes which define configuration settings used within the system. This type of object can also include run-time attributes which provide operational status information. The value of the configuration attributes for this type of object is persistent.
  • Run-time – Run-time objects contain only run-time attributes which provide operational status information for the resource associated with the object. Run-time attributes can optionally be marked as persistent but are typically not persistent.

The objects in the information model are accessible through two different IMM-provided interfaces as shown in the following figure.

Information Model Management Service

Information Model Management Service

The Object Management (OM) API is intended for use by system management applications, potentially including a CLI or SNMP agent, which need to do the following:

· Perform create, read, update, and delete operations, so-called “CRUD” operations, on the objects representing the configuration of the system

· Monitor the operational state of the system

· Initiate administrative operations on system resources.

An application that performs these types of operations is called an object manager in the IMM vernacular.

The Object Implementer (OI) API is intended for applications or services that are responsible for managing the operation for a defined set of system resources represented as objects in the information model. An object implementer is also responsible for validating and processing configuration change and administrative operation requests on those objects for which it registers as the implementer. As shown in the figure, there are typically a set of configuration objects in the information model which represent the AMF system model where these objects might include Service Groups (SGs), Service Units (SUs) and Service Instances (SIs). AMF uses these objects and their properties to determine the resources within the system it needs to manage along with the availability management policies to be applied to those resources. AMF will also create run-time objects within the information model that represent the operational state of the AMF entities under management such as workload assignments.

Now that you have a basic understanding of IMM and its anatomy, check back next week and we’ll explore the value IMM offers as you architect your system.

1 comment August 1, 2008

Availability Management Framework

Posted by Steve Mills, Systems Engineer, GoAhead Software

Today I want to talk about the power and beauty of the Availability Management Framework (AMF) Service, which is one of the many SA Forum AIS services. First let’s set some context.

It was mentioned in an earlier blog that the Service Availability Forum (SA Forum) is the primary source of standards for high availability middleware. Roughly the SA Forum specifications are broken down into two areas:

  • The Hardware Platform interface (HPI) specification defines a set of APIs that abstract hardware platform resources (fans, blades, power supplies) so that any particular platform technology, such as ATCA, Compact PCI, BladeCenter or even proprietary, expresses itself in the same way to its clients. GoAhead’s SAFfire HA middleware is just such an HPI client. This is a big topic unto itself which we will leave for another day.
  • Then there are the Application Interface Specification (AIS) services, each of which has the general goal of supporting the needs of a system and its embedded distributed applications in a highly available computing environment called a cluster. There are twelve AIS services today with more on the way. Example AIS services include Logging, Checkpointing and Messaging.

The Availability Management Framework (AMF) is by far the most important of these AIS services. AMF provides the core features that allow a system and its applications to achieve five 9’s or greater service availability. Roughly, AMF accomplishes this by coordinating a set of active and standby objects that provide service protection in the advent of a fault.

But how does AMF help achieve these system goals and how is it different from the other AIS services?

In some ways AMF is much like any other AIS service. Like other AIS services, AMF specifies the following set of things:

  • APIs (Java coming soon!) available to participating applications
  • Alarms and Notification events that others can subscribe to
  • Objects that expose AMF’s configuration and runtime values
  • Administrative operations that allow humans (or scripts) to control AMF’s activities

One thing that makes AMF so different from the other AIS services is that AMF’s APIs only expose the tip of a very large conceptual ‘ice berg’ that remains hidden below the surface from the AMF components that actually use the AMF APIs.

With other AIS services you can learn almost all there is to know about that service by carefully studying its APIs. For example, the APIs of the Checkpoint service tells us about checkpoints, their replicas, how they are structured and what operations one can perform on them, and that is about all there is to know about the checkpoint service.

Not so with AMF. The AMF APIs are only used by one rather basic AMF modeling object called a ‘component’ (which can be thought of as a Linux process). Through the AMF APIs an ‘SA-aware’ component is told when to go active, when to go standby, report health status or report failure. That’s about it. In fact, the AMF APIs are not even required for a component to participate since AMF also supports legacy ‘non SA-Aware’ components, whose code cannot be changed. In short, we don’t learn much about AMF by only studying its APIs.

Now for that portion of the iceberg hidden below the AMF API surface: The AMF APIs give little clue that there is a rich and elaborate world behind them. The lowly component is but a cog in a very big and elaborate machine whose configuration is expressed in something called the system model.

The component itself has little insight that it is participating in something called a Service Unit (SU), perhaps with other components, and that this SU may be one of many SUs that make up a Service Group (SG), some number of which in turn make up an application. This entire conceptual framework is captured in the system model.

AMF controls the component’s life-cycle. AMF knows when, where and why the constituent components of an SU are instantiated and terminated, as guided by information in the system model. The component has no clue.

AMF knows when, where and why to assign the constituent components of an SU the active or standby roles as well as how and when to healthcheck each. AMF knows if the component is participating in a 2N, N+M, N-way or one of the other redundancy models. Again, the component itself has little idea. All of this detail is in the system model.

In the advent of a fault, AMF knows how and when to initiate recovery and repair policies such as component-restart, SU or node failover, as explained in the system model. The component simply does whatever AMF tells it to.

In sum, AMF is by design minimally intrusive on the component writer and the coding effort in general. Implementing a component is relatively simple, straight forward and re-usable. No grand system insights are required. Once built, these components are like re-usable ‘atoms’ available for configuration to form new and interesting ‘molecules’ designed to fit a particular computing device.

The system model is the real beauty and power of AMF. The system model is like one big data driven policy engine that allows the ‘big picture’ system designer to properly assemble the components so as to capture the HA structure and policies suitable for the target device at hand. This is a configuration not a coding exercise. This allows for easy corrections, adjustments and tweaks as needed. This agility is particularly elegant, powerful and cost effective as the project (and its realities) transitions from paper design, to development, to system test and finally deployment.

Add comment July 25, 2008

The recession: Has it hit the telecom industry?

Posted by Jim Ewel, CEO and Chair, GoAhead Software

The two largest markets for high availability solutions are telecom and defense. Both markets have been very healthy for the last couple of years. The question we ask ourselves is how healthy will these markets remain moving forward? This is relevant not only for our sales, but for the job prospects of many of our customers and partners. During the last telecom downturn in 2001-2002, the large telecom equipment manufacturers shed tens of thousands of jobs.

First quarter numbers for the major telecom equipment manufacturers for 2008 were mixed, although the large players seem healthy. Ericsson was up five percent year over year (YoY), Nortel was up 11 percent, Motorola was up two percent, and Alcatel-Lucent was flat in revenues, but up dramatically in profitability.

How about the revenues of the service providers? Of the top five service providers, three show healthy growth (Verizon, AT&T and Telefonica), NTT shows a slight decline and Deutsche Telekom seems to be declining a little more dramatically, probably due to competition from Vodafone.

So a quick look in the rear-view mirror shows a fairly healthy market, and certainly no signs of a recession comparable to 2001-2002. What about the prospects for the industry moving forward?

AT&T CEO Randall Stephenson has warned of a slowdown but mainly in the landline business. He cited wireless as a growth area, as did Verizon’s Chairman Ivan Seidenberg, who sees no sign of a slowdown. But the big areas of growth are in infrastructure outside of North America and Western Europe, and in new technologies. At year end 2007, China signed its 500 millionth subscriber, and India passed the U.S. in total number of subscribers. China and India add 8-9M net additional subscribers per month. New phones, like the iPhone, have also increased data services and corresponding ARPUs. All of this requires new equipment, and new equipment means a healthy market.

The growth potential for new technology is also huge. For example, in the details of Verizon’s most recent quarterly filing, they talked about the potential of their broadband fiber-to-the-premises offering (FiOS).

“FiOS Internet was available for sale to 7.9 million premises by the end of the quarter. Penetration for the service averaged 22.9 percent across all markets. FiOS TV was available for sale to 6.5 million premises by the end of the quarter. Penetration for the service averaged 18.7 percent across all markets.”

Imagine if they can expand this service nationwide and take away 20 percent of the cable companies’ broadband and television business. I know given an alternative to Comcast’s high prices, I’d switch in a heartbeat to a comparable solution.

In short, we remain very bullish on the telecom business in general — especially in those areas where high availability isn’t an option, but a requirement.

Add comment July 18, 2008

A conversation on HA, standards, integration and more

Posted by Jim Ewel, CEO and Chair, GoAhead Software

Welcome to Zero Downtime, the GoAhead blog! We started this blog to have a conversation with our customers and other people interested in highly available, application-ready platforms. Just to be clear, by highly available, we don’t mean a system that seems generally reliable but might be out of service for an hour every night for maintenance. We mean absolutely critical systems — sometimes life critical — where we and our customers have to pay penalties if the systems fail or have to be taken down for any reason. Generally, the availability of systems is measured in terms of nines (i.e. something that’s available 99.999% of the time is known as a “five nines system”) and the amount of allowable downtime per year is measured in minutes or sometimes seconds. If you’re reading this blog, however, you probably know this, so I won’t go on about it.

We specialize in helping people build systems that are at least five nines, and sometimes six nines, of availability. We’ve been doing this for almost twelve years, and we’ve helped build all kinds of systems: base station controllers, media gateways, call servers, WiMax controllers, switched digital video controllers, home location registers (HLRs), border and session control products, deep packet inspection, command and control weapons systems. If you’ve made a wireless call on Verizon, Sprint, KDDI, tMobile, or Orange, chances are you were routed through a piece of equipment using our software. We’re not limited to wireless applications; soon, Comcast and other cable providers will be installing equipment bringing video on demand to their subscribers. This equipment will be protected by GoAhead’s software.

We’re also actively involved in the industry. Asif Naseem, our president and COO, is also the current president of the Service Availability Forum (SA Forum). The SA Forum authors standard interface specifications for service availability middleware services. Many of our engineers and architects are also actively involved in the sub-committees of the SA Forum. We participate in or monitor many other standards bodies or industry forums including SCOPE alliance, blade.org, PICMG, Linux Foundation, IETF, and CP-TA.

I say all this not to brag, but to establish our qualifications to host such a discussion forum. That being said, we hope to learn from this exercise. Learn both through our writing and research as well as your comments.

Over the next few weeks, we have a number of topics we plan to write about. First, we plan to do a series of articles on the SA Forum’s Application Interface Specification (AIS), including some specific articles on the Availability Management Framework (AMF), the Platform Management APIs (PLM) and the various services. We’ll also blog on the topics of integrating network management and high availability, on MicroTCA, and on creating an application ready platform.

We’re looking forward to writing about these topics and posing some compelling questions we’d like you to weigh in on. We hope you’re looking forward to participating. If you have other topics you’d like to suggest, please let us know through your comments. You disagree with us? Great! Let us know in the comments. We’ll read every one.

Thanks for listening.

Add comment July 18, 2008


Categories

  • Blogroll

  • Feeds