Our Research on High-performance and Real-time CORBA
Many application domains (such as avionics, telecommunications, and
multimedia) require real-time guarantees from the underlying networks,
operating systems, and middleware components to achieve their quality
of service (QoS) requirements. In addition to providing end-to-end
QoS guarantees, applications in these domains must be flexible and
reusable. Requirements for flexibility and reusability motivate the
use of object-oriented middleware like the Common Object Request
Broker Architecture (CORBA). However, the
performance levels, and QoS enforcement features of current CORBA
implementations are not yet suited for hard real-time systems (e.g.,
avionics) and constrained latency systems (e.g.,
teleconferencing).
An increasing number of operating systems, networks, and protocols now
support real-time scheduling. However, no integrated solutions yet
exist that provide end-to-end QoS guarantees to distributed object
applications. To rectify this situation, we've been conducting
research over the past several years to identify the key architectural
patterns and performance optimizations necessary to build
high-performance, real-time ORBs. We have developed a prototype
real-time ORB endsystem called TAO that can
deliver end-to-end QoS guarantees to applications.
This document describes our existing research on CORBA, outlines our
upcoming plans, and provides an overview of our vision of the future
of CORBA, particularly for real-time systems.
Measuring the Performance of CORBA Over High-speed ATM Networks
We've have been conducting research on measuring and optimizing the performance of CORBA
implementations for high-speed networks
and real-time systems, as
well as designing reliable
CORBA systems. Our experience using CORBA over the past several years
indicates that it is well-suited for request/response applications
over lower-speed networks (such as Ethernet and Token Ring). However,
through our benchmarking efforts, we've determined that conventional
implementations of CORBA incur considerable overhead when used for
performance-sensitive applications over high-speed networks.
As users and organizations migrate to networks with Gigabit data rates
and deploy applications with demanding quality of service
requirements, the inefficiencies of current implementations of CORBA
will force developers to choose lower-level mechanisms (like ACE C++ wrappers
for sockets) to achieve the necessary
transfer rates. The use of low-level mechanisms, particularly C
language programming interfaces, increases development effort and
reduces system reliability, flexibility, and reuse. This is a serious
problem for mission/life-critical applications (such as medical imaging and real-time systems).
Therefore, we believe it is imperative that performance of high-level,
but inefficient, communication middleware be improved to match that of
low-level, but efficient, tools. Moreover, we believe that advances
in high-performance, real-time distributed object computing can be
achieved only by simultaneously integrating techniques and tools that
simplify application development; optimize application, I/O subsystem,
and network performance; and systematically measure performance to
pinpoint and alleviate bottlenecks. To validate our research
hypthesis, we are developing a high-performance, real-time ORB called
TAO (The ACE ORB).
Limitations with Existing ORBs
Our experience using CORBA on telecommunication, avionics, and medical
projects indicates that it is well-suited for request/response
applications. However, the QoS specification and enforcement features
of current ORBs, as well as their performance levels, are not yet
suitable for applications with hard real-time requirements (e.g.,
avionics mission computers) and stringent statistical real-time
requirements (e.g., teleconferencing). In particular, conventional
ORB specifications and implementations are characterized by the
following deficiencies:
- Lack of QoS specification and enforcement --
Conventional ORBs do not define APIs that allow applications to
specify their end-to-end QoS requirements. Likewise, existing ORB
implementations do not provide support for end-to-end QoS enforcement
between applications across a network. For instance, CORBA provides
no standard way for clients to indicate the relative priorities of
their requests to an ORB. Likewise, there are no means for DCOM or
RMI clients to inform an ORB how frequently to execute operations that
must run periodically.
- Lack of real-time features -- Conventional ORBs
do not provide key features that are necessary to support real-time
programming. For instance, although the CORBA inter-operability
protocol (GIOP) supports asynchronous messaging, there is no standard
programming language mapping for exchanging ORB requests
asynchronously. Likewise, the DCOM and RMI specifications do not
require an ORB to notify clients when transport layer flow control
occurs. Therefore, it is hard to write portable and efficient
real-time applications that are guaranteed not to block indefinitely
when ORB endsystem and network resources are temporarily
unavailable.
- Lack of performance optimizations -- Existing ORBs
incur significant throughput and latency overhead. These overheads stem from
excessive data copying, non-optimized presentation layer conversions,
internal message buffering strategies that produce non-uniform
behavior for different message sizes, inefficient demultiplexing
algorithms, long chains of intra-ORB virtual method calls, and lack of
integration with underlying real-time OS and network QoS
mechanisms.
Meeting the QoS needs of next-generation distributed applications
requires much more than defining IDL interfaces or building real-time
scheduling into ORBs. It requires a vertically integrated
architecture that can deliver end-to-end QoS guarantees at multiple
levels of an entire distributed system.
Optimizations and Features for High-performance, Real-time CORBA
Although some operating systems, networks, and protocols now support
real-time scheduling, they do not provide integrated end-to-end
solutions. In particular, QoS research at the IPC and OS layers has
not necessarily addressed key requirements and usage characteristics
of ORB middlware (such as CORBA, DCOM, or RMI). For instance,
research on QoS for communication systems has focused largely on
policies for allocating network bandwidth on a per-connection basis.
Likewise, research on real-time operating systems has focused largely
on avoiding priority inversions and non-determinism in synchronization
and scheduling mechanisms for multi-threaded applications. In
contrast, the programming model for developers of CORBA applications
focuses largely on invoking remote operations on distributed objects.
Determining how to map the results from QoS work at the IPC and OS
layers to ORB middleware is an important open research topic.
In addition, high-performance, real-time ORB endsystems require more
than CORBA middleware -- they must also integrate with network
adapters, operating system I/O subsystems, communication protocols,
and common object services. Below, we outline the requirements of
high-performance, real-time ORB endsystems that form the basis for our
work on TAO.
- Policies and mechanisms for specifying end-to-end application
QoS requirements
Real-time ORB endsystems must allow applications to specify the QoS
requirements of their IDL operations using a small number of
parameters (such as computation time, execution period, bandwidth and
delay requirements). For instance, video-conferencing groupware may
require high throughput with statistical real-time latency deadlines.
In contrast, an operational flight control platform for avionics may
require periodic processing with strict real-time deadlines.
- QoS enforcement from real-time operating
systems and networks
Regardless of the ability to specify QoS requirements, ORBs cannot
deliver end-to-end guarantees to applications without network and OS
support for QoS enforcement. Therefore, ORB endsystems must be
capable of scheduling resources such as CPUs, memory, storage
throughput, network adapter throughput, and network connection
bandwidth and latency. For instance, OS scheduling mechanisms must
allow high priority client requests to run to completion and prevent
them from being blocked indefinitely by lower priority requests.
- Optimized real-time communication
protocols
The throughput, latency, and reliability requirements of multimedia
applications like teleconferencing are more stringent and varied than
those found in traditional applications like remote login or file
transfer. Likewise, the channel speed, bit-error rates, and services
(such as isochronous and bounded-latency delivery guarantees) of
networks like ATM exceed those offered by traditional networks like
Ethernet. Therefore, ORB endsystems must provide a range of
communication protocols that can be customized and optimized for
specific application requirements and network/host environments.
- Optimized real-time request
demultiplexing and dispatching
ORB endsystems must demultiplex and dispatch incoming client requests
to the appropriate operation of the target object. In conventional
ORBs, demultiplexing occurs at multiple layers (e.g., the network
interface, the protocol stack, the user/kernel boundary, and the ORB's
Object Adapter). However, layered demultiplexing is inappropriate for
high-performance and real-time applications because it reduces
performance by increasing the number of times that internal tables
must be searched while incoming client requests traverse various
protocol processing layers. Likewise, layered demultiplexing can
cause priority inversions because important target object-level QoS
information is inaccessible to the lowest level device drivers and
protocol stacks in the I/O subsystem of an ORB endsystem.
- Optimized memory management
On modern RISC hardware, data copying consumes a significant amount of
CPU, memory, and I/O bus resources. Therefore, multiple layers in an
ORB endsystem (e.g., the network adapters, I/O subsystem protocol
stacks, Object Adapter, and presentation layer) must collaborate to
minimize data copying.
- Optimized presentation layer
Presentation layer conversions transform application-level data into a
portable format that masks byte order, alignment, and word length
differences. There are many optimizations that reduce the cost of
presentation layer conversions. For instance, there are tradeoffs
between using compiled versus interpreted code for presentation layer
conversions. Compiled marshaling code is efficient, but requires
excessive amounts of memory, which is problematic in many embedded
real-time environments. In contrast, interpreted marshaling code is
slower, but more compact.
It is important to recognize that requirements for high performance
may conflict with requirements for real-time determinism. For
instance, real-time scheduling policies often rely on the
predictability of endsystem operations like thread scheduling,
demultiplexing, and message buffering. However, certain optimizations
(such as using self-organizing search structures to demultiplex client
requests) can increase the average-case performance of operations,
while decreasing the predictability of any given operation.
Therefore, our ORB endsystem is designed with an open architecture
that allows applications to select the appropriate tradeoffs between
average-case and worst-case performance. Moreover, where possible, we
use algorithms and data structures that can optimize for both
performance and predictability. For instance, de-layered
demultiplexing can increase ORB performance and predictability by
eliminating excessive searching and avoiding priority inversions.
Overview of TAO: The ACE ORB
Currently, there is significant interest in developing
high-performance implementations of real-time ORB endsystems.
However, meeting the requirements outlined above involves much more
than defining ORB QoS interfaces using OMG IDL -- it requires an
integrated architecture that delivers end-to-end QoS guarantees at
multiple levels of the entire system.
Therefore, we are developing a high-performance, real-time ORB called
TAO (The ACE
ORB) that is explicitly targeted to meet the requirements of
high-performance, real-time ORBs described above. TAO's ORB endsystem
architecture is summarized in Figure 2:
|
Figure 2. TAO: An ORB Endsystem Architecture for High-Performance, Real-Time CORBA
|
TAO contains the following policies and mechanisms that span the
following network adapter, operating system, communication protocol,
and CORBA middleware layers in an ORB endsystem:
- Real-time scheduling of OS and network resources;
- A high-performance ATM Port Interface Controller (APIC);
- Efficient zero-copy buffer management that shares client request
buffers across OS protection domains;
- Customized real-time implementations of GIOP-compliant
transport protocols;
- A set of Real-time IDL (RIDL) schemas that allows applications
to specify QoS attributes for their object's operations using a small
number of parameters (such as computation time, execution period,
bandwidth and delay requirements);
- A real-time Object Adapter that supports various real-time
dispatching mechanisms and de-layered demultiplexing;
- An off-line Scheduling Service that determines the priority and
scheduling characteristics of client requests with hard real-time
deadlines;
- An optimized presentation layer that uses innovative compiler
techniques and efficient buffer management schemes to reduce data
copying overhead in ORB endsystems;
- Real-time Event Channels that use the RT ORB to support
customized scheduling, concurrency, and filtering policies for
user-defined events.
Our architectural
overview of TAO describes these components in detail.
The Future of CORBA and CORBA Research
We believe the future of CORBA is very promising, particularly for
real-time systems. Real-time system development strategies will
migrate towards those used for ``mainstream'' systems to achieve lower
development cost and faster time to market. We have seen real-time
software development projects that have lagged in terms of design and
development methodologies (and languages) by decades. These
projects are extremely costly to evolve and maintain. They are so
specialized that they cannot be adapted to meet new market
opportunities.
The flexibility and adaptability offered by CORBA make it very
attractive for use in RT systems. If the real-time challenges can be
overcome, and our progress
so far shows that they can, then the use of Real-time CORBA is
compelling. Moreover, the solutions to these challenges will be
sufficiently complex, yet general, that it will be well worth
re-applying them to other projects.
In addition, CORBA can be adapted to ``niche'' markets, e.g.,
the RTOS market, that aren't well covered by more traditional major
players, e.g., Sun, Microsoft, IBM. In this sense, CORBA has
an advantage over other DOC technologies (such as DCOM and Java RMI)
since it can be integrated into a wider range of platforms,
i.e., it's open!
Back to my CORBA Research page.