We deliver 99.9%
rather than the 97%
delivered by others
Server connection reliability is not the same thing as end-to-end
connection reliability. Many hosters will proudly tout their BGP, BGP2,
or BGP4 routing for enhanced Internet connectivity reliability. In fact, a properly
managed BGP service will give a server site connection reliability that exceeds
99.99%. But the problem with BGP is that there is always only one pathway to
a web site at any given time. And if anything along the multi-thousand mile
route "breaks" there is nothing that either the server or the client
can do to get around the problem.
The Internet breaks, regularly!
When you make a connection to a web site, chances are that you
will pass through 20 to 30 routers. You will frequently also pass through thousands
of miles of optical fiber, with all of its signal amplifiers and segments that
can be assaulted by backhoes. Finally, there are times when pieces of the Internet
get congested, and some or all of your data just never gets through. The Internet,
overall, does a pretty good job of routing around these problems. The BGP routing
system generally finds breakage and posts alternate routes in 15 to 20 minutes.
But during the interval, those users affected still cannot connect. While there
will be some routes that are never broken, in our experience about 3 percent
of the time someone tries to make a connection to a particular web site,
they will be unable to connect or stay connected for the entire session due
to some form of route failure or congestion.
3% non-connectivity does not sound like a lot, but ... it
represents about 11 calendar days out of a year. More important is that this
is the average. Because of regional problems, some people will have much longer
periods. For instance, we have seen some people lose their connectivity to a
specific point for 40% of the business hours for four weeks, until a third party
fixed an intermediate router problem.
BGP fixes last mile breakage,
but it does not fix intermediate breakage
BGP as used by hosting service providers delivers
two things to the provider. The first is rapid cutover to alternate carriers
if their last mile to an individual Internet carrier is cut. The second of these
is load-balancing to make more efficient use of bandwidth. While BGP almost
instantly restores local service for the Internet provider, neither they nor
the client at the other end can do anything to get around the problem because
neither knows where the problem is. Both must wait for the Internet to self-heal
through the advertising of a new route.
MultiPathing fixes all but first mile breakage
All modern browsers have the ability to go to a
named website via multiple paths if the web site advertises multiple paths.
Browsers arbitrarily choose one, try it, and then try each item in the list
sequentially if there is no response from the prior route in about 30 seconds.
When these alternate destinations have different
Internet carriers (UUnet, AT+T, and Sprint are carriers) the routes normally
diverge from each other within a few router hops from the user's browser. Thus,
the only common point of failure is the first couple of hops, which is normally
the responsibility of the user's ISP. After divergence, there must be simultaneous
failure on all routes for connection to be impossible.
The Interstate highway analogy
The best way to understand the difference between
BGP and MultiPathing is to think of traveling on the Interstate from Washington
DC to Los Angeles.
In BGP, at any given time, the driver might only
be permitted to drive via I-70/I-15. If there is an accident in Columbus, they
will never get to the destination, because there is only one path, and divergence
is not permitted until the Internet advertises a new satisfactory route.
Conversely, in MultiPath, the driver has two routes,
perhaps I-70/I-15 and I-95/I-10. If they cannot get through on the first, they
get to try to try the second route a few seconds later. All they need to be
able to do to get through is to get to the Beltway where both start.
The mirrored variant of MultiPathing
If someone in Washington DC has a choice between going to Miami
or going to Boston for their data, they intrinsically have MultiPath capabilities,
because most of the route is not in common, and the endpoints are not in common.
Mirrored solutions are the most reliable for most web sites because they have
nothing in common with each other, and thus no practical common points of failure.
This is the best solution for static content because the the data being delivered
changes only occasionally. We use the mirrored variant of MultiPathing in our
mirrored basic web service product.
True MultiPathing for dynamic systems
When web sites are actually collecting data from customers and
prospects, rather than just giving the customers data, the data ultimately needs
to go back to one logical system. If you try to go to two or more simultaneously,
you end up with two data sets, and possibly the problem of someone starting
on one server and being unable to complete on another (due to Internet path
failure). Logically, therefore, most dynamic databases need to be on a single
For such situations, we offer true MultiPathing, multiple routes,
on multiple carriers, all pointing at the same machine to deliver connectivity
that is as reliable as the machine itself ... something in excess of 99.9%.
We offer two variants of this MultiPath connectivity. The first, and less expensive,
solution is to proxy through two different servers in the same facility. This
can be used to provide standard services such as web and SSH. The more expensive
variant is direct connect, giving your virtual or physical machine two (or more)
IP numbers on two discrete backbones. This service is necessary if you want
MultiPath SSL with your own cert, or other specialized services.
Why 3 percent failure matters to you
Three percent matters to you because use of the
Internet is so low cost relative to the value it produces. EasyCo, for instance,
will deliver 20,000 typical web pages for just one dollar. Because costs are
so low, a typical e-tailer will have standard web hosting bills that amount
to about 1/10 of a percent of sales.
The costs of doing business are not in the hosting
service, but in the costs of people to prepare the materials, prepare the programs
that deliver it, process the orders, and keep the computer equipment that manages
it all alive. Losing 3 percent of sales in the face of these high, and generally
fixed costs, has a major impact on the bottom line. Reliability matters not
because of the intrinsic waste of bits that never get delivered, but because
of the massive economic loss that is a consequence of non-delivery.
Why don't more providers offer MultiPathing?
To be honest, we only offer MultiPathing because
our business started in database hosting, where reliable connectivity is absolutely
needed for reliable service. In database hosting, people must remain connected
all day long, or even (in some cases) 24x7. We could not afford to have a company
with 100 employees totally out of business for 20 to 30 minutes at a time due
to a BGP route failure. MultiPathing gives us the ability to transparently reconnect
on a line failure, with "real" down time a virtually unnoticeable
This said, MultiPathing is not a simple engineering
project because there are some complex interrelationship issues. Most hosters
therefore buy the off the shelf stuff, and presume you will be content with
"pretty good" rather than the best possible.