InterProphet No Drop Packet Router "never drop a packet. Never" - Product Demonstration
Home

Silicon TCP

DVD/HDTV Video Streaming

Ultra Low Power Communications

EtherSAN Advanced Storage Architecture

FAQ

Can Silicon TCPTM do no drop routing?
InterProphet's Silicon TCPTM technology can not only send mass bandwidth -- it can also be used to route internet traffic to eliminate packet drops and reduce congestion across a network. But just how effective is ballistic protocol processing in real use for routing? Let's see another example.

Demonstration Environment
In this demonstration, three Athlon systems PC's, are sucessively connected together through seperate point to point network links, with each link being a seperate subnet. One system acts as an aggregation router which simulates a high capacity core network communications stream with millions of concurrent sessions, while the other two implement an edge router DS3 environment to supply a simulated 1000 DSL subscribers. Both of these routers have InterProphet boards with Silicon TCP. Six other systems of varying kind simulate a group of DSL connected consumers that fetch email, web pages, ftp, and video streams.

To allow a reasonable comparison, standard high-performance NICs are also present in the systems to allow an "apples to apples" comparison with traditional router performance in this configuration.

Many fail to understand the source of packet drops is the fundamental design decision of the Internet to deal with overcapacity is simply for switches and routers to drop ASAP. More clever equipment attempts to choose which to drop by an arbitrary convention, otherwise its random. Many "end-to-end" experts strongly advocate this as the best medicine - if the customer complains enough, more capacity will be factored in, leading to an overcapacity core network and a underappreciated edge network situation - everyones core is the same, but everyones edge is different. Often capacity drops are transient - in as little as 600 microseconds, the capacity recovers to transmit - instead of dropping, packets are selectively delayed with flow control accross the edge.

Edge Router supplying DSL services
(click-on to enlarge)

ISP Distribution Router System
(click-on to enlarge)

Silicon TCP Product Results
Both systems can maintain indefinitely a sustained synchronous message exchange perpetually at full data rate. The distribution router on the upstream side successively routes packets across using TCP flow/congestion control and waits for the edge router to pass the packets to the DSL client before routing more packets. Similarly, the edge router on the downstream side successively routes packets across using TCP flow/congestion control and waits for the distribution router to pass the packets to the aggregation router before routing more packets. In either case, if a drop occurs between the routers, the packet loss is detected within the round trip time between the routers and is recovered from the sourcing router. The technique used is called transparent congestion recovery.

Silicon TCP can allow two routers to reliably exchange packets in 24 microseconds round trip time. From the time a sending router sends a packet to the time the receiving router receives a packet, the entire transaction is completed in a fraction of the time of an end-to-end packet retransmission. This corresponds to the common case of over 100,000 fold improvement in the recovery time from the most common internet performance bottleneck!

Even better -- recovery occurs without the customer or his computer noticing. Quality is continuously maintained over the network segment covered. He never knows there was a problem.

Traditional Router Stack
(click-on to enlarge)

Silicon TCP Router Stack
(click-on to enlarge)

Comparison with Existing Edge Routers
In comparison, the exact same configuration can be measured with high performance (32bit PCI) NIC cards and standard IP routing stack. In this case, it sustains typical packet drop detection in round trip times of 120 milliseconds. Because of the congestion backoff protocols used in recovery, multiple packet exchanges must occur when a non-transparent congestion event occurs. This limits the minimum data rate of the DSL connection to a packet every second or so, until the integrity of the connection is reestablished. In addition, the recovery time is far less predictable, and may range from a fraction of a second to almost a minute.

But the worst consequence is that the customer can't miss seeing the effects of the packet drop and recovery. His email times out, his browser jams, and his streaming video freezes. A painful loss of quality. This is all too common. It is the bane of widespread broadband deployment. Sometimes the customer sees little difference with lower rate connectivity that costs less!

Application of Silicon TCP
These results demonstrate that Silicon TCP can be used to eliminate the most common packet drops between routers. 95% of all packet drops occur in the environment of the edge of the network. Silicon TCP provides a thousand fold reduction in error discovery time, and a ten thousand fold improvement in error recovery time.

Since Silicon TCP scales with the bit rate of the link, as the rate of speed of network signalling increases, the ability to handle the rate improves in lockstep - Silicon TCP always processes protocols at the peak rate of the link. Even with existing expensive specialized NICs, fatter (8192 byte) packets, and elaborate software modifications, induced latencies aren't getting shorter -- in some cases they are getting longer. Silicon TCP reverses this trend; it removes timing jitter thus, near-perfect quality Internet communications becomes possible.

Equipment and software used
In this demonstration, identical Athlon XP processor motherboards are used for the routers. Both systems ran with identical BSD operating system configurations. All had 512MB of RAM. For the traditional router comparisons, the standard BSD IP routing stack and NetGear 10/100 32-bit PCI cards were used.

Why do IP routers drop packets?
Internet routing occurs below the reliable transport level. Packet delivery between routers isn't gauranteed, but instead occurs on a best efforts basis. If a router has a transient overload that exceeds its memory capacity, packets must drop. There is no flow control or global traffic management that is present in real time. Silicon TCP adds this in transparently and remedies the breech.

Why does this happen at the edge of the Internet?
At the core of the Internet, transport capacity far exceeds the bandwidth in use at any time. Since millions of connections are mediated by each router, costly memory, processor, and exotic architectures can be employed since the cost will be shared by an enormous base. Even better, the transport capacity inbound and outbound is balanced, so the disparity of feeds isn't present. But none of this is true at the edge.

At the edge, bandwidth and equipment costs are dear. Thousands of subscribers per DSL POP limit the amount of bandwidth, processors, and memory to be afforded. The smaller the costs, the more profitable the ISP. The highest costs are customer quality and support. To even be affordable, an average DSL POP uses T1 or T3 connections that run at a fraction of the core Internet's OC-48 or OC-192 speed. There is no buffer big enough to hold 1,000 subscribers data travelling at 10,000x when it hits the edge all at once. And if there was, the cost would be prohibative, given that you'd only need it 0.001% of the time.

Other Approaches Used
Digital switched networks degrade in transport efficency and quality of service due to transient overcommitment of link and/or memory capacity in network communications equipment between communications endpoints. While the majority of such events last less than a fraction of a second, the effect is devistating for protocols such as TCP/IP, as it triggers a worst case effect whereby the transport is degraded for several seconds. Even worse, this effect can cascade to multiple unrelated communications, which all attempt similar recovery and potentially cascade more congestion events(a so called "congestive collapse").

This mechanism inside network equipment causes the generation of a duplicate packet to replace packets destroyed by other network equipment as a means to recover the fault before damage results. This mechanism functions transparently to all other network devices and software such that to obtain the benefit, no modification of existing devices and software need be done. Other ways have been used to minimize this problem:

Packet Shedding - routers, switches, NIC's, client/server operating systems drop packets to shed load if either the memory to queue the packet or the resource to run the software required to process the packet is available for use. It is an attempt to gradually degrade operation rather than have a hard failure for all packets. This is the traditional technique that is widely employed.

ATM/MPLS spare capacity - By reserving both primary and secondary bandwidth path to specific plurarities of packet communications, a congestion event is dealt with as if it were a synthetic hardware fault that is remedied by using the secondary bandwidth path to absorb the overage. For this to work, the secondary path must be insured to not be congested when the primary path is congested. Both ATM and MPLS approaches dedicate seperate link paths and time on those paths to support this real time demand. To use this approach you must always considerably over allocate bandwidth to support the recovery need. This drastically drives up cost and complexity.

Selective Packet Drop - (aka priority, RED, class of service based diffserv) Various techniques triage the packet drop problem by selectively dropping packets or by altering the drop profile randomly so as to minimize the cascade failure of multiple congestion recovery events. Selection criteria may include policy-based, class of service based, or capacity threshold. Some even include anticipating the congestion avoidance algorithms of the protocols so as to apportion damage to traffic without realtime significance(text, images, e-mail) in favor of realtime video/audio streams where quality loss is immediately apparent. These schemes seek to minimize the collateral damage of the drops, but the problem with not eliminating drops entirely is that the effect of these techniques is localized and subject to the kind of traffic present at any given time - as the use of the network changes, the mechanism requires external management and configuration to adapt the technique to obtain the benefit. Our mechanism is independant of these quality enhancements, as it does not judge the content or traffic being transmitted, nor does it impose additional queuing beyond that required to generate duplicate packets downstream of the recovery path entry endpoint.

OSI/IP gateways (sitara) , X.25 networks (aka transport layer stacking, no caching or dup generation) Reliable transport networks that assert delivery on incremental propogation from each consecutive network equipment to the next is another technology used to reduce the effects of network congestion, by stacking an end to end protocol on top of a reliable transport network. Examples of this include OSI/IP gateways (sitara networks), and older X.25 gateways. This approach eliminates long round trip induced congestion costs, but at the cost of protocol conversion and a decrease in capacity/efficency because of the delay imposed by the conversion process. The practical impact of these costs is that new congestion and flow control effects are created that limit the gains that can be obtained. In addition, because this mechanism does not generate the duplicate packets desired, packet drop is still present for the portions of the transport path to/from the gateways. To minimize the total impact of these effects, transport gateways and retransmission nodes must be spread accross more than the high congestion bottlenecks of a network, requiring larger deployment of nonstandard equipment of considerably higher cost, thus the approach is not scalable. Some of these techniques violate the end-to-end semantics of the TCP protocols, inducing peculiar problems in various client/server programs in common use.


Copyright © InterProphet Corporation, 1997-1999. All rights reserved.
Silicon TCPTM, RAIDBlasterTM, EtherSANTM, and WebCelleratorTM are trademarks of InterProphet Corporation.