|
Questions
What
is Silicon TCP "really"?
Why NT and not UNIX?
Why is your website so DOC
centric?
Can you use your card with
any PC?
Is a linux driver
available?
Will you supply a
specification?
What about ISCSI?
I see you talk about
EtherSAN products which use Silicon TCP, but that looks
like its for corporate use. Where can I get a board
for my own use?
Why didnt you do a
gigabit version instead of a 100 mbit?
I see you prefer small
packets. What about fat packets? Arent fat packets
better?
But wont it then
overload the infrastructure (switches and routers)?
Why did you use an FPGA
instead of an ASIC?
Then if you use an ASIC,
how do you know it works perfectly?
What if you dont have
a TCP packet?
But could you do UDP as well?
What about a Netware driver
support?
But Sun could also use this
too. What about Sun?
Well, so what if
youve got great numbers. What about
"latency"? I bet you cant reduce latency.
Well, yah, NT is slow, but
how much would it help, say, linux?
Well, that's neat but can't
I do close to that same rate with a standard card?
But isnt it
preservation of packet order that causes the delays?
But what about IPV6?
How about a PC/104 format
card and MS-DOS drivers?
But can it do webserving?
Id like to do
distributed clustering. Any interest in that area?
How long have you been
working on this technology?
Everybody needs this. Why
didnt anyone think of this sooner?
Answers
What is Silicon TCP
"really"?
Silicon TCP is a patented
(see "TCP/IP network accelerator system and method which identifies classes of packet traffic for predictable protocols")
combination of hardware and software
techniques which accomplishes TCP protocol processing as
the packets arrive and deliver the correct results
to the application. It can also transmit information in
entirely like fashion.
Unlike software
implementations, Silicon TCP processes the protocol in a
completely novel yet entirely standard manner. While it
accomplishes the function of software stacks, it is not
implemented like a software stack.
Silicon TCP is not a
protocol engine. It has no processor on the card,
but instead entirely relies on functional logic which is
"programmed" by the packet itself. As such,
implementations can be reduced to as little as 5000 gates
for specific use, and advanced cores of 5 million gates can do the work of
thousands of network processors.
Silicon TCP works
orthogonally and complementarily with the main processor,
by doing the "heavy lifting" of protocol
processing and delivery. It does all stack processing, yet is
itself programmable using "ballistic" means to integrate with OS IPSEC and other
stack extensions. Hence, the main processor is now
freed up for interesting transaction processing and
applications. Update 11/01 - Silicon TCP hardware passes RFC 1121 functions w/o host processor involvement - its optional!
Why NT and not UNIX?
When we set out to
prove this technology, one of the commonly stated
putdowns was that we would be "mucking" with
the kernel to get any results, and that it would never
fly in a "real world", e.g. Microsoft NT,
situation.
Well, we decided to do
this project entirely on NT systems, using NT-based
development packages ranging from the chip design
(Veribest, FPGA Express, Xilinx Alliance, etc) to
software driver development (Driverworks) to general
development using the Microsoft Developer Network suites
for our applications and low-level driver interfaces.
Our results are based
entirely on our technology. Since we dont have NT
source, we have never "mucked" with the
kernel.
We expect the UNIX
community will soon rectify this lack themselves, since
it is really much easier to do a UNIX driver than work
within the NT framework. We took the harder path
deliberately to not obscure the achievement of the design
team.
Why is your website so DOC
centric?
Sorry. As you can
see above, weve been stuck on these systems for
quite a while, and we get myopic. I had a great
webmaster, but made him use the same tools we do. We just
didnt really think about optimizing our site, since
we never expected so much interest in the first place.
We'll get them to PDF soon, so if you don't have Word,
you can still read them.
Can you use your card with
any PC?
The current cards
you see displayed are standard format 33MHz / 32Bit / 5V
PCI 100 mbit Ethernet plain vanilla. Our demo systems you
see on the web page, for example, are IBM Aptiva PCs I
bought from Circuit City. We figured if they could run in
an IBM PC PCI slot, theyd work pretty much
anywhere.
Is a linux driver available?
We expect once
boards are available linux support will very quickly
follow. Update 11/01 - BSD driver too.
Will you supply a
specification?
Sure will. Two
documents exist: the "Software Programmer's Guide to
Silicon TCP Engines", and the "EtherSAN Board
Programmers Reference Manual". These should be available from partners.
Assuming one has access to
the kernel, one could make additional optimizations which
improve performance still further. We dont have
access to the NT kernel, so we have done no kernel
optimizations.
What about ISCSI?
We waited til the dust settled.
Many current implementations don't suit enterprise storage, and the
disk manufacturers had trouble understanding anything more than UDP.
Update 11/01 - TCP CDB transfers at 100mB/s with %0.35 CPU.
I see you talk about
EtherSAN products which use Silicon TCP, but that looks
like its for corporate use. Where can I get a board
for my own use?
Since we are a
small start-up which has been totally dedicated to
designing and developing a technology which everyone said
could not be done, we could not expend any resources
towards volume channel development. Since all Internet
content ultimately comes off of a server disk drive or
drive array, we are tightly focussed on getting the
content off of disk drives faster while eliminating the
bottleneck of the server CPU and doing it entirely on the
Internet. This is the EtherSAN concept.
However, we do want to get
this cool technology out to everyone we can!
We are currently talking
with hardware partners who can supply boards through the
correct channels to the end customer as rapidly as
possible, and support them properly.
Why didnt you do a
gigabit version instead of a 100 mbit?
Like the concerns
expressed about possible kernel jiggering, we tried to
keep the rest of the board as straightforward as
possible. This meant standard everything, including
100mbit Ethernet. We wanted to see the pure technology
end-game here and we believe weve done it.
By the way, it was damn
hard for a little design team to even get test equipment
for real 100 mbit, much less gigabit. When we
started, the standards hadn't been finished for
1000BaseT.
The Silicon TCP design
engineer, however, made sure our development system was
easily modifiable to gigabit with a board spin. After
all, he originally came from Cisco and felt it was
important.Update 11/01 - OC192/DWDM core with new adaptive retransmission engine arrives.
I see you prefer small
packets. What about fat packets? Arent fat packets
better?
Packets are
agglomerated to a "fat" form to reduce stack/OS
overhead, as you surmise. However, fat packets increase
congestion over the network (300 octets or above have
increasing likelihood of not making it through the
network -- cf. "the nature of the beast:
recent traffic measurements
from an Internet backbone"). That is the tradeoff -- reduce
stack overhead with fat packets and increase the
likelihood that it will not make it through the network.
They also have horrible effects on congestion control
mechanisms (increase round trip time variance), so you
can't as easily optimize server output or traffic shape.
But think about this -- what if you could process TCP
entirely in hardware in *real-time* as the packets
arrive, eliminating stack overhead completely? Then,
small packets are better since they are more likely to
make it through the network, and congestion is less
likely to occur. To us, it no longer matters what the
size of the packet is, since we process it in real-time
anyway! We no longer need make that tradeoff between
efficiency in the stack/OS and transmission through the
network! Update 11/01 - Newer trend analysis paper (see "Trends in wide area IP traffic patterns - A view from Ames Internet Exchange") leads us to an enhancement of our design.
But wont it then
overload the infrastructure (switches and routers)?
In the
long run, one would wish to have the same mechanism we
have developed embedded in the infrastructure (e.g.
routers, switches) as use increases, but since there is a
lot of half-used capacity right now, this is not an
immediate concern.
Why did you use an FPGA
instead of an ASIC?
FPGA was faster
for proving the technology. We used Xilinx because we
also got the PCI core from them, so we could use one
vendor (being a frugal startup, we try to be practical).
Actually, the technology
would work even better in ASIC form. The design engineer
has struggled, not with lack of gates, but with lack of
routing resource in the Xilinx FPGA chips. Even if you
get a bigger chip, you dont get more routing
resource. ASIC eliminates this issue.
Then if you use an ASIC,
how do you know it works perfectly?
In the
case of the chip itself, thats what ASIC
verification and validation are for. Weve also been
running test verification between a fully compliant
software stack and our technology, for protocol
correctness. We believe this is very important, and have
been working on this issue for over a year. Update 11/01 - Simulating in ASIC design tools.
What if you dont have
a TCP packet?
We safely
failover to software stack processing via various
techniques without loss of integrity or duplication of
resource.
But could you do UDP as well?
Since TCP
packets are over 90% of the traffic on the Internet, we
chose to focus on TCP. However, we have had requests for
additional functionality such as UDP and we can supply it
if a hardware vendor can give us a substantial order for
it.
What about a Netware driver
support?
Well, Ive
talked to Novell a few times, but they have appeared
distracted (as have we), and theres no open source
development group I can rely on here like you can on the
UNIX side. Anyone from the Novell side serious here?
But Sun could also use this
too. What about Sun?
I've
chatted with Sun on occasion, but they seem pretty happy
with their current offerings. If hotbox NT servers become
a reality, though, they might become more interested.
I just think they
dont believe NT can be made competitive, and just
havent bothered to take any such efforts seriously.
I cant really blame them, since I would have
thought the same a few years ago myself.
Well, so what if
youve got great numbers. What about
"latency"? I bet you cant reduce latency.
Oh, joy,
oh bliss! Someone actually is asking about latency! This
is great! Silicon TCP is actually all about latency
reduction. People have gotten so used to talking about
bandwidth that theyve completely forgotten about
the latency issues. So lets talk raw latency in the
software stack.
By the numbers
(measurements courtesy of a big database firm which has a
eclectic CEO) a 200MHz Pentium Pro II running NT
typically incurs a 400 microsecond latency in processing
of the protocol from the time the request (a socket send)
is made to the time the data hits the wire -- almost all
of this consumed in the OS. In addition, the ring
crossings incurred can induce additional delays (context
switches, interrupts) which can make even this
measurement even more unpredictable.
Silicon TCP does this
entire stack transaction in under one microsecond. We can
actually process at the interpacket gap speed. Now, a
real transaction has to take into consideration the
Ethernet signalling details, like runt packet inflation
(e.g. 64 octets frame, 960 nanosecond interpacket gap),
and so a typical three phase transaction is limited to 25
microseconds per transaction on 100BaseT full duplex.
Note that because no software resource is tied up with
TCP, this time is a deterministic one instead of a
statistical minimum.
Well, yah, NT is slow, but
how much would it help, say, linux?
While the NT
software stack is a renowned sloth due to its intense
reliance on layers, the same software benchmarks run on
linux apparently were typically only 5% faster, so yes,
linux could probably benefit.
Well, that's neat but can't
I do close to that same rate with a standard card?
Using standard Netgear 100mbit PCI cards(DEC 32bit NIC
chip) we got around 6 Mbytes per second (it varied a lot,
so I'm being generous here), but at the cost of burning
100 % of the CPU.
In sum, we achieved a
reliable 9 Mbyte per second rate for small packets while
burning about 2 % of the CPU. Wiggling the mouse will use
more CPU.
Now, you tell me. Do you
want to spend your entire time doing nothing else *but*
this, or do you also want to, for example, use a database
as well?
But isnt it
preservation of packet order that causes the delays?
Nope. Out of order
does increase that latency in processing receive stream,
but remember that packet traffic is "bursty"
(cf. Van Jacobsen's work on traffic analysis of the
Internet) and that a single out of order processing is
considered sufficient for software stack implementations.
We can do out of order packet processing as well in
hardware, without incurring additional software latency.
In fact, with our mechanism we can do arbitrary out of
order packet processing (N.B. the latency will be the
worst case of aggregate arrival times). However, there is
no market (or even technical) demand for such a product
that we have been told. If there is, wed love to
hear it.
But what about IPV6?
Same as above (can
do it) for IP V6, but in all of the meetings Ive
had with people who have asked about this, I have yet to
see a serious commitment (sigh).
When youre a little
company, you put the resources only where you can get
interest. I assume if there is real interest in this
area, we will be able to put the resources in place.
How about a PC/104 format
card and MS-DOS drivers?
Nope, no embedded
modules at this time. We stuck with a standard 33 MHz /
32 bit wide 5V PCI plain vanilla 100 mbit Ethernet card.
I assume the appropriate hardware vendor, if demand
warranted, could easily design such a board, but they
would have to use an ASIC version of Silicon TCP hardware
(plus software) due to the very low power demands and
board area considerations of such a format. Definitely a
volume player here.
As to DOS, its
conceivable, but all our software integration work to
date has been in C++ in VC6 with the usual bloat. I
suppose a TSR could be done, and wheedled into the
address space limitations.
But can it do webserving?
We took the
standard sample MS webserver from the MSDN and made it
work with the board. The results in a nutshell from
sending an InterProphet logo GIF (25KB) was to get 200
per second continuously with < 5% of the processor. We
are waiting on real CGI numbers for Apache with Web
Bench, but its looks like above 500 per second. We're
resource limited on benchmarks - not enough clients.Update 11/01 - wire speed 894 requests/sec with Apache 1.3.20, %0.5 CPU load, 10k index page.
Id like to do
distributed clustering. Any interest in that area?
Well, we actually
have a lot of background and interest in clustering. You
definitely are on the cutting edge. If we can just get
this stuff out to people, clustering could actually be
made an economic reality.
Clusters have been limited
to special purpose niches for way too long. The Internet
and electronic services are the
"killer applications" for clustering. The key
here is to apply the careful lessions learned by the SAN
pioneers, and apply them to networking directly -- it's
not a question of if SANs are good or bad, SANs are
good. However, SANs must be able to deliver performance
in the environment of the Internet in order to satisfy
market requirements, without compromising SAN integrity.
How long have you been
working on this technology?
We have been
working on this technology for over two years, and it has
not been a trivial pursuit, as there were many issues to
resolve, and networking is a complicated systems problem.
The young engineering team that I managed made this a
reality, and I am very proud of them. They are the ones
that deserve the accolades, as they accomplished what
everyone said would be impossible -- make TCP efficient
and economic.
Everybody needs this. Why
didnt anyone think of this sooner?
Well, I suppose
that necessity is really the mother of invention. No one
really thought about it at least, thats what
Ive found in talking to people. Thats what
makes for the fun in a start-up doing something
that no one has done before and finding that it really
works.
Have a question?Ask Lynne Jolitz.
|