Tuesday, February 16, 2010

Troubleshooting FWSM performance

FWSM is the firewall blade for Cat 6500 and 7600 routers.
The appliance has been out for a while and has been proven quite a performance bottle neck (AKA pain in the ***) very often. Specifically because of the placement (data centers) and increasing bandwidth demand there (Tengig links, Nexus). There's slowly a newer device looming on the horizon, but it may take several years.

So first of all, what is FWSM? It's a PIX in a chassis with very neat physical architecture.
You should also know that it has separate logical circuit for both connection for management from chassis (session ... command does a telnet on the backplane). There is also a 6 x gigabit etherchannel,
labeled PoXYZ.

There two dedicated proprietary chips (if you inspect the blade itself you will notice it's done by IBM) handling active connections. 

There is a dedicated network processor to perform access-list, route, xlate lookups.
Coincidentally it will also handle inspected ICMP traffic and ALL IPv6 traffic.

Check out:
http://www.scribd.com/doc/28698783/FWSM-Architecture (page 23)

The actual CPU, where layer 7 is performed and access-list compilation/optimization is done.
High CPU on FWSM does not cause performance degrade for non inspected traffic, in most cases anyway.

If you're checking for actual performance problems introduced by FWSM, please do this one test before making any assumptions.

Connect two gig capable laptop to two different line cards (gig capable obviously), put the laptops into two different VRFs and vlans, best if they are not advertised outside the chassis.
Add the vlans to FWSM apply an access-group for each vlan interface on the blade (it can be "permit ip any any") and create statics if nat-control is enabled.
Test of IPerf with TCP, single transfer, I've done some time ago was giving me just above 800 Mbit/s for outbound connections, around 650Mbit/s for inbound connections - if you want to have some baseline.

This method is of course not perfect, you could most likely tweak the physical and logical infrastructure to get better numbers, but my idea was to perform a test similar to real life.

Things to remember:
- make sure that the connection is not inspected if you want decent performance.
-  If this is TCP and you're getting poor performance consider enabling sysoption np completion-unit, this magic option is invoking special processing created to address scenarios in which FWSM was known to introduce out of order packets for TCP streams.
Please note that completion unit will not take care of out of order packets introduced elsewhere and it only work for traffic handled by first two chips.
- Get packet capture. Do not trust the FWSM capture, do local SPAN. FWSM capture is buggy, it's a bit more decent nowadays, but still is not to be relied on. Cisco TAC usually relies on FWSM capture functionality, but will ask you to do SPAN in some cases.
Please note that captures have to be done simultaneously.
What are we looking for in the captures? Out of order or dropped packets (to some extent you can trust dissector built into wireshark for this task), packets dropped (seen in one capture, but not the other), delay introduced by FWSM for packets passing through.
- FWSM does not play well with SACK option, check headers!
In newer FWSM versions you can clear SACK option in TCP via sysopt!
- Things that can  improve performance and/or analysis: disable sequence randomization, state bypass (untested)

So you have a blade and you're getting packet drop and poor performance, where should you look:
- show np block (hardware buffer counters) - if they are non-zero and increasing it's bad. You're most likely running into hardware limitation of the FWSM. Consider running active/active failover to split the traffic into two physical devices, doing traffic shaping towards the FWSM on the chassis or bypass the FWSM completely for some traffic.
- show np all stats | i RTL  and show np all stats | i RL will show you if the packets are dropped because of software rate limiting mechanisms built into network processors.
- show np 3 stats can show if some redundant or extensive traffic is crossing the FWSM.
Good counters to start,
a) check if you're not being by one type of traffic
  TCP Fixups        
  UDP Fixups          
  ICMP Fixups        
those will show you how many connection "fixups" were done. 
b)  Everything starting from "Discard Statistics" is usually interesting if you're looking for packets dropped.
c) Take care of "Flow Control: Rate Limit Statistics" - packets software rate limited by the chip.
d) ARP miss indications - are some hosts on your network sending traffic to hosts which do not exist?

TAC and Cisco System Engineers can tell you what's going on and what some of the counters mean. Believe me it's NOT intuitive.

While troubleshooting FWSM performance ALWAYS start with checking the switching path.
CAM entries should point either to PoXYZ or trunk between two chassis', ARP entries for active unit should bear the MAC address of primary unit.

edited on 21/Jun/2010 - added bug for SACK.
edited on 13/Aug/2010 - clarifications, more about NP3.


Cameron said...

excellent blog. Is there any downside to the 'sysoption np completion-unit' being enabled?

Isam said...

There is no known downside to enabling np completion unit that I saw.
But remember that it's new code and additional processing. Don't expect "one command solution".

Elasa said...

Dear Sir, I've found your useful weblog while i was searching about "slow connection with FWSM" and I appreciate if u can halp me regrding to this issue..
I've recently setup FWSM Multiple context running in routed mode all the ping tests were successfully done between servers but when I start to setup database connection ...I saw that database application is very slow...and I should mention that all the security rule that I have, is one ACL with permit any any which is applied to all interfaces...the version of FWSM Sw is 3.2(2)
would u please help me?

Isam said...


Is it only SQL net application experiencing this problem?
If so .. can you please check if SQLnet inspection is turned on.

If it's turned on and SQL is using redirect:

If inspection is turned on but you're not using redirect - you can remove it (inspection) safely.

If no inspection is turned on - troubleshoot like any TCP flow.

otr said...

Work from home theory is fast gaining popularity because of the freedom and flexibility that comes with it. Since one is not bound by fixed working hours, they can schedule their work at the time when they feel most productive and convenient to them. Women & Men benefit a lot from this concept of work since they can balance their home and work perfectly. People mostly find that in this situation, their productivity is higher and stress levels lower. Those who like isolation and a tranquil work environment also tend to prefer this way of working. Today, with the kind of communication networks available, millions of people worldwide are considering this option.

Women & Men who want to be independent but cannot afford to leave their responsibilities at home aside will benefit a lot from this concept of work. It makes it easier to maintain a healthy balance between home and work. The family doesn't get neglected and you can get your work done too. You can thus effectively juggle home responsibilities with your career. Working from home is definitely a viable option but it also needs a lot of hard work and discipline. You have to make a time schedule for yourself and stick to it. There will be a time frame of course for any job you take up and you have to fulfill that project within that time frame.

There are many things that can be done working from home. A few of them is listed below that will give you a general idea about the benefits of this concept.

This is the most common and highly preferred job that Women & Men like doing. Since in today's competitive world both the parents have to work they need a secure place to leave behind their children who will take care of them and parents can also relax without being worried all the time. In this job you don't require any degree or qualifications. You only have to know how to take care of children. Parents are happy to pay handsome salary and you can also earn a lot without putting too much of an effort.

For those who have a garden or an open space at your disposal and are also interested in gardening can go for this method of earning money. If given proper time and efforts nursery business can flourish very well and you will earn handsomely. But just as all jobs establishing it will be a bit difficult but the end results are outstanding.

Freelance can be in different wings. Either you can be a freelance reporter or a freelance photographer. You can also do designing or be in the advertising field doing project on your own. Being independent and working independently will depend on your field of work and the availability of its worth in the market. If you like doing jewellery designing you can do that at home totally independently. You can also work on freelancing as a marketing executive working from home. Wanna know more, email us on workfromhome.otr214427@gmail.com and we will send you information on how you can actually work as a marketing freelancer.

Internet related work
This is a very vast field and here sky is the limit. All you need is a computer and Internet facility. Whatever field you are into work at home is perfect match in the software field. You can match your time according to your convenience and complete whatever projects you get. To learn more about how to work from home, contact us today on workfromhome.otr214427@gmail.comand our team will get you started on some excellent work from home projects.

Diet food
Since now a days Women & Men are more conscious of the food that they eat hence they prefer to have homemade low cal food and if you can start supplying low cal food to various offices then it will be a very good source of income and not too much of efforts. You can hire a few ladies who will help you out and this can be a good business.

Thus think over this concept and go ahead.