Monday, June 4, 2007

Oracle Protocol Performance Papers

One of the early things that I worked on after joining NetApp was to work on this joint performance testing and paper with IBM NFS engineering

Since Netapp storage systems support NFS, iSCSI as well FCP to access to the data, we compared performance on these protocols in a given configuration and this led to work that improved NFS performance for oracle database use.

http://www.netapp.com/library/tr/3408.pdf

The paper may be useful to those that are considering deploying Oracle on AIX with NetApp storage and provides various host and storage tuning guidelines as well.

I have repeated similar paper on RHEL4 Linux
http://www.netapp.com/library/tr/3495.pdf
Here we endedup improving iSCSI driver in Linux 2.6, as well NFS in Linux 2.6 for Oracle databases.

3 comments:

Tubby said...

Hi Sanjay.

I don't see much activity here - I hope you are still around.

Please clear up something for me - we are getting conflicting messages from different NetApp people. We have severe performance problems and there are two theories:

1) We allowed the aggregate to exceed 80% capacity.
2) We allowed the main volume to exceed 80% capacity.

Both were, in fact, true. We have rectified both conditions, but the performance problem persists. We will do a reallocate when feasible, but I really need to know which of the above conditions is the one to avoid at all costs.

I hope the 80% on the aggregate is the "bad" one, because keeping each volume below 80% is much more of a management nightmare.

We are using a model FAS3020 with 8TB and with software version 7.2.2. There are 51 volumes on the one aggregate with the largest volume being for a 2.4TB Oracle/SAP database.

Your expert opinion will be much appreciated.

Regards,
Alan.

- Sanjay Gulabani said...

I am very much around and committed. So problem with conflicting messages is due to there isn't simple answer and free space is somewhat workload dependent.
One the factors to look at first you probably have a 20% snap reserve, which if not used completely should be counted towards the 20% per aggregate.

I picked Steve Daniel's brain on this who is lot more familiar with space reservation than I am and he will post a comment to your question for more details.

Stephen said...

Alan --

For some workloads and under some circumstances NetApp advises people to not completely fill an aggregate. The problem is that some workloads leave the free space in a very chaotic state, and some workloads care about the state of the free space. Workloads that perform lots of random overwrites tend to fall into the first category. Workloads where write performance is crucial or where the amount of space in active use is large compared to system memory may fall into the second.

So it is possible that your system will run better with a little extra space in the aggregate. Be sure to count space that has been reserved (for volume or aggregate snap reserve or for space reservations) towards the total amount you should keep free.

There is no need to keep your volumes at 80%. There are circumstances where filling a volume to 100% has caused problems (which is a bug, IMHO), but there is a simple work around. If this is an issue then you need to keep the volume to no more than 80% of the biggest size the volume has ever been.

If you wish you can try simply growing the volume by 20% and then shrinking it back to its original size.

For example, suppose you have a 500 GB volume named "vol1":

  vol size vol1 +100g
  vol size vol1 -100g

Now you will never fill the volume too full.

If you want me to take a look at your current situation and render an opinion, feel free to mail me directly with some data and a NetApp case number. You should be able to email me using my name, "Steve.Daniel" in the domain "netapp.com"

-- Steve Daniel
Director, Database Platform and Performance Technology
NetApp