Veritas File System (VxFS) tuning
Preamble
In addition of Veritas Volume Manager (VxVM) Symantec is also proposing Veritas File System (VxFS) that is most of the time used in combination of VxVM. Symantec claim that highest benefit is found when using both in parallel. This document has been written using Red Hat Enterprise Linux Server release 5.5 (Tikanga) and below VxFS/VxVM releases:
To avoid putting real server name in your document think of something like:
VxFS file system physical parameters
When creating a file system there are two important characteristics to choose:
- Block size (cannot be changed once the file system has been created)
- Intent log size (can be changed after file system creation with fsadm, VxFS usually performs better with larger log sizes)
Remark:
When using VxVM with VxFS Symantec recommend usage of vxresize (instead of vxassist and fsadm) to club volume and filesystem shrink or grow.
When using VxVM with VxFS Symantec recommend usage of vxresize (instead of vxassist and fsadm) to club volume and filesystem shrink or grow.
Block size
From man mkfs_vxfs command (-o bsize=bsize):
File system size Default block size
————— ——————
0 TB to 1 TB 1k
>1 TB 8kSimilarly, the block size determines the maximum possible file system size, as given on the following table:Block size Maximum file system size
———- ————————
1k 32 TB
2k 64 TB
4k 128 TB
8k 256 TB
Recommended Oracle file systems block size (assuming your Oracle database have block size equal or bigger than 8KB which is :
File System | Block Size |
---|---|
Oracle software and dump/diagnostic directories | 1KB |
Redo log directory | 512 bytes for Solaris, AIX, Windows, Linux and 1KB for HP-UX |
Archived log directory | 1KB |
Control files directory | 8KB (control files block size is 16KB starting with Oracle 10g) |
Data, index, undo, system/sysaux and temporary directories | 8KB |
You can check control file block size with (Linux RedHat 5.5 and Oracle 11.2.0.3):
Remark:
For Oracle release lower than 10g control files block size was equal to Oracle initialization parameter db_block_size, starting with 10g their block size is now 16KB whatever value of db_block_size.
For Oracle release lower than 10g control files block size was equal to Oracle initialization parameter db_block_size, starting with 10g their block size is now 16KB whatever value of db_block_size.
You can check redo log block size with (Linux RedHat 5.5 and Oracle 11.2.0.3):
Remark:
As 4KB block size disk are slowly coming on the market, Oracle 11gR2 now offer the capability to create redo log files with the size you like…
As 4KB block size disk are slowly coming on the market, Oracle 11gR2 now offer the capability to create redo log files with the size you like…
Once the fylesystem has already been created the command to see what block size has been chosen is fstyp. On HPUX 11iv3 (11.31) the parameter to look at is f_frsize:
On Linux (RedHat 6.3) the parameter to look at is bsize:
Remark:
If from fragmentation point of view having bigger filesystem block size is obvious from performance perspective it is no so straight forward. See Analyzing the impact of the Vxfs filesystem block size on Oracle article in reference section that is almost breaking what I had in mind since multiple years…
If from fragmentation point of view having bigger filesystem block size is obvious from performance perspective it is no so straight forward. See Analyzing the impact of the Vxfs filesystem block size on Oracle article in reference section that is almost breaking what I had in mind since multiple years…
Intent log size
From man mkfs_vxfs command (-o logsize=n):
Block size Minimum log size Maximum log size
———- —————- —————-
1k 256 blocks 262,144 blocks
2k 128 blocks 131,072 blocks
4k 64 blocks 65,536 blocks
8k 32 blocks 32,768 blocksThe default log size increases with the file system size, as shown on the following table:File system size Default log size
—————- —————-
0 MB to 8 MB 256k
8 MB to 512 MB 1 MB
512 MB to 16 GB 16 MB
16 GB to 512 GB 64 MB
512+ GB 256 MB
Intent log size:
Remark:
if the fsadm command complains for something like:
if the fsadm command complains for something like:
then look for VxFS binaries in /opt/VRTS/bin directory. Be careful if changing the PATH because simple tool like df will not behave the same as Symantec has re-written it.
So in my 6GB file system example example:
The default block size (1KB) has been chosen and so default intent log size of 16384 blocks i.e. 16MB.
So which intent log size to choose ? Symantec say recovery time increae with larger intent log while VxFS performs better with larger intent log size. As you obliviously when to tune for the 99.99% of time when your system is up and running you should consider creating large intent log size keeping in mind that behavior must be controlled while application is running (no clear Oracle recommendation)…
File extents
Same as Oracle table extent you can change default extent allocation policy and/or preallocate space to a file:
Remark:
An extent size of 0 use default extent allocation. See vxtunefs for policy description (parameters are initial_extent_size and max_seqio_extent_size).
An extent size of 0 use default extent allocation. See vxtunefs for policy description (parameters are initial_extent_size and max_seqio_extent_size).
Small exemple with an empty file:
Now changing its extend and initial allocation:
Please note it takes a bit of time to recover free space when deleting this test file.
Fixed extent sizes and Oracle ? I would say it is beneficial for Oracle datafiles as it avoids fragmentation, but if like me you work with autoextend feature then do not set a too small next extent and you would achieve same behavior.
VxFS file system tuning
Tunable filesystem parameters
If file systems are used with VxVM it is suggested to let default value so do test when changing…
Remark:
When using VxFS with VxVM, VxVM by default breaks up I/O requests larger than 256K.
When using VxFS with VxVM, VxVM by default breaks up I/O requests larger than 256K.
File system fragmentation
To display it issue:
Remark:
Symantec do recommend to perform regular file system defragmentation (!!):
Symantec do recommend to perform regular file system defragmentation (!!):
In general, VxFS works best if the percentage of free space in the file system does not get below 10 percent. This is because file systems with 10 percent or more free space have less fragmentation and better extent allocation. Regular use of Veritas df command (not the default OS df) to monitor free space is desirable (man df_vxfs).
An unfragmented file system has the following characteristics:
- Less than 1 percent of free space in extents of less than 8 blocks in length
- Less than 5 percent of free space in extents of less than 64 blocks in length
- More than 5 percent of the total file system size available as free extents in lengths of 64 or more blocks
A badly fragmented file system has one or more of the following characteristics:
- Greater than 5 percent of free space in extents of less than 8 blocks in length
- More than 50 percent of free space in extents of less than 64 blocks in length
- Less than 5 percent of the total file system size available as free extents in lengths of 64 or more blocks
Mount options
Suggested mount options for Oracle databases:
File System | Normal Mount Options (VxFS) | Advanced Mount Options (VxFS) |
---|---|---|
Oracle software and dump/diagnostic directories | delaylog,datainlog,nolargefiles | delaylog,nodatainlog,nolargefiles |
Redo log directory | delaylog,datainlog,largefiles | delaylog,nodatainlog,convosync=direct,mincache=direct,largefiles |
Archived log directory | delaylog,datainlog,nolargefiles | delaylog,nodatainlog,convosync=direct,mincache=direct,nolargefiles |
Control files directory | delaylog,datainlog,nolargefiles | delaylog,datainlog,nolargefiles |
Data, index, undo, system/sysaux and temporary directories | delaylog,datainlog,largefiles | delaylog,nodatainlog,convosync=direct,mincache=direct,largefiles |
Remark:
Licensed product Concurrent I/O (CIO) should also be considered when looking for I/O performance and running an Oracle database.
Licensed product Concurrent I/O (CIO) should also be considered when looking for I/O performance and running an Oracle database.
Database accelerators
Veritas Extension for Oracle Disk Manager
I found this in Veritas File System Administrator’s Guide and more deeply in Veritas Storage Foundation: Storage and Availability Management for Oracle Databases and it’s like re-discovering hot water. What is this Oracle Disk Manager (ODM) ?
From Symantec documentation:
The benefits of using Oracle Disk Manager are as follows:
- True kernel asynchronous I/O for files and raw devices
- Reduced system call overhead
- Improved file system layout by preallocating contiguous files on a VxFS file system
- Performance on file system files that is equivalent to raw devices
- Transparent to users
Oracle Disk Manager improves database I/O performance to VxFS file systems by:
- Supporting kernel asynchronous I/O
- Supporting direct I/O and avoiding double buffering
- Avoiding kernel write locks on database files
- Supporting many concurrent I/Os in one system call
- Avoiding duplicate opening of files per Oracle instance
- Allocating contiguous datafiles
From Oracle documentation:
Oracle has developed a new disk and file management API called odmlib, which is marketed under the feature Oracle Disk Manager (ODM). ODM is fundamentally a file management and I/O interface that allows DBAs to manage larger and more complex databases, whilst maintaining the total cost of ownership.Oracle Disk Manager (ODM) is packaged as part of Oracle9i and above; however, you’ll need a third party vendor’s ODM driver to fully implement Oracle’s interface. For example, Veritas’ VRTSodm package (in Database Edition V3.5) provides an ODM library. Other vendors such as HP and Network Appliance (DAFS) have also announced support and integration of ODM.
A bit of history can be found in this Veritas slide:
Remark:
ODM is an integrated solution and is considered as replacement of Quick I/O.
ODM is an integrated solution and is considered as replacement of Quick I/O.
Let’s confirm option is available and usable:
Looking at documentation on how to configure it I had the surprise to see that it’s already there:
So then what’s the difference with the library from Veritas package and this one ? I have feeling that the Oracle one is fake library for link consistency and in any case you must use the one coming from Veritas package.
Once database restarted you should see this appearing in alert log file located in ADR (Automatic Diagnostic Repository):
Once activated (/dev/omd/fid file not empty, File Identification Descriptor) you can find usage statistics in:
And using odmstat:
Once ODM is activated you do not have to bother anymore with mount options and filesystem properties as ODM performs direct IO (raw) and work inkernalized asynchronous IO (kaio) mode.
Remark:
It is strongly suggested to backup your database files if deactivating ODM.
It is strongly suggested to backup your database files if deactivating ODM.
Veritas Cached Oracle Disk Manager
As we have seen ODM bypasses file system cache and so do direct I/O and no read ahead (as we know read intensive database can suffer from this). Cached ODM (CODM) is implementing selected cached I/O, what ? Better than a long explanation from Symantec documentation:
ODM I/O bypasses the file system cache and directly reads from and writes to disk. Cached ODM enables selected I/O to use caching (file system buffering) and read ahead, which can improve overall Oracle DB I/O performance. Cached ODM performs a conditional form of caching that is based on per-I/O hints from Oracle. The hints indicate what Oracle will do with the data. ODM uses these hints to perform caching and read ahead for some reads, but ODM avoids caching other reads, possibly even for the same file.
CODM is an ODM extension (that must be installed as a requisite), check CODM package is installed with:
Activate it on a file system using (/etc/vx/tunefstab to make it persistent across reboot):
Then use setcachefile and getcachefile odmadm parameters to change individual files:
The cachemap maps file type and I/O type combinations to caching advisories. you can tune it using setcachemap and getcachemap odmadm parameters. List of available parmaeters:
On top of complexity to understand which files can benefit from caching or not, cachemap has so much values to tune that it becomes impossible to tune CODM manually without any advices. Please note that cachemap settings are not persistent across reboot, use /etc/vx/odmadm file to achieve it. So how to achieve this ?
It is advised not to change default cachemap to avoid drawback like file system cache and Oracle SGA double cache. To understand which files can benefit from CODM you have two options:
- Use a Veritas tool called Cached ODM Manager (dbed_codm_adm) that can be used by DBAs.
- Generate AWR reports (Oracle 10g and above) and order tablespaces/files per Reads, highest physical reads datafiles would benefit from CODM.
Putting all together it starts to be a bit complex:
- Oracle SGA
- File system cache
- CODM (ODM)
So then where to put available memory ? Added value of CODM is dynamic allocation, SGA is not dynamic (SGA_MAX_SIZE / MEMORY_MAX_TARGET). Then CODM versus file system ? CODM has a much better granularity as a per file cache, so you can activate it for file where it’s really needed (using AWR and/or dbed_codm_adm).
got from http://blog.yannickjaquier.com/
References
- Storage Foundation DocCentral
- Pros and Cons of Using Direct I/O for Databases [ID 1005087.1]
- Async IO does not appear to be in use by Oracle on VxFS [ID 756275.1]
- Master Note for Oracle Disk Manager [ID 1226653.1]
- Cached Oracle Disk Manager: Usage Guidelines and Best Practices
- Controlfile Structure
- Oracle Internals – 2 (Oracle Controlfile Structures )
- Log Block Size
- Oracle redo logs use a different blocksize
- Understanding 4KB Sector Support for Oracle Files
- Analyzing the impact of the Vxfs filesystem block size on Oracle
Comments
Post a Comment