Saturday, January 29, 2011

Cannot reduce filesystem size

Cannot reduce fs size

I have little issue reducing the fs on AIX
.5.3. Here is what I get

root@dccccc-svc/etc>df -g /data/edw/init_stg
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/init_stg_lv01    793.00    425.25   47%     1081     1%
/data/edw/init_stg
root@dccccc-svc/etc>oslevel -s
5300-08-03-0831
root@dccccc-svc/etc>

root@dccccc-svc/etc>chfs -a size=-10G /data/edw/init_stg
chfs: There is not enough free space to shrink the file system.

System Model: IBM,9131-52A
Machine Serial Number: 0xxxxxxxxxx
Processor Type: PowerPC_POWER5
Processor Implementation Mode: POWER 5
Processor Version: PV_5_3
Number Of Processors: 4
Processor Clock Speed: 1648 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 3 dccccc-normal
Memory Size: 32000 MB
Good Memory Size: 32000 MB
Platform Firmware level: SF240_332
Firmware Version: IBM,SF240_332

That happens when you try to reduce a big chunk of data (in this case
10G) that may not be contiguous in the filesystem because you have files
scattered everywhere.

1. Try to defrag the FS

#defragfs -s /data/edw/init_stg

2. If you still can't reduce it after this. Try reducing the FS in
smaller chunks.
Instead of 10G at a time, try reducing 1 or 2 gigs. Then, repeat the
operation.

3. Try looking for files large using the find cmd and move them out
temporarily, just to see if we can shrink the fs without them:

#find /<filesystem> -xdev -size +2048 -ls|sort -r +6|pg

4. Sometimes processes open big files and use lots of temporary space in
those filesystem.
You could check processes/applications running against the filesystem
and stop them temporarily, if you can.
#fuser -cu[x] <filsystem>

Please, let me know if this works.

------------------------------------------
Explanations to the behavior of shrinkfs:

In the beginning of the JFS2 filesystem, there is the superblock, the
superblock backup, and then the data and metadata of the filesystem.  At
the end is the inline log (if there is one), and the fsck working area.

The way the filesystem shrink works is this:  When chfs is run and a
size is given (either -NUM or an absolute NUM size) AIX calculates where
that exists within the filesystem.  This marker is known as "the fence".
The system then calculates how much data is left outside the fence, that
must be moved inside it (since we don't want to lose data).  It
calculates the free space available, and subtracts a minimal amount for
the fsck working area and inline log (if any) that must go at the tail
end of the filesystem.

What chfs has to do is some complex calculating: in the area outside the
fence, is there any data to be saved and moved inside? In the area
inside the fence, how much data is there?  Is it contiguous?  How much
free space is there we have to play with?  Is there enough space to move
the data from outside the fence inside it to save it?  And lastly, is
there enough space to move the fsck working area and inline logs inside
also along with these?

It does not try to reorganize the data in any way.  If a large file
outside the fence is make up of contiguous extents, then AIX looks for
an equivalent contiguous free space area inside the fence to move the
file to.  If it can't find one, either due to a lack of space or free
space fragmentation, it fails this operation and won't shrink the
filesystem.  The chfs shrink will also not purposely fragment a file to
force it to fit within fragmented free space.

In some cases running defragfs on the filesystem to defragment the files
will help, but many times it doesn't.  The reason is because the purpose
of defragfs is to coalesce files into more contiguous extents, but not
to coalesce the free space in between them.

If non-contiguous free space is the issue, the only way to get them to
coalesce into large enough regions is to back up the data, remove it,
and restore it.  Then the filesystem shrink may find enough contiguous
free space when chfs is run to move the data outside the fence into.

There's a limit to how much chfs can shrink a filesystem. This is
because chfs has to take into account not only the data you are
moving around, but it tries to keep the contiguous blocks of data in
files still contiguous. So if you have a filesystem with a lot of
space that is broken up into small areas, but you are moving around
large files it may fail even though it looks like you have a lot of
space left to shrink.

The free space reported by the df command is not necessary the space
that can be truncated by a shrinkFS request due to filesystem
fragmentation. A fragmented filesystem may not be shrunk if it does
not have enough free space for an object to be moved out of the region
to be truncated, and shrinkFS does not perform filesystem
defragmentation. In this case, the chfs command should fail with the
returned message: chfs: There is not enough free space to shrink the
file system - return code 28 (ENOSPC).

One of the common areas we see that limits customers is the
inclusion of large, unfragmented files in a filesystem, such as binary
database files.  If a filesystem consists of a few, but extremely large
files, depending on how these are laid out the chfs may fail to find
enough space to move the data from outside the fence into it if it were
to attempt to shrink the filesystem.

No comments:

Post a Comment