Journaling for ext2fs, TEST release 0.0.5e

Released 5 January, 2001
Stephen Tweedie <sct@redhat.com>

Changes in this release
-----------------------

0.0.5e:

Added barrier support: it is now possible to suspend all journal
activity on a filesystem temporarily, forcing the journal into a clean
state.

Added support for forcing data-journaling on a per-inode basis.  Enable
this on the quota files.

Fix O_SYNC for the writeback data journaling mode.

In ext3_orphan_del, lock the superblock _before_ testing the orphan list
to make the entire operation atomic.

When rename overwrites and deletes a file, make sure that file gets put
on the orphan list.

Fix the initialisation of a new journal: make sure that the journal
superblock is pre-zeroed along with the rest of the journal.

Don't mount or remount a journal with unrecognised features.

{ For older changes, see the file CHANGES. }


Introduction
------------

What is journaling?

    * It means you don't have to fsck after a crash.  Basically.

What works?

    * Journaling to a journal file on the journaled filesystem

    * Automatic recover when the filesystem is remounted

    * All VFS operations (including quota) should be journaled

    * Add data updates are also journaled


What is left to be done?

    * Quota support for non-data-journaled filesystems.

    * Journaling to an off-filesystem device, eg. NVRam

    * Decent documentation!

    * A few internal cleanups: migrating the extra buffer_head fields to
      a separate jfs_buffer_info field in particular.

How to apply
------------

This README should have come with two diffs for kernel version
linux-2.2.19pre6:

  -rw-rw-r-- 1 sct sct 439133 Jan  5 17:05 linux-2.2.19pre6.ext3.diff
  -rw-rw-r-- 1 sct sct 218132 Jan  5 16:55 linux-2.2.19pre6.kdb.diff

The first diff is copy of SGI's kdb kernel debugger patches.  Apply this
first if you want kdb.  The second patch is the ext3 filesystem.  If you
apply this without the kdb diff, you will get a couple of rejects (the
ext3 diff includes a kdb module for interrogating jfs data structures)
--- ignore those.

2.2.19pre6 includes a significant VM fix, so it is recommended to use it
instead of 2.2.18.  However, you can use 2.2.18 too: the file

  -rw-rw-r-- 1 sct sct 21068 Jan  5 15:37 ext3-0.0.5d_to_0.0.5e.diff

in the archive should bring an ext3-0.0.5d system up to 0.0.5e, and the
ext3-0.0.5d patches were made against a plain 2.2.18 kernel.

If you can't apply kernel patches, stop reading this now.  Right now!

Now, configure the kernel, saying YES to "Enable Second extended fs
development code" (I *assume* you want it!), and build it.

The release also includes packaged versions of e2fsprogs which include
mke2fs and e2fsck support for ext3 filesystems.


What next?
----------

Now, you want to make a journaled filesystem (recommended) or journal an
existing one.  Great.  Making a new ext3 filesystem is easy: you just
use the mke2fs from the e2fsprogs in ext3-0.0.5d, and use the
"-jsize=<n>" option when running mke2fs.  This tells mke2fs to create an
ext3 filesystem with a hidden journal of <n> MB.  10MB is a good size to
choose.

To upgrade an existing ext2 filesystem to ext3, first of all mount the
filesystem you want to journal.  (Except see below for special
instructions for the root filesystem).

Be aware that the jfs patch does _not_ change the ext2 code.  Rather, it
makes a copy of ext2 called ext3, and all the fancy footwork takes place
in that.  You don't have to run ext3 on all your valuable filesystems:
just use it on the throwaway ones.

Now, create a journal file.  I don't know how big it should be yet: the
rules of thumb have yet to be established!  However, try (say) 2MB for a
small filesystem on a 486; maybe up to 30MB on a big 18G 10krpm Cheetah.
Or whatever you want.  You need at least 1024 blocks for the journal, so
on a filesystem with a blocksize of 4k the minimum journal is 4MB.

You'll need to make sure that the file is preallocated, so use something
like:

	dd if=/dev/zero of=/mnt/sparefs/journal.dat bs=1k count=10000

assuming you want a 10MB journal on a 1k ext2 filesystem mounted on
/mnt/sparefs.  You need to find the journal inode's inode number, too:

	ls -i /mnt/sparefs/journal.dat

For a newly created filesystem, this will probably show

        12 journal.dat

OK, 12 is the expected number for a clean fs.  You might want to do a
"chmod 400 journal.dat" right now to make sure that nobody will be
able to poke around in the journal once it is running (don't worry,
ext3 will be able to write to the journal even if you specify a
read-only access mode for the file).

Now, umount as ext2.  Take a deep breath.  Now mount as ext3, giving it
the inode number of the file to be created as a journal:

	mount -t ext3 /dev/sdb2 /mnt/sparefs -o journal=12

Bingo.  That's it.  Enjoy!

Note: The "-o journal=<nnn>" bit is only necessary when creating a new
journal the first time you mount a filesystem as ext3.  Do _not_ add
it to /etc/fstab: it will do no good at all there.

Warning: the journaling will get _seriously_ confused if you try to 
delete the journal file.  Future versions of ext3 will protect this 
automatically, but for now you probably want to make it into an
immutable file to guard it:

	chattr +i /mnt/sparefs/journal.dat

Setting the immutable bit will not prevent the filesystem from writing
to the journal internally, but it will stop any other processes from
modifying or removing the journal.


Creating a journal on your root filesystem
------------------------------------------

How do you add the "-o journal=<nnn>" to the mount options for the
root filesystem?  Obviously, / gets mounted for you by the kernel, so
you can't add it on the mount command.  However, the ext3 comes with a
new kernel boot option, "rootflags=", which lets you specify any
options you want to be used when / is mounted.

To create the journal on your root filesystem, then, you want to boot
once with the rootflags option.  When creating the journal, it is also
important to mount the root in read-write mode.  So, the kernel
command line options you want to add will look like this:

	rw rootflags=journal=12

if your journal.dat is inode number 12.  If you are using LILO as your
boot loader, you can either specify these options at the boot prompt,
or you can force LILO to add new temporary kernel options just for the
next boot only: if the LILO kernel image is called "ext3", then you
can run

	/sbin/lilo -R ext3 rw rootflags=journal=12

and reboot to get the kernel to build your journal on the root
filesystem.


How to fsck
-----------

As long as you are using the e2fsprogs from the ext3-0.0.5c release,
e2fsck should work just as expected on both ext2 and ext3 filesystems.
The new e2fsck fully understands the ext3 journal.

If you use an older version of e2fsck from e2fsprogs-1.17 or later, then
you can now run e2fsck quite happily on the filesystem, but *only as
long as the filesystem was unmounted cleanly*.  If it wasn't, then
you'll need to get the kernel code to recover the journal from the disk
by mounting the filesystem (even a readonly mount will cause a journal
recovery to happen) and umounting it again (or, for the root filesystem,
remounting it readonly with "mount -o remount,ro /").



How to move back from ext3 to ext2
----------------------------------

It's quite easy.  If you unmount an ext3 filesystem cleanly, then you
can remount it as ext2 without any other commands.  If you crash and are
left with an unclean ext3 filesystem, on the other hand, the filesystem
will prevent you from mounting it as ext2: it is not safe to mount it
until you have recovered the journal, and the only way to do that for
now is to mount it as ext3.

However, if for any reason you do have an ext3 filesystem which you want
to convert permanently back to ext2, whether it was cleanly unmounted or
not, you can use "debugfs" from e2fsprogs-1.17 or later to do it.
First, run debugfs and open the filesystem (the -w flag means open for
write, and the -f flag forces it to open the filesystem even if there
are unknown journal flags set):

    [root@sarek /root]# debugfs
    debugfs 1.18, 11-Nov-1999 for EXT2 FS 0.5b, 95/08/09
    debugfs:  open -f -w /dev/sdb1 

Now, use "features" to see which feature bits are set on the filesystem:

    debugfs:  features
    Filesystem features: has_journal filetype sparse_super

We want to clear the journal bits, then we can quit:

    debugfs:  features -has_journal -needs_recovery
    Filesystem features: filetype sparse_super
    debugfs:  quit
    [root@sarek /root]# debugfs

That's it!




Enjoy.
--Stephen.
