Go to file
Graham Christensen ece5c0f304
stage-1: modprobe ext{2,3,4} before resizing
I noticed booting a system with an ext4 root which expanded to 5T took
quite a long time (12 minutes in some cases, 43(!) in others.)

I changed stage-1 to run `resize2fs -d 62` for extra debug output and
timing information. It revealed the adjust_superblock step taking
almost all of the time:

    [Fri Oct 30 11:10:15 UTC 2020] zero_high_bits_in_metadata: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
    [Fri Oct 30 11:21:09 UTC 2020] adjust_superblock: Memory used: 396k/4556k (295k/102k), time: 654.21/ 0.59/ 5.13

but when I ran resize2fs on a disk with the identical content growing
to the identical target size, it would only take about 30 seconds. I
looked at what happened between those two steps in the fast case with
strace and found:

```
   235	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=1795}, ru_stime={tv_sec=0, tv_usec=3590}, ...}) = 0
   236	write(1, "zero_high_bits_in_metadata: Memo"..., 84zero_high_bits_in_metadata: Memory used: 132k/0k (72k/61k), time:  0.00/ 0.00/ 0.00
   237	) = 84
   238	gettimeofday({tv_sec=1604061278, tv_usec=480147}, NULL) = 0
   239	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=1802}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0
   240	gettimeofday({tv_sec=1604061278, tv_usec=480192}, NULL) = 0
   241	mmap(NULL, 2564096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa3c7355000
   242	access("/sys/fs/ext4/features/lazy_itable_init", F_OK) = 0
   243	brk(0xf85000)                           = 0xf85000
   244	brk(0xfa6000)                           = 0xfa6000
   245	gettimeofday({tv_sec=1604061278, tv_usec=538828}, NULL) = 0
   246	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=58720}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0
   247	write(1, "adjust_superblock: Memory used: "..., 79adjust_superblock: Memory used: 396k/2504k (305k/92k), time:  0.06/ 0.06/ 0.00
   248	) = 79
   249	gettimeofday({tv_sec=1604061278, tv_usec=539119}, NULL) = 0
   250	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=58812}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0
   251	gettimeofday({tv_sec=1604061279, tv_usec=939}, NULL) = 0
   252	getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=520411}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0
   253	write(1, "fix_uninit_block_bitmaps 2: Memo"..., 88fix_uninit_block_bitmaps 2: Memory used: 396k/2504k (305k/92k), time:  0.46/ 0.46/ 0.00
   254	) = 88
```

In particular the access to /sys/fs seemed interesting. Looking
at the source of resize2fs:

```
[root@ip-172-31-22-182:~/e2fsprogs-1.45.5]# rg -B2 -A1 /sys/fs/ext4/features/lazy_itable_init .
./resize/resize2fs.c
923-	if (getenv("RESIZE2FS_FORCE_LAZY_ITABLE_INIT") ||
924-	    (!getenv("RESIZE2FS_FORCE_ITABLE_INIT") &&
925:	     access("/sys/fs/ext4/features/lazy_itable_init", F_OK) == 0))
926-		lazy_itable_init = 1;
```

I confirmed /sys is mounted, and then found a bug suggesting the
ext4 module is maybe not loaded:
https://bugzilla.redhat.com/show_bug.cgi?id=1071909

My home server doesn't have ext4 loaded and had 3T to play with, so
I tried (and succeeded with) replicating the issue locally:

```
[root@kif:/scratch]# lsmod | grep -i ext

[root@kif:/scratch]# zfs create -V 3G rpool/scratch/ext4

[root@kif:/scratch]# time mkfs.ext4 /dev/zvol/rpool/scratch/ext4
mke2fs 1.45.5 (07-Jan-2020)
Discarding device blocks: done
Creating filesystem with 786432 4k blocks and 196608 inodes
Filesystem UUID: 560a4a8f-93dc-40cc-97a5-f10049bf801f
Superblock backups stored on blocks:
	32768, 98304, 163840, 229376, 294912

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

real	0m2.261s
user	0m0.000s
sys	0m0.025s

[root@kif:/scratch]# zfs set volsize=3T rpool/scratch/ext4

[root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4
resize2fs 1.45.5 (07-Jan-2020)
fs has 11 inodes, 1 groups required.
fs requires 16390 data blocks.
With 1 group(s), we have 22234 blocks available.
Last group's overhead is 10534
Need 16390 data blocks in last group
Final size of last group is 26924
Estimated blocks needed: 26924
Extents safety margin: 49
Resizing the filesystem on /dev/zvol/rpool/scratch/ext4 to 805306368 (4k) blocks.
read_bitmaps: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
read_bitmaps: I/O read: 1MB, write: 0MB, rate: 3802.28MB/s
fix_uninit_block_bitmaps 1: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
resize_group_descriptors: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
move_bg_metadata: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
zero_high_bits_in_metadata: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
```

here it got stuck for quite some time ... straceing this 20 minutes in revealed this in a tight loop:

```
getuid()                                = 0
geteuid()                               = 0
getgid()                                = 0
getegid()                               = 0
prctl(PR_GET_DUMPABLE)                  = 1 (SUID_DUMP_USER)
fallocate(3, FALLOC_FL_ZERO_RANGE, 2222649901056, 2097152) = 0
fsync(3)                                = 0
```

it finally ended 43(!) minutes later:

```
adjust_superblock: Memory used: 264k/3592k (210k/55k), time: 2554.03/ 0.16/15.07
fix_uninit_block_bitmaps 2: Memory used: 264k/3592k (210k/55k), time:  0.16/ 0.16/ 0.00
blocks_to_move: Memory used: 264k/3592k (211k/54k), time:  0.00/ 0.00/ 0.00
Number of free blocks: 755396/780023556, Needed: 0
block_mover: Memory used: 264k/3592k (216k/49k), time:  0.05/ 0.01/ 0.00
block_mover: I/O read: 1MB, write: 0MB, rate: 18.68MB/s
inode_scan_and_fix: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
inode_ref_fix: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
move_itables: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
calculate_summary_stats: Memory used: 264k/3592k (216k/49k), time: 16.35/16.35/ 0.00
fix_resize_inode: Memory used: 264k/3592k (222k/43k), time:  0.04/ 0.00/ 0.00
fix_resize_inode: I/O read: 1MB, write: 0MB, rate: 22.80MB/s
fix_sb_journal_backup: Memory used: 264k/3592k (222k/43k), time:  0.00/ 0.00/ 0.00
overall resize2fs: Memory used: 264k/3592k (222k/43k), time: 2570.90/16.68/15.07
overall resize2fs: I/O read: 1MB, write: 1MB, rate: 0.00MB/s
The filesystem on /dev/zvol/rpool/scratch/ext4 is now 805306368 (4k) blocks long.

real	43m1.943s
user	0m16.761s
sys	0m15.069s
```

I then cleaned up and recreated the zvol, loaded the ext4 module, created the ext4 fs,
resized the volume, and resize2fs'd and it went quite quickly:

```
[root@kif:/scratch]# zfs destroy rpool/scratch/ext4

[root@kif:/scratch]# zfs create -V 3G rpool/scratch/ext4

[root@kif:/scratch]# modprobe ext4

[root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4

[root@kif:/scratch]# time mkfs.ext4 /dev/zvol/rpool/scratch/ext4
mke2fs 1.45.5 (07-Jan-2020)
Discarding device blocks: done
Creating filesystem with 786432 4k blocks and 196608 inodes
Filesystem UUID: 5b415f2f-a8c4-4ba0-ac1d-78860de77610
Superblock backups stored on blocks:
	32768, 98304, 163840, 229376, 294912

Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done

real	0m1.013s
user	0m0.001s
sys	0m0.023s

[root@kif:/scratch]# zfs set volsize=3T rpool/scratch/ext4

[root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4
resize2fs 1.45.5 (07-Jan-2020)
fs has 11 inodes, 1 groups required.
fs requires 16390 data blocks.
With 1 group(s), we have 22234 blocks available.
Last group's overhead is 10534
Need 16390 data blocks in last group
Final size of last group is 26924
Estimated blocks needed: 26924
Extents safety margin: 49
Resizing the filesystem on /dev/zvol/rpool/scratch/ext4 to 805306368 (4k) blocks.
read_bitmaps: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
read_bitmaps: I/O read: 1MB, write: 0MB, rate: 3389.83MB/s
fix_uninit_block_bitmaps 1: Memory used: 132k/0k (63k/70k), time:  0.00/ 0.00/ 0.00
resize_group_descriptors: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
move_bg_metadata: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
zero_high_bits_in_metadata: Memory used: 132k/0k (68k/65k), time:  0.00/ 0.00/ 0.00
adjust_superblock: Memory used: 264k/1540k (210k/55k), time:  0.02/ 0.02/ 0.00
fix_uninit_block_bitmaps 2: Memory used: 264k/1540k (210k/55k), time:  0.15/ 0.15/ 0.00
blocks_to_move: Memory used: 264k/1540k (211k/54k), time:  0.00/ 0.00/ 0.00
Number of free blocks: 755396/780023556, Needed: 0
block_mover: Memory used: 264k/3592k (216k/49k), time:  0.01/ 0.01/ 0.00
block_mover: I/O read: 1MB, write: 0MB, rate: 157.11MB/s
inode_scan_and_fix: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
inode_ref_fix: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00
move_itables: Memory used: 264k/3592k (216k/49k), time:  0.00/ 0.00/ 0.00

calculate_summary_stats: Memory used: 264k/3592k (216k/49k), time: 16.20/16.20/ 0.00
fix_resize_inode: Memory used: 264k/3592k (222k/43k), time:  0.00/ 0.00/ 0.00
fix_resize_inode: I/O read: 1MB, write: 0MB, rate: 5319.15MB/s
fix_sb_journal_backup: Memory used: 264k/3592k (222k/43k), time:  0.00/ 0.00/ 0.00
overall resize2fs: Memory used: 264k/3592k (222k/43k), time: 16.45/16.38/ 0.00
overall resize2fs: I/O read: 1MB, write: 1MB, rate: 0.06MB/s
The filesystem on /dev/zvol/rpool/scratch/ext4 is now 805306368 (4k) blocks long.

real	0m17.908s
user	0m16.386s
sys	0m0.079s
```

Success!
2020-10-30 12:18:23 -04:00
.github README.md: update stable release links 2020-10-26 20:10:29 -04:00
doc Revert "Merge #101508: libraw: 0.20.0 -> 0.20.2" 2020-10-25 09:41:51 +01:00
lib lib/types.nix: fix missing inherit 2020-10-26 00:50:06 -07:00
maintainers Merge pull request #100688 from henrikolsson/master 2020-10-26 20:05:27 +01:00
nixos stage-1: modprobe ext{2,3,4} before resizing 2020-10-30 12:18:23 -04:00
pkgs Merge pull request #99444 from etu/aldo-upgrade 2020-10-28 13:23:42 +00:00
.editorconfig .editorconfig: cleanup, drop indent_size excludes 2020-10-17 11:18:32 +10:00
.gitattributes gitattributes: disable merge=union in all-packages 2018-03-27 11:03:03 -05:00
.gitignore git ignore __pycache__ folders 2020-08-16 11:30:11 +02:00
.version 21.03 is Okapi 2020-09-07 14:20:35 -07:00
COPYING COPYING: include 2020 2020-01-11 15:17:22 -08:00
default.nix docs: add -L to remaining curl install commands 2020-09-11 12:14:07 -07:00
flake.nix flake.nix: allow inclusion of nixpkgs as path:/.../ 2020-10-13 12:05:19 +02:00
README.md README.md: update stable release links 2020-10-26 20:10:29 -04:00

NixOS logo

Code Triagers badge Open Collective supporters

Nixpkgs is a collection of over 40,000 software packages that can be installed with the Nix package manager. It also implements NixOS, a purely-functional Linux distribution.

Manuals

  • NixOS Manual - how to install, configure, and maintain a purely-functional Linux distribution
  • Nixpkgs Manual - contributing to Nixpkgs and using programming-language-specific Nix expressions
  • Nix Package Manager Manual - how to write Nix expressions (programs), and how to use Nix command line tools

Community

Other Project Repositories

The sources of all official Nix-related projects are in the NixOS organization on GitHub. Here are some of the main ones:

  • Nix - the purely functional package manager
  • NixOps - the tool to remotely deploy NixOS machines
  • nixos-hardware - NixOS profiles to optimize settings for different hardware
  • Nix RFCs - the formal process for making substantial changes to the community
  • NixOS homepage - the NixOS.org website
  • hydra - our continuous integration system
  • NixOS Artwork - NixOS artwork

Continuous Integration and Distribution

Nixpkgs and NixOS are built and tested by our continuous integration system, Hydra.

Artifacts successfully built with Hydra are published to cache at https://cache.nixos.org/. When successful build and test criteria are met, the Nixpkgs expressions are distributed via Nix channels.

Contributing

Nixpkgs is among the most active projects on GitHub. While thousands of open issues and pull requests might seem a lot at first, it helps consider it in the context of the scope of the project. Nixpkgs describes how to build over 40,000 pieces of software and implements a Linux distribution. The GitHub Insights page gives a sense of the project activity.

Community contributions are always welcome through GitHub Issues and Pull Requests. When pull requests are made, our tooling automation bot, OfBorg will perform various checks to help ensure expression quality.

The Nixpkgs maintainers are people who have assigned themselves to maintain specific individual packages. We encourage people who care about a package to assign themselves as a maintainer. When a pull request is made against a package, OfBorg will notify the appropriate maintainer(s). The Nixpkgs committers are people who have been given permission to merge.

Most contributions are based on and merged into these branches:

  • master is the main branch where all small contributions go
  • staging is branched from master, changes that have a big impact on Hydra builds go to this branch
  • staging-next is branched from staging and only fixes to stabilize and security fixes with a big impact on Hydra builds should be contributed to this branch. This branch is merged into master when deemed of sufficiently high quality

For more information about contributing to the project, please visit the contributing page.

Donations

The infrastructure for NixOS and related projects is maintained by a nonprofit organization, the NixOS Foundation. To ensure the continuity and expansion of the NixOS infrastructure, we are looking for donations to our organization.

You can donate to the NixOS foundation by using Open Collective:

License

Nixpkgs is licensed under the MIT License.

Note: MIT license does not apply to the packages built by Nixpkgs, merely to the files in this repository (the Nix expressions, build scripts, NixOS modules, etc.). It also might not apply to patches included in Nixpkgs, which may be derivative works of the packages to which they apply. The aforementioned artifacts are all covered by the licenses of the respective packages.