how to operate huge number of small files efficiently

When you have huge number of small files, how to deal with it?

I am using linux, developing android.
android code becomes more and more huge, and after compile , you will have a lot of object files.

for one project , you will get more than 4 million of files.

so if you have several projects to develop? billions of them.


1: use reiserfs?
I tried reiserfs on a hardware raid0,
I can not compile my android code, it will show error, But when I move the same code to ext4
it compiles well. reiserfs should be dropped.

2:if you want to backup these codes
use squashfs, it will compress these small files to a big read only file system, you can
easily handle it .copy or delete the project (4 millions) is very fast. if you just delete on normal 4 millions of small files, may
take 8 hours.

3:if you want to develop?need read and write?
you can use sparse files.

truncate -s 700G austin.ext4

you will create a filesystem limited to 700GB.

mkfs.ext4 austin.ext4

then mount it

mkdir mnt
mount austin.ext4 mnt/
chown ailantian:ailantian mnt

then do your develop under the mnt directory.

you will found your compile time , and grep time, will be faster.

and at last, you have one big file for the project, which contains more then 4 millions files , even you copy to other places,delete huge number of small files, faster.

for efficient rsync on sparse files
mount -t squashfs ../myproject_4.1base.squashfs /tmp/squashfs/

rsync -avrS /tmp/squashfs .

for huge small files, you still can get 80MB/s on usb3.0 external hdd read speed.
if you don't use this ? you will get only 6MB/s

for my write destination mnt directory on my zfs sw raid0
write speed 187MB

read from usb3.0 external hdd ext4 squashfs ,about 80MB/s

capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
basepool 227G 3.40T 0 1.46K 0 187M
sda 114G 1.70T 0 772 0 96.1M
sdb 114G 1.70T 0 726 0 90.7M
cache - - - - - -
sdc 75.6G 157G 0 8 0 1.12M
---------- ----- ----- ----- ----- ----- -----


4:overlayfs?

for the base , you can compress to squashfs , then use overlay fs
for write permission, then you will get small number of small files, but it is more complex than method 3.


another way is to use tar

tar to stdout then sync to server.