Creating a large size file in less time

Question

I want to create a large file ~10G filled with zeros and random values. I have tried using:

dd if=/dev/urandom of=10Gfile bs=5G count=10

It creates a file of about 2Gb and exits with a exit status '0'. I fail to understand why?

I also tried creating file using:

head -c 10G </dev/urandom >myfile

It takes about 28-30 mins to create it. But I want it created faster. Anyone has a solution?

Also i wish to create multiple files with same (pseudo) random pattern for comparison. Does anyone know a way to do that?

Welcome to AskUbuntu! You are probably getting an error with `dd` due to the block size. You might want to look at this post http://stackoverflow.com/questions/6161823/dd-how-to-calculate-optimal-blocksize it has some good answers how to calculate best block size, as well as some user scripts/programs, and other suggestions using `dd`. — No Time, Aug 04 '14 at 21:43
Also have a look at http://stackoverflow.com/questions/257844/quickly-create-a-large-file-on-a-linux-system — muru, Aug 04 '14 at 21:54

score 24 · Answer 1 · answered Jul 26 '16 at 16:35

24

How about using fallocate, this tool allows us to preallocate space for a file (if the filesystem supports this feature). For example, allocating 5GB of data to a file called 'example', one can do:

fallocate -l 5G example

This is much faster than dd, and will allocate the space very rapidly.

answered Jul 26 '16 at 16:35

Colin Ian King

18,370
3
59
70

Does this file contain random data or does it contain whatever happened to be on the allocated disk space? – cprn Jul 26 '16 at 16:37
It will contain all zeros. Basically, space is preallocated, and if you don't modify the data it will be presumed to be zero. – Colin Ian King Jul 26 '16 at 16:38
How can this be quicker than dumping `/dev/zero` then? – cprn Jul 26 '16 at 16:39
1

It's very fast because it's one system call which does block preallocation (e.g. it reserves the space but does minimal I/O), where as dd'ing from /dev/zero to a file involves a load of read/writes. – Colin Ian King Jul 26 '16 at 16:42
I'm upping this one. One last question though... I was using `truncate` in the past and found out it doesn't physically allocate the file on the device and just creates an arbitrary large file until accessed, regardless of the available space. Are you sure this isn't the case with `fallocate`? I would check it but I'm on a mobile... – cprn Jul 26 '16 at 17:13
fallocate does what it says, it really does allocate the space. Check using df before and after and you will see the free blocks have reduced by the fallocate action. – Colin Ian King Jul 26 '16 at 18:12
Great! It is definitely legit! – Colin Ian King Jul 27 '16 at 07:36
1

see man fallocate. It says: "fallocate is used to preallocate blocks to a file. For filesystems which support the fallocate system call, this is done quickly by allocating blocks and marking them as uninitialized, requiring no IO to the data blocks. This is much faster than creating a file by filling it with zeros." For me "no IO to the data blocks" means that data is NOT initialized at all. Fallocate just reserves the space, nothing more. – user4955663 Feb 21 '17 at 10:43
2

The blocks are marked as uninitialized at the filesystem level, but when you read them the blocks returned to userspace will be zero filled. If you write the data then it will go to the space that is allocated. This is how sparse files work and I expect you are using non allocated zero'd file blocks without knowing it all the time. The fiemap() IOCTL will show you that a lot of files are sparse and have holes in them that you never know. The latest versions of cp even use this mechanism so they speed up copies of spare data blocks. – Colin Ian King Feb 21 '17 at 12:57
I just did that to create 10GB file. When I tried to copy it to 32GBUSB drive or 500GB HDD to test transfers, i got message I still need 18EB free space. – Gacek Jan 17 '21 at 11:47

score 16 · Answer 2 · edited Aug 27 '14 at 07:59

You can use dd to create a file consisting solely of zeros. Example:

dd if=/dev/zero of=zeros.img count=1 bs=1 seek=$((10 * 1024 * 1024 * 1024 - 1))

This is very fast because only one byte is really written to the physical disc. However, some file systems do not support this.

If you want to create a file containing pseudo-random contents, run:

dd if=/dev/urandom of=random.img count=1024 bs=10M

I suggest that you use 10M as buffer size (bs). This is because 10M is not too large, but it still gives you a good buffer size. It should be pretty fast, but it always depends on your disk speed and processing power.

ahaha that's awesome. Just did a 12GB file on EXT4 in 0.052s. Thank you kindly — J-Cake, Nov 15 '22 at 04:18

score 7 · Answer 3 · edited May 23 '17 at 12:39

7

Using dd, this should create a 10 GB file filled with random data:

dd if=/dev/urandom of=test1 bs=1M count=10240

count is in megabytes.

Source: stackoverflow - How to create a file with a given size in Linux?

edited May 23 '17 at 12:39

Community

1

answered Aug 04 '14 at 21:49

Alaa Ali

31,075
11
94
105

1

I just tried `dd if=/dev/urandom of=10Gfile bs=500M count=20`, which gave me 10237226010 bytes in just under 20 minutes. – Jos Aug 04 '14 at 21:55
Valuable link, thanks! – Manuel Jordan Apr 01 '22 at 15:53

score 2 · Answer 4 · answered Aug 09 '19 at 10:49

This question was opened 5 years ago. I just stumbled across this and wanted to add my findings.

If you simply use

dd if=/dev/urandom of=random.img count=1024 bs=10M

it will work significantly faster as explained by xiaodongjie. But, you can make it even faster by using eatmydata like

eatmydata dd if=/dev/urandom of=random.img count=1024 bs=10M

What eatmydata does is it disables fsync making the disc write faster.

You can read more about it at https://flamingspork.com/projects/libeatmydata/.

The way I look at it `dd` is fast enough to begin with, and it's called libEAT-MY-DATA for a reason. — karel, Aug 09 '19 at 11:18

score 1 · Answer 5 · answered Jul 26 '16 at 16:28

1

Answering the first part of your question:

Trying to write a buffer of 5GB at a time is not a good idea as your kernel probably doesn't support that. It won't give you any performance benefit in any case. Writing 1M at a time is a good maximum.

answered Jul 26 '16 at 16:28

cprn

1,159
2
12
22

Creating a large size file in less time

5 Answers5