I am using the bash shell and would like to pipe the out of the command openssl rand -base64 1000 to the command dd such as dd if={output of openssl} of="sample.txt bs=1G count=1.
I think I can use variables but I am however unsure how best to do so. The reason I would like to create the file is because I would like a 1GB file with random text.
- 5,686
- 12
- 68
- 115
- 8,900
- 36
- 90
- 133
-
What do you want to do with that file? To check e.g. compression algorithms, use the type of data they are designed for (natural language text you can get boatloads at project Gutenberg; source code, grab e.g. the GNU, BSD, Sourceforge packages or sample github). "Real world" data is *not* random. – vonbrand May 24 '21 at 01:24
6 Answers
if= is not required, you can pipe something into dd instead:
something... | dd of=sample.txt bs=1G count=1
something... | head -c 1G > sample.txt
It wouldn't be useful here since openssl rand requires specifying the number of bytes anyway. So you don't actually need dd – this would work:
openssl rand -out sample.txt -base64 $(( 2**30 * 3/4 ))
1 gigabyte is usually 230 bytes (though you can use 10**9 for 109 bytes instead). The * 3/4 part accounts for Base64 overhead, making the encoded output 1 GB.
Alternatively, you could use /dev/urandom, but it would be a little slower than OpenSSL:
dd if=/dev/urandom of=sample.txt bs=1G count=1
I would use bs=64M count=16 or similar, so that 'dd' won't try to use the entire 1 GB of RAM at once:
dd if=/dev/urandom of=sample.txt bs=64M count=16
or even the simpler head tool – you don't really need dd here:
head -c 1G /dev/urandom > sample.txt
- 426,297
- 64
- 894
- 966
-
Thanks. A few questions, does using the command `openssl rand -base64 $(( 2**30 * 3/4 )) > sample.txt` give you a true text file? Secondly I don't quite follow the use of `bs=64M count=16`. Can you elaborate further? – PeanutsMonkey Sep 06 '12 at 19:06
-
2I posted a question regarding compressing large files at http://superuser.com/questions/467697/why-does-a-zip-file-appear-larger-than-the-source-file-especially-when-it-is-tex and was advised that using `/dev/urandom` generates a binary file and not a true text file. – PeanutsMonkey Sep 06 '12 at 19:10
-
@PeanutsMonkey: What do you mean by a "true text file"? A file that only contains printable characters, I'm guessing? Then yes, the `-base64` option tells OpenSSL to output a "text" file. – u1686_grawity Sep 06 '12 at 19:23
-
@PeanutsMonkey: But beware that **random data does not compress well**, regardless of whether it is "binary" or "true text". – u1686_grawity Sep 06 '12 at 19:23
-
2@PeanutsMonkey: Right; you would need something like `dd if=/dev/urandom bs=750M count=1 | uuencode my_sample > sample.txt`. – Scott - Слава Україні Sep 06 '12 at 19:33
-
@Scott - Can you elaborate what that does exactly as well as why you are using a byte size of 750M and a count of 1? – PeanutsMonkey Sep 06 '12 at 19:52
-
@grawity - Well people keep bouncing the term "true text file" and based on my previous post it was suggested that `/dev/urandom` generates binary files. My understanding is that a text file is one with printable characters although am unsure whether ASCII characters would count. I thought `-base64` is used to convert binary data to text? – PeanutsMonkey Sep 06 '12 at 19:56
-
@grawity - If random data does not compress well, how can I create a file that mimics real world scenarios? – PeanutsMonkey Sep 06 '12 at 19:56
-
4@PeanutsMonkey: There's no single "real world scenario", some scenarios might be dealing with gigabytes of text, others – with gigabytes of JPEGs, or gigabytes of compiled software... If you want a lot of text, download a [Wikipedia dump](http://dumps.wikimedia.org/backup-index.html) for example. – u1686_grawity Sep 06 '12 at 20:06
-
2@PeanutsMonkey: The `dd` reads 750,000,000 bytes from `/dev/urandom` and pipes them into `uuencode`. `uuencode` encodes its input into a form of base64 encoding (not necessarily consistent with other programs). In other words, this converts binary data to text. I used 750M because I trusted grawity's statement that base64 encoding expands data by 33⅓%, so you need to ask for ¾ as much binary data as you want in your text file. – Scott - Слава Україні Sep 06 '12 at 20:07
-
@Scott: Pure Base64 always encodes 3 bytes to 4 (33.(3)%). OpenSSL's encoder splits output into 64-character lines (so about 35.4% overhead; I forgot to account for this – would be `*48/65`). UUencode uses even shorter lines and adds length prefixes, header & footer, resulting in ~40% overhead. – u1686_grawity Sep 06 '12 at 20:15
-
@Scott - That makes sense although am curious to understand why limit the count to 1? – PeanutsMonkey Sep 06 '12 at 20:41
-
@grawity - I am astounded by the depth of knowledge. Where are you learning all of this stuff? – PeanutsMonkey Sep 06 '12 at 20:42
-
1@leighmcc: FYI: using `>` redirection does *not* make the writes pass through bash – it is equivalent to having the program open the file directly. – u1686_grawity May 10 '13 at 14:03
-
7Note if it says `dd: warning: partial read (33554431 bytes); suggest iflag=fullblock` it will create a *truncated file* so add the `iflag=fullblock` flag, then it works. – rogerdpack Sep 27 '18 at 20:21
Create a 1GB.bin random content file:
dd if=/dev/urandom of=1GB.bin bs=64M count=16 iflag=fullblock
- 125
- 6
- 1,234
- 12
- 9
-
8For me, `iflag=fullblock` was the necessary addition compare to other answers. – dojuba Sep 18 '18 at 14:55
Since, your goal is to create a 1GB file with random content, you could also use yes command instead of dd:
yes [text or string] | head -c [size of file] > [name of file]
Sample usage:
yes this is test file | head -c 100KB > test.file
- 151
- 1
- 2
If you want EXACTLY 1GB, then you can use the following:
openssl rand -out $testfile -base64 792917038; truncate -s-1 $testfile
The openssl command makes a file exactly 1 byte too big. The truncate command trims that byte off.
- 26,555
- 15
- 113
- 235
- 41
- 1
-
That extra byte is probably because of the `-base64`. Removing it will result in a file with the correct size. – Daniel Oct 10 '19 at 11:34
If you just need a somewhat random file which is not used for security related things, like benchmarking something, then the following will be significantly faster:
truncate --size 1G foo
shred --iterations 1 foo
It's also more convenient because you can simply specify the size directly.
- 138
- 6
Try this script.
#!/bin/bash
openssl rand -base64 1000 | dd of=sample.txt bs=1G count=1
This script might work as long as you don't mind using /dev/random.
#!/bin/bash
dd if=/dev/random of="sample.txt bs=1G count=1"
- 115
- 6
- 408
- 3
- 10
-
8I wouldn't recommend wasting `/dev/random` on this unless there's a very good reason to do so. `/dev/urandom` is much cheaper. – Ansgar Wiechers Sep 06 '12 at 18:22
-
1
-
-
-
3@grawity, @PeanutsMonkey: He made a typo; he meant `random=$(openssl rand -base64 1000)`. Although I would question whether `bash` would let you assign a gigabyte-long value to a variable. And even if you do say `random=$(openssl rand -base64 1000)`, the subsequent `if=$random` doesn't make sense. – Scott - Слава Україні Sep 06 '12 at 19:28
-
Right. `random=<(openssl ...)` would *almost* work (if not for bash's poorly-thought-out implementation of the feature). And `dd if=<(openssl ...)` would definitely work, but then it's just exact same thing as `openssl ... | dd` – u1686_grawity Sep 06 '12 at 19:37