Here're a few variants based on what you want to use as a data source.
The first version assumes you use a dictionary file that contains one word per line:
awk -v lineno=10 -v linelen=80 'BEGIN { i = 0 } { i++; words[i] = $0 } END { line1 = ""; line2 = ""; srand(); n = 0; w = ""; nw = 0; for (j = 0; j < lineno; j++) { while (n <= linelen) { w = words[int(rand() * i) + 1]; nw = length(w); line1 = line2; if (n > 0) { line2 = line2 " " w; n += 1 + nw } else { line2 = w; n += nw } }; print line1; line2 = w; n = nw } }' /usr/share/dict/words
The second version takes pseudo-random bytes from
/dev/urandom
:
cat /dev/urandom | tr -dc 'a-zA-Z\n' | awk -v lineno=10 -v linelen=80 -v wordmin=5 -v wordmax=10 'BEGIN { j = 0; n = 0; line1 = ""; line2 = ""; wordlen = wordmax - wordmin; srand(); nwlen = wordmin + int(rand() * wordlen); w = ""; nw = 1 } { len = length($0); for (i = 1; i <= len; i++) { w = w substr($0, i, 1); if (nw == nwlen) { line1 = line2; if (n > 0) { line2 = line2 " " w; n += 1 + nw } else { line2 = w; n += nw }; if (n > linelen) { print line1; line2 = w; n = nw; j++; if (j >= lineno) { exit } }; w = ""; nw = 0; nwlen = wordmin + int(rand() * wordlen) }; nw++ } }'
The third version is pure Awk and generates pseudo-random letters from a selected set of characters:
awk -v lineno=10 -v linelen=80 -v wordmin=5 -v wordmax=10 -v chars=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ 'BEGIN { charslen = length(chars); n = 0; j = 0; line1 = ""; line2 = ""; wordlen = wordmax - wordmin; srand(); nwlen = wordmin + int(rand() * wordlen); nw = 1; while (j < lineno) { w = w substr(chars, int(rand() * charslen) + 1, 1); if (nw == nwlen) { line1 = line2; if (n > 0) { line2 = line2 " " w; n += 1 + nw } else { line2 = w; n += nw }; if (n > linelen) { print line1; line2 = w; n = nw; j++ }; w = ""; nw = 0; nwlen = wordmin + int(rand() * wordlen) }; nw++ } }'
The variables are quite self-explanatory, but I'll describe them anyway.
lineno
: the number of lines in the output
linelen
: the maximum length of a line in the output
wordmin
: the minimum length of a word in the output
wordmax
: the maximum length of a word in the output
chars
: the characters to use in the output
Of course
there's this really simple commandline that is almost as good (feature-wise) as the second variant and doesn't involve Awk at all (and is most probably a lot faster, but I've not timed it) ...
cat /dev/urandom | tr '0-3' ' ' | tr -dc 'a-zA-Z ' | fold -w 80 | head -n 10
The point of the
tr '0-3' ' '
part is to decrease the average word length in the generated output.
Comments
Third Version awk script
and let's say I wanted the script not to repeat any character or lets say repeat some and other just use a character a single time?
Nice scripts btw.
Best regards.
Dante.