2

How do I maintain a readable output from my program while also putting that same output into a compressed file, gzip or otherwise, so I can parse it later?

I run a command that generates a lot of samey lines of clear text that should be easy to compress, but I also want to monitor the output of that command. I know I can use tee to print and then put stdin into a file. But the files are massive (250MB+). I also know that I can use gzip to compress stdin into a file, but then I won't see any output on the screen and gzip's --stdout flag just dumps the compressed data to stdout, which is not what I want.

An example would be

python -c 'for i in range(0, 100): print("A" * 120)' | gzip --best > A.log.gz

but I also want to see the As printed to the console.

FalcoGer
  • 935
  • 2
    Can you explain why the combination of tee and gzip isn't what you need? Ideally, give us a specific, simple and reproducible example so we can use that to test our approaches. – terdon Mar 20 '23 at 14:46
  • 3
    Maybe something like seq 1 100 | tee >(gzip >foo.gz) – Bodo Mar 20 '23 at 15:06
  • 1
    It sounds like you need either tee with a process substitution for the gzip command, or pee. See also tee for commands – steeldriver Mar 20 '23 at 15:06
  • Is this a command/script that creates a lot of output in a single run and terminates or is it a long-running service? – Bodo Mar 20 '23 at 15:12
  • @Bodo that worked. If you provide an answer I'll accept it. It'd be nice if you included how I could've figured that out myself. This >(program) syntax is new to me. – FalcoGer Mar 20 '23 at 15:43
  • @terdon tee writes to a file but doesn't compress, gzip compresses but doesn't print. I don't know to "combine" the two as you suggest – FalcoGer Mar 20 '23 at 15:45
  • @steeldriver so I can use >(process) in place of any file name and the data that would be written to that file would appear as stdin to that process instead? Thank you, I didn't know that. – FalcoGer Mar 20 '23 at 15:48
  • Yes, I was thinking of process substitution, I just wanted an example command to be able to test and be sure it does what you want. – terdon Mar 20 '23 at 15:56

1 Answers1

4

You can do this by combining tee and process substitution:

python -c 'for i in range(0, 100): print("A" * 120)' | 
  tee >( gzip > compressed.gz)

The >() construct, which is one kind of process substitution, allows you to treat a command as though it were a file. In this case, a file opened for writing (<() would open for reading). The command above will both print to standard output (since that is what tee does by default) and redirect to the special "file" gzip > compressed.gz meaning it will print to it, thereby passing all output through gzip and finally to the compressed file compressed.gz.

terdon
  • 104,404