0

I have a Perl script which creates a binary file while scanning a very large text file. It outputs to STDOUT which I redirect in the commandline to a file.

To optimize it I'm making changes then seeing how low it takes to run. On Linux for this I use the "time" command. On Windows the best way to time a program seemed to be to PowerShell's "measure-command". This seemed to work fine but I noticed the generated files were larger. On examination I found that the files generated from within PowerShell begin with a BOM and contain CRLF pairs!

My Perl script has a "binmode STDOUT" directive and does work correctly in a normal dosbox.

Is this a bug or misfeature in PowerShell or measure-command? Has it affected others creating binary files by means other than Perl?

Googling hasn't turned anything up so far. I'm using Perl 5.12, PowerShell v1.0 and Windows XP.

hippietrail
  • 4,505
  • 15
  • 53
  • 86
  • Not a real answer... you may want to ask this on stackoverflow. – Joe Internet Jan 12 '11 at 04:05
  • Yeah it was tough deciding which site to ask on. I went with this one because it was more about the features of the tools than algorithsm or data structures but I'll move it if nobody answers here (-: – hippietrail Jan 12 '11 at 06:23

1 Answers1

0

This is because PowerShell will see the output as strings by default. Strings in .NET are Unicode, so that is the default output of PowerShell.

I assume that you are using PowerShell to write the output to a file? If so, then using "Set-Content -Encoding Byte" will fix your issue.

Measure-Command {& "c:\myscript.pl" | Set-Content "C:\myoutput.bin" -Encoding Byte}
JasonMArcher
  • 151
  • 1
  • 9
  • So | Set-Content "C:\myoutput.bin" is like > "C:\myoutput.bin" under DOS or Unix with the pipe character used also for redirection? – hippietrail Jan 13 '11 at 04:29
  • Its the pipe character and is used the same way as DOS and Unix shells. In this case you use Set-Content so that we can use the Byte encoding. – JasonMArcher Jan 28 '11 at 21:38