10

I need to convert all sheets of a single .xls/.xlsx file to a .csv. This will be done on all .xls files in all directories and sub-directories (recursively).

Step 1: Get the sheetnames of all .xls into a .csv using:

for file in $(find . -name '*.xls' -o -name '*.xlsx');do in2csv -n "$file" > ${file%.xls}-sheetnames-list.csv; done

filename-sheetnames-list.csv can act as a list:

sheetname1
sheetname2
sheetname3

Step 2 : The code for converting a specific sheet into a .csv using in2csv is:

in2csv --sheet "SHEETNAME" filename.xls > filename-SHEETNAME.csv

How can I get every sheetname in a .xls/x and write every sheet separately for all directories containing a .xls/x ?

in2csv --write-sheets "-" filename.xls > filename-sheet1.csv filename-sheet2.csv .... gives output only on sheet1.csv, not sure how to get all sheets from this.

csheth
  • 421
  • 5
  • 16

4 Answers4

10

You can just put a loop inside another loop.

To avoid errors, don't use for with find results.

while IFS= read -r file; do
    while IFS= read -r sheet; do
        in2csv --sheet "$sheet" "$file" > "${file%.*}-${sheet}.csv"
    done < <(in2csv -n "$file")
done < <(find . -name '*.xls' -o -name '*.xlsx')
Eliah Kagan
  • 116,445
  • 54
  • 318
  • 493
pLumo
  • 26,204
  • 2
  • 57
  • 87
  • @muru ah crap. You're absolutely right. I'd tested in an environment where the IFS had already been changed so of course it propagated downwards. *Idiot*. Thanks, edit reverted. – terdon Nov 07 '17 at 09:01
  • @RoVo the first option works fine. The second one however gives me no output or error. I am unsure why; for a single `.xls` `in2csv --write-sheets "-" filename.xls > sheetname.csv` gives only the first sheet. I don't know what additional info to add to write all sheets. That shall give us clues to correct your code. – csheth Nov 08 '17 at 07:37
  • 1
    did you update to that version 1.0.2 ? `pip install csvkit -U`. I think the way it works is not what you like, with the simple skript from 1st option you have more ways to control the output and the filenames etc. – pLumo Nov 08 '17 at 08:02
  • still doesn't work with the update, and yes I'd prefer using a list than `--write-sheets` Maybe you can set this alternative option as another answer... I will accept the first option as the answer then. Thanks @RoVo – csheth Nov 08 '17 at 08:24
  • 1
    Maybe generally a good idea to have alternative options in another answer. Thanks, glad that I could help. – pLumo Nov 08 '17 at 08:55
7

Skipping find and using bash:

shopt -s globstar  # enable recursive globbing
for f in **/*.xls{,x}  # for files ending in .xls or .xlsx
do
    in2csv -n "$f" |   # get the sheetnames
      xargs -I {} bash -c 'in2csv --sheet "$2" "$1" > "${1%.*}"-"$2".csv' _ "$f" {} # {} will be replaced with the sheetname
done
muru
  • 193,181
  • 53
  • 473
  • 722
  • this script looks elegant but its output is `filename-{}.csv` containing no data. I'm a novice and can't seem to find the error by editing the script and reading up. Some help? – csheth Nov 08 '17 at 07:23
  • @ChintanSheth my bad, I'd forgotten the redirection would be outside `xargs`. Corrected, not as elegant now. – muru Nov 08 '17 at 08:01
  • `xargs` and `>` is evil :-P. That's why I prefer another loop, it's less error prone. – pLumo Nov 08 '17 at 08:01
  • @RoVo I'd have usually gone for another loop too, just wanted to show another method here. – muru Nov 08 '17 at 08:03
  • This works now, however slightly slower than @RoVo answer. – csheth Nov 08 '17 at 08:25
3

csvkit version > 1.0.2 has a builtin function to write all sheets:

--write-sheets: WRITE_SHEETS
                      The names of the Excel sheets to write to files, or
                      "-" to write all sheets.

So you could try the following:

find . -name '*.xls' -o -name '*.xlsx' -exec in2csv --write-sheets "-" {} \;

Note:

This seems not to work 100% as expected. But worth a try and as this is the first version with that option maybe in future versions the implementation is better/easier.

pLumo
  • 26,204
  • 2
  • 57
  • 87
0

Use Gnumeric:

ssconvert -S filename.xlsx filename.csv

to get one csv file per sheet.