Upload + unpack of compressed file may corrupt data
|Reported by:||Nicklas Nordborg||Owned by:||Nicklas Nordborg|
Description (last modified by )
When uploading some compressed files and selecting to decompress them things may go wrong. In the published release it seems like an exception is always thrown:
com.ice.tar.InvalidHeaderException: bad header in block 915 record 8, header magic is not 'ustar' or unix-style zeros, it is '5649464895554', or (dec) 56, 49, 46, 48, 9, 55, 54 at com.ice.tar.TarInputStream.getNextEntry(Unknown Source) at net.sf.basedb.plugins.TarFileUnpacker.unpack(TarFileUnpacker.java:163) at org.apache.jsp.filemanager.upload.upload_jsp._jspService(upload_jsp.java:205)
It seems like the exception is happening after processing a file. I have changed the code to ignore the exception just to see what happens. Everything seems fins but there are two issues:
- The decompressed file has been corrupted. This can be seen by them having different md5 values. The raw upload has the same md5 as I get if I run 'md5sum' from the local command prompt. Comparing the two files I find that the start to differ at line 393511. See below for examples.
- The tar.bz2 file is not complete. It's size on the BASE server (1.0MB) is smaller than the original file (2.1 MB).
The most surprising thing is that if I upload the tar.bz2 without unpacking it the tar.bz2 file is uploaded correctly and it is then possible to decompress the file without corrupting the contents.
Examples of corruption
Original file around line 393511:
525 614 114.0 23.6 20 526 614 250.0 29.6 25 527 614 75.0 15.4 25 528 614 455.5 52.1 20 529 614 72.0 13.9 25
525 614 114.0 23.6 20 526 614 250.0 29.6 25.0 25 139 22 618.0 46.3 19 617 77.7 114.8 118.0 48.3.0 25 34 62 633.0 16 627 77 64.0 32 6 20 23 19 20
In the cases I have found an exception is always thrown if the file is corrupted, but since the exception seems to happen when looking for the next file, I am not sure that this is true in all possible cases. In all cases it also seems to work if the compressed file is first uploaded without decompressing. Decompressing can be done using the "Run plugin" button from the single-item view page of the compressed file.
I have added the test data files I have used to the testdata repository referenced from the DeveloperInformation page. It is located in the ticket.869 subdirectory and contains three uncompressed files and three compressed. The only one that can be uploaded and decompressed at the same time is the MG_U74Av2.cdf.bz2 file.