Opened 16 years ago

Last modified 16 years ago

#869 closed defect

Upload + unpack of compressed file may corrupt data — at Version 1

Reported by: Nicklas Nordborg Owned by: everyone
Priority: blocker Milestone: BASE 2.5.1
Component: web Version:
Keywords: Cc:

Description (last modified by Nicklas Nordborg)

When uploading some compressed files and selecting to decompress them things may go wrong. In the published release it seems like an exception is always thrown:

com.ice.tar.InvalidHeaderException: 
bad header in block 915 record 8, header magic is not 'ustar' or
unix-style zeros, it is '5649464895554', or 
(dec) 56, 49, 46, 48, 9, 55, 54
  at com.ice.tar.TarInputStream.getNextEntry(Unknown Source)
  at net.sf.basedb.plugins.TarFileUnpacker.unpack(TarFileUnpacker.java:163)
  at org.apache.jsp.filemanager.upload.upload_jsp._jspService(upload_jsp.java:205)

It seems like the exception is happening after processing a file. I have changed the code to ignore the exception just to see what happens. Everything seems fins but there are two issues:

  • The decompressed file has been corrupted. This can be seen by them having different md5 values. The raw upload has the same md5 as I get if I run 'md5sum' from the local command prompt. Comparing the two files I find that the start to differ at line 393511. See below for examples.
  • The tar.bz2 file is not complete. It's size on the BASE server (1.0MB) is smaller than the original file (2.1 MB).

The most surprising thing is that if I upload the tar.bz2 without unpacking it the tar.bz2 file is uploaded correctly and it is then possible to decompress the file without corrupting the contents.

Examples of corruption

Original file around line 393511:

525	614	114.0	23.6	 20
526	614	250.0	29.6	 25
527	614	75.0	15.4	 25
528	614	455.5	52.1	 20
529	614	72.0	13.9	 25

Corrupted file:

525	614	114.0	23.6	 20
526	614	250.0	29.6	 25.0	 25
139	22	618.0	46.3	19	617	77.7	114.8	118.0	48.3.0	 25
34	62	633.0	16	627	77	64.0	32	6	 20
23	19	 20

Workaround

In the cases I have found an exception is always thrown if the file is corrupted, but since the exception seems to happen when looking for the next file, I am not sure that this is true in all possible cases. In all cases it also seems to work if the compressed file is first uploaded without decompressing. Decompressing can be done using the "Run plugin" button from the single-item view page of the compressed file.

Test data

I have added the test data files I have used to the testdata repository referenced from the DeveloperInformation page. It is located in the ticket.869 subdirectory and contains three uncompressed files and three compressed. The only one that can be uploaded and decompressed at the same time is the MG_U74Av2.cdf.bz2 file.

Change History (1)

comment:1 by Nicklas Nordborg, 16 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.