Opened 11 years ago

Closed 11 years ago

#1485 closed enhancement (fixed)

File items should be able to reference external files

Reported by: Nicklas Nordborg Owned by: Nicklas Nordborg
Priority: major Milestone: BASE 2.16
Component: core Version:
Keywords: Cc:

Description (last modified by Nicklas Nordborg)

If we add a URL field to a file item, it should be possible to link a file item in BASE with a file on the internet. It should be "invisible" to users in the sense that File.getDownloadStream() should act as a proxy for the file. To begin with we should support at least http and https URLs.

We need to investigate how some of the other file properties should be interpreted. For example:

  • Location: can be PRIMARY, SECONDARY and OFFLINE. Many places will only work when location=PRIMARY since that is the only setting were getDownloadStream() returns any data, but PRIMARY also means that the file should be located on the BASE server.... Hmmm... maybe we should add a fourth option (EXTERNAL?) and add a notice about a possible incompatible change.
  • Size: The file size is usually stored automatically when the file is upload and is retained if the file is taken offline. Some code may require file size > 0. Since we don't know the size of external files (or can we issue a HEAD request to find out?) we may have to update some code that makes decisions based on the size.
  • External files should not contribute to quota as far as BASE is concerned.

For instructions on how to setup a test environment with a https server that only accepts clients with a trusted certificate see HttpsFiles.

Change History (33)

comment:1 Changed 11 years ago by Nicklas Nordborg

Owner: changed from everyone to Nicklas Nordborg
Status: newassigned

comment:2 Changed 11 years ago by Nicklas Nordborg

(In [5325]) References #1485: File items should be able to reference external files

Added URL field to File item and Location.EXTERNAL. Updated test code with some more test cases and file download. This change can be a major issue for code that rely on File.getSize() or File.getLocation().

Plug-ins that use the file size to report progress most likely needs to be updated to cover the case were the file size is not known.

comment:3 Changed 11 years ago by Nicklas Nordborg

(In [5326]) References #1485: File items should be able to reference external files

It is now possible to add external files using the GUI. Fixed the list and view pages so that they behave correctly for external files.

comment:4 Changed 11 years ago by Nicklas Nordborg

(In [5327]) References #1485: File items should be able to reference external files

Fixed code that use Location.PRIMARY to determine if a file is usable or not.

  • Code that throws error if location != PRIMARY
  • Code that adds a filter location==PRIMARY to list pages when selecting files

comment:5 Changed 11 years ago by Nicklas Nordborg

(In [5328]) References #1485: File items should be able to reference external files

Redirect file links in the GUI for EXTERNAL files to the external URL.

comment:6 Changed 11 years ago by Nicklas Nordborg

(In [5329]) References #1485: File items should be able to reference external files

Fixed code that use File.getSize() for stuff like progress reporting, etc.

Fixed progress reporter implementations so that they can handle -1 to indicate unknown progress.

comment:7 Changed 11 years ago by Nicklas Nordborg

(In [5330]) References #1485: File items should be able to reference external files

Added data for external files in quota listings.

comment:8 Changed 11 years ago by Nicklas Nordborg

(In [5331]) References #1485: File items should be able to reference external files

Fixes some issues with links to the external file in the view and list pages.

comment:9 Changed 11 years ago by Nicklas Nordborg

(In [5332]) References #1485: File items should be able to reference external files

Implemented option to issue a HEAD request to a given URL to get metadata about the target. Most importantly we are interested in the size, but we can also get last modification time, MIME type and character set.

comment:10 Changed 11 years ago by Nicklas Nordborg

(In [5333]) References #1485: File items should be able to reference external files

The change in TestFile? should have been part of the last commit...

comment:11 Changed 11 years ago by Nicklas Nordborg

(In [5334]) References #1485: File items should be able to reference external files

File download test failed on windows because subversion modified line breaks in the reference file. Do not change the svn:eol-style of this file!

comment:12 Changed 11 years ago by Nicklas Nordborg

(In [5335]) References #1485: File items should be able to reference external files

Improved error handling with non-existing url:s.

comment:13 in reply to:  11 Changed 11 years ago by Jari Häkkinen

Replying to nicklas:

(In [5334]) References #1485: File items should be able to reference external files

File download test failed on windows because subversion modified line breaks in the reference file. Do not change the svn:eol-style of this file!

For future reference: Another option could be to set the file as binary, subversion doesn't tamper with binary files.

comment:14 Changed 11 years ago by Nicklas Nordborg

So far it seems to work fine with http and also with https if the server has a valid certificate and doesn't require any authentication. It doesn't work with self-signed, expired or other invalid certificates or if the server requires authentication.

The Proteios project has solved some parts of this by creating a specific SSLSocketFactory that reads information about trusted servers and authentication information from certificate files stored on the server.

See Application.java, line 923-1044

comment:15 Changed 11 years ago by Nicklas Nordborg

Resolution: fixed
Status: assignedclosed

(In [5336]) Fixes #1485: File items should be able to reference external files

Added support for using key-store and trust-store where authentication certificates and trusted self-signed certificates can be added.

comment:16 Changed 11 years ago by Nicklas Nordborg

Description: modified (diff)

comment:17 Changed 11 years ago by Nicklas Nordborg

(In [5337]) References #1485: File items should be able to reference external files

I managed to setup a test environment with an Apache server using https that only accepts clients with a trusted certificate. This has been documented on the wiki.

comment:18 Changed 11 years ago by Nicklas Nordborg

(In [5342]) References #1485: File items should be able to reference external files

Test program now uses a custom trust-store so that the test case with a https URL works.

comment:19 Changed 11 years ago by Nicklas Nordborg

Resolution: fixed
Status: closedreopened

There may be a performance issue with the current implementation when the same file is accessed multiple times in a short time span. A typical example is when the "auto detect fileformat" function is used. Each test will open a new connection and download the file again. Ok, in most cases only the first few header lines are needed so it is not too bad, but...

I think it would be good if the file could be cached locally per transaction. For example the first request to download the file simultaneously copies the downloaded data to a temporary local file. If the close() method is called before the complete file has been downloaded, the underlying http connection should be kept open. Then, if there are more requests to download the file (eg. when checking the next file format), we start by reading the local data. If we need more data, we continue with the download (and append this to the local file). This procedure can be repeated as many times as we need. When the transaction ends we close the underlying http connection and removes the local file data.

comment:20 Changed 11 years ago by Nicklas Nordborg

Status: reopenednew

comment:21 Changed 11 years ago by Nicklas Nordborg

Status: newassigned

comment:22 Changed 11 years ago by Nicklas Nordborg

(In [5353]) References #1485: File items should be able to reference external files

Adds caching capabilities for external (and compressed) files. Use the File.getCachedDownloadStream() instead of File.getDownloadStream().

comment:23 Changed 11 years ago by Nicklas Nordborg

Resolution: fixed
Status: assignedclosed

comment:24 Changed 11 years ago by Nicklas Nordborg

Resolution: fixed
Status: closedreopened

There are some more things that I want to do...

Add support for password protected files. Eg. those that the browser pops up a dialog asking for a username and password. It will not work by setting the username and password on the url (http://username:password@host/...) and it is really a bad idea since it becomes visible for everyone. The built-in http functionality in Java has support for Basic authentication which is not so secure since everything is only base64-encoded. The Digest authentication is more secure since it uses MD5. I think all we need is supported by the Apache http components library (http://hc.apache.org/).

My idea is to add a new FileServer? item to BASE. The main purpose is that it can hold a username/password (and possibly more in the future). An external file item that requires authentication can then be linked with the file server.

It would also be nice to add fields for SSL certificates and personal keys to the file server item. This would be a better solution than having to register things that are personal to each user in the global keystores for the server.

comment:25 Changed 11 years ago by Nicklas Nordborg

Status: reopenednew

comment:26 Changed 11 years ago by Nicklas Nordborg

Status: newassigned

comment:27 Changed 11 years ago by Nicklas Nordborg

(In [5359]) References #1485: File items should be able to reference external files

Adding Apache Http components.

comment:28 Changed 11 years ago by Nicklas Nordborg

(In [5360]) References #1485: File items should be able to reference external files

Added FileServer? item. Changed download code to use Apache http components.

Need to add a gui also and maybe think a second time if the view/download links in BASE should redirect (as they do know) or if BASE should be used as a proxy. The drawback with redirect is that authentication information is "lost" and users need to enter the password again in the browser. Maybe not so good if they share files with other users but doesn't want to give away their username/password to the external site.

comment:29 Changed 11 years ago by Nicklas Nordborg

(In [5361]) References #1485: File items should be able to reference external files

Added gui for managing file servers. Changes download servlet to act as proxy instead of redirecting external files.

comment:30 Changed 11 years ago by Nicklas Nordborg

(In [5362]) References #1485: File items should be able to reference external files

Added support for registering server and client certificates to a FileServer? item. This means that there is almost no reason to do this globally on the server.

comment:31 Changed 11 years ago by Nicklas Nordborg

(In [5366]) References #1485: File items should be able to reference external files

The old commons-logging jar (updated in [5359]) file was used by the docbook build script. Fix to make sure that the actual version number we use doesn't matter.

comment:32 Changed 11 years ago by Nicklas Nordborg

(In [5408]) References #1485: File items should be able to reference external files

Updated data-layer documentation with FileServerData? and related information.

comment:33 Changed 11 years ago by Nicklas Nordborg

Resolution: fixed
Status: assignedclosed
Note: See TracTickets for help on using tickets.