Chapter 11. Experimental platforms and data file types

Table of Contents

11.1. Platforms
11.2. Platform variants
11.3. Data file types
11.4. Selecting files for an item

11.1. Platforms

An experimental platform in BASE can be seen as an item representing the set of data file types that are produced or needed by a given experimental setup. For example, the Affymetrix platform (as defined in BASE) uses CEL files for raw data and CDF files for array design information. The concept of a platform is also tightly coupled to the ability to keep data in files instead of importing it to the database. When you have selected a platform for a raw bioassay or an array design, you also know which files you should provide.

BASE comes pre-installed with three platforms.

  • A generic platform that can be used with almost any type of data that can be imported into the database from simple column-based text files.

  • The Affymetrix platform which keep data in CEL and CDF files instead of importing into the database.

  • A sequencing platform which uses GTF files to define array designs and FPKM counts as raw data.

Other platforms, such as Illumina, are available as non-core plug-in packages, see Section 3.2, “BASE plug-ins site”. An administrator may define additional platforms and file types.

You can manage platforms going to AdministratePlatformsExperimental platforms.

Figure 11.1. Platform properties

Platform properties

Name

The name of the platform

External ID

An ID that is used to identify the platform. The ID must be unique and can't be changed once the platform has been created.

File-only

If the platform is a file-only platform or not. File-only platforms can't have it's data imported into the database. This option can't be changed once the platform has been created.

Raw data type

If you have selected file-only=no, you may select a raw data type. This will lock this platform to the selected raw data type. If you select - any -, raw data of any raw data type can be used. This option can't be changed once the platform has been created.

Channels

If you have selected file-only=yes, you must enter the number of channels the platform uses. This information is needed in the analysis module of BASE to create the proper database tables. This option can't be changed once the platform has been created.

Description

A description of the platform.

Figure 11.2. Select data file types

Select data file types

Data file types

This list contains the file types already associated with this platform. An [x] at the end of the name indicates a required file.

Required

Mark this checkbox to indicate that the file is required for the platform.

[Note] Note

The requried flag is not enforced when creating items. It is used for generating warnings when validating an experiment and can be checked by plug-ins that may need the file in order to run.

Allow multiple files

Mark this checkbox if it is possible to have more than one file of the given type. In most cases, a single file is used, but some platforms (for example Illumina) may split data into multiple files.

Add data file types

Opens a popup window that allows you to add more file types to the platform.

Remove

Removes the selected file types from the platform.

11.2. Platform variants

It is possible for an administrator to define variants of a platform. The main purpose for this is to be able to select additional file types that are only used in some cases. The file types defined by the parent platform are always inherited by the variants.

You can create new variants from the single-item view of a platform. This view also has a Variants tab which lists all variants that has been defined for a platform.

11.3. Data file types

Each file type used by a platform must be registered as a data file type. For example, CEL and CDF files are file types used by the Affymetrix platform. There are several purposes of a data file type:

  • Describe the file type and make it identifiable. Each file type must have a unique ID which makes it possible to find out if a specific file has been added to an item. For example, to find the CEL file of a raw bioassay.

  • Connect a specific file type with a generic file type. For example, the CEL file is used to store raw data for an experiment. Another platform may use a different file type. Both file types are of the generic type raw data. This makes it possible for client applications or plug-ins to find the raw data for an experiment without actually knowing which file types that are used on various platforms.

  • Make it possible to validate and extract metadata from attached files. This is done by extensions. Currently, BASE ships with extensions for CEL, CDF and GTF files, but the administrator may have installed extensions for other file types. See Section 27.8.9, “Fileset validators” for more information about creating extensions.

You can manage data file types by going to AdministratePlatformsData file types.

Figure 11.3. Data file type properties

Data file type properties

Name

The name of the file type.

External ID

An ID that is used to identify the file type. The ID must be unique and can't be changed once the file type has been created.

Item type

The type of item files of this file type can be attached to. This option can't be changed once the file type has been created.

File extension

The commonly used file extension for files of this type. Optional.

Generic type

The generic type of data that files of this type contains. For example, CEL files contains raw data and CDF files contains a reporter map (in BASE terms).

Description

A description of the file type.

11.4. Selecting files for an item

Selecting files for an item follows the same pattern for all items that supports it. They all have a Data files tab in their edit view. On this tab you can select files for all file types that are defined by the platform or subtype of the item.

Figure 11.4. Selecting files for an array design

Selecting files for an array design

The list contains all file types that are defined by by the platform or subtype that is selected on the Properties tab. Use the Browse or Add icons to select files from the file manager. Note that for some file types only a single file can be selected, but for other file types multiple files are allowed. The dropdown list contains recently used files as well as an option to clear the selected file.

Validate files

Mark this checkbox if you want to validate and extract metadata from the selected files. The checkbox is automatically checked if changes are made.

[Note] Note

Validation and metadata extraction is performed by extensions. The checkbox is only visible if there is at least one installed extension that supports validation of the current file types.