Synergy File Reader for Python

This module allows you to read text files produced by Synergy plate readers in Python using appropriate Python data structures.

Example

Our data is located in the file example_data.txt in the same folder. We can load it like this:

from synergy_file_reader import SynergyFile
my_file = SynergyFile("example_data.txt")

Now my_file is (for all practical purposes) a list containing the individual plates in the file. In our case there is only one such plate, so it’s best to assign it to a separate variable:

my_plate = my_file[0]

In such a case, it often makes sense to load the file and extract the plate in one command. The following is equivalent to the above:

from synergy_file_reader import SynergyFile
my_plate = SynergyFile("example_data.txt")[0]

If you are familiar with Python data structures, everything else is straightforward now: Have a look at the properties and keys of my_plate and it should be clear how to access the information you need. If not, we proceed with a small tour:

my_plate has a series of properties containing information we may care about. Let’s start with channels:

print(my_plate.channels) # ['OD:600']

This tells us that we measured only one channel for our plate, which we called OD:600. We can now extract and print the times and temperatures for these measurements:

channel = my_plate.channels[0]
print(channel) # 'OD:600'
print(my_plate.times[channel])
print(my_plate.temperatures[channel])

The my_plate.metadata is a dictionary, which contains all sorts of tangential data for our plate, as long as it is contained in the file. For example, if we want to know the version of the software that produced our plate, we can do this as follows:

print(my_plate.metadata["Software Version"])

However, what we usually care about is the raw data. For each well and channel, this is stored in a NumPy array. We can access it by directly indexing my_plate. The following commands are all equivalent, with the last only working because we only have one channel:

print(my_plate["C3",channel])
print(my_plate["C",3,channel])
print(my_plate["C3"])

Our file includes no aggregated results such as growth rates computed by the Synergy software. If it did, they would be in the dictionary my_data.results, each value of which is a SynergyResult which can be indexed like my_plate.

If we want to access the raw data for all wells, we can do this with the values function which mirrors this functionality for dictionaries. keys analogously gives us all well–channel combinations. In the following example, we take the first values from the time series for all wells, and compute the 10th percentile of them as a baseline for our measurements:

import numpy as np
baseline = np.percentile( [ts[0] for ts in my_plate.values()], 10 )

The attributes rows and cols facilitate easy iterations over all rows or columns, respectively. We here compute the element-wise median of all time series in the first row (“A”), to use as a reference for comparisons:

reference = np.median( [my_plate["A",col] for col in my_plate.cols], axis=0 )

Finally, we plot our data. We here specify that the baseline we computed before shall be subtracted from all time series. Moreover, our reference should appear in each plot:

fig,axess = my_plate.plot(
		colours = ["red"],
		ylim = (1e-3,2), xlim = (0,30000),
		baseline = baseline,
		plot_args = {"linewidth":3},
		reference = reference,
		reference_plot_args = { "color":"black", "linewidth":1, "label":"no drug" },
	)

The remaining arguments are hopefully self-explanatory and also explained in the documentation of plot. fig is a Matplotlib figure object with all the functionality that comes with that. For example, we can use savefig to export our plot to a file:

fig.savefig( "example.pdf", bbox_inches="tight" )

And this is what our plot looks like:

_images/example.png

Preparing Files

The Synergy software allows you to export files in various different layouts and control all sorts of details as to what information should be included in the file. As long as the included information is not ambiguous, this module should be able to read it – if not, tell me.

If you already exported your files and did not choose for a minimal format, just try and see whether they can be loaded. If want to make sure that your file is readable and contains all information, do the following:

  • Choose Automatic content.

  • Whenever you have the option to Include something, do so.

  • Use tables and not matrices for everything. (The main problem of the latter is that temperature information is lost.)

  • Use Tab as a separator. (The other separators will probably work fine, but I am not sure that they do not lead to ambiguities.)

Sample IDs

Sample IDs are supported for reading files and indexing. However, the module discards some information automatically computed from blank data by the Synergy software. This is because they are nasty to parse and integrate into the data structure with what I expect to be little usage: If you are using this module, I expect that you will want to calculate information like this yourself.

Under the Hood

If anybody cares, this module uses trial-and-error parsing. For every type of data block, there is a parser, which throws an error if fed with data that does not match the format of the type of data block. To parse a file, all of these parsers are applied one by one until one doesn’t throw an error, which is then taken to reflect the true structure of the data. This process is then repeated with the remainder of the file until there are no more lines left.

This makes the parser rather flexible to expand, but also makes it impossible to pinpoint where exactly the parsing fails. A file simply becomes unparsable once it contains a data block for which all of the implemented parsers fail.

Data Structure

SynergyFile is a collection of plates, each of which is a SynergyPlate. SynergyPlate inherits from SynergyResult, which is used for the raw data. The results of a SynergyPlate are also of the type SynergyResult.

Special values are parsed as follows:

  • ????? and empty fields are parsed as nan (not a number)

  • OVRFLW is parsed as inf (+∞)

  • < followed by a number is parsed as zero.

Command Reference

class SynergyResult(sample_ids=None)

A single well- and channel-wise result.

You can index this in different ways, using the well F7 for the channel "OD" as an example:

  • Row letter, column number, and channel separately: result["F",7,"OD"]

  • Well identifier in one string: result["F7","OD"]

  • If there is only one channel, you do not need to specify it: result["F7"]

  • If your plate contains information on sample IDs, you can also use those for indexing: result["SPL55","OD"]. If the sample ID is not unique, a list of the respective content of all matching wells will be returned.

It comes with the following attributes and methods:

  • rows and cols are lists of the row letters and column numbers, respectively.

  • channels is a list of all channels for which recordings exists.

  • keys and values are methods similar to those for dictionaries, returning iterables of all keys and values respectively.

  • If your plate contains information on sample IDs, sample_ids contains the mappin of sample IDs to wells.

class SynergyPlate

Data for a single plate.

Raw data can be accessed by indexing this object directly like a SynergyResult.

This usally comes with the following methods and attributes:

  • rows and cols are lists of the row letters and column numbers, respectively.

  • channels is a list of all channels for which recordings exists.

  • keys and values are methods similar to those for dictionaries, returning iterables of all keys and values respectively.

  • times is a dictionary specifying the times of measurements (in seconds) for each channel.

  • temperature_range is contains the minimal and maximal temperature specified in the file. This almost always contains some meaningful information.

  • temperatures is a dictionary specifying the temperatures at the times of measurements for each channel. This is only not empty if the file specifies the information.

  • metadata is a dictionary of metadata like the time of the measurement, procedure details, file paths, and information about the device.

  • results is a dictionary of SynergyResult. These are usually aggregated estimates by the plate-reader software such as of the growth rate, lag time, etc.

  • gains is a dictonary containing the automatic gains determined for each channel, if such exist. Otherwise it’s empty.

  • plot allows to quickly plot the data and has an extensive documentation.

plot(*, channels=None, colours=None, xlim=None, ylim=None, baseline=0, log_y=True, plot_args={}, timescale=None, label_pad=20, label_size='xx-large', reference=None, reference_plot_args={})

plots time series of the raw data using Matplotlib.

This is a service function to quickly obtain a decent plot. It does not cover all potential use cases and is not suited as basis for extensive customisations. If you want a starting point for the latter, have a look at this example.

Parameters:
channelsiterable of keys or None

The channels to be plotted. If None, all channels will be plotted.

colours: iterable of colour names or None

The colours to use for the respective channels. If None, Matplotlib’s default colour cycle will be used.

xlim, ylim: pair of numbers or None

The ranges of the plot. If None this will be chosen automatically by Matplotlib.

baseline: number or array

This will be subtracted from the values of all time series before plotting (including the reference).

log_y: boolean

Whether the ordinate should be logarithmic.

plot_args: dictionary

Further keyword arguments to be passed to every plot (except the reference).

timescale: one of “seconds”, “minutes”, “hours”, “days”, or None

Which timescale to use for labelling. If None, this will be guessed.

label_pad: number

Padding of row and column labels.

label_size: Matplotlib font size

Size of row and column labels.

reference: array or None

A reference time series to be plotted below all time series. The times used will be the ones for the first channel.

reference_plot_args: dictionary

Keyword arguments to be passed to the plot command of the reference time series. This includes color and label.

Returns:
figureMatlotlib figure object
axessarray of Matplotlib axes objects
class SynergyFile(filename, separator='\t', encoding='iso-8859-1', verbose=False)

Represents the contents of a Synergy file. For most practical purposes, you can treat this like a list of SynergyPlate. Often this list contains only one plate.

Parameters:
filenamestring

The location of the filename from which to read the data.

separatorstring

The separator character used in the file.

encodingstring specifying a supported encoding

The encoding of the file. This cannot be automatically detected.

verboseboolean

whether to detail information about the parsing process. Mostly useful for debugging.