Synergy File Reader for Python¶
This module allows you to read text files produced by Synergy plate readers in Python using appropriate Python data structures.
Example¶
Our data is located in the file example_data.txt
in the same folder. We can load it like this:
from synergy_file_reader import SynergyFile
my_file = SynergyFile("example_data.txt")
Now my_file
is (for all practical purposes) a list containing the individual plates in the file. In our case there is only one such plate, so it’s best to assign it to a separate variable:
my_plate = my_file[0]
In such a case, it often makes sense to load the file and extract the plate in one command. The following is equivalent to the above:
from synergy_file_reader import SynergyFile
my_plate = SynergyFile("example_data.txt")[0]
If you are familiar with Python data structures, everything else is straightforward now: Have a look at the properties and keys of my_plate
and it should be clear how to access the information you need. If not, we proceed with a small tour:
my_plate
has a series of properties containing information we may care about. Let’s start with channels
:
print(my_plate.channels) # ['OD:600']
This tells us that we measured only one channel for our plate, which we called OD:600. We can now extract and print the times and temperatures for these measurements:
channel = my_plate.channels[0]
print(channel) # 'OD:600'
print(my_plate.times[channel])
print(my_plate.temperatures[channel])
The my_plate.metadata
is a dictionary, which contains all sorts of tangential data for our plate, as long as it is contained in the file. For example, if we want to know the version of the software that produced our plate, we can do this as follows:
print(my_plate.metadata["Software Version"])
However, what we usually care about is the raw data. For each well and channel, this is stored in a NumPy array. We can access it by directly indexing my_plate
. The following commands are all equivalent, with the last only working because we only have one channel:
print(my_plate["C3",channel])
print(my_plate["C",3,channel])
print(my_plate["C3"])
Our file includes no aggregated results such as growth rates computed by the Synergy software. If it did, they would be in the dictionary my_data.results
, each value of which is a SynergyResult
which can be indexed like my_plate
.
If we want to access the raw data for all wells, we can do this with the values
function which mirrors this functionality for dictionaries. keys
analogously gives us all well–channel combinations. In the following example, we take the first values from the time series for all wells, and compute the 10th percentile of them as a baseline for our measurements:
import numpy as np
baseline = np.percentile( [ts[0] for ts in my_plate.values()], 10 )
The attributes rows
and cols
facilitate easy iterations over all rows or columns, respectively. We here compute the element-wise median of all time series in the first row (“A”), to use as a reference for comparisons:
reference = np.median( [my_plate["A",col] for col in my_plate.cols], axis=0 )
Finally, we plot
our data. We here specify that the baseline
we computed before shall be subtracted from all time series. Moreover, our reference
should appear in each plot:
fig,axess = my_plate.plot(
colours = ["red"],
ylim = (1e-3,2), xlim = (0,30000),
baseline = baseline,
plot_args = {"linewidth":3},
reference = reference,
reference_plot_args = { "color":"black", "linewidth":1, "label":"no drug" },
)
The remaining arguments are hopefully self-explanatory and also explained in the documentation of plot
. fig
is a Matplotlib figure object with all the functionality that comes with that. For example, we can use savefig
to export our plot to a file:
fig.savefig( "example.pdf", bbox_inches="tight" )
And this is what our plot looks like:
Preparing Files¶
The Synergy software allows you to export files in various different layouts and control all sorts of details as to what information should be included in the file. As long as the included information is not ambiguous, this module should be able to read it – if not, tell me.
If you already exported your files and did not choose for a minimal format, just try and see whether they can be loaded. If want to make sure that your file is readable and contains all information, do the following:
Choose Automatic content.
Whenever you have the option to Include something, do so.
Use tables and not matrices for everything. (The main problem of the latter is that temperature information is lost.)
Use Tab as a separator. (The other separators will probably work fine, but I am not sure that they do not lead to ambiguities.)
Sample IDs¶
Sample IDs are supported for reading files and indexing. However, the module discards some information automatically computed from blank data by the Synergy software. This is because they are nasty to parse and integrate into the data structure with what I expect to be little usage: If you are using this module, I expect that you will want to calculate information like this yourself.
Under the Hood¶
If anybody cares, this module uses trial-and-error parsing. For every type of data block, there is a parser, which throws an error if fed with data that does not match the format of the type of data block. To parse a file, all of these parsers are applied one by one until one doesn’t throw an error, which is then taken to reflect the true structure of the data. This process is then repeated with the remainder of the file until there are no more lines left.
This makes the parser rather flexible to expand, but also makes it impossible to pinpoint where exactly the parsing fails. A file simply becomes unparsable once it contains a data block for which all of the implemented parsers fail.
Data Structure¶
SynergyFile
is a collection of plates, each of which is a SynergyPlate
. SynergyPlate
inherits from SynergyResult
, which is used for the raw data. The results
of a SynergyPlate
are also of the type SynergyResult
.
Special values are parsed as follows:
?????
and empty fields are parsed asnan
(not a number)OVRFLW
is parsed asinf
(+∞)<
followed by a number is parsed as zero.
Command Reference¶
- class SynergyResult(sample_ids=None)¶
A single well- and channel-wise result.
You can index this in different ways, using the well F7 for the channel
"OD"
as an example:Row letter, column number, and channel separately:
result["F",7,"OD"]
Well identifier in one string:
result["F7","OD"]
If there is only one channel, you do not need to specify it:
result["F7"]
If your plate contains information on sample IDs, you can also use those for indexing:
result["SPL55","OD"]
. If the sample ID is not unique, a list of the respective content of all matching wells will be returned.
It comes with the following attributes and methods:
rows
andcols
are lists of the row letters and column numbers, respectively.channels
is a list of all channels for which recordings exists.keys
andvalues
are methods similar to those for dictionaries, returning iterables of all keys and values respectively.If your plate contains information on sample IDs,
sample_ids
contains the mappin of sample IDs to wells.
- class SynergyPlate¶
Data for a single plate.
Raw data can be accessed by indexing this object directly like a
SynergyResult
.This usally comes with the following methods and attributes:
rows
andcols
are lists of the row letters and column numbers, respectively.channels
is a list of all channels for which recordings exists.keys
andvalues
are methods similar to those for dictionaries, returning iterables of all keys and values respectively.times
is a dictionary specifying the times of measurements (in seconds) for each channel.temperature_range
is contains the minimal and maximal temperature specified in the file. This almost always contains some meaningful information.temperatures
is a dictionary specifying the temperatures at the times of measurements for each channel. This is only not empty if the file specifies the information.metadata
is a dictionary of metadata like the time of the measurement, procedure details, file paths, and information about the device.results
is a dictionary ofSynergyResult
. These are usually aggregated estimates by the plate-reader software such as of the growth rate, lag time, etc.gains
is a dictonary containing the automatic gains determined for each channel, if such exist. Otherwise it’s empty.plot
allows to quickly plot the data and has an extensive documentation.
- plot(*, channels=None, colours=None, xlim=None, ylim=None, baseline=0, log_y=True, plot_args={}, timescale=None, label_pad=20, label_size='xx-large', reference=None, reference_plot_args={})¶
plots time series of the raw data using Matplotlib.
This is a service function to quickly obtain a decent plot. It does not cover all potential use cases and is not suited as basis for extensive customisations. If you want a starting point for the latter, have a look at this example.
- Parameters:
- channelsiterable of keys or None
The channels to be plotted. If
None
, all channels will be plotted.- colours: iterable of colour names or None
The colours to use for the respective channels. If
None
, Matplotlib’s default colour cycle will be used.- xlim, ylim: pair of numbers or None
The ranges of the plot. If
None
this will be chosen automatically by Matplotlib.- baseline: number or array
This will be subtracted from the values of all time series before plotting (including the reference).
- log_y: boolean
Whether the ordinate should be logarithmic.
- plot_args: dictionary
Further keyword arguments to be passed to every plot (except the reference).
- timescale: one of “seconds”, “minutes”, “hours”, “days”, or None
Which timescale to use for labelling. If
None
, this will be guessed.- label_pad: number
Padding of row and column labels.
- label_size: Matplotlib font size
Size of row and column labels.
- reference: array or None
A reference time series to be plotted below all time series. The times used will be the ones for the first channel.
- reference_plot_args: dictionary
Keyword arguments to be passed to the plot command of the reference time series. This includes
color
andlabel
.
- Returns:
- figureMatlotlib figure object
- axessarray of Matplotlib axes objects
- class SynergyFile(filename, separator='\t', encoding='iso-8859-1', verbose=False)¶
Represents the contents of a Synergy file. For most practical purposes, you can treat this like a list of
SynergyPlate
. Often this list contains only one plate.- Parameters:
- filenamestring
The location of the filename from which to read the data.
- separatorstring
The separator character used in the file.
- encodingstring specifying a supported encoding
The encoding of the file. This cannot be automatically detected.
- verboseboolean
whether to detail information about the parsing process. Mostly useful for debugging.