Discovering scientific data with Python
During my studies, Biochemistry students alwiays used Excel or OriginPro for analysis of experimental data. Starting my thesis in a theoretical laboratory finally introduced me to the every day use of Python.
Working with a bunch of data files can be quite tedious, especially when one has to keep track of their paths, to do analysis on their contents. Gladly there is one Python package trying to solve this problem: datreant.
Note: This post is not supposed to give a full review of the package, but only show a quick way to get started.
Let’s cut to the chase: grabbing paths to theoretical/experimental data files is never a fun task to solve again and again. Using Pythons os
package, one can walk through a path and grab all file names. But why bother to create own solutions when someone already has done so?
Let us have a look at a simple datreant
workflow:
So far we have not done anything special. We imported a package and set the path to the folder containing some experimental data.
We now created a new Treant
object inside the given path. Now what?
datreant
actually comes with a handy way of visualizing everything inside a given Treant
using the Treant.draw()
method.
Note: Treant.94847b1d-7ef8-490a-a381-c509fd0b1ac0.json
is a state file created by datreant.core
storing information about the current Treant
object. More information here.
This is neat. We immediately get to retrieve a list of all files inside this path, but we are only interested in the .txt
files inside this folder.
To filter out everything else we just glob for files ending with .txt
. Afterwards we use the abspaths
method to retrieve the absolute file paths, instead of the Leaf
objects that we would get otherwise.
Easy. Right? There is much more to datreant
, but we will come back to that at some point later in time.