Hi Robert, Carlo, Sal, Pierre,
Think it would be good to follow up on software project. Pierre has made some progress here and would be good to try and define tasks a little bit clearer to make progress…
There is always the potential issue of having “too many cooks in the kitchen” (that have different recipes for same thing) to move forward efficiently, something that I noticed can get quite confusing/frustrating when writing software together with people. So would be good to clearly assign tasks. I talked to Pierre today and he would be happy to integrate things in framework we have to try tie things together. What would foremost be needed would be ways of treating data, meaning code that takes a raw spectral image and meta-data and converts it into “standard” format (spectral representation) that can then be fitted. Then also “plugins” that serve a specific purpose in the analysis/rendering that can be included in framework.
The way I see it (and please comment if you see differently), there are ~4 steps here:
Take raw data (in .tif,, .dat, txt, etc. format) and meta data (in .cvs, xlsx, .dat, .txt, etc.) and render a standard spectral presentation. Also take provided instrument response in one of these formats and extract key parameters from this
Fit the data with drop-down menu list of functions, that will include different functional dependences and functions corrected for instrument response.
Generate/display a visual representation of results (frequency shift(s) and linewidth(s)), that is ideally interactive to some extent (and maybe has some funky features like looking at spectra at different points. These can be spatial maps and/or evolution with some other parameter (time, temperature, angle, etc.). Also be able to display maps of relative peak intensities in case of multiple peak fits, and whatever else useful you can think of.
Extract “mechanical” parameters given assigned refractive indices and densities
I think the idea of fitting modified functions (e.g. corrected based on instrument response) vs. deconvolving spectra makes more sense (as can account for more complex corrections due to non-optical anomalies in future –ultimately even functional variations in vicinity of e.g. phase transitions). It is also less error prone, as systematically doing decon with non-ideal registration data can really throw you off the cliff, so to speak.
My understanding is that we kind of agreed on initial meta-data reporting format. Getting from 1 to 2 will no doubt be most challenging as it is very instrument specific. So instructions will need to be written for different BLS implementations.
This is a huge project if we want it to be all inclusive.. so I would suggest to focus on making it work for just a couple of modalities first would be good (e.g. crossed VIPA, time-resolved, anisotropy, and maybe some time or temperature course of one of these). Extensions should then be more easy to navigate. At one point think would be good to involve SBS specific considerations also.
Think would be good to discuss a while per email to gather thoughts and opinions (and already start to share codes), and then plan a meeting beginning of March -- how does first week of March look for everyone?
I created this mailing list (software(a)biobrillouin.org) we can use for discussion. You should all be able to post to (and it makes it easier if we bring anyone else in along the way).
At moment on this mailing list is Robert, Carlo, Sal, Pierre and myself. Let me know if I should add anyone.
All the best,
Kareem
5
32
Yes, and do please keep discussing on this mailing list (or privately) until we next all discuss in Zoom. It would be really great to have a roughly agreed content/structure of h5 file everyone can work with before we next talk..
From: Kareem Elsayad <kareem.elsayad(a)meduniwien.ac.at>
Date: Thursday, 20. February 2025 at 02:49
To: <software(a)biobrillouin.org>
Subject: Re: [Software] Re: [EXTERN] Re: Software manuscript / BLS microscopy
Dear All, (and I guess especially Carlo & Pierre 😊)
I understand both your (main) points concerning what this all should be, and I think this is actually perfect for dividing tasks. The format of the hf or bh5 file being where things meet and what needs to be agreed on.
The thing with raw data is ofcourse that it is variable between instruments and labs, and the conversion to “standard spectra” that can then be fitted etc. is going to be unique (maybe even between experiments in same project). That said, asking people to create some complex file from their data that works with a developed software is also unlikely to get a following.
So (the way I see it) there are basically two parts. Getting from raw data to h5 (or bh5) format which contains the spectra in standard format. And then the whole analysis part and visualization (GUI, pretty features, etc.).
While the latter may get the spotlight, it obviously relies heavily on the former being done right. Given that there are numerous “standard” BLS setup implementations the development of software for getting from raw data to h5 I think makes sense, since the h5 will no doubt be cryptic, and creating a working h5 is not something everyone will want to program by themselves (it is not an insignificant step given we are also trying to ultimately cater to biologists). As such a software that generates the h5 files, with drag and drop features and entering system parameters, for different setups makes sense and will save many labs a headache if they don’t have a good programmer on hand.
So I would suggest that Pierre & Sal lead the work on developing this, while Carlo & Sebastian lead the work on developing the analysis/interface-presentation part. This way things are nicely divided and we just need to agree on the h5 file that is transferred between the two. From the side of Pierre & Sal there could maybe also be a second h5 generated that contains all the raw data and details on how it was converted to the transferred h5 file (for complete transparency this could then also be reported in papers if people wish). These could exist as two separate programs with respective GUIs but also eventually combined to a single one.
How does this sound to everyone?
To clear up details and try assign tasks going forward how about a Zoom first week of March (I would be free Monday 3rd and Tue 4th after 1pm)?
All the best,
Kareem
From: Carlo Bevilacqua via Software <software(a)biobrillouin.org>
Reply to: Carlo Bevilacqua <carlo.bevilacqua(a)embl.de>
Date: Wednesday, 19. February 2025 at 20:43
To: Pierre Bouvet <pierre.bouvet(a)meduniwien.ac.at>
Cc: <software(a)biobrillouin.org>
Subject: [Software] Re: [EXTERN] Re: Software manuscript / BLS microscopy
Hi Pierre,
I realized that we might have slightly different aims.
For me the most important part of this project is to have a unified file format that can be read and visualized by a standard software. The file format you are proposing is not sufficient for that in my opinion. For example one of my main motivation was to have an interface where the user can see the reconstructed image, click on a pixel, see the corresponding spectrum to check its quality and maybe try different fitting functions; that would already not be possible with your structure because the software would have no notion on where the spectral data is stored and how to associate it to a specific pixel.
I don't see the structure of the file to be too complex as an issue, as long as it is functional: we can always provide an API and/or a GUI that allow people to take their spectra and save them in whatever format we decide without bothering about understanding the actual structure, the same way you can work with HDF5 file without having any understanding on how the data is actually stored on the disk.
My idea of making a webapp as a GUI follows directly from there. Typical case: I sent some data to a collaborator and they can just run the app in their browser and check if the outliers they see in the image are real or a bad spectra. Similarly if we make a common database of Brillouin spectra, people can explore it easily using the webapp.
From what I understood, you are instead more interested in going from raw data to an actual spectrum, which I don't see so much as a priority because each lab has their own code already and the actual procedure would be different for each lab (apart from standard instruments like the FP). I am not saying this should not be part of the software but not as a priority and rather has a layer where people can easily implement their own code (of course we could already implement code for standard instruments).
I think it would be good to cleary define what is our common aim now, so we are sure we are working in the same direction.
Let me know if you agree or I misunderstood what is your idea.
Best,
Carlo
On Wed, Feb 19, 2025 at 16:11, Pierre Bouvet <pierre.bouvet(a)meduniwien.ac.at> wrote:
Hi,
I think you're trying to go too far too fast. The approach I present here is intentionally simple, so that people can very quickly get to storing their data in a HDF5 file format together with the parameters they used to acquire them with minimal effort. The closest to what you did is therefore as follows:
- data are datasets with no other restriction and they are stored in groups
- each group can have a set of attribute proper to the data they are storing
- attributes have a nomenclature imposed by a spreadsheet and are in the form of text
- the default name of a data is the name of a raw data: “Raw_data”, other arrays can have whatever name they want (Shift_5GHz, Frequency, ...)
- arrays and attributes are hierarchical, so they apply to their groups and all groups under it.
This is the bare minimum to meet our needs, so we need to stop here in the definition of the format since it’s already enough to have the GUI working correctly, and therefore the first version of the unified software advertised. Of course we will have to refine things later on, but we don’t want to scare people off by presenting them a file description that for one might not match their measurements and that is extremely hard to conceptually understand. To make my point clearer, take your definition of the format for example: there are 3 different amplitude arrays in your description, two different shift arrays and width arrays that are of different dimension, then we have a calibration group on one side but a spectrometer characterization array in another group that is called “experiment_info”, that’s just too complicated to use correctly. On the other hand, placing your raw data “as is” in a group dedicated to this data is conceptually easy and straightforward.
Where you are right, is that should we have hundreds of people using it, we might in a near future want to store abscissa, impulse responses, … in a standardized manner. In that case, the question falls down to either adding sub-groups or storing the data together with the measure. Both are fine and don’t really pose a problem for now, essentially because we are not trying to visualize the results but just trying to unify the way to get them.
Now regarding Dash, if I understand correctly, it’s just a way to change the frontend you propose? If that’s so, why not, but then I don’t really see the benefits: if you really want to use dash, you can always embed it inside a Qt app. Also Dash being browser based, you will only have one window for the whole interface, which will be a pain to deal with if you want to use the interface to do everything the GUI is expected to do (inspect an H5 file, convert PSD, treat data, edit parameters, inspect a failed treatment, simulate results using setups with slight changes of parameters…). We agree that at one point when people receive an h5 file, it will be useful to have a web interface that can do what yours or Sal’s GUI do (seeing the mapping results together with the spectrum, eventually the time-domain signal) but here again, it’s going too far too soon: let’s first have a simple and reliable GUI that can convert data from any spectrometer to PSDs and treat them with a unified code. This is the real challenge, visualization and nomenclature of results are both important but secondary. Also keep in mind that, the frontend can easily be generated by anyone really, with Copilot or ChatGPT or whatever AI, but you can’t ask them to develop the function that will correctly read your data and convert them to a PSD nor treat the PSD. This is done on the backend and is the priority since the unification essentially happens there.
Best,
Pierre
Pierre Bouvet, PhD
Post-doctoral Fellow
Medical University Vienna
Department of Anatomy and Cell Biology
Wahringer Straße 13, 1090 Wien, Austria
On 19/2/25, at 13:21, Carlo Bevilacqua <carlo.bevilacqua(a)embl.de> wrote:
Hi Pierre,
thanks for your reply.
Regarding the file structure, what I am still not very clear about is what you define as Raw_data and data, how the actual spectra are stored and how the association to spatial coordinates and parameters is made. That's why I would really appreciate if you could make a document similar to what I did, where it is clear what each group should (or can) contain, what is the shape of each dataset, which attributes they have, etc...
I am not saying that this should be the final structure of the file but I strongly believe that having it written in a structured way helps defining it and seeing potential issues.
Regarding the webapp, it works by separating the frontend, which run in the browser and is responsible of running the GUI, and the backend which is doing the actual computation and that you can structure as you like with multithreading or any optimization you wish. The communication between frontend and backend is handled transparently by the framework, so you don't have to take care of it.
When you run Dash locally a local server is created so the data stays on your computer and there is no issue with data transfer/privacy.
If we want to move it to a server at a later stage, then you are right about the fact that data needs to be transferred to a server (although there might be solutions that allow you to still run the computation on the local browser, like WebAssembly, but I think it is too complex to look into this at this stage). Regarding safety and privacy, the data will be stored on the server only for the time that the user is using the app and then deleted (of course we will need to make a disclaimer on the website about this). Regarding space I don't think people at the beginning will load their 1Tb dataset on the webapp but rather look at some small dataset of few tens of Gb; in that case 1Tb of server space is more than enough (remember that the file will be stored only for the time the user is using the app). If people really start to use the webapp and space becomes a limitation, I would be super happy to look into WebAssembly so that the data can be handled locally from the browser without transferring it to the server.
To summarize I would agree that, at this stage, there might be no advantage of a webapp over a local GUI (and might actually be a bit more work to develop it). But, if this project really flies, the potential of a webapp is huge, especially in the context of the BioBrillouin society where we want to have a database of spectral data and the webapp could be used to explore that without downloading anything on your computer. My main point is that moving now to a webapp would not be too much work, but if we want to do it at a later stage it will basically entail re-writing everything from scratch.
Let me know what are your thoughts about it.
Best,
Carlo
On Wed, Feb 19, 2025 at 08:51, Pierre Bouvet <pierre.bouvet(a)meduniwien.ac.at> wrote:
Hi Carlo,
You’re right, the format is still a little fuzzy. The main idea is to have a data-based hierarchy for storage: each data is stored in an individual group. From there, abscissa dependent on the data are stored in the same group and the treated data are stored in sub-groups. The names of the groups and sub-groups are fixed (Data_i), as is the name of the raw data (Raw_data) and abscissa (Abscissa_i). Also, the measure and spectrometer-dependent arguments have a defined nomenclature. To differentiate groups between them, we preferably attribute them a name as an attribute that is not used as an identifier, the identifier being either the path to the data/group from file root (Data_0/Data_42/Data_2/Raw_data) or potentially a hash value (not implemented). This I think allows all possible configurations, and using an hierarchical approach we can also pass common attributes and arrays (parameters of the spectrometer or abscissa arrays for example) on parent to reduce memory complexity.
Now regarding the use of server-based GUI, first off, I’ve never used them so I’m just making supposition here but if the platform is able to treat data online, it will have to store the spectra somewhere in memory and it will not be local. My primary concerns with this approach is safety, particularly in terms of intellectual property protection, ethical considerations, and potential security vulnerabilities. Additionally, managing storage space will be a headache.
Now, this doesn’t mean that we can’t have a server-based data plotter, or a server-based interface that does treatments locally but I don’t really see the benefits of this over a local software that can have multiple windows, which could at one point be multithreaded, and that could wrap c code to speed regressions for example (some of this might apply to Dash, I’m not super familiar with it). Now regarding memory complexity, having all the data we treat go on a server is a bad idea as it will raise the question of cost which will very rapidly become a real problem. Just an example: for a 100x100 map, I need 10Gb of memory (1Mo/point) with my setup (1To of archive storage is approximately 1euro/month) so this would get out of hands super fast assuming people use it.
Now maybe there are solutions I don’t see for these problems and someone can take care of them but for me it’s just not worth the effort when having a local GUI solves all of these issues at once. But if you find a solution, then I’ll be happy to migrate to Dash, it won’t be fast to translate every feature but I can totally join you on the platform.
Best,
Pierre
Pierre Bouvet, PhD
Post-doctoral Fellow
Medical University Vienna
Department of Anatomy and Cell Biology
Wahringer Straße 13, 1090 Wien, Austria
On 18/2/25, at 20:26, Carlo Bevilacqua <carlo.bevilacqua(a)embl.de> wrote:
Hi Pierre,
regarding the structure of the file, I agree that we should keep it simple. I am not suggesting to make it more complex, rather to have a document where we define it in a clear and structured way rather than as a general description, to make sure that we are all on the same page.
I am saying this because, to be honest, I am not sure I fully understand how you are structuring the HDF5 and I think it is important to agree on this now, rather than finding at a later stage that we had different ideas.
Ideally if the GUI should be part of a single application, we should write it using a unified framework.
The reasons why, after considering it for a bit, I lean towards Dash rather than Qt are:
it can run in a web browser so it will be easy to eventually move it to a website, thus people can use it without installing it (which will hopefully help in promoting its use)
it is based on plotly which is a graphical library with very good plotting capabilites and highly customizable, that would make the data visualization easier/more appealing
Let me know what you think about it and if you see any advantage of Qt over a web app which I am not considering. Also, if we agree that Dash might be a better option, would you consider migrating to Dash? It might not be too much work if most of the code you wrote is for data processing rather that for the GUI itself and I could help you with that. Alternatively one workaround is to have a QtWebEngine in your Qt app and have the Dash app run inside that, but that would make only the Dash part portable to a server later on.
Best,
Carlo
On Tue, Feb 18, 2025 at 10:24, Pierre Bouvet <pierre.bouvet(a)meduniwien.ac.at> wrote:
Hi,
Thanks,
More than merge them later, just keep them separate in the process and rather than trying to build "one code to do it all", build one library and GUI that encapsulate codes to do everything.
Having a structure is a must but it needs to be kept as simple as possible else we’ll rapidly find situations where we need to change the structure. The middle ground I think is to force the use of hierarchies and impose nomenclatures where problem are expected to appear: each dataset has its own group, each group can encapsulate other groups, each parameter of a group applies to all its sub-groups if the subgroup does not change its value, each array of a group applies to all of its sub-groups if the sub-group does not redefine an array with same name, the names of the groups are held as parameters and their ID are managed by the software.
For the Plotly interface, I don’t know how to integrate it to Qt but if you find a way to do it, that’s perfect with me :)
My vision of the GUI is something that unifies the format and can then execute scripts on the data stored in this format. I have pushed the application of the GUI to my data up to the point where I can do the whole information extraction process from it (add data, convert to PSD, fit curves). I think this could be super interesting if we implement it for all techniques.
I think we could formalize the project in milestones, in particular define the minimal denominator that will allow us to have the paper. The milestone I reached for my spectro encompasses the following points:
- Being able to add data to the file easily (by dragging & dropping to the GUI)
- Being able to assign properties to these data easily (again by dragging & dropping)
- Being able to structure the added data in groups/folders/containers/however we want to call it
- Making it easy for new data types to be loaded
- Allowing data from same type but different structure to be added (e.g. .dat files)
- Execute scripts on the data easily and allowing parameters of these scripts to be defined from the GUI (e.g. select a peak on a curve to fit this peak)
- Make it easy to add scripts for treating or extracting PSD from raw data.
- Allow the export of a Python code to access the data from the file (we cans see them as “break points” in the treatment pipeline)
- Edit of properties inside the GUI
In any case I think we could build a spec sheet for the project with what we want to have in it based on what we want to advertise in the paper. We can always add things later on but if we agree on a strict minimum to have the project advertised, then that will set its first milestone on which we’ll be able to build later on.
Best,
Pierre
Pierre Bouvet, PhD
Post-doctoral Fellow
Medical University Vienna
Department of Anatomy and Cell Biology
Wahringer Straße 13, 1090 Wien, Austria
On 17/2/25, at 14:53, Carlo Bevilacqua via Software <software(a)biobrillouin.org> wrote:
Hi Pierre, hi Sal,
thanks for sharing your thoughts about it.
@Pierre I am very sorry that Ren passed away :(
As far as I understood you are suggesting to work on the individual aspects separately and then merge them together at a later stage? I am fine with that but I still think it is very important to now agree on what should be in the file format we are defining, because that is kind of the core of the project and will affect a lot of the design choices.
I am happy to start working on the GUI for data visualization. In my idea, it will be something similar to this but written in dash, so it is a web app that for now can run locally (so no transfer of data to an external server is required) but in the future can easily be uploaded to a website that people can just use without installing anything on their computer.
The question is how much of the data processing should be possible to do (or trigger) from the same GUI? I think at least a drop down menu with different fitting functions should be there, so the user can quickly check the results on a spectrum from a specific point in the image.
@Pierre how are you envisioning the GUI you are working on? As far as I understood it is mainly to get the raw data and save it to our HDF5 format with some treatment on it.
One idea could be to have a shared list of features that we are implementing in the GUI as we work on it, to avoid having the same functionality duplicated (or worst inconsistent) between the GUI for generating our file format and the GUI for visualization.
Let me know what you think about it.
Best,
Carlo
On Mon, Feb 17, 2025 at 11:04, Pierre Bouvet via Software <software(a)biobrillouin.org> wrote:
Hi everyone,
Sorry for the delayed response, I just went through the loss of Ren (my dog) and wasn’t at all able to answer.
First of, Sal, I am making progress and I should have everything you have made on your branch integrated to the GUI branch soon, at which point I will push everything to main.
Regarding the project, I think I align with Sal: keep things as simple as possible. My approach was to divide everything in 3 mostly independent layers:
- Store data in an organized file that anyone can use, and make it easy to do
- Convert these data into something that has physical significance: a Power Spectrum Density
- Extract information from this Power Spectrum Density
Each layer has its own challenge but they are independent on the challenge of having people using it: personally, if I had someone come to me with a new software to play with my measures, I would most likely only look at it for 1 minute (or less depending who made it) with a lot of apprehension and then based on how simple it looks and how much I understood of it, use it or - what is most likely - discard it. This is why Project.pdf is essentially meant to say: we put data arrays in groups and give them attributes written in text, but you can also create groups within groups to organize your data!
I believe most of the people in the BioBrillouin society will have the same approach and before having something complex that can do a lot, I think it’s best to have something simple that can just unify the format (which in itself is a lot) and that people can use right away with a minimal impact on their own individual pipelines for data processing.
To be honest, I don’t think people will blindly trust our software at first to treat their data, but they will most likely use it at first to organize their raw spectra, and maybe add their own treated data if they are feeling particularly adventurous. But that is not a problem as long as we start a movement. From there yes, we can complexity and create custom file architectures but that is already too complex I think for this first project.
If we can all just import our data to the wrapper and specify their attributes, this will already be a success. Then for the paper, if we can all add a custom code to treat these data to obtain a PSD, I think this will be enough. As I said, the current API already allows you to add your data in a variety of format, so it’s just a question of developing the PSD conversion before having something we can publish (and then tell people to use).
A few extra points:
- Working with my setup, I realized if the user could choose between different algorithms to treat their own data, it would make the overall project much more usable (if an algorithm bugs for whatever reason, you can use another one) so I created 2 bottle necks in the form of functions (for PSD conversion and treatment) that would inspect modules dedicated to either PSD conversion or treatment, and list existing functions. It’s in between classical and modular programming but it makes the development of new PSD conversion code and treatment way way easier
- The GUI is developed using oriented-object programming. Therefore I have already made some low-level choices that kind of impact all the GUI. I’m not saying they are the best choices, I’m just saying that they work, so if you want to work on the GUI, I would either recommend getting familiar with these choices, or making sure that all the functionalities are preserved, particularly the ones that are invisible (logging of treatment steps, treatment errors…)
I’ll try merging the branches on Git asap and will definitely send you all an email when it’s done :)
Best,
Pierre
Pierre Bouvet, PhD
Post-doctoral Fellow
Medical University Vienna
Department of Anatomy and Cell Biology
Wahringer Straße 13, 1090 Wien, Austria
On 14/2/25, at 19:36, Sal La Cavera Iii via Software <software(a)biobrillouin.org> wrote:
Hi all,
I agree with the things enumerated and points made by Kareem/Carlo! I guess I would advocate for starting from a position of simplicity. We probably don't want to get weighed down by trying to include a bunch of bells and whistles from the start as these can be easily added once there is a minimally viable stable product.
As long as data can be loaded to local memory, then treated in various (initially simple) ways through the GUI (and maybe maintain non-GUI compatibility for command-line warriors like myself), then the bh5 formatting stuff can be sorted on the back end? I definitely agree with Carlo's structure set out in the file format document; but all of that data will be floating around the memory in some shape or form and just needs to be wrapped up and packaged (according to Carlo's structure e.g.) when the user is happy and presses the "generate h5 filestore" etc. (?)
Definitely agree with the recommendation to create the alpha using mainly requirements that our 3 labs would find useful (import filetypes, treatments, etc), and then we can add on more universal functionality second / get some beta testers in from other labs etc.
I'm able to support on whatever jobs need doing and am free to meet in the beginning of March like you mentioned Kareem.
Hope you guys have a nice weekend,
Cheers,
Sal
---------------------------------------------------------------
Salvatore La Cavera III
Royal Academy of Engineering Research Fellow
Nottingham Research Fellow
Optics and Photonics Group
University of Nottingham
Email: salvatore.lacaveraiii(a)nottingham.ac.uk
ORCID iD: 0000-0003-0210-3102
<Outlook-tygjxucs.png>Book a Coffee and Research chat with me!
From: Carlo Bevilacqua via Software <software(a)biobrillouin.org>
Sent: 12 February 2025 13:31
To: Kareem Elsayad <kareem.elsayad(a)meduniwien.ac.at>
Cc: sebastian.hambura(a)embl.de <sebastian.hambura(a)embl.de>; software(a)biobrillouin.org <software(a)biobrillouin.org>
Subject: [Software] Re: Software manuscript / BLS microscopy
You don't often get email from software(a)biobrillouin.org. Learn why this is important
Hi Kareem,
thanks for restarting this and sorry for my silence, I just came back from US and was planning to start working on this again.
Could you also add Sebastian (in CC) to the mailing list?
As you outlined, I would split the project into two parts: 1) getting from the raw data to some standard "processed' spectra and 2) do data analysis/visualization on that. For the second part the way I envision it is:
the most updated definition of the file format from Pierre is this one, correct? In addition to this document I think it would be good to have a more structured description of the file (like this), where each field is clearly defined in terms of data type and dimensions. Sebastian was also suggesting that the document should contain the reasoning behind each specific choice in the specs and also things that we considered but decided had some issues (so in future we can look back at it). I still believe that the file format should contain the "processed" spectra (i.e. after FT and baseline subtraction for impulsive, or after taking a line profile and possibly linearization with VIPA,...) so we can apply standard data processing or visualization which is independent on the actual underlaying technique (e.g. VIPA, FP, stimulated, time domain, ...)
agree on an API to read the data from our file format (most likely a Python class). For that we should: 1) decide on which information is important to extract (spectral information, spatial coordinates, hyper-parameters, metadata,...) 2) implement an interface to read the data (e.g. readSpectrumAtIndex(...), readImage(...), ...)
build a GUI that use the previously defined API to show and process the data.
I would say that, once we have a very solid description of the file format as in step 1, step 2 will come naturally and we can divide the actual implementation between us. Step 3 can also easily be implemented and divided between us once we have an API (and I am happy to work on the GUI myself).
The first half of the project is the trickiest and I know Pierre has already done a lot of work in that direction. We should definitely agree on which extent we can define a standard to store the raw data, given the variability between labs, (and probably we should do it for common techniques like FP or VIPA) and how to implement the treatments, leaving to possibility to load some custom code in the pipeline to do the conversion.
Let me know what you all think about this.
If you agree I would start by making a document which clearly defines the file format in a structured way (as in my step 1 before). @Pierre could you write a new document or modify the document I originally made to reflect the latest structure you implemented for the file format? The metadata can still be defined in a separated excel sheet, as long as the data type and format is well defined there.
Best regards,
Carlo
On Wed, Feb 12, 2025 at 02:19, Kareem Elsayad via Software <software(a)biobrillouin.org> wrote:
Hi Robert, Carlo, Sal, Pierre,
Think it would be good to follow up on software project. Pierre has made some progress here and would be good to try and define tasks a little bit clearer to make progress…
There is always the potential issue of having “too many cooks in the kitchen” (that have different recipes for same thing) to move forward efficiently, something that I noticed can get quite confusing/frustrating when writing software together with people. So would be good to clearly assign tasks. I talked to Pierre today and he would be happy to integrate things in framework we have to try tie things together. What would foremost be needed would be ways of treating data, meaning code that takes a raw spectral image and meta-data and converts it into “standard” format (spectral representation) that can then be fitted. Then also “plugins” that serve a specific purpose in the analysis/rendering that can be included in framework.
The way I see it (and please comment if you see differently), there are ~4 steps here:
Take raw data (in .tif,, .dat, txt, etc. format) and meta data (in .cvs, xlsx, .dat, .txt, etc.) and render a standard spectral presentation. Also take provided instrument response in one of these formats and extract key parameters from this
Fit the data with drop-down menu list of functions, that will include different functional dependences and functions corrected for instrument response.
Generate/display a visual representation of results (frequency shift(s) and linewidth(s)), that is ideally interactive to some extent (and maybe has some funky features like looking at spectra at different points. These can be spatial maps and/or evolution with some other parameter (time, temperature, angle, etc.). Also be able to display maps of relative peak intensities in case of multiple peak fits, and whatever else useful you can think of.
Extract “mechanical” parameters given assigned refractive indices and densities
I think the idea of fitting modified functions (e.g. corrected based on instrument response) vs. deconvolving spectra makes more sense (as can account for more complex corrections due to non-optical anomalies in future –ultimately even functional variations in vicinity of e.g. phase transitions). It is also less error prone, as systematically doing decon with non-ideal registration data can really throw you off the cliff, so to speak.
My understanding is that we kind of agreed on initial meta-data reporting format. Getting from 1 to 2 will no doubt be most challenging as it is very instrument specific. So instructions will need to be written for different BLS implementations.
This is a huge project if we want it to be all inclusive.. so I would suggest to focus on making it work for just a couple of modalities first would be good (e.g. crossed VIPA, time-resolved, anisotropy, and maybe some time or temperature course of one of these). Extensions should then be more easy to navigate. At one point think would be good to involve SBS specific considerations also.
Think would be good to discuss a while per email to gather thoughts and opinions (and already start to share codes), and then plan a meeting beginning of March -- how does first week of March look for everyone?
I created this mailing list (software(a)biobrillouin.org) we can use for discussion. You should all be able to post to (and it makes it easier if we bring anyone else in along the way).
At moment on this mailing list is Robert, Carlo, Sal, Pierre and myself. Let me know if I should add anyone.
All the best,
Kareem
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________
Software mailing list -- software(a)biobrillouin.org
To unsubscribe send an email to software-leave(a)biobrillouin.org
_______________________________________________
Software mailing list -- software(a)biobrillouin.org
To unsubscribe send an email to software-leave(a)biobrillouin.org
_______________________________________________ Software mailing list -- software(a)biobrillouin.org To unsubscribe send an email to software-leave(a)biobrillouin.org
1
0