Hi Sal, thanks a lot for writing the script. Actually the specs changed a bit and here (https://github.com/prevedel-lab/Brillouin-standard-file/blob/main/docs/BLS_f...) is the most updated version (nothing major, just better defining some fields). The main difference though is that we are now using Zarr file (as we discussed in our last meeting). In the meanwhile I wrote a library which exposes an intuitive interface to read/write to our file format (it is not online yet, but will be soon). I think the best way would be to use that to actually export Pierre's hdf5 to ours. Also Sebastian progressed quite a bit on the GUI side. Shall we have a meeting to update on each other's status? Would Monday 26th in the afternoon (e.g. 3pm) work for you? Best, Carlo On Wed, May 21, 2025 at 17:44, Sal La Cavera Iii wrote: Hi guys, Hope everyone's doing well! Found some time to get this hierarchical-to-flat conversion script off the ground. In the below wetransfer link (3 days until it expires, so let me know if you need me to send it again) there's a python script that should be run with Pierre's multiparameter cell imaging dataset (h5 file) in the same directory. It doesn't map to every sub-group in Carlo's file_spec document (https://github.com/bio-brillouin/HDF5_BLS/blob/main/Bh5_file_spec.md) but it does the basic ones like Raw_data, Frequency, Shift, Linewidth, *_std, etc. Additional datasets/attributes should be fairly easy to add, but probably this will require a different test file that contains some of these other parameters (e.g. PSD, Amplitude, IRF, Calibration, etc). It will probably require some rigidity on behalf of the end-user, e.g., when they're using Pierre's software to create an h5 file, the user shouldn't have the option to save the Frequency data as 'freq' or 'f' it'll have to be 'Frequency.' Also, this version doesn't accommodate multiple signal processing runs of the same dataset (Carlo parametrised this as /Analysis_{m} in the file spec), but again this should be pretty straight forward to implement (having a test file that contained {m+1} would be useful). Right now Analysis_0 is hardcoded in. Carlo/Sebastian does this file format trend towards what you're looking for for handing over Pierre's h5 file to your plotting side of things? https://we.tl/t-TszCgBzwfE (https://we.tl/t-TszCgBzwfE) Any feedback etc just let me know, happy to chat further, Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk (mailto:salvatore.lacaveraiii@nottingham.ac.uk) ORCID iD: 0000-0003-0210-3102 (https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...) Book a Coffee and Research chat with me! ------------------------------------ From: Pierre Bouvet Sent: 17 April 2025 12:35 To: Sebastian Hambura Cc: Sal La Cavera iii (staff) ; Carlo Bevilacqua Subject: Re: [EXTERN] Practice dataset Hi Sebastian, Was just writing to you to say that in my hurry, I didn’t added the treated data for cell.h5. I’ll do that whenever I get a bit of time or when the GUI will be able to do it (most likely option B except if you want it fast then I can spend a few minutes doing that by hand, you tell me :) ). Your question is good, the way you’d do it for these data is to recover the index on the azimuthal and polar calibration datasets. It’s indeed impractical and very specific to this kind of data (I had the same problem with accessing the time axis with time-domain datasets). For now however, I’d say that it’s sufficient since - that I know of - I have the only spectrometer able to do that, and there are technique-specific ways to access this information. Since I guess you also want to be able to convert these information into your data format and not have to create technique-specific access function, one way to avoid altogether this issue could be to define all the treated datasets as 2D arrays with size - (512x512) in this example. Then you would have the shift[i, j] associated with PSD[i, j, :]. If you have other ideas, I’m open to suggestions :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 13:19, Sebastian Hambura wrote: Hi Pierre, Ok, this clarifies a bit the whole thing. But then concretly, what is the spectra (as intensity = f(frequency) ) used for a given treated point ? For example, if I wanted to see the underlaying spectra (or acquired data + fit) of /Brillouin/Mock anisotropic sample/Treated/Shift[1000], what part of /Brillouin/Mock anisotropic sample/Frequency and /Brillouin/Mock anisotropic sample/Intensity would I need to display ? Best, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 17.04.2025 um 11:50 schrieb Pierre Bouvet: Hi Sebastian, Sorry for the lack of details, I’ll try to explain it as best I can For the Frequency and Intensity datasets, there are the size of my detector (for now a 512x512 but hopefully soon a 2048x2048 if we can get the money ^^). By nature, I am entangling 3 hyper parameters in my measures: shift frequency, azimuthal angle of observation, polar angle of observation (with errors on each of them that I would add later on but let’s say my device is perfect for now). Now the VIPA spectrometer creates constructive interferences for specific frequencies so after treatment, I will only get treated values for the points where constructive interference happened. For each of these points I will therefore have a shift, a linewidth, a value for the azimuthal angle and a value for the polar angle. The most direct way of storing the results is therefore to keep everything 1D and then, when trying to represent things, use a color mesh with the azimuthal and polar abscissa as the x and y positions of the shift and linewidth. For the calibration now, this is different because we can consider that the angular response of water follows the theory for homogeneous and isotropic solutions, meaning that we will essentially have no dependence of the shift and linewidth on the polar angles but a dependence of sin(theta/2) on the azimuthal angles. Therefore, it is possible to reconstruct a 2D image for the shift and linewidth on all the points of the detector (which are themselves associated with a value for azimuthal and polar angles, as you can see in the calibration group). That’s weird for cells.h5, H5Viewer told me nothing on VS, but on the other hand I might have my answer for why the file was super heavy, just need to figure out how to delete groups and datasets without truncating the file. Here is the same file but made without deleting files from the HDF5 file (I tried it on myHDF5 it seems to work): https://we.tl/t-WlmabmsXHo (https://we.tl/t-WlmabmsXHo) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 11:27, Sebastian Hambura (mailto:sebastian.hambura@embl.de) wrote: Hi Pierre, I'm currently looking at your angle_resolved_dataset.h5, and maybe it's because I'm not familiar with your specific setup but I'm a bit confused: - Frequency and Intensity are 512*512 for all 3 experiments/datagroups - Treated for 'Mock anisotropic sample' is 3584(*1) : are you somehow using the same row of your PSD for multiple treated datapoints ? Or are you doing some 'subpixel' rows ? - Treated for '/Brillouin/Water spectrum for angle calibration' is 512*512: how does that work from your PSD data ? For the cells.h5 file, I tried opening it with https://myhdf5.hdfgroup.org (https://myhdf5.hdfgroup.org/) to explore it, and the tool refusing to open it and report this error: (...) unable to read superblock major: File accessibility minor: Read failed #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Fsuper.c line 603 in H5F__super_read(): truncated file: eof = 309329920, sblock->base_addr = 0, stored_eof = 537790152 major: File accessibility minor: File has been truncated I'll see if other tools might still be able to read it Cheers, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 14.04.2025 um 17:56 schrieb Pierre Bouvet: Sure, here is the link for the ar-VIPA mock H5 file: https://we.tl/t-Sd3ZC8wIXn (https://we.tl/t-Sd3ZC8wIXn) For a more classic file, here are some measures of cells (I only kept 2 cell types, 2 days of measure and 2 samples per day of measure for a total of 8 spectra + their frequency/treated arrays). I treated them using a custom code so I didn’t export the treatment steps in the JSON file I mentioned @ last meeting. Some of the measures are rubish but I guess you don’t care about it ^^ One problem I observed though is that I created this file by deleting elements from a larger file but it seems the memory is still allocated to the HDF5 file. So we get a 500MB file when a classic size for 8 spectra et al is usually under 100Mb. I’ll dig into that when I get time. Here is the link: https://we.tl/t-HUAf2lR9AF (https://we.tl/t-HUAf2lR9AF) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 14/4/25, at 17:04, Sal La Cavera Iii (mailto:Salvatore.Lacaveraiii@nottingham.ac.uk) wrote: Hi both, Pierre I think that WeTransfer link expired! Any chance you can regenerate it? Potentially the simulated angle-resolved dataset will be a good late-stage-dev to test the software's generalisability? In the first instance should we just use a basic 2D scan of a cell or zebrafish, c. elegans, etc? From a 2-stage VIPA e.g. Something we can treat as the MNIST or ImageNet dataset in the Deep Learning community. Carlo, what about something from your 2019 zebra fish paper? (unless you guys have something else you've already both been using for dev purposes). Safe travels guys! Enjoy that Colomba pasquale Carlo! My favourite are the pistachio ones :) Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk (mailto:salvatore.lacaveraiii@nottingham.ac.uk) ORCID iD: 0000-0003-0210-3102 (https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...) Book a Coffee and Research chat with me! ------------------------------------ From: Pierre Bouvet Sent: 14 April 2025 10:04 To: Carlo Bevilacqua Cc: Sal La Cavera iii (staff) ; Sebastian Hambura Subject: Re: [EXTERN] Practice dataset Hi, That sounds awesome, wish I could come with you to Italy, must be wonderful at this time of year!! Perfect for legal department @ EMBL. To be clear, I think your local approach is robust by design but I’m always scared of having things online (that’s why I never used GitHub before this project ^^). In particular here, my concern is mainly on people being able to have somehow access to the data that will be opened or treated (I’m still traumatized by how stupidly complex it was to work with human samples in France legally speaking) and people being able to modify the code so it downloads whatever. I’ve looked a bit this weekend and people @ HDF have done things with safety in mind so there are normally no SQL-like attacks possible and no possibility of saving compiled programs so I guess just adding a layer between your code and the user to filter all non-HDF5 files coming in or out, and adding hashing to the original data to confirm that no corruption will occur is enough but I’m no specialist in cybersecurity. For third-party attacks it’s trickier. I looked into the H5Web GitHub page to try and understand how they did it, but I’m not super familiar with TypeScript so aside the HTTPS protocol, I don’t really know how they do it. I might be over-worrying here (kind of my specialty) but I think if this project is successful and BLS develops enough, it will become a real problem so might as well do it now. Have a nice time in Italy! Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 13/4/25, at 13:18, Carlo Bevilacqua (mailto:carlo.bevilacqua@embl.de) wrote: Hi guys, thanks for getting started the process of generating a standard dataset. Sorry for the late reply but in the last days I was focusing more on lab work as I want to finish something before flying to Italy next Thursday. I will do home office when in Italy and I will try to generate a file with some experimental data. @Pierre regarding your concerns about GDPR, we could for sure double check with the legal department at EMBL at a later stage, but I want to highlight that the data is not being transfered to any external server, it is just being read by the browser (basically the browser is acting like your Qt GUI and in principle it could run completely offline once the website is loaded). I think we should put a disclaimer on the website about this and maybe the legal department could help on the exact phrasing. Best, Carlo On Tue, Apr 8, 2025 at 18:11, Pierre Bouvet (mailto:pierre.bouvet@meduniwien.ac.at) wrote: Hi everyone, I’ve created a mock file that would correspond to an angle-resolved measurement with calibration in angle and impulse response function. It’s a little heavy so I prefer to send it using wetransfer (all the data are simulated): https://we.tl/t-Om7OrG9Mnu (https://we.tl/t-Om7OrG9Mnu) I’m sending you this one first because it is the most exotic BLS technique in terms of the needs for data storage (I think). It asked me to revise the format a lot when I started working on it. The trickiest conceptual limitation is that I place the detector in the Fourier plane of the sample, meaning that I entangle the frequency axis, the azimuthal angles of collections, the polar angle of collections, and the positions on the detector, leading to the need of multiple abscissa defined on the same axis (I promise I'll try explaining it better with images and all when I’ll start writing the article). This technique will at one point (when we get the green light to buy a new detector) be used for imagery, by extracting the c_ii coefficients and plotting them, but I kept things as simple as possible for now as I’m still brainstorming a lot the ways to use this method effectively in biological samples. I have to underline that with the few devices I’m working with here, I find that in general, each type of spectrometer leads to a slightly different but specific representation of data that falls under the description of the format I made (for example, in time domain, you need an abscissa for the time axis, in TFP you never have raw data, in 1-VIPA you usually bin the signal on the detector, in 2-VIPA you usually have a dimensionality reduction to perform to pass from raw data to PSD…) so I’ll try to send you a file for each technique I use at the lab (plus time-resolved with the data you sent me) as soon as possible so you can cross-check compatibility. I’m currently doing two long acquisitions for two groups we collaborate with so I don’t know how long it’ll take, but I am confident all other techniques will be much easier to implement in Carlo’s structure. Another detail : I found an unexpected limitation in the HDF5_BLS library that doesn’t allow me to add attributes directly to the top group so for now each group has its own set of attributes. I’ve added that on the list of things to do ^^ For the time-domain compatibility, I forgot my parents where coming to Vienna this weekend so I unfortunately couldn’t do it, but it’s close to the top of the list now so it’ll be done soon :) Final point, this time more for Carlo & Sebastian: I realized that for some samples (human tissue for example), we might have to face some legal challenges. Since you’re not only proposing to read the file but also write in it using a web-based application, I’m scared we will at one point have to face the infamous GDPR, get a green light from an Ethics committee or get some kind of legal validation from a lawyer. So before putting anything on the server, do you know people that could either advise us on these safety points or better, overview them at EMBL? This is one of the few computer things I’ve never really studied because I always had the magic “just don’t put it online” joker - and also because I kind of flee legislation stuff whenever I can - but I guess this is not an option anymore so we need to do it properly (plus if we don’t, from what I understand we can get in huge troubles). Tell me what you think :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 7/4/25, at 10:39, Sal La Cavera Iii wrote: Good morning guys, Hope you had a nice sunny weekend! Just wanted to follow up from Friday's meeting to see if you had a test dataset we can use as the standard for this software project. I think I heard "smilies" and then also one with cells? Preferably Pierre an .h5 with your preferred structure etc, even if it's a work in project / likely to be modified a little bit in the future. If not I can whip something up, or repurpose something, but would prefer to work with a dataset that is most-representative and most useful for your guys! Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk (mailto:salvatore.lacaveraiii@nottingham.ac.uk) ORCID iD: 0000-0003-0210-3102 (https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...) Book a Coffee and Research chat with me! This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.
Thanks all 😊 Yes, indeed a quick update on how everything has evolved on different fronts might be nice. I could join quickly on Mon 26th @3pm if that works for everyone. Here’s Zoom link we can use… https://us02web.zoom.us/j/5191046969?pwd=alpzaldoZEd3N2ZEQ0hYZU1RR1dOdz09 Meeting ID: 519 104 6969 Passcode: jY3zH8 All the best, Kareem From: Carlo Bevilacqua via Software <software@biobrillouin.org> Reply to: Carlo Bevilacqua <carlo.bevilacqua@embl.de> Date: Thursday, 22. May 2025 at 13:08 To: <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset Hi Sal, thanks a lot for writing the script. Actually the specs changed a bit and here is the most updated version (nothing major, just better defining some fields). The main difference though is that we are now using Zarr file (as we discussed in our last meeting). In the meanwhile I wrote a library which exposes an intuitive interface to read/write to our file format (it is not online yet, but will be soon). I think the best way would be to use that to actually export Pierre's hdf5 to ours. Also Sebastian progressed quite a bit on the GUI side. Shall we have a meeting to update on each other's status? Would Monday 26th in the afternoon (e.g. 3pm) work for you? Best, Carlo On Wed, May 21, 2025 at 17:44, Sal La Cavera Iii <salvatore.lacaveraiii@nottingham.ac.uk> wrote: Hi guys, Hope everyone's doing well! Found some time to get this hierarchical-to-flat conversion script off the ground. In the below wetransfer link (3 days until it expires, so let me know if you need me to send it again) there's a python script that should be run with Pierre's multiparameter cell imaging dataset (h5 file) in the same directory. It doesn't map to every sub-group in Carlo's file_spec document but it does the basic ones like Raw_data, Frequency, Shift, Linewidth, *_std, etc. Additional datasets/attributes should be fairly easy to add, but probably this will require a different test file that contains some of these other parameters (e.g. PSD, Amplitude, IRF, Calibration, etc). It will probably require some rigidity on behalf of the end-user, e.g., when they're using Pierre's software to create an h5 file, the user shouldn't have the option to save the Frequency data as 'freq' or 'f' it'll have to be 'Frequency.' Also, this version doesn't accommodate multiple signal processing runs of the same dataset (Carlo parametrised this as /Analysis_{m} in the file spec), but again this should be pretty straight forward to implement (having a test file that contained {m+1} would be useful). Right now Analysis_0 is hardcoded in. Carlo/Sebastian does this file format trend towards what you're looking for for handing over Pierre's h5 file to your plotting side of things? https://we.tl/t-TszCgBzwfE Any feedback etc just let me know, happy to chat further, Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk ORCID iD: 0000-0003-0210-3102 Book a Coffee and Research chat with me! From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at> Sent: 17 April 2025 12:35 To: Sebastian Hambura <sebastian.hambura@embl.de> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk>; Carlo Bevilacqua <carlo.bevilacqua@embl.de> Subject: Re: [EXTERN] Practice dataset Hi Sebastian, Was just writing to you to say that in my hurry, I didn’t added the treated data for cell.h5. I’ll do that whenever I get a bit of time or when the GUI will be able to do it (most likely option B except if you want it fast then I can spend a few minutes doing that by hand, you tell me :) ). Your question is good, the way you’d do it for these data is to recover the index on the azimuthal and polar calibration datasets. It’s indeed impractical and very specific to this kind of data (I had the same problem with accessing the time axis with time-domain datasets). For now however, I’d say that it’s sufficient since - that I know of - I have the only spectrometer able to do that, and there are technique-specific ways to access this information. Since I guess you also want to be able to convert these information into your data format and not have to create technique-specific access function, one way to avoid altogether this issue could be to define all the treated datasets as 2D arrays with size - (512x512) in this example. Then you would have the shift[i, j] associated with PSD[i, j, :]. If you have other ideas, I’m open to suggestions :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 13:19, Sebastian Hambura <sebastian.hambura@embl.de> wrote: Hi Pierre, Ok, this clarifies a bit the whole thing. But then concretly, what is the spectra (as intensity = f(frequency) ) used for a given treated point ? For example, if I wanted to see the underlaying spectra (or acquired data + fit) of /Brillouin/Mock anisotropic sample/Treated/Shift[1000], what part of /Brillouin/Mock anisotropic sample/Frequency and /Brillouin/Mock anisotropic sample/Intensity would I need to display ? Best, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 17.04.2025 um 11:50 schrieb Pierre Bouvet: Hi Sebastian, Sorry for the lack of details, I’ll try to explain it as best I can For the Frequency and Intensity datasets, there are the size of my detector (for now a 512x512 but hopefully soon a 2048x2048 if we can get the money ^^). By nature, I am entangling 3 hyper parameters in my measures: shift frequency, azimuthal angle of observation, polar angle of observation (with errors on each of them that I would add later on but let’s say my device is perfect for now). Now the VIPA spectrometer creates constructive interferences for specific frequencies so after treatment, I will only get treated values for the points where constructive interference happened. For each of these points I will therefore have a shift, a linewidth, a value for the azimuthal angle and a value for the polar angle. The most direct way of storing the results is therefore to keep everything 1D and then, when trying to represent things, use a color mesh with the azimuthal and polar abscissa as the x and y positions of the shift and linewidth. For the calibration now, this is different because we can consider that the angular response of water follows the theory for homogeneous and isotropic solutions, meaning that we will essentially have no dependence of the shift and linewidth on the polar angles but a dependence of sin(theta/2) on the azimuthal angles. Therefore, it is possible to reconstruct a 2D image for the shift and linewidth on all the points of the detector (which are themselves associated with a value for azimuthal and polar angles, as you can see in the calibration group). That’s weird for cells.h5, H5Viewer told me nothing on VS, but on the other hand I might have my answer for why the file was super heavy, just need to figure out how to delete groups and datasets without truncating the file. Here is the same file but made without deleting files from the HDF5 file (I tried it on myHDF5 it seems to work): https://we.tl/t-WlmabmsXHo Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 11:27, Sebastian Hambura <sebastian.hambura@embl.de> wrote: Hi Pierre, I'm currently looking at your angle_resolved_dataset.h5, and maybe it's because I'm not familiar with your specific setup but I'm a bit confused: - Frequency and Intensity are 512*512 for all 3 experiments/datagroups - Treated for 'Mock anisotropic sample' is 3584(*1) : are you somehow using the same row of your PSD for multiple treated datapoints ? Or are you doing some 'subpixel' rows ? - Treated for '/Brillouin/Water spectrum for angle calibration' is 512*512: how does that work from your PSD data ? For the cells.h5 file, I tried opening it with https://myhdf5.hdfgroup.org to explore it, and the tool refusing to open it and report this error: (...) unable to read superblock major: File accessibility minor: Read failed #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Fsuper.c line 603 in H5F__super_read(): truncated file: eof = 309329920, sblock->base_addr = 0, stored_eof = 537790152 major: File accessibility minor: File has been truncated I'll see if other tools might still be able to read it Cheers, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 14.04.2025 um 17:56 schrieb Pierre Bouvet: Sure, here is the link for the ar-VIPA mock H5 file: https://we.tl/t-Sd3ZC8wIXn For a more classic file, here are some measures of cells (I only kept 2 cell types, 2 days of measure and 2 samples per day of measure for a total of 8 spectra + their frequency/treated arrays). I treated them using a custom code so I didn’t export the treatment steps in the JSON file I mentioned @ last meeting. Some of the measures are rubish but I guess you don’t care about it ^^ One problem I observed though is that I created this file by deleting elements from a larger file but it seems the memory is still allocated to the HDF5 file. So we get a 500MB file when a classic size for 8 spectra et al is usually under 100Mb. I’ll dig into that when I get time. Here is the link: https://we.tl/t-HUAf2lR9AF Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 14/4/25, at 17:04, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk> wrote: Hi both, Pierre I think that WeTransfer link expired! Any chance you can regenerate it? Potentially the simulated angle-resolved dataset will be a good late-stage-dev to test the software's generalisability? In the first instance should we just use a basic 2D scan of a cell or zebrafish, c. elegans, etc? From a 2-stage VIPA e.g. Something we can treat as the MNIST or ImageNet dataset in the Deep Learning community. Carlo, what about something from your 2019 zebra fish paper? (unless you guys have something else you've already both been using for dev purposes). Safe travels guys! Enjoy that Colomba pasquale Carlo! My favourite are the pistachio ones :) Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk ORCID iD: 0000-0003-0210-3102 <Outlook-xarjscql.png>Book a Coffee and Research chat with me! From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at> Sent: 14 April 2025 10:04 To: Carlo Bevilacqua <carlo.bevilacqua@embl.de> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk>; Sebastian Hambura <sebastian.hambura@embl.de> Subject: Re: [EXTERN] Practice dataset Hi, That sounds awesome, wish I could come with you to Italy, must be wonderful at this time of year!! Perfect for legal department @ EMBL. To be clear, I think your local approach is robust by design but I’m always scared of having things online (that’s why I never used GitHub before this project ^^). In particular here, my concern is mainly on people being able to have somehow access to the data that will be opened or treated (I’m still traumatized by how stupidly complex it was to work with human samples in France legally speaking) and people being able to modify the code so it downloads whatever. I’ve looked a bit this weekend and people @ HDF have done things with safety in mind so there are normally no SQL-like attacks possible and no possibility of saving compiled programs so I guess just adding a layer between your code and the user to filter all non-HDF5 files coming in or out, and adding hashing to the original data to confirm that no corruption will occur is enough but I’m no specialist in cybersecurity. For third-party attacks it’s trickier. I looked into the H5Web GitHub page to try and understand how they did it, but I’m not super familiar with TypeScript so aside the HTTPS protocol, I don’t really know how they do it. I might be over-worrying here (kind of my specialty) but I think if this project is successful and BLS develops enough, it will become a real problem so might as well do it now. Have a nice time in Italy! Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 13/4/25, at 13:18, Carlo Bevilacqua <carlo.bevilacqua@embl.de> wrote: Hi guys, thanks for getting started the process of generating a standard dataset. Sorry for the late reply but in the last days I was focusing more on lab work as I want to finish something before flying to Italy next Thursday. I will do home office when in Italy and I will try to generate a file with some experimental data. @Pierre regarding your concerns about GDPR, we could for sure double check with the legal department at EMBL at a later stage, but I want to highlight that the data is not being transfered to any external server, it is just being read by the browser (basically the browser is acting like your Qt GUI and in principle it could run completely offline once the website is loaded). I think we should put a disclaimer on the website about this and maybe the legal department could help on the exact phrasing. Best, Carlo On Tue, Apr 8, 2025 at 18:11, Pierre Bouvet <pierre.bouvet@meduniwien.ac.at> wrote: Hi everyone, I’ve created a mock file that would correspond to an angle-resolved measurement with calibration in angle and impulse response function. It’s a little heavy so I prefer to send it using wetransfer (all the data are simulated): https://we.tl/t-Om7OrG9Mnu I’m sending you this one first because it is the most exotic BLS technique in terms of the needs for data storage (I think). It asked me to revise the format a lot when I started working on it. The trickiest conceptual limitation is that I place the detector in the Fourier plane of the sample, meaning that I entangle the frequency axis, the azimuthal angles of collections, the polar angle of collections, and the positions on the detector, leading to the need of multiple abscissa defined on the same axis (I promise I'll try explaining it better with images and all when I’ll start writing the article). This technique will at one point (when we get the green light to buy a new detector) be used for imagery, by extracting the c_ii coefficients and plotting them, but I kept things as simple as possible for now as I’m still brainstorming a lot the ways to use this method effectively in biological samples. I have to underline that with the few devices I’m working with here, I find that in general, each type of spectrometer leads to a slightly different but specific representation of data that falls under the description of the format I made (for example, in time domain, you need an abscissa for the time axis, in TFP you never have raw data, in 1-VIPA you usually bin the signal on the detector, in 2-VIPA you usually have a dimensionality reduction to perform to pass from raw data to PSD…) so I’ll try to send you a file for each technique I use at the lab (plus time-resolved with the data you sent me) as soon as possible so you can cross-check compatibility. I’m currently doing two long acquisitions for two groups we collaborate with so I don’t know how long it’ll take, but I am confident all other techniques will be much easier to implement in Carlo’s structure. Another detail : I found an unexpected limitation in the HDF5_BLS library that doesn’t allow me to add attributes directly to the top group so for now each group has its own set of attributes. I’ve added that on the list of things to do ^^ For the time-domain compatibility, I forgot my parents where coming to Vienna this weekend so I unfortunately couldn’t do it, but it’s close to the top of the list now so it’ll be done soon :) Final point, this time more for Carlo & Sebastian: I realized that for some samples (human tissue for example), we might have to face some legal challenges. Since you’re not only proposing to read the file but also write in it using a web-based application, I’m scared we will at one point have to face the infamous GDPR, get a green light from an Ethics committee or get some kind of legal validation from a lawyer. So before putting anything on the server, do you know people that could either advise us on these safety points or better, overview them at EMBL? This is one of the few computer things I’ve never really studied because I always had the magic “just don’t put it online” joker - and also because I kind of flee legislation stuff whenever I can - but I guess this is not an option anymore so we need to do it properly (plus if we don’t, from what I understand we can get in huge troubles). Tell me what you think :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 7/4/25, at 10:39, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk> wrote: Good morning guys, Hope you had a nice sunny weekend! Just wanted to follow up from Friday's meeting to see if you had a test dataset we can use as the standard for this software project. I think I heard "smilies" and then also one with cells? Preferably Pierre an .h5 with your preferred structure etc, even if it's a work in project / likely to be modified a little bit in the future. If not I can whip something up, or repurpose something, but would prefer to work with a dataset that is most-representative and most useful for your guys! Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk ORCID iD: 0000-0003-0210-3102 <Outlook-fdl5qycq.png>Book a Coffee and Research chat with me! This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org To unsubscribe send an email to software-leave@biobrillouin.org
Thanks for updates Carlo! If possible are people available at the same time on Tuesday or Wednesday the 27th/28th? 26th is a holiday in the UK so I'll be out of town but will be back in the office the next day. --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 [cid:236a4452-0ef4-4155-aef4-82a3e1f197a2]<https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! ________________________________ From: Kareem Elsayad via Software <software@biobrillouin.org> Sent: 22 May 2025 13:52 To: software@biobrillouin.org <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset Thanks all ?? Yes, indeed a quick update on how everything has evolved on different fronts might be nice. I could join quickly on Mon 26th @3pm if that works for everyone. Here’s Zoom link we can use… https://us02web.zoom.us/j/5191046969?pwd=alpzaldoZEd3N2ZEQ0hYZU1RR1dOdz09 Meeting ID: 519 104 6969 Passcode: jY3zH8 All the best, Kareem From: Carlo Bevilacqua via Software <software@biobrillouin.org> Reply to: Carlo Bevilacqua <carlo.bevilacqua@embl.de> Date: Thursday, 22. May 2025 at 13:08 To: <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset Hi Sal, thanks a lot for writing the script. Actually the specs changed a bit and here<https://github.com/prevedel-lab/Brillouin-standard-file/blob/main/docs/BLS_f...> is the most updated version (nothing major, just better defining some fields). The main difference though is that we are now using Zarr file (as we discussed in our last meeting). In the meanwhile I wrote a library which exposes an intuitive interface to read/write to our file format (it is not online yet, but will be soon). I think the best way would be to use that to actually export Pierre's hdf5 to ours. Also Sebastian progressed quite a bit on the GUI side. Shall we have a meeting to update on each other's status? Would Monday 26th in the afternoon (e.g. 3pm) work for you? Best, Carlo On Wed, May 21, 2025 at 17:44, Sal La Cavera Iii <salvatore.lacaveraiii@nottingham.ac.uk> wrote: Hi guys, Hope everyone's doing well! Found some time to get this hierarchical-to-flat conversion script off the ground. In the below wetransfer link (3 days until it expires, so let me know if you need me to send it again) there's a python script that should be run with Pierre's multiparameter cell imaging dataset (h5 file) in the same directory. It doesn't map to every sub-group in Carlo's file_spec document<https://github.com/bio-brillouin/HDF5_BLS/blob/main/Bh5_file_spec.md> but it does the basic ones like Raw_data, Frequency, Shift, Linewidth, *_std, etc. Additional datasets/attributes should be fairly easy to add, but probably this will require a different test file that contains some of these other parameters (e.g. PSD, Amplitude, IRF, Calibration, etc). It will probably require some rigidity on behalf of the end-user, e.g., when they're using Pierre's software to create an h5 file, the user shouldn't have the option to save the Frequency data as 'freq' or 'f' it'll have to be 'Frequency.' Also, this version doesn't accommodate multiple signal processing runs of the same dataset (Carlo parametrised this as /Analysis_{m} in the file spec), but again this should be pretty straight forward to implement (having a test file that contained {m+1} would be useful). Right now Analysis_0 is hardcoded in. Carlo/Sebastian does this file format trend towards what you're looking for for handing over Pierre's h5 file to your plotting side of things? https://we.tl/t-TszCgBzwfE Any feedback etc just let me know, happy to chat further, Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 [cid:image001.png@01DBCB29.1AB68AF0]<https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! ________________________________ From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at<mailto:pierre.bouvet@meduniwien.ac.at>> Sent: 17 April 2025 12:35 To: Sebastian Hambura <sebastian.hambura@embl.de<mailto:sebastian.hambura@embl.de>> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk<mailto:ezzsl7@exmail.nottingham.ac.uk>>; Carlo Bevilacqua <carlo.bevilacqua@embl.de<mailto:carlo.bevilacqua@embl.de>> Subject: Re: [EXTERN] Practice dataset Hi Sebastian, Was just writing to you to say that in my hurry, I didn’t added the treated data for cell.h5. I’ll do that whenever I get a bit of time or when the GUI will be able to do it (most likely option B except if you want it fast then I can spend a few minutes doing that by hand, you tell me :) ). Your question is good, the way you’d do it for these data is to recover the index on the azimuthal and polar calibration datasets. It’s indeed impractical and very specific to this kind of data (I had the same problem with accessing the time axis with time-domain datasets). For now however, I’d say that it’s sufficient since - that I know of - I have the only spectrometer able to do that, and there are technique-specific ways to access this information. Since I guess you also want to be able to convert these information into your data format and not have to create technique-specific access function, one way to avoid altogether this issue could be to define all the treated datasets as 2D arrays with size - (512x512) in this example. Then you would have the shift[i, j] associated with PSD[i, j, :]. If you have other ideas, I’m open to suggestions :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 13:19, Sebastian Hambura <sebastian.hambura@embl.de<mailto:sebastian.hambura@embl.de>> wrote: Hi Pierre, Ok, this clarifies a bit the whole thing. But then concretly, what is the spectra (as intensity = f(frequency) ) used for a given treated point ? For example, if I wanted to see the underlaying spectra (or acquired data + fit) of /Brillouin/Mock anisotropic sample/Treated/Shift[1000], what part of /Brillouin/Mock anisotropic sample/Frequency and /Brillouin/Mock anisotropic sample/Intensity would I need to display ? Best, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 17.04.2025 um 11:50 schrieb Pierre Bouvet: Hi Sebastian, Sorry for the lack of details, I’ll try to explain it as best I can For the Frequency and Intensity datasets, there are the size of my detector (for now a 512x512 but hopefully soon a 2048x2048 if we can get the money ^^). By nature, I am entangling 3 hyper parameters in my measures: shift frequency, azimuthal angle of observation, polar angle of observation (with errors on each of them that I would add later on but let’s say my device is perfect for now). Now the VIPA spectrometer creates constructive interferences for specific frequencies so after treatment, I will only get treated values for the points where constructive interference happened. For each of these points I will therefore have a shift, a linewidth, a value for the azimuthal angle and a value for the polar angle. The most direct way of storing the results is therefore to keep everything 1D and then, when trying to represent things, use a color mesh with the azimuthal and polar abscissa as the x and y positions of the shift and linewidth. For the calibration now, this is different because we can consider that the angular response of water follows the theory for homogeneous and isotropic solutions, meaning that we will essentially have no dependence of the shift and linewidth on the polar angles but a dependence of sin(theta/2) on the azimuthal angles. Therefore, it is possible to reconstruct a 2D image for the shift and linewidth on all the points of the detector (which are themselves associated with a value for azimuthal and polar angles, as you can see in the calibration group). That’s weird for cells.h5, H5Viewer told me nothing on VS, but on the other hand I might have my answer for why the file was super heavy, just need to figure out how to delete groups and datasets without truncating the file. Here is the same file but made without deleting files from the HDF5 file (I tried it on myHDF5 it seems to work): https://we.tl/t-WlmabmsXHo Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 11:27, Sebastian Hambura <sebastian.hambura@embl.de><mailto:sebastian.hambura@embl.de> wrote: Hi Pierre, I'm currently looking at your angle_resolved_dataset.h5, and maybe it's because I'm not familiar with your specific setup but I'm a bit confused: - Frequency and Intensity are 512*512 for all 3 experiments/datagroups - Treated for 'Mock anisotropic sample' is 3584(*1) : are you somehow using the same row of your PSD for multiple treated datapoints ? Or are you doing some 'subpixel' rows ? - Treated for '/Brillouin/Water spectrum for angle calibration' is 512*512: how does that work from your PSD data ? For the cells.h5 file, I tried opening it with https://myhdf5.hdfgroup.org<https://myhdf5.hdfgroup.org/> to explore it, and the tool refusing to open it and report this error: (...) unable to read superblock major: File accessibility minor: Read failed #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Fsuper.c line 603 in H5F__super_read(): truncated file: eof = 309329920, sblock->base_addr = 0, stored_eof = 537790152 major: File accessibility minor: File has been truncated I'll see if other tools might still be able to read it Cheers, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 14.04.2025 um 17:56 schrieb Pierre Bouvet: Sure, here is the link for the ar-VIPA mock H5 file: https://we.tl/t-Sd3ZC8wIXn For a more classic file, here are some measures of cells (I only kept 2 cell types, 2 days of measure and 2 samples per day of measure for a total of 8 spectra + their frequency/treated arrays). I treated them using a custom code so I didn’t export the treatment steps in the JSON file I mentioned @ last meeting. Some of the measures are rubish but I guess you don’t care about it ^^ One problem I observed though is that I created this file by deleting elements from a larger file but it seems the memory is still allocated to the HDF5 file. So we get a 500MB file when a classic size for 8 spectra et al is usually under 100Mb. I’ll dig into that when I get time. Here is the link: https://we.tl/t-HUAf2lR9AF Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 14/4/25, at 17:04, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk><mailto:Salvatore.Lacaveraiii@nottingham.ac.uk> wrote: Hi both, Pierre I think that WeTransfer link expired! Any chance you can regenerate it? Potentially the simulated angle-resolved dataset will be a good late-stage-dev to test the software's generalisability? In the first instance should we just use a basic 2D scan of a cell or zebrafish, c. elegans, etc? From a 2-stage VIPA e.g. Something we can treat as the MNIST or ImageNet dataset in the Deep Learning community. Carlo, what about something from your 2019 zebra fish paper? (unless you guys have something else you've already both been using for dev purposes). Safe travels guys! Enjoy that Colomba pasquale Carlo! My favourite are the pistachio ones :) Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-xarjscql.png><https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! ________________________________ From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at<mailto:pierre.bouvet@meduniwien.ac.at>> Sent: 14 April 2025 10:04 To: Carlo Bevilacqua <carlo.bevilacqua@embl.de<mailto:carlo.bevilacqua@embl.de>> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk<mailto:ezzsl7@exmail.nottingham.ac.uk>>; Sebastian Hambura <sebastian.hambura@embl.de<mailto:sebastian.hambura@embl.de>> Subject: Re: [EXTERN] Practice dataset Hi, That sounds awesome, wish I could come with you to Italy, must be wonderful at this time of year!! Perfect for legal department @ EMBL. To be clear, I think your local approach is robust by design but I’m always scared of having things online (that’s why I never used GitHub before this project ^^). In particular here, my concern is mainly on people being able to have somehow access to the data that will be opened or treated (I’m still traumatized by how stupidly complex it was to work with human samples in France legally speaking) and people being able to modify the code so it downloads whatever. I’ve looked a bit this weekend and people @ HDF have done things with safety in mind so there are normally no SQL-like attacks possible and no possibility of saving compiled programs so I guess just adding a layer between your code and the user to filter all non-HDF5 files coming in or out, and adding hashing to the original data to confirm that no corruption will occur is enough but I’m no specialist in cybersecurity. For third-party attacks it’s trickier. I looked into the H5Web GitHub page to try and understand how they did it, but I’m not super familiar with TypeScript so aside the HTTPS protocol, I don’t really know how they do it. I might be over-worrying here (kind of my specialty) but I think if this project is successful and BLS develops enough, it will become a real problem so might as well do it now. Have a nice time in Italy! Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 13/4/25, at 13:18, Carlo Bevilacqua <carlo.bevilacqua@embl.de><mailto:carlo.bevilacqua@embl.de> wrote: Hi guys, thanks for getting started the process of generating a standard dataset. Sorry for the late reply but in the last days I was focusing more on lab work as I want to finish something before flying to Italy next Thursday. I will do home office when in Italy and I will try to generate a file with some experimental data. @Pierre regarding your concerns about GDPR, we could for sure double check with the legal department at EMBL at a later stage, but I want to highlight that the data is not being transfered to any external server, it is just being read by the browser (basically the browser is acting like your Qt GUI and in principle it could run completely offline once the website is loaded). I think we should put a disclaimer on the website about this and maybe the legal department could help on the exact phrasing. Best, Carlo On Tue, Apr 8, 2025 at 18:11, Pierre Bouvet <pierre.bouvet@meduniwien.ac.at><mailto:pierre.bouvet@meduniwien.ac.at> wrote: Hi everyone, I’ve created a mock file that would correspond to an angle-resolved measurement with calibration in angle and impulse response function. It’s a little heavy so I prefer to send it using wetransfer (all the data are simulated): https://we.tl/t-Om7OrG9Mnu I’m sending you this one first because it is the most exotic BLS technique in terms of the needs for data storage (I think). It asked me to revise the format a lot when I started working on it. The trickiest conceptual limitation is that I place the detector in the Fourier plane of the sample, meaning that I entangle the frequency axis, the azimuthal angles of collections, the polar angle of collections, and the positions on the detector, leading to the need of multiple abscissa defined on the same axis (I promise I'll try explaining it better with images and all when I’ll start writing the article). This technique will at one point (when we get the green light to buy a new detector) be used for imagery, by extracting the c_ii coefficients and plotting them, but I kept things as simple as possible for now as I’m still brainstorming a lot the ways to use this method effectively in biological samples. I have to underline that with the few devices I’m working with here, I find that in general, each type of spectrometer leads to a slightly different but specific representation of data that falls under the description of the format I made (for example, in time domain, you need an abscissa for the time axis, in TFP you never have raw data, in 1-VIPA you usually bin the signal on the detector, in 2-VIPA you usually have a dimensionality reduction to perform to pass from raw data to PSD…) so I’ll try to send you a file for each technique I use at the lab (plus time-resolved with the data you sent me) as soon as possible so you can cross-check compatibility. I’m currently doing two long acquisitions for two groups we collaborate with so I don’t know how long it’ll take, but I am confident all other techniques will be much easier to implement in Carlo’s structure. Another detail : I found an unexpected limitation in the HDF5_BLS library that doesn’t allow me to add attributes directly to the top group so for now each group has its own set of attributes. I’ve added that on the list of things to do ^^ For the time-domain compatibility, I forgot my parents where coming to Vienna this weekend so I unfortunately couldn’t do it, but it’s close to the top of the list now so it’ll be done soon :) Final point, this time more for Carlo & Sebastian: I realized that for some samples (human tissue for example), we might have to face some legal challenges. Since you’re not only proposing to read the file but also write in it using a web-based application, I’m scared we will at one point have to face the infamous GDPR, get a green light from an Ethics committee or get some kind of legal validation from a lawyer. So before putting anything on the server, do you know people that could either advise us on these safety points or better, overview them at EMBL? This is one of the few computer things I’ve never really studied because I always had the magic “just don’t put it online” joker - and also because I kind of flee legislation stuff whenever I can - but I guess this is not an option anymore so we need to do it properly (plus if we don’t, from what I understand we can get in huge troubles). Tell me what you think :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 7/4/25, at 10:39, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk<mailto:Salvatore.Lacaveraiii@nottingham.ac.uk>> wrote: Good morning guys, Hope you had a nice sunny weekend! Just wanted to follow up from Friday's meeting to see if you had a test dataset we can use as the standard for this software project. I think I heard "smilies" and then also one with cells? Preferably Pierre an .h5 with your preferred structure etc, even if it's a work in project / likely to be modified a little bit in the future. If not I can whip something up, or repurpose something, but would prefer to work with a dataset that is most-representative and most useful for your guys! Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-fdl5qycq.png><https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org To unsubscribe send an email to software-leave@biobrillouin.org This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.
Dear all, would be great to meet - however, I think there was a misunderstanding concerning dates when I talked to Carlo about this. I actually proposed Mon. June 2nd at 3pm. Since the 26th is anyway a holiday in the UK, would everyone be available then (or later that day)? Best, RObert -- Dr. Robert Prevedel Group Leader Cell Biology and Biophysics Unit European Molecular Biology Laboratory Meyerhofstr. 1 69117 Heidelberg, Germany Phone: +49 6221 387-8722 Email: robert.prevedel@embl.de http://www.prevedel.embl.de
On 22.05.2025, at 15:00, Sal La_Cavera_Iii via Software <software@biobrillouin.org> wrote:
Thanks for updates Carlo! If possible are people available at the same time on Tuesday or Wednesday the 27th/28th? 26th is a holiday in the UK so I'll be out of town but will be back in the office the next day.
--------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-y324snrw.png> <https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! From: Kareem Elsayad via Software <software@biobrillouin.org <mailto:software@biobrillouin.org>> Sent: 22 May 2025 13:52 To: software@biobrillouin.org <mailto:software@biobrillouin.org> <software@biobrillouin.org <mailto:software@biobrillouin.org>> Subject: [Software] Re: [EXTERN] Practice dataset
Thanks all 😊 Yes, indeed a quick update on how everything has evolved on different fronts might be nice. I could join quickly on Mon 26th @3pm if that works for everyone.
Here’s Zoom link we can use… https://us02web.zoom.us/j/5191046969?pwd=alpzaldoZEd3N2ZEQ0hYZU1RR1dOdz09 Meeting ID: 519 104 6969 Passcode: jY3zH8
All the best, Kareem
From: Carlo Bevilacqua via Software <software@biobrillouin.org <mailto:software@biobrillouin.org>> Reply to: Carlo Bevilacqua <carlo.bevilacqua@embl.de <mailto:carlo.bevilacqua@embl.de>> Date: Thursday, 22. May 2025 at 13:08 To: <software@biobrillouin.org <mailto:software@biobrillouin.org>> Subject: [Software] Re: [EXTERN] Practice dataset
Hi Sal, thanks a lot for writing the script.
Actually the specs changed a bit and here <https://github.com/prevedel-lab/Brillouin-standard-file/blob/main/docs/BLS_f...> is the most updated version (nothing major, just better defining some fields). The main difference though is that we are now using Zarr file (as we discussed in our last meeting). In the meanwhile I wrote a library which exposes an intuitive interface to read/write to our file format (it is not online yet, but will be soon). I think the best way would be to use that to actually export Pierre's hdf5 to ours.
Also Sebastian progressed quite a bit on the GUI side.
Shall we have a meeting to update on each other's status? Would Monday 26th in the afternoon (e.g. 3pm) work for you?
Best, Carlo
On Wed, May 21, 2025 at 17:44, Sal La Cavera Iii <salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk>> wrote: Hi guys,
Hope everyone's doing well!
Found some time to get this hierarchical-to-flat conversion script off the ground. In the below wetransfer link (3 days until it expires, so let me know if you need me to send it again) there's a python script that should be run with Pierre's multiparameter cell imaging dataset (h5 file) in the same directory. It doesn't map to every sub-group in Carlo's file_spec document <https://github.com/bio-brillouin/HDF5_BLS/blob/main/Bh5_file_spec.md> but it does the basic ones like Raw_data, Frequency, Shift, Linewidth, *_std, etc. Additional datasets/attributes should be fairly easy to add, but probably this will require a different test file that contains some of these other parameters (e.g. PSD, Amplitude, IRF, Calibration, etc). It will probably require some rigidity on behalf of the end-user, e.g., when they're using Pierre's software to create an h5 file, the user shouldn't have the option to save the Frequency data as 'freq' or 'f' it'll have to be 'Frequency.'
Also, this version doesn't accommodate multiple signal processing runs of the same dataset (Carlo parametrised this as /Analysis_{m} in the file spec), but again this should be pretty straight forward to implement (having a test file that contained {m+1} would be useful). Right now Analysis_0 is hardcoded in.
Carlo/Sebastian does this file format trend towards what you're looking for for handing over Pierre's h5 file to your plotting side of things?
Any feedback etc just let me know, happy to chat further,
Cheers,
Sal
--------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <image001.png> <https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at <mailto:pierre.bouvet@meduniwien.ac.at>> Sent: 17 April 2025 12:35 To: Sebastian Hambura <sebastian.hambura@embl.de <mailto:sebastian.hambura@embl.de>> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk <mailto:ezzsl7@exmail.nottingham.ac.uk>>; Carlo Bevilacqua <carlo.bevilacqua@embl.de <mailto:carlo.bevilacqua@embl.de>> Subject: Re: [EXTERN] Practice dataset
Hi Sebastian,
Was just writing to you to say that in my hurry, I didn’t added the treated data for cell.h5. I’ll do that whenever I get a bit of time or when the GUI will be able to do it (most likely option B except if you want it fast then I can spend a few minutes doing that by hand, you tell me :) ). Your question is good, the way you’d do it for these data is to recover the index on the azimuthal and polar calibration datasets. It’s indeed impractical and very specific to this kind of data (I had the same problem with accessing the time axis with time-domain datasets). For now however, I’d say that it’s sufficient since - that I know of - I have the only spectrometer able to do that, and there are technique-specific ways to access this information. Since I guess you also want to be able to convert these information into your data format and not have to create technique-specific access function, one way to avoid altogether this issue could be to define all the treated datasets as 2D arrays with size - (512x512) in this example. Then you would have the shift[i, j] associated with PSD[i, j, :]. If you have other ideas, I’m open to suggestions :)
Best,
Pierre
Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria
On 17/4/25, at 13:19, Sebastian Hambura <sebastian.hambura@embl.de <mailto:sebastian.hambura@embl.de>> wrote:
Hi Pierre,
Ok, this clarifies a bit the whole thing. But then concretly, what is the spectra (as intensity = f(frequency) ) used for a given treated point ? For example, if I wanted to see the underlaying spectra (or acquired data + fit) of /Brillouin/Mock anisotropic sample/Treated/Shift[1000], what part of /Brillouin/Mock anisotropic sample/Frequency and /Brillouin/Mock anisotropic sample/Intensity would I need to display ? Best,
Sebastian
Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 17.04.2025 um 11:50 schrieb Pierre Bouvet: Hi Sebastian,
Sorry for the lack of details, I’ll try to explain it as best I can For the Frequency and Intensity datasets, there are the size of my detector (for now a 512x512 but hopefully soon a 2048x2048 if we can get the money ^^). By nature, I am entangling 3 hyper parameters in my measures: shift frequency, azimuthal angle of observation, polar angle of observation (with errors on each of them that I would add later on but let’s say my device is perfect for now). Now the VIPA spectrometer creates constructive interferences for specific frequencies so after treatment, I will only get treated values for the points where constructive interference happened. For each of these points I will therefore have a shift, a linewidth, a value for the azimuthal angle and a value for the polar angle. The most direct way of storing the results is therefore to keep everything 1D and then, when trying to represent things, use a color mesh with the azimuthal and polar abscissa as the x and y positions of the shift and linewidth. For the calibration now, this is different because we can consider that the angular response of water follows the theory for homogeneous and isotropic solutions, meaning that we will essentially have no dependence of the shift and linewidth on the polar angles but a dependence of sin(theta/2) on the azimuthal angles. Therefore, it is possible to reconstruct a 2D image for the shift and linewidth on all the points of the detector (which are themselves associated with a value for azimuthal and polar angles, as you can see in the calibration group). That’s weird for cells.h5, H5Viewer told me nothing on VS, but on the other hand I might have my answer for why the file was super heavy, just need to figure out how to delete groups and datasets without truncating the file. Here is the same file but made without deleting files from the HDF5 file (I tried it on myHDF5 it seems to work): https://we.tl/t-WlmabmsXHo
Best,
Pierre
Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria
On 17/4/25, at 11:27, Sebastian Hambura <sebastian.hambura@embl.de> <mailto:sebastian.hambura@embl.de> wrote:
Hi Pierre,
I'm currently looking at your angle_resolved_dataset.h5, and maybe it's because I'm not familiar with your specific setup but I'm a bit confused: - Frequency and Intensity are 512*512 for all 3 experiments/datagroups - Treated for 'Mock anisotropic sample' is 3584(*1) : are you somehow using the same row of your PSD for multiple treated datapoints ? Or are you doing some 'subpixel' rows ? - Treated for '/Brillouin/Water spectrum for angle calibration' is 512*512: how does that work from your PSD data ?
For the cells.h5 file, I tried opening it with https://myhdf5.hdfgroup.org <https://myhdf5.hdfgroup.org/> to explore it, and the tool refusing to open it and report this error: (...) unable to read superblock major: File accessibility minor: Read failed #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Fsuper.c line 603 in H5F__super_read(): truncated file: eof = 309329920, sblock->base_addr = 0, stored_eof = 537790152 major: File accessibility minor: File has been truncated I'll see if other tools might still be able to read it Cheers,
Sebastian
Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 14.04.2025 um 17:56 schrieb Pierre Bouvet: Sure, here is the link for the ar-VIPA mock H5 file: https://we.tl/t-Sd3ZC8wIXn For a more classic file, here are some measures of cells (I only kept 2 cell types, 2 days of measure and 2 samples per day of measure for a total of 8 spectra + their frequency/treated arrays). I treated them using a custom code so I didn’t export the treatment steps in the JSON file I mentioned @ last meeting. Some of the measures are rubish but I guess you don’t care about it ^^ One problem I observed though is that I created this file by deleting elements from a larger file but it seems the memory is still allocated to the HDF5 file. So we get a 500MB file when a classic size for 8 spectra et al is usually under 100Mb. I’ll dig into that when I get time. Here is the link: https://we.tl/t-HUAf2lR9AF
Best,
Pierre
Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria
On 14/4/25, at 17:04, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk> <mailto:Salvatore.Lacaveraiii@nottingham.ac.uk> wrote:
Hi both,
Pierre I think that WeTransfer link expired! Any chance you can regenerate it?
Potentially the simulated angle-resolved dataset will be a good late-stage-dev to test the software's generalisability? In the first instance should we just use a basic 2D scan of a cell or zebrafish, c. elegans, etc? From a 2-stage VIPA e.g. Something we can treat as the MNIST or ImageNet dataset in the Deep Learning community.
Carlo, what about something from your 2019 zebra fish paper? (unless you guys have something else you've already both been using for dev purposes).
Safe travels guys! Enjoy that Colomba pasquale Carlo! My favourite are the pistachio ones :)
Cheers,
Sal
--------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-xarjscql.png> <https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at <mailto:pierre.bouvet@meduniwien.ac.at>> Sent: 14 April 2025 10:04 To: Carlo Bevilacqua <carlo.bevilacqua@embl.de <mailto:carlo.bevilacqua@embl.de>> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk <mailto:ezzsl7@exmail.nottingham.ac.uk>>; Sebastian Hambura <sebastian.hambura@embl.de <mailto:sebastian.hambura@embl.de>> Subject: Re: [EXTERN] Practice dataset
Hi,
That sounds awesome, wish I could come with you to Italy, must be wonderful at this time of year!! Perfect for legal department @ EMBL. To be clear, I think your local approach is robust by design but I’m always scared of having things online (that’s why I never used GitHub before this project ^^). In particular here, my concern is mainly on people being able to have somehow access to the data that will be opened or treated (I’m still traumatized by how stupidly complex it was to work with human samples in France legally speaking) and people being able to modify the code so it downloads whatever. I’ve looked a bit this weekend and people @ HDF have done things with safety in mind so there are normally no SQL-like attacks possible and no possibility of saving compiled programs so I guess just adding a layer between your code and the user to filter all non-HDF5 files coming in or out, and adding hashing to the original data to confirm that no corruption will occur is enough but I’m no specialist in cybersecurity. For third-party attacks it’s trickier. I looked into the H5Web GitHub page to try and understand how they did it, but I’m not super familiar with TypeScript so aside the HTTPS protocol, I don’t really know how they do it. I might be over-worrying here (kind of my specialty) but I think if this project is successful and BLS develops enough, it will become a real problem so might as well do it now. Have a nice time in Italy!
Best,
Pierre
Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria
On 13/4/25, at 13:18, Carlo Bevilacqua <carlo.bevilacqua@embl.de> <mailto:carlo.bevilacqua@embl.de> wrote:
Hi guys, thanks for getting started the process of generating a standard dataset.
Sorry for the late reply but in the last days I was focusing more on lab work as I want to finish something before flying to Italy next Thursday. I will do home office when in Italy and I will try to generate a file with some experimental data.
@Pierre regarding your concerns about GDPR, we could for sure double check with the legal department at EMBL at a later stage, but I want to highlight that the data is not being transfered to any external server, it is just being read by the browser (basically the browser is acting like your Qt GUI and in principle it could run completely offline once the website is loaded). I think we should put a disclaimer on the website about this and maybe the legal department could help on the exact phrasing.
Best, Carlo
On Tue, Apr 8, 2025 at 18:11, Pierre Bouvet <pierre.bouvet@meduniwien.ac.at> <mailto:pierre.bouvet@meduniwien.ac.at> wrote: Hi everyone,
I’ve created a mock file that would correspond to an angle-resolved measurement with calibration in angle and impulse response function. It’s a little heavy so I prefer to send it using wetransfer (all the data are simulated): https://we.tl/t-Om7OrG9Mnu I’m sending you this one first because it is the most exotic BLS technique in terms of the needs for data storage (I think). It asked me to revise the format a lot when I started working on it. The trickiest conceptual limitation is that I place the detector in the Fourier plane of the sample, meaning that I entangle the frequency axis, the azimuthal angles of collections, the polar angle of collections, and the positions on the detector, leading to the need of multiple abscissa defined on the same axis (I promise I'll try explaining it better with images and all when I’ll start writing the article). This technique will at one point (when we get the green light to buy a new detector) be used for imagery, by extracting the c_ii coefficients and plotting them, but I kept things as simple as possible for now as I’m still brainstorming a lot the ways to use this method effectively in biological samples. I have to underline that with the few devices I’m working with here, I find that in general, each type of spectrometer leads to a slightly different but specific representation of data that falls under the description of the format I made (for example, in time domain, you need an abscissa for the time axis, in TFP you never have raw data, in 1-VIPA you usually bin the signal on the detector, in 2-VIPA you usually have a dimensionality reduction to perform to pass from raw data to PSD…) so I’ll try to send you a file for each technique I use at the lab (plus time-resolved with the data you sent me) as soon as possible so you can cross-check compatibility. I’m currently doing two long acquisitions for two groups we collaborate with so I don’t know how long it’ll take, but I am confident all other techniques will be much easier to implement in Carlo’s structure. Another detail : I found an unexpected limitation in the HDF5_BLS library that doesn’t allow me to add attributes directly to the top group so for now each group has its own set of attributes. I’ve added that on the list of things to do ^^ For the time-domain compatibility, I forgot my parents where coming to Vienna this weekend so I unfortunately couldn’t do it, but it’s close to the top of the list now so it’ll be done soon :) Final point, this time more for Carlo & Sebastian: I realized that for some samples (human tissue for example), we might have to face some legal challenges. Since you’re not only proposing to read the file but also write in it using a web-based application, I’m scared we will at one point have to face the infamous GDPR, get a green light from an Ethics committee or get some kind of legal validation from a lawyer. So before putting anything on the server, do you know people that could either advise us on these safety points or better, overview them at EMBL? This is one of the few computer things I’ve never really studied because I always had the magic “just don’t put it online” joker - and also because I kind of flee legislation stuff whenever I can - but I guess this is not an option anymore so we need to do it properly (plus if we don’t, from what I understand we can get in huge troubles). Tell me what you think :)
Best,
Pierre
Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria
On 7/4/25, at 10:39, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk <mailto:Salvatore.Lacaveraiii@nottingham.ac.uk>> wrote:
Good morning guys,
Hope you had a nice sunny weekend! Just wanted to follow up from Friday's meeting to see if you had a test dataset we can use as the standard for this software project. I think I heard "smilies" and then also one with cells? Preferably Pierre an .h5 with your preferred structure etc, even if it's a work in project / likely to be modified a little bit in the future. If not I can whip something up, or repurpose something, but would prefer to work with a dataset that is most-representative and most useful for your guys!
Cheers,
Sal
--------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-fdl5qycq.png> <https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org <mailto:software@biobrillouin.org> To unsubscribe send an email to software-leave@biobrillouin.org <mailto:software-leave@biobrillouin.org>This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org <mailto:software@biobrillouin.org> To unsubscribe send an email to software-leave@biobrillouin.org <mailto:software-leave@biobrillouin.org>
Ok by me From: Robert Prevedel via Software <software@biobrillouin.org> Reply to: Robert Prevedel <prevedel@embl.de> Date: Thursday, 22. May 2025 at 16:12 To: Sal La_Cavera_Iii <Sal.La_Cavera_Iii@nottingham.ac.uk> Cc: "software@biobrillouin.org" <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset Dear all, would be great to meet - however, I think there was a misunderstanding concerning dates when I talked to Carlo about this. I actually proposed Mon. June 2nd at 3pm. Since the 26th is anyway a holiday in the UK, would everyone be available then (or later that day)? Best, RObert -- Dr. Robert Prevedel Group Leader Cell Biology and Biophysics Unit European Molecular Biology Laboratory Meyerhofstr. 1 69117 Heidelberg, Germany Phone: +49 6221 387-8722 Email: robert.prevedel@embl.de http://www.prevedel.embl.de On 22.05.2025, at 15:00, Sal La_Cavera_Iii via Software <software@biobrillouin.org> wrote: Thanks for updates Carlo! If possible are people available at the same time on Tuesday or Wednesday the 27th/28th? 26th is a holiday in the UK so I'll be out of town but will be back in the office the next day. --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk ORCID iD: 0000-0003-0210-3102 <Outlook-y324snrw.png>Book a Coffee and Research chat with me! From: Kareem Elsayad via Software <software@biobrillouin.org> Sent: 22 May 2025 13:52 To: software@biobrillouin.org <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset Thanks all 😊 Yes, indeed a quick update on how everything has evolved on different fronts might be nice. I could join quickly on Mon 26th @3pm if that works for everyone. Here’s Zoom link we can use… https://us02web.zoom.us/j/5191046969?pwd=alpzaldoZEd3N2ZEQ0hYZU1RR1dOdz09 Meeting ID: 519 104 6969 Passcode: jY3zH8 All the best, Kareem From: Carlo Bevilacqua via Software <software@biobrillouin.org> Reply to: Carlo Bevilacqua <carlo.bevilacqua@embl.de> Date: Thursday, 22. May 2025 at 13:08 To: <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset Hi Sal, thanks a lot for writing the script. Actually the specs changed a bit and here is the most updated version (nothing major, just better defining some fields). The main difference though is that we are now using Zarr file (as we discussed in our last meeting). In the meanwhile I wrote a library which exposes an intuitive interface to read/write to our file format (it is not online yet, but will be soon). I think the best way would be to use that to actually export Pierre's hdf5 to ours. Also Sebastian progressed quite a bit on the GUI side. Shall we have a meeting to update on each other's status? Would Monday 26th in the afternoon (e.g. 3pm) work for you? Best, Carlo On Wed, May 21, 2025 at 17:44, Sal La Cavera Iii <salvatore.lacaveraiii@nottingham.ac.uk> wrote: Hi guys, Hope everyone's doing well! Found some time to get this hierarchical-to-flat conversion script off the ground. In the below wetransfer link (3 days until it expires, so let me know if you need me to send it again) there's a python script that should be run with Pierre's multiparameter cell imaging dataset (h5 file) in the same directory. It doesn't map to every sub-group in Carlo's file_spec document but it does the basic ones like Raw_data, Frequency, Shift, Linewidth, *_std, etc. Additional datasets/attributes should be fairly easy to add, but probably this will require a different test file that contains some of these other parameters (e.g. PSD, Amplitude, IRF, Calibration, etc). It will probably require some rigidity on behalf of the end-user, e.g., when they're using Pierre's software to create an h5 file, the user shouldn't have the option to save the Frequency data as 'freq' or 'f' it'll have to be 'Frequency.' Also, this version doesn't accommodate multiple signal processing runs of the same dataset (Carlo parametrised this as /Analysis_{m} in the file spec), but again this should be pretty straight forward to implement (having a test file that contained {m+1} would be useful). Right now Analysis_0 is hardcoded in. Carlo/Sebastian does this file format trend towards what you're looking for for handing over Pierre's h5 file to your plotting side of things? https://we.tl/t-TszCgBzwfE Any feedback etc just let me know, happy to chat further, Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk ORCID iD: 0000-0003-0210-3102 <image001.png> Book a Coffee and Research chat with me! From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at> Sent: 17 April 2025 12:35 To: Sebastian Hambura <sebastian.hambura@embl.de> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk>; Carlo Bevilacqua <carlo.bevilacqua@embl.de> Subject: Re: [EXTERN] Practice dataset Hi Sebastian, Was just writing to you to say that in my hurry, I didn’t added the treated data for cell.h5. I’ll do that whenever I get a bit of time or when the GUI will be able to do it (most likely option B except if you want it fast then I can spend a few minutes doing that by hand, you tell me :) ). Your question is good, the way you’d do it for these data is to recover the index on the azimuthal and polar calibration datasets. It’s indeed impractical and very specific to this kind of data (I had the same problem with accessing the time axis with time-domain datasets). For now however, I’d say that it’s sufficient since - that I know of - I have the only spectrometer able to do that, and there are technique-specific ways to access this information. Since I guess you also want to be able to convert these information into your data format and not have to create technique-specific access function, one way to avoid altogether this issue could be to define all the treated datasets as 2D arrays with size - (512x512) in this example. Then you would have the shift[i, j] associated with PSD[i, j, :]. If you have other ideas, I’m open to suggestions :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 13:19, Sebastian Hambura <sebastian.hambura@embl.de> wrote: Hi Pierre, Ok, this clarifies a bit the whole thing. But then concretly, what is the spectra (as intensity = f(frequency) ) used for a given treated point ? For example, if I wanted to see the underlaying spectra (or acquired data + fit) of /Brillouin/Mock anisotropic sample/Treated/Shift[1000], what part of /Brillouin/Mock anisotropic sample/Frequency and /Brillouin/Mock anisotropic sample/Intensity would I need to display ? Best, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 17.04.2025 um 11:50 schrieb Pierre Bouvet: Hi Sebastian, Sorry for the lack of details, I’ll try to explain it as best I can For the Frequency and Intensity datasets, there are the size of my detector (for now a 512x512 but hopefully soon a 2048x2048 if we can get the money ^^). By nature, I am entangling 3 hyper parameters in my measures: shift frequency, azimuthal angle of observation, polar angle of observation (with errors on each of them that I would add later on but let’s say my device is perfect for now). Now the VIPA spectrometer creates constructive interferences for specific frequencies so after treatment, I will only get treated values for the points where constructive interference happened. For each of these points I will therefore have a shift, a linewidth, a value for the azimuthal angle and a value for the polar angle. The most direct way of storing the results is therefore to keep everything 1D and then, when trying to represent things, use a color mesh with the azimuthal and polar abscissa as the x and y positions of the shift and linewidth. For the calibration now, this is different because we can consider that the angular response of water follows the theory for homogeneous and isotropic solutions, meaning that we will essentially have no dependence of the shift and linewidth on the polar angles but a dependence of sin(theta/2) on the azimuthal angles. Therefore, it is possible to reconstruct a 2D image for the shift and linewidth on all the points of the detector (which are themselves associated with a value for azimuthal and polar angles, as you can see in the calibration group). That’s weird for cells.h5, H5Viewer told me nothing on VS, but on the other hand I might have my answer for why the file was super heavy, just need to figure out how to delete groups and datasets without truncating the file. Here is the same file but made without deleting files from the HDF5 file (I tried it on myHDF5 it seems to work): https://we.tl/t-WlmabmsXHo Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 11:27, Sebastian Hambura <sebastian.hambura@embl.de> wrote: Hi Pierre, I'm currently looking at your angle_resolved_dataset.h5, and maybe it's because I'm not familiar with your specific setup but I'm a bit confused: - Frequency and Intensity are 512*512 for all 3 experiments/datagroups - Treated for 'Mock anisotropic sample' is 3584(*1) : are you somehow using the same row of your PSD for multiple treated datapoints ? Or are you doing some 'subpixel' rows ? - Treated for '/Brillouin/Water spectrum for angle calibration' is 512*512: how does that work from your PSD data ? For the cells.h5 file, I tried opening it with https://myhdf5.hdfgroup.org to explore it, and the tool refusing to open it and report this error: (...) unable to read superblock major: File accessibility minor: Read failed #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Fsuper.c line 603 in H5F__super_read(): truncated file: eof = 309329920, sblock->base_addr = 0, stored_eof = 537790152 major: File accessibility minor: File has been truncated I'll see if other tools might still be able to read it Cheers, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 14.04.2025 um 17:56 schrieb Pierre Bouvet: Sure, here is the link for the ar-VIPA mock H5 file: https://we.tl/t-Sd3ZC8wIXn For a more classic file, here are some measures of cells (I only kept 2 cell types, 2 days of measure and 2 samples per day of measure for a total of 8 spectra + their frequency/treated arrays). I treated them using a custom code so I didn’t export the treatment steps in the JSON file I mentioned @ last meeting. Some of the measures are rubish but I guess you don’t care about it ^^ One problem I observed though is that I created this file by deleting elements from a larger file but it seems the memory is still allocated to the HDF5 file. So we get a 500MB file when a classic size for 8 spectra et al is usually under 100Mb. I’ll dig into that when I get time. Here is the link: https://we.tl/t-HUAf2lR9AF Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 14/4/25, at 17:04, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk> wrote: Hi both, Pierre I think that WeTransfer link expired! Any chance you can regenerate it? Potentially the simulated angle-resolved dataset will be a good late-stage-dev to test the software's generalisability? In the first instance should we just use a basic 2D scan of a cell or zebrafish, c. elegans, etc? From a 2-stage VIPA e.g. Something we can treat as the MNIST or ImageNet dataset in the Deep Learning community. Carlo, what about something from your 2019 zebra fish paper? (unless you guys have something else you've already both been using for dev purposes). Safe travels guys! Enjoy that Colomba pasquale Carlo! My favourite are the pistachio ones :) Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk ORCID iD: 0000-0003-0210-3102 <Outlook-xarjscql.png> Book a Coffee and Research chat with me! From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at> Sent: 14 April 2025 10:04 To: Carlo Bevilacqua <carlo.bevilacqua@embl.de> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk>; Sebastian Hambura <sebastian.hambura@embl.de> Subject: Re: [EXTERN] Practice dataset Hi, That sounds awesome, wish I could come with you to Italy, must be wonderful at this time of year!! Perfect for legal department @ EMBL. To be clear, I think your local approach is robust by design but I’m always scared of having things online (that’s why I never used GitHub before this project ^^). In particular here, my concern is mainly on people being able to have somehow access to the data that will be opened or treated (I’m still traumatized by how stupidly complex it was to work with human samples in France legally speaking) and people being able to modify the code so it downloads whatever. I’ve looked a bit this weekend and people @ HDF have done things with safety in mind so there are normally no SQL-like attacks possible and no possibility of saving compiled programs so I guess just adding a layer between your code and the user to filter all non-HDF5 files coming in or out, and adding hashing to the original data to confirm that no corruption will occur is enough but I’m no specialist in cybersecurity. For third-party attacks it’s trickier. I looked into the H5Web GitHub page to try and understand how they did it, but I’m not super familiar with TypeScript so aside the HTTPS protocol, I don’t really know how they do it. I might be over-worrying here (kind of my specialty) but I think if this project is successful and BLS develops enough, it will become a real problem so might as well do it now. Have a nice time in Italy! Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 13/4/25, at 13:18, Carlo Bevilacqua <carlo.bevilacqua@embl.de> wrote: Hi guys, thanks for getting started the process of generating a standard dataset. Sorry for the late reply but in the last days I was focusing more on lab work as I want to finish something before flying to Italy next Thursday. I will do home office when in Italy and I will try to generate a file with some experimental data. @Pierre regarding your concerns about GDPR, we could for sure double check with the legal department at EMBL at a later stage, but I want to highlight that the data is not being transfered to any external server, it is just being read by the browser (basically the browser is acting like your Qt GUI and in principle it could run completely offline once the website is loaded). I think we should put a disclaimer on the website about this and maybe the legal department could help on the exact phrasing. Best, Carlo On Tue, Apr 8, 2025 at 18:11, Pierre Bouvet <pierre.bouvet@meduniwien.ac.at> wrote: Hi everyone, I’ve created a mock file that would correspond to an angle-resolved measurement with calibration in angle and impulse response function. It’s a little heavy so I prefer to send it using wetransfer (all the data are simulated): https://we.tl/t-Om7OrG9Mnu I’m sending you this one first because it is the most exotic BLS technique in terms of the needs for data storage (I think). It asked me to revise the format a lot when I started working on it. The trickiest conceptual limitation is that I place the detector in the Fourier plane of the sample, meaning that I entangle the frequency axis, the azimuthal angles of collections, the polar angle of collections, and the positions on the detector, leading to the need of multiple abscissa defined on the same axis (I promise I'll try explaining it better with images and all when I’ll start writing the article). This technique will at one point (when we get the green light to buy a new detector) be used for imagery, by extracting the c_ii coefficients and plotting them, but I kept things as simple as possible for now as I’m still brainstorming a lot the ways to use this method effectively in biological samples. I have to underline that with the few devices I’m working with here, I find that in general, each type of spectrometer leads to a slightly different but specific representation of data that falls under the description of the format I made (for example, in time domain, you need an abscissa for the time axis, in TFP you never have raw data, in 1-VIPA you usually bin the signal on the detector, in 2-VIPA you usually have a dimensionality reduction to perform to pass from raw data to PSD…) so I’ll try to send you a file for each technique I use at the lab (plus time-resolved with the data you sent me) as soon as possible so you can cross-check compatibility. I’m currently doing two long acquisitions for two groups we collaborate with so I don’t know how long it’ll take, but I am confident all other techniques will be much easier to implement in Carlo’s structure. Another detail : I found an unexpected limitation in the HDF5_BLS library that doesn’t allow me to add attributes directly to the top group so for now each group has its own set of attributes. I’ve added that on the list of things to do ^^ For the time-domain compatibility, I forgot my parents where coming to Vienna this weekend so I unfortunately couldn’t do it, but it’s close to the top of the list now so it’ll be done soon :) Final point, this time more for Carlo & Sebastian: I realized that for some samples (human tissue for example), we might have to face some legal challenges. Since you’re not only proposing to read the file but also write in it using a web-based application, I’m scared we will at one point have to face the infamous GDPR, get a green light from an Ethics committee or get some kind of legal validation from a lawyer. So before putting anything on the server, do you know people that could either advise us on these safety points or better, overview them at EMBL? This is one of the few computer things I’ve never really studied because I always had the magic “just don’t put it online” joker - and also because I kind of flee legislation stuff whenever I can - but I guess this is not an option anymore so we need to do it properly (plus if we don’t, from what I understand we can get in huge troubles). Tell me what you think :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 7/4/25, at 10:39, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk> wrote: Good morning guys, Hope you had a nice sunny weekend! Just wanted to follow up from Friday's meeting to see if you had a test dataset we can use as the standard for this software project. I think I heard "smilies" and then also one with cells? Preferably Pierre an .h5 with your preferred structure etc, even if it's a work in project / likely to be modified a little bit in the future. If not I can whip something up, or repurpose something, but would prefer to work with a dataset that is most-representative and most useful for your guys! Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk ORCID iD: 0000-0003-0210-3102 <Outlook-fdl5qycq.png> Book a Coffee and Research chat with me! This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org To unsubscribe send an email to software-leave@biobrillouin.org This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org To unsubscribe send an email to software-leave@biobrillouin.org _______________________________________________ Software mailing list -- software@biobrillouin.org To unsubscribe send an email to software-leave@biobrillouin.org
OK for me too Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria
On 22/5/25, at 16:23, Kareem Elsayad via Software <software@biobrillouin.org> wrote:
Ok by me
From: Robert Prevedel via Software <software@biobrillouin.org <mailto:software@biobrillouin.org>> Reply to: Robert Prevedel <prevedel@embl.de <mailto:prevedel@embl.de>> Date: Thursday, 22. May 2025 at 16:12 To: Sal La_Cavera_Iii <Sal.La_Cavera_Iii@nottingham.ac.uk> Cc: "software@biobrillouin.org" <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset
Dear all,
would be great to meet - however, I think there was a misunderstanding concerning dates when I talked to Carlo about this. I actually proposed Mon. June 2nd at 3pm. Since the 26th is anyway a holiday in the UK, would everyone be available then (or later that day)?
Best,
RObert -- Dr. Robert Prevedel Group Leader Cell Biology and Biophysics Unit European Molecular Biology Laboratory Meyerhofstr. 1 69117 Heidelberg, Germany
Phone: +49 6221 387-8722 Email: robert.prevedel@embl.de http://www.prevedel.embl.de
On 22.05.2025, at 15:00, Sal La_Cavera_Iii via Software <software@biobrillouin.org> wrote:
Thanks for updates Carlo! If possible are people available at the same time on Tuesday or Wednesday the 27th/28th? 26th is a holiday in the UK so I'll be out of town but will be back in the office the next day.
--------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-y324snrw.png> <https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! From: Kareem Elsayad via Software <software@biobrillouin.org <mailto:software@biobrillouin.org>> Sent: 22 May 2025 13:52 To: software@biobrillouin.org <mailto:software@biobrillouin.org> <software@biobrillouin.org <mailto:software@biobrillouin.org>> Subject: [Software] Re: [EXTERN] Practice dataset
Thanks all 😊 Yes, indeed a quick update on how everything has evolved on different fronts might be nice. I could join quickly on Mon 26th @3pm if that works for everyone.
Here’s Zoom link we can use… https://us02web.zoom.us/j/5191046969?pwd=alpzaldoZEd3N2ZEQ0hYZU1RR1dOdz09 Meeting ID: 519 104 6969 Passcode: jY3zH8
All the best, Kareem
From: Carlo Bevilacqua via Software <software@biobrillouin.org <mailto:software@biobrillouin.org>> Reply to: Carlo Bevilacqua <carlo.bevilacqua@embl.de <mailto:carlo.bevilacqua@embl.de>> Date: Thursday, 22. May 2025 at 13:08 To: <software@biobrillouin.org <mailto:software@biobrillouin.org>> Subject: [Software] Re: [EXTERN] Practice dataset
Hi Sal, thanks a lot for writing the script.
Actually the specs changed a bit and here <https://github.com/prevedel-lab/Brillouin-standard-file/blob/main/docs/BLS_f...> is the most updated version (nothing major, just better defining some fields). The main difference though is that we are now using Zarr file (as we discussed in our last meeting). In the meanwhile I wrote a library which exposes an intuitive interface to read/write to our file format (it is not online yet, but will be soon). I think the best way would be to use that to actually export Pierre's hdf5 to ours.
Also Sebastian progressed quite a bit on the GUI side.
Shall we have a meeting to update on each other's status? Would Monday 26th in the afternoon (e.g. 3pm) work for you?
Best, Carlo
On Wed, May 21, 2025 at 17:44, Sal La Cavera Iii <salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk>> wrote:
Hi guys,
Hope everyone's doing well!
Found some time to get this hierarchical-to-flat conversion script off the ground. In the below wetransfer link (3 days until it expires, so let me know if you need me to send it again) there's a python script that should be run with Pierre's multiparameter cell imaging dataset (h5 file) in the same directory. It doesn't map to every sub-group in Carlo's file_spec document <https://github.com/bio-brillouin/HDF5_BLS/blob/main/Bh5_file_spec.md> but it does the basic ones like Raw_data, Frequency, Shift, Linewidth, *_std, etc. Additional datasets/attributes should be fairly easy to add, but probably this will require a different test file that contains some of these other parameters (e.g. PSD, Amplitude, IRF, Calibration, etc). It will probably require some rigidity on behalf of the end-user, e.g., when they're using Pierre's software to create an h5 file, the user shouldn't have the option to save the Frequency data as 'freq' or 'f' it'll have to be 'Frequency.'
Also, this version doesn't accommodate multiple signal processing runs of the same dataset (Carlo parametrised this as /Analysis_{m} in the file spec), but again this should be pretty straight forward to implement (having a test file that contained {m+1} would be useful). Right now Analysis_0 is hardcoded in.
Carlo/Sebastian does this file format trend towards what you're looking for for handing over Pierre's h5 file to your plotting side of things?
Any feedback etc just let me know, happy to chat further,
Cheers,
Sal
--------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <image001.png> <https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at <mailto:pierre.bouvet@meduniwien.ac.at>> Sent: 17 April 2025 12:35 To: Sebastian Hambura <sebastian.hambura@embl.de <mailto:sebastian.hambura@embl.de>> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk <mailto:ezzsl7@exmail.nottingham.ac.uk>>; Carlo Bevilacqua <carlo.bevilacqua@embl.de <mailto:carlo.bevilacqua@embl.de>> Subject: Re: [EXTERN] Practice dataset
Hi Sebastian,
Was just writing to you to say that in my hurry, I didn’t added the treated data for cell.h5. I’ll do that whenever I get a bit of time or when the GUI will be able to do it (most likely option B except if you want it fast then I can spend a few minutes doing that by hand, you tell me :) ). Your question is good, the way you’d do it for these data is to recover the index on the azimuthal and polar calibration datasets. It’s indeed impractical and very specific to this kind of data (I had the same problem with accessing the time axis with time-domain datasets). For now however, I’d say that it’s sufficient since - that I know of - I have the only spectrometer able to do that, and there are technique-specific ways to access this information. Since I guess you also want to be able to convert these information into your data format and not have to create technique-specific access function, one way to avoid altogether this issue could be to define all the treated datasets as 2D arrays with size - (512x512) in this example. Then you would have the shift[i, j] associated with PSD[i, j, :]. If you have other ideas, I’m open to suggestions :)
Best,
Pierre
Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria
On 17/4/25, at 13:19, Sebastian Hambura <sebastian.hambura@embl.de <mailto:sebastian.hambura@embl.de>> wrote:
Hi Pierre,
Ok, this clarifies a bit the whole thing. But then concretly, what is the spectra (as intensity = f(frequency) ) used for a given treated point ? For example, if I wanted to see the underlaying spectra (or acquired data + fit) of /Brillouin/Mock anisotropic sample/Treated/Shift[1000], what part of /Brillouin/Mock anisotropic sample/Frequency and /Brillouin/Mock anisotropic sample/Intensity would I need to display ? Best,
Sebastian
Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 17.04.2025 um 11:50 schrieb Pierre Bouvet:
Hi Sebastian,
Sorry for the lack of details, I’ll try to explain it as best I can For the Frequency and Intensity datasets, there are the size of my detector (for now a 512x512 but hopefully soon a 2048x2048 if we can get the money ^^). By nature, I am entangling 3 hyper parameters in my measures: shift frequency, azimuthal angle of observation, polar angle of observation (with errors on each of them that I would add later on but let’s say my device is perfect for now). Now the VIPA spectrometer creates constructive interferences for specific frequencies so after treatment, I will only get treated values for the points where constructive interference happened. For each of these points I will therefore have a shift, a linewidth, a value for the azimuthal angle and a value for the polar angle. The most direct way of storing the results is therefore to keep everything 1D and then, when trying to represent things, use a color mesh with the azimuthal and polar abscissa as the x and y positions of the shift and linewidth. For the calibration now, this is different because we can consider that the angular response of water follows the theory for homogeneous and isotropic solutions, meaning that we will essentially have no dependence of the shift and linewidth on the polar angles but a dependence of sin(theta/2) on the azimuthal angles. Therefore, it is possible to reconstruct a 2D image for the shift and linewidth on all the points of the detector (which are themselves associated with a value for azimuthal and polar angles, as you can see in the calibration group). That’s weird for cells.h5, H5Viewer told me nothing on VS, but on the other hand I might have my answer for why the file was super heavy, just need to figure out how to delete groups and datasets without truncating the file. Here is the same file but made without deleting files from the HDF5 file (I tried it on myHDF5 it seems to work): https://we.tl/t-WlmabmsXHo
Best,
Pierre
Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria
On 17/4/25, at 11:27, Sebastian Hambura <sebastian.hambura@embl.de> <mailto:sebastian.hambura@embl.de> wrote:
Hi Pierre,
I'm currently looking at your angle_resolved_dataset.h5, and maybe it's because I'm not familiar with your specific setup but I'm a bit confused: - Frequency and Intensity are 512*512 for all 3 experiments/datagroups - Treated for 'Mock anisotropic sample' is 3584(*1) : are you somehow using the same row of your PSD for multiple treated datapoints ? Or are you doing some 'subpixel' rows ? - Treated for '/Brillouin/Water spectrum for angle calibration' is 512*512: how does that work from your PSD data ?
For the cells.h5 file, I tried opening it with https://myhdf5.hdfgroup.org <https://myhdf5.hdfgroup.org/> to explore it, and the tool refusing to open it and report this error: (...) unable to read superblock major: File accessibility minor: Read failed #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Fsuper.c line 603 in H5F__super_read(): truncated file: eof = 309329920, sblock->base_addr = 0, stored_eof = 537790152 major: File accessibility minor: File has been truncated I'll see if other tools might still be able to read it Cheers,
Sebastian
Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 14.04.2025 um 17:56 schrieb Pierre Bouvet: > Sure, here is the link for the ar-VIPA mock H5 file: https://we.tl/t-Sd3ZC8wIXn > For a more classic file, here are some measures of cells (I only kept 2 cell types, 2 days of measure and 2 samples per day of measure for a total of 8 spectra + their frequency/treated arrays). I treated them using a custom code so I didn’t export the treatment steps in the JSON file I mentioned @ last meeting. Some of the measures are rubish but I guess you don’t care about it ^^ > One problem I observed though is that I created this file by deleting elements from a larger file but it seems the memory is still allocated to the HDF5 file. So we get a 500MB file when a classic size for 8 spectra et al is usually under 100Mb. I’ll dig into that when I get time. Here is the link: https://we.tl/t-HUAf2lR9AF > > Best, > > Pierre > > > Pierre Bouvet, PhD > Post-doctoral Fellow > Medical University Vienna > Department of Anatomy and Cell Biology > Wahringer Straße 13, 1090 Wien, Austria > > > > > >> On 14/4/25, at 17:04, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk> <mailto:Salvatore.Lacaveraiii@nottingham.ac.uk> wrote: >> >> Hi both, >> >> Pierre I think that WeTransfer link expired! Any chance you can regenerate it? >> >> Potentially the simulated angle-resolved dataset will be a good late-stage-dev to test the software's generalisability? In the first instance should we just use a basic 2D scan of a cell or zebrafish, c. elegans, etc? From a 2-stage VIPA e.g. Something we can treat as the MNIST or ImageNet dataset in the Deep Learning community. >> >> Carlo, what about something from your 2019 zebra fish paper? (unless you guys have something else you've already both been using for dev purposes). >> >> Safe travels guys! Enjoy that Colomba pasquale Carlo! My favourite are the pistachio ones :) >> >> Cheers, >> >> Sal >> >> --------------------------------------------------------------- >> Salvatore La Cavera III >> Royal Academy of Engineering Research Fellow >> Nottingham Research Fellow >> Optics and Photonics Group >> University of Nottingham >> Email: salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk> >> ORCID iD: 0000-0003-0210-3102 >> <Outlook-xarjscql.png> <https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> >> Book a Coffee and Research chat with me! >> From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at <mailto:pierre.bouvet@meduniwien.ac.at>> >> Sent: 14 April 2025 10:04 >> To: Carlo Bevilacqua <carlo.bevilacqua@embl.de <mailto:carlo.bevilacqua@embl.de>> >> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk <mailto:ezzsl7@exmail.nottingham.ac.uk>>; Sebastian Hambura <sebastian.hambura@embl.de <mailto:sebastian.hambura@embl.de>> >> Subject: Re: [EXTERN] Practice dataset >> >> Hi, >> >> That sounds awesome, wish I could come with you to Italy, must be wonderful at this time of year!! >> Perfect for legal department @ EMBL. To be clear, I think your local approach is robust by design but I’m always scared of having things online (that’s why I never used GitHub before this project ^^). In particular here, my concern is mainly on people being able to have somehow access to the data that will be opened or treated (I’m still traumatized by how stupidly complex it was to work with human samples in France legally speaking) and people being able to modify the code so it downloads whatever. >> I’ve looked a bit this weekend and people @ HDF have done things with safety in mind so there are normally no SQL-like attacks possible and no possibility of saving compiled programs so I guess just adding a layer between your code and the user to filter all non-HDF5 files coming in or out, and adding hashing to the original data to confirm that no corruption will occur is enough but I’m no specialist in cybersecurity. For third-party attacks it’s trickier. I looked into the H5Web GitHub page to try and understand how they did it, but I’m not super familiar with TypeScript so aside the HTTPS protocol, I don’t really know how they do it. I might be over-worrying here (kind of my specialty) but I think if this project is successful and BLS develops enough, it will become a real problem so might as well do it now. >> Have a nice time in Italy! >> >> Best, >> >> Pierre >> >> Pierre Bouvet, PhD >> Post-doctoral Fellow >> Medical University Vienna >> Department of Anatomy and Cell Biology >> Wahringer Straße 13, 1090 Wien, Austria >> >> >> >> >> >>> On 13/4/25, at 13:18, Carlo Bevilacqua <carlo.bevilacqua@embl.de> <mailto:carlo.bevilacqua@embl.de> wrote: >>> >>> Hi guys, >>> thanks for getting started the process of generating a standard dataset. >>> >>> Sorry for the late reply but in the last days I was focusing more on lab work as I want to finish something before flying to Italy next Thursday. >>> I will do home office when in Italy and I will try to generate a file with some experimental data. >>> >>> @Pierre regarding your concerns about GDPR, we could for sure double check with the legal department at EMBL at a later stage, but I want to highlight that the data is not being transfered to any external server, it is just being read by the browser (basically the browser is acting like your Qt GUI and in principle it could run completely offline once the website is loaded). >>> I think we should put a disclaimer on the website about this and maybe the legal department could help on the exact phrasing. >>> >>> Best, >>> Carlo >>> >>> On Tue, Apr 8, 2025 at 18:11, Pierre Bouvet <pierre.bouvet@meduniwien.ac.at> <mailto:pierre.bouvet@meduniwien.ac.at> wrote: >>>> Hi everyone, >>>> >>>> I’ve created a mock file that would correspond to an angle-resolved measurement with calibration in angle and impulse response function. It’s a little heavy so I prefer to send it using wetransfer (all the data are simulated): https://we.tl/t-Om7OrG9Mnu >>>> I’m sending you this one first because it is the most exotic BLS technique in terms of the needs for data storage (I think). It asked me to revise the format a lot when I started working on it. The trickiest conceptual limitation is that I place the detector in the Fourier plane of the sample, meaning that I entangle the frequency axis, the azimuthal angles of collections, the polar angle of collections, and the positions on the detector, leading to the need of multiple abscissa defined on the same axis (I promise I'll try explaining it better with images and all when I’ll start writing the article). This technique will at one point (when we get the green light to buy a new detector) be used for imagery, by extracting the c_ii coefficients and plotting them, but I kept things as simple as possible for now as I’m still brainstorming a lot the ways to use this method effectively in biological samples. >>>> I have to underline that with the few devices I’m working with here, I find that in general, each type of spectrometer leads to a slightly different but specific representation of data that falls under the description of the format I made (for example, in time domain, you need an abscissa for the time axis, in TFP you never have raw data, in 1-VIPA you usually bin the signal on the detector, in 2-VIPA you usually have a dimensionality reduction to perform to pass from raw data to PSD…) so I’ll try to send you a file for each technique I use at the lab (plus time-resolved with the data you sent me) as soon as possible so you can cross-check compatibility. >>>> I’m currently doing two long acquisitions for two groups we collaborate with so I don’t know how long it’ll take, but I am confident all other techniques will be much easier to implement in Carlo’s structure. >>>> Another detail : I found an unexpected limitation in the HDF5_BLS library that doesn’t allow me to add attributes directly to the top group so for now each group has its own set of attributes. I’ve added that on the list of things to do ^^ >>>> For the time-domain compatibility, I forgot my parents where coming to Vienna this weekend so I unfortunately couldn’t do it, but it’s close to the top of the list now so it’ll be done soon :) >>>> Final point, this time more for Carlo & Sebastian: I realized that for some samples (human tissue for example), we might have to face some legal challenges. Since you’re not only proposing to read the file but also write in it using a web-based application, I’m scared we will at one point have to face the infamous GDPR, get a green light from an Ethics committee or get some kind of legal validation from a lawyer. So before putting anything on the server, do you know people that could either advise us on these safety points or better, overview them at EMBL? This is one of the few computer things I’ve never really studied because I always had the magic “just don’t put it online” joker - and also because I kind of flee legislation stuff whenever I can - but I guess this is not an option anymore so we need to do it properly (plus if we don’t, from what I understand we can get in huge troubles). Tell me what you think :) >>>> >>>> Best, >>>> >>>> Pierre >>>> >>>> Pierre Bouvet, PhD >>>> Post-doctoral Fellow >>>> Medical University Vienna >>>> Department of Anatomy and Cell Biology >>>> Wahringer Straße 13, 1090 Wien, Austria >>>> >>>> >>>> >>>> >>>> >>>>> On 7/4/25, at 10:39, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk <mailto:Salvatore.Lacaveraiii@nottingham.ac.uk>> wrote: >>>>> >>>>> Good morning guys, >>>>> >>>>> Hope you had a nice sunny weekend! Just wanted to follow up from Friday's meeting to see if you had a test dataset we can use as the standard for this software project. I think I heard "smilies" and then also one with cells? Preferably Pierre an .h5 with your preferred structure etc, even if it's a work in project / likely to be modified a little bit in the future. If not I can whip something up, or repurpose something, but would prefer to work with a dataset that is most-representative and most useful for your guys! >>>>> >>>>> Cheers, >>>>> >>>>> Sal >>>>> >>>>> --------------------------------------------------------------- >>>>> Salvatore La Cavera III >>>>> Royal Academy of Engineering Research Fellow >>>>> Nottingham Research Fellow >>>>> Optics and Photonics Group >>>>> University of Nottingham >>>>> Email: salvatore.lacaveraiii@nottingham.ac.uk <mailto:salvatore.lacaveraiii@nottingham.ac.uk> >>>>> ORCID iD: 0000-0003-0210-3102 >>>>> <Outlook-fdl5qycq.png> <https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> >>>>> Book a Coffee and Research chat with me! >>>>> This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. >>>> >>>> >> >> >> This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. > >
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.
_______________________________________________ Software mailing list -- software@biobrillouin.org <mailto:software@biobrillouin.org> To unsubscribe send an email to software-leave@biobrillouin.org <mailto:software-leave@biobrillouin.org> This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org <mailto:software@biobrillouin.org> To unsubscribe send an email to software-leave@biobrillouin.org <mailto:software-leave@biobrillouin.org>
_______________________________________________ Software mailing list -- software@biobrillouin.org To unsubscribe send an email to software-leave@biobrillouin.org _______________________________________________ Software mailing list -- software@biobrillouin.org <mailto:software@biobrillouin.org> To unsubscribe send an email to software-leave@biobrillouin.org <mailto:software-leave@biobrillouin.org>
Sounds good, see you guys then --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 [cid:327f3b58-38d4-416f-b877-487a67929ee3]<https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! ________________________________ From: Pierre Bouvet via Software <software@biobrillouin.org> Sent: 22 May 2025 15:25 To: Kareem Elsayad <kareem.elsayad@meduniwien.ac.at> Cc: software@biobrillouin.org <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset OK for me too Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 22/5/25, at 16:23, Kareem Elsayad via Software <software@biobrillouin.org> wrote: Ok by me From: Robert Prevedel via Software <software@biobrillouin.org<mailto:software@biobrillouin.org>> Reply to: Robert Prevedel <prevedel@embl.de<mailto:prevedel@embl.de>> Date: Thursday, 22. May 2025 at 16:12 To: Sal La_Cavera_Iii <Sal.La_Cavera_Iii@nottingham.ac.uk> Cc: "software@biobrillouin.org" <software@biobrillouin.org> Subject: [Software] Re: [EXTERN] Practice dataset Dear all, would be great to meet - however, I think there was a misunderstanding concerning dates when I talked to Carlo about this. I actually proposed Mon. June 2nd at 3pm. Since the 26th is anyway a holiday in the UK, would everyone be available then (or later that day)? Best, RObert -- Dr. Robert Prevedel Group Leader Cell Biology and Biophysics Unit European Molecular Biology Laboratory Meyerhofstr. 1 69117 Heidelberg, Germany Phone: +49 6221 387-8722 Email: robert.prevedel@embl.de http://www.prevedel.embl.de On 22.05.2025, at 15:00, Sal La_Cavera_Iii via Software <software@biobrillouin.org> wrote: Thanks for updates Carlo! If possible are people available at the same time on Tuesday or Wednesday the 27th/28th? 26th is a holiday in the UK so I'll be out of town but will be back in the office the next day. --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-y324snrw.png><https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! ________________________________ From: Kareem Elsayad via Software <software@biobrillouin.org<mailto:software@biobrillouin.org>> Sent: 22 May 2025 13:52 To: software@biobrillouin.org<mailto:software@biobrillouin.org> <software@biobrillouin.org<mailto:software@biobrillouin.org>> Subject: [Software] Re: [EXTERN] Practice dataset Thanks all 😊 Yes, indeed a quick update on how everything has evolved on different fronts might be nice. I could join quickly on Mon 26th @3pm if that works for everyone. Here’s Zoom link we can use… https://us02web.zoom.us/j/5191046969?pwd=alpzaldoZEd3N2ZEQ0hYZU1RR1dOdz09 Meeting ID: 519 104 6969 Passcode: jY3zH8 All the best, Kareem From: Carlo Bevilacqua via Software <software@biobrillouin.org<mailto:software@biobrillouin.org>> Reply to: Carlo Bevilacqua <carlo.bevilacqua@embl.de<mailto:carlo.bevilacqua@embl.de>> Date: Thursday, 22. May 2025 at 13:08 To: <software@biobrillouin.org<mailto:software@biobrillouin.org>> Subject: [Software] Re: [EXTERN] Practice dataset Hi Sal, thanks a lot for writing the script. Actually the specs changed a bit and here<https://github.com/prevedel-lab/Brillouin-standard-file/blob/main/docs/BLS_f...> is the most updated version (nothing major, just better defining some fields). The main difference though is that we are now using Zarr file (as we discussed in our last meeting). In the meanwhile I wrote a library which exposes an intuitive interface to read/write to our file format (it is not online yet, but will be soon). I think the best way would be to use that to actually export Pierre's hdf5 to ours. Also Sebastian progressed quite a bit on the GUI side. Shall we have a meeting to update on each other's status? Would Monday 26th in the afternoon (e.g. 3pm) work for you? Best, Carlo On Wed, May 21, 2025 at 17:44, Sal La Cavera Iii <salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk>> wrote: Hi guys, Hope everyone's doing well! Found some time to get this hierarchical-to-flat conversion script off the ground. In the below wetransfer link (3 days until it expires, so let me know if you need me to send it again) there's a python script that should be run with Pierre's multiparameter cell imaging dataset (h5 file) in the same directory. It doesn't map to every sub-group in Carlo's file_spec document<https://github.com/bio-brillouin/HDF5_BLS/blob/main/Bh5_file_spec.md> but it does the basic ones like Raw_data, Frequency, Shift, Linewidth, *_std, etc. Additional datasets/attributes should be fairly easy to add, but probably this will require a different test file that contains some of these other parameters (e.g. PSD, Amplitude, IRF, Calibration, etc). It will probably require some rigidity on behalf of the end-user, e.g., when they're using Pierre's software to create an h5 file, the user shouldn't have the option to save the Frequency data as 'freq' or 'f' it'll have to be 'Frequency.' Also, this version doesn't accommodate multiple signal processing runs of the same dataset (Carlo parametrised this as /Analysis_{m} in the file spec), but again this should be pretty straight forward to implement (having a test file that contained {m+1} would be useful). Right now Analysis_0 is hardcoded in. Carlo/Sebastian does this file format trend towards what you're looking for for handing over Pierre's h5 file to your plotting side of things? https://we.tl/t-TszCgBzwfE Any feedback etc just let me know, happy to chat further, Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <image001.png><https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! ________________________________ From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at<mailto:pierre.bouvet@meduniwien.ac.at>> Sent: 17 April 2025 12:35 To: Sebastian Hambura <sebastian.hambura@embl.de<mailto:sebastian.hambura@embl.de>> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk<mailto:ezzsl7@exmail.nottingham.ac.uk>>; Carlo Bevilacqua <carlo.bevilacqua@embl.de<mailto:carlo.bevilacqua@embl.de>> Subject: Re: [EXTERN] Practice dataset Hi Sebastian, Was just writing to you to say that in my hurry, I didn’t added the treated data for cell.h5. I’ll do that whenever I get a bit of time or when the GUI will be able to do it (most likely option B except if you want it fast then I can spend a few minutes doing that by hand, you tell me :) ). Your question is good, the way you’d do it for these data is to recover the index on the azimuthal and polar calibration datasets. It’s indeed impractical and very specific to this kind of data (I had the same problem with accessing the time axis with time-domain datasets). For now however, I’d say that it’s sufficient since - that I know of - I have the only spectrometer able to do that, and there are technique-specific ways to access this information. Since I guess you also want to be able to convert these information into your data format and not have to create technique-specific access function, one way to avoid altogether this issue could be to define all the treated datasets as 2D arrays with size - (512x512) in this example. Then you would have the shift[i, j] associated with PSD[i, j, :]. If you have other ideas, I’m open to suggestions :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 13:19, Sebastian Hambura <sebastian.hambura@embl.de<mailto:sebastian.hambura@embl.de>> wrote: Hi Pierre, Ok, this clarifies a bit the whole thing. But then concretly, what is the spectra (as intensity = f(frequency) ) used for a given treated point ? For example, if I wanted to see the underlaying spectra (or acquired data + fit) of /Brillouin/Mock anisotropic sample/Treated/Shift[1000], what part of /Brillouin/Mock anisotropic sample/Frequency and /Brillouin/Mock anisotropic sample/Intensity would I need to display ? Best, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 17.04.2025 um 11:50 schrieb Pierre Bouvet: Hi Sebastian, Sorry for the lack of details, I’ll try to explain it as best I can For the Frequency and Intensity datasets, there are the size of my detector (for now a 512x512 but hopefully soon a 2048x2048 if we can get the money ^^). By nature, I am entangling 3 hyper parameters in my measures: shift frequency, azimuthal angle of observation, polar angle of observation (with errors on each of them that I would add later on but let’s say my device is perfect for now). Now the VIPA spectrometer creates constructive interferences for specific frequencies so after treatment, I will only get treated values for the points where constructive interference happened. For each of these points I will therefore have a shift, a linewidth, a value for the azimuthal angle and a value for the polar angle. The most direct way of storing the results is therefore to keep everything 1D and then, when trying to represent things, use a color mesh with the azimuthal and polar abscissa as the x and y positions of the shift and linewidth. For the calibration now, this is different because we can consider that the angular response of water follows the theory for homogeneous and isotropic solutions, meaning that we will essentially have no dependence of the shift and linewidth on the polar angles but a dependence of sin(theta/2) on the azimuthal angles. Therefore, it is possible to reconstruct a 2D image for the shift and linewidth on all the points of the detector (which are themselves associated with a value for azimuthal and polar angles, as you can see in the calibration group). That’s weird for cells.h5, H5Viewer told me nothing on VS, but on the other hand I might have my answer for why the file was super heavy, just need to figure out how to delete groups and datasets without truncating the file. Here is the same file but made without deleting files from the HDF5 file (I tried it on myHDF5 it seems to work): https://we.tl/t-WlmabmsXHo Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 17/4/25, at 11:27, Sebastian Hambura <sebastian.hambura@embl.de><mailto:sebastian.hambura@embl.de> wrote: Hi Pierre, I'm currently looking at your angle_resolved_dataset.h5, and maybe it's because I'm not familiar with your specific setup but I'm a bit confused: - Frequency and Intensity are 512*512 for all 3 experiments/datagroups - Treated for 'Mock anisotropic sample' is 3584(*1) : are you somehow using the same row of your PSD for multiple treated datapoints ? Or are you doing some 'subpixel' rows ? - Treated for '/Brillouin/Water spectrum for angle calibration' is 512*512: how does that work from your PSD data ? For the cells.h5 file, I tried opening it with https://myhdf5.hdfgroup.org<https://myhdf5.hdfgroup.org/> to explore it, and the tool refusing to open it and report this error: (...) unable to read superblock major: File accessibility minor: Read failed #006: /__w/libhdf5-wasm/libhdf5-wasm/build/1.14.2/_deps/hdf5-src/src/H5Fsuper.c line 603 in H5F__super_read(): truncated file: eof = 309329920, sblock->base_addr = 0, stored_eof = 537790152 major: File accessibility minor: File has been truncated I'll see if other tools might still be able to read it Cheers, Sebastian Sebastian Hambura Software Engineer Robert Prevedel Group Cell Biology and Biophysics EMBL Heidelberg Meyerhofstraße 1 69117 Heidelberg, Germany Am 14.04.2025 um 17:56 schrieb Pierre Bouvet: Sure, here is the link for the ar-VIPA mock H5 file: https://we.tl/t-Sd3ZC8wIXn For a more classic file, here are some measures of cells (I only kept 2 cell types, 2 days of measure and 2 samples per day of measure for a total of 8 spectra + their frequency/treated arrays). I treated them using a custom code so I didn’t export the treatment steps in the JSON file I mentioned @ last meeting. Some of the measures are rubish but I guess you don’t care about it ^^ One problem I observed though is that I created this file by deleting elements from a larger file but it seems the memory is still allocated to the HDF5 file. So we get a 500MB file when a classic size for 8 spectra et al is usually under 100Mb. I’ll dig into that when I get time. Here is the link: https://we.tl/t-HUAf2lR9AF Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 14/4/25, at 17:04, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk><mailto:Salvatore.Lacaveraiii@nottingham.ac.uk> wrote: Hi both, Pierre I think that WeTransfer link expired! Any chance you can regenerate it? Potentially the simulated angle-resolved dataset will be a good late-stage-dev to test the software's generalisability? In the first instance should we just use a basic 2D scan of a cell or zebrafish, c. elegans, etc? From a 2-stage VIPA e.g. Something we can treat as the MNIST or ImageNet dataset in the Deep Learning community. Carlo, what about something from your 2019 zebra fish paper? (unless you guys have something else you've already both been using for dev purposes). Safe travels guys! Enjoy that Colomba pasquale Carlo! My favourite are the pistachio ones :) Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-xarjscql.png><https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! ________________________________ From: Pierre Bouvet <pierre.bouvet@meduniwien.ac.at<mailto:pierre.bouvet@meduniwien.ac.at>> Sent: 14 April 2025 10:04 To: Carlo Bevilacqua <carlo.bevilacqua@embl.de<mailto:carlo.bevilacqua@embl.de>> Cc: Sal La Cavera iii (staff) <ezzsl7@exmail.nottingham.ac.uk<mailto:ezzsl7@exmail.nottingham.ac.uk>>; Sebastian Hambura <sebastian.hambura@embl.de<mailto:sebastian.hambura@embl.de>> Subject: Re: [EXTERN] Practice dataset Hi, That sounds awesome, wish I could come with you to Italy, must be wonderful at this time of year!! Perfect for legal department @ EMBL. To be clear, I think your local approach is robust by design but I’m always scared of having things online (that’s why I never used GitHub before this project ^^). In particular here, my concern is mainly on people being able to have somehow access to the data that will be opened or treated (I’m still traumatized by how stupidly complex it was to work with human samples in France legally speaking) and people being able to modify the code so it downloads whatever. I’ve looked a bit this weekend and people @ HDF have done things with safety in mind so there are normally no SQL-like attacks possible and no possibility of saving compiled programs so I guess just adding a layer between your code and the user to filter all non-HDF5 files coming in or out, and adding hashing to the original data to confirm that no corruption will occur is enough but I’m no specialist in cybersecurity. For third-party attacks it’s trickier. I looked into the H5Web GitHub page to try and understand how they did it, but I’m not super familiar with TypeScript so aside the HTTPS protocol, I don’t really know how they do it. I might be over-worrying here (kind of my specialty) but I think if this project is successful and BLS develops enough, it will become a real problem so might as well do it now. Have a nice time in Italy! Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 13/4/25, at 13:18, Carlo Bevilacqua <carlo.bevilacqua@embl.de><mailto:carlo.bevilacqua@embl.de> wrote: Hi guys, thanks for getting started the process of generating a standard dataset. Sorry for the late reply but in the last days I was focusing more on lab work as I want to finish something before flying to Italy next Thursday. I will do home office when in Italy and I will try to generate a file with some experimental data. @Pierre regarding your concerns about GDPR, we could for sure double check with the legal department at EMBL at a later stage, but I want to highlight that the data is not being transfered to any external server, it is just being read by the browser (basically the browser is acting like your Qt GUI and in principle it could run completely offline once the website is loaded). I think we should put a disclaimer on the website about this and maybe the legal department could help on the exact phrasing. Best, Carlo On Tue, Apr 8, 2025 at 18:11, Pierre Bouvet <pierre.bouvet@meduniwien.ac.at><mailto:pierre.bouvet@meduniwien.ac.at> wrote: Hi everyone, I’ve created a mock file that would correspond to an angle-resolved measurement with calibration in angle and impulse response function. It’s a little heavy so I prefer to send it using wetransfer (all the data are simulated): https://we.tl/t-Om7OrG9Mnu I’m sending you this one first because it is the most exotic BLS technique in terms of the needs for data storage (I think). It asked me to revise the format a lot when I started working on it. The trickiest conceptual limitation is that I place the detector in the Fourier plane of the sample, meaning that I entangle the frequency axis, the azimuthal angles of collections, the polar angle of collections, and the positions on the detector, leading to the need of multiple abscissa defined on the same axis (I promise I'll try explaining it better with images and all when I’ll start writing the article). This technique will at one point (when we get the green light to buy a new detector) be used for imagery, by extracting the c_ii coefficients and plotting them, but I kept things as simple as possible for now as I’m still brainstorming a lot the ways to use this method effectively in biological samples. I have to underline that with the few devices I’m working with here, I find that in general, each type of spectrometer leads to a slightly different but specific representation of data that falls under the description of the format I made (for example, in time domain, you need an abscissa for the time axis, in TFP you never have raw data, in 1-VIPA you usually bin the signal on the detector, in 2-VIPA you usually have a dimensionality reduction to perform to pass from raw data to PSD…) so I’ll try to send you a file for each technique I use at the lab (plus time-resolved with the data you sent me) as soon as possible so you can cross-check compatibility. I’m currently doing two long acquisitions for two groups we collaborate with so I don’t know how long it’ll take, but I am confident all other techniques will be much easier to implement in Carlo’s structure. Another detail : I found an unexpected limitation in the HDF5_BLS library that doesn’t allow me to add attributes directly to the top group so for now each group has its own set of attributes. I’ve added that on the list of things to do ^^ For the time-domain compatibility, I forgot my parents where coming to Vienna this weekend so I unfortunately couldn’t do it, but it’s close to the top of the list now so it’ll be done soon :) Final point, this time more for Carlo & Sebastian: I realized that for some samples (human tissue for example), we might have to face some legal challenges. Since you’re not only proposing to read the file but also write in it using a web-based application, I’m scared we will at one point have to face the infamous GDPR, get a green light from an Ethics committee or get some kind of legal validation from a lawyer. So before putting anything on the server, do you know people that could either advise us on these safety points or better, overview them at EMBL? This is one of the few computer things I’ve never really studied because I always had the magic “just don’t put it online” joker - and also because I kind of flee legislation stuff whenever I can - but I guess this is not an option anymore so we need to do it properly (plus if we don’t, from what I understand we can get in huge troubles). Tell me what you think :) Best, Pierre Pierre Bouvet, PhD Post-doctoral Fellow Medical University Vienna Department of Anatomy and Cell Biology Wahringer Straße 13, 1090 Wien, Austria On 7/4/25, at 10:39, Sal La Cavera Iii <Salvatore.Lacaveraiii@nottingham.ac.uk<mailto:Salvatore.Lacaveraiii@nottingham.ac.uk>> wrote: Good morning guys, Hope you had a nice sunny weekend! Just wanted to follow up from Friday's meeting to see if you had a test dataset we can use as the standard for this software project. I think I heard "smilies" and then also one with cells? Preferably Pierre an .h5 with your preferred structure etc, even if it's a work in project / likely to be modified a little bit in the future. If not I can whip something up, or repurpose something, but would prefer to work with a dataset that is most-representative and most useful for your guys! Cheers, Sal --------------------------------------------------------------- Salvatore La Cavera III Royal Academy of Engineering Research Fellow Nottingham Research Fellow Optics and Photonics Group University of Nottingham Email: salvatore.lacaveraiii@nottingham.ac.uk<mailto:salvatore.lacaveraiii@nottingham.ac.uk> ORCID iD: 0000-0003-0210-3102 <Outlook-fdl5qycq.png><https://outlook.office.com/bookwithme/user/6a3f960a8e89429cb6fc693c01d10119@...> Book a Coffee and Research chat with me! This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org<mailto:software@biobrillouin.org> To unsubscribe send an email to software-leave@biobrillouin.org<mailto:software-leave@biobrillouin.org> This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law. _______________________________________________ Software mailing list -- software@biobrillouin.org<mailto:software@biobrillouin.org> To unsubscribe send an email to software-leave@biobrillouin.org<mailto:software-leave@biobrillouin.org> _______________________________________________ Software mailing list -- software@biobrillouin.org To unsubscribe send an email to software-leave@biobrillouin.org _______________________________________________ Software mailing list -- software@biobrillouin.org<mailto:software@biobrillouin.org> To unsubscribe send an email to software-leave@biobrillouin.org<mailto:software-leave@biobrillouin.org> This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.
participants (5)
-
Carlo Bevilacqua -
Kareem Elsayad -
Pierre Bouvet -
Robert Prevedel -
Sal La_Cavera_Iii