As we have often seen with guests who are founders, the inspiration for their business proposition is rooted in frustrations they have encountered in previous roles. Harry Rickerby, Co-Founder and CEO at Briefly Bio, is no exception. He talks about his new venture and the challenges that led him to set up the company earlier this year, and of course, as this is Data in Biotech, his views on Data, AI, and ML.
Harry originally studied biology at Imperial College, London, developing an interest in synthetic biology. From there, he joined LabGenius in 2014 as its first employee, focusing on data and machine learning to help with protein engineering and drug discovery. He remained with the company for eight years, moving into more of a leadership role during this time. Harry left the organization in 2022 to cofound Briefly Bio, where he is developing a platform to address the challenge of incomplete and inconsistent documentation of experiments.
Our conversation with Harry drilled down into the specific challenges he has encountered over his almost decade in the industry, which led to his move to set up Briefly Bio. From the issues that arise from incomplete, inconsistent documentation to the potential of Large Language Models (LLMs), here are the highlights:
Further Reading: For anyone wanting to hear more from Harry himself, he regularly publishes on Substack. He also recommends the blog of previous Data in Biotech guest Jesse Johnson, Scaling Biotech, for insights into building data teams and data systems within a biotech organization.
The highlights only scratch the surface of our conversation with Harry on how LLMs can be used in the biotech space; for more detail, you can listen to the podcast in full here.
As the host of Data in Biotech, Harry’s vision for LLMs and his view that having purpose and intent when integrating them into software really resonated in this conversation. There is a lot of noise around LLMs and Generative AI at the moment, and so much of it aligns with Gartner’s placement of the technology at ‘the peak of inflated expectations’ on its AI Hype Cycle.
When we look realistically at how organizations can use Generative AI and LLMs, it is clear they need a considerable amount of guidance and direction to have a meaningful impact. Briefly is a great example of what is needed to take the conversation on LLMs from simple Chatbots that don’t add a huge amount of value, to powerful tools that can be used to tackle some of the big challenges facing scientists needing to utilize data better.
This naturally brings us back to a familiar challenge of how we create better data. Data is our bread and butter at CorrDyn, so we think about this a lot, but following this podcast, let’s look at where the role of a data science team begins.
Without tools, platforms, templates, or standardization when it comes to how to record experimental data, wet lab scientists have no hope of generating the consistent full data that data science teams are looking for. This is a problem for the data science team to address in partnership with the wet lab team. Neither can achieve long-term success unless their process begins improves data quality coming from the wet lab.
Any issues in the data generated move downstream through the entire organization and cannot be solved at a later date. Empowering scientists to create better data and metadata needs to be the starting point for any data team. Tackling the issue of data quality at the source is central to making the process of analyzing the data easier and more reliable. But how can we improve data quality?
Automation is one of the key tools in the arsenal of overcoming quality data generation. Using automation, LLMs, or any tool to remove the drudgery of proper documentation and remove the ambiguity of what consistent and accurately captured protocols look like is at the heart of quality data. Data scientists cannot exist in isolation. They must collaborate with the wider science team to ensure they have datasets that are detailed enough to work with. Only by starting at the beginning, when the data is being generated, can they guarantee this.
The good news is that your team does not necessarily have to adopt a new tool for documentation and tracking to get the value of better metadata from your experiments. Regardless of how your metadata is generated, the goal should always be to extract it to a place where your data team can work with a copy of it free from compliance or process concerns, like a data warehouse. This sandbox is where your data science team can apply repeatable information extraction and cleaning approaches that solve your most critical metadata needs while applying quality control to the data your wet lab team generates. All of this is typically achievable without altering the day-to-day experience of wet lab team while creating a meaningful feedback loop for the data science and wet lab teams to collaborate toward better experimental metadata.
Briefly Bio’s vision of better upfront metadata generation is a better long-term solution to the problem because it solicits the wet lab team’s upfront approval of experimental metadata generation. However, teams that remain committed to and invested in optimizing their existing tools and processes can still achieve many of the benefits without restructuring R&D processes.
If you're interested in discovering how your organization can unlock the value of data and maximize its potential, get in touch with CorrDyn for a free SWOT analysis.
Want to listen to the full podcast? Listen here: