Scheduled upgrade on April 4, 08:00 UTC

Kindly note that during the maintenance window, app.hopsworks.ai will not be accessible.

April 4, 2025

App Status

Back to Blog

Hopsworks Team

Hopsworks Experts

Let's keep in touch!

Subscribe to our newsletter and receive the latest product updates, upcoming events, and industry news.

More Blogs

Hopsworks AI Lakehouse Now Supports NVIDIA NIM Microservices

How we secure your data with Hopsworks

Migrating from AWS to a European Cloud - How We Cut Costs by 62%

The 10 Fallacies of MLOps

Hopsworks AI Lakehouse: The Power of Integrated MLOps Components

Article updated on

5-minute interview Laura Gutierrez Funderburk

Episode 21: Laura Gutierrez Funderburk, Senior Developer Advocate - Bytewax

July 12, 2024

5 min

Read

Hopsworks Team

Hopsworks Experts

5-minute Interviews

TL;DR

I think, within the context of fine tuning, one area that I'm very eager to explore is this idea of fine-tuning the embedding models or the embeddings, as opposed to the LLM itself, to optimize the retrieval mechanism within RAG.

Time to meet another face from Bytewax! This time we met with Laura Gutierrez Funderburk, Senior Developer Advocate. We talk about streaming analytics and interesting use cases such as RAG, continue reading or watch the episode to learn more.

Tell us a little bit about yourself

My name is Laura Gutierrez Funderburk. I am based in Vancouver, Canada, but I’m originally from Mexico. I got into data science through mathematics, initially planning to become an applied mathematician. Along the way, I learned about programming, particularly Python, and how it was used to solve problems in bioinformatics. I became hooked on using Python and programming to tackle real-life problems. Fast forward a few years, and I now have the opportunity to apply both my mathematical skills and programming skills to solve problems in data science. I am also involved in community outreach as part of Bytewax.

One of the things that we're very excited about at Bytewax, being an open source community, is letting people leverage the Python ecosystem while bringing in real time analytics. For us, the purpose is to make it as easy as possible for people to build efficient pipelines. So one of the things that I think is quite nice and that the community likes about us, is we're leveraging the Rust package Timely behind the scenes. And then we have exposed the Python API that allows people to simply import and pip install as you would any other Python package. And then from there you can combine other Python packages with our functionality to build some of these pipelines for data that changes in real time.

What is the relationship between Streaming Analytics and AI?

So one of the use cases is Retrieval Augmented Generation (RAG). We've learned about fine-tuning as one of the mechanisms to ground an LLM. But the problem with something like fine-tuning is that if you have data that's constantly changing or the fine tuning process is not necessarily successful. You can then turn to RAG as a processor or mechanism to ground the LLM. When you're dealing with curating a database that's changing on the regular, say for example, financial data. You have data coming in every single day, either from the market prices, but also in the form of news, articles or any kind of unstructured information. Leveraging Bytewax and Unstructured, or Bytewax together with Haystack, Langchain, or Hopsworks, allows you to ground the LLM with the latest data.

Do you see any particularly interesting use cases in the field?

I think, within the context of fine-tuning, one area that I'm very eager to explore is this idea of fine-tuning the embedding models or the embeddings, as opposed to the LLM itself, to optimize the retrieval mechanism within RAG. So, for instance, if I think about retrieving information from social media platforms where I want to optimize my search for specific hashtags or specific formats that the community is using online, leveraging platforms like Hopsworks or Bytewax is one of those compelling use cases. One of the tricky things about social media, sometimes you have these phenomenons of things blowing up and you want to be in it. You want to leverage it, you want to capture it as it's happening, not tomorrow or the week after, you want to catch it now. I think that's one of the compelling use cases that I see for sure.

Do you have any interesting resources to recommend?

One community that I found was really helpful in getting started with working with AI and LLMs was the AI Makerspace community. They're an online based community, they offer workshops and free seminars and webinars. Once a week they have these community sessions where the members of the community can come in and talk about a topic. It’s very broad in terms of the community, but they have a very strong focus on building, shipping and sharing.

‍

Listen to the full episode:

References

Interested for more?

🤖 Register for free on Hopsworks Serverless
🌐 Read about the open, disaggregated AI Lakehouse stack
📚 Get your early copy: O'Reilly's 'Building Machine Learning Systems' book
🛠️ Explore all Hopsworks Integrations
🧩 Get started with codes and examples
⚖️ Compare other Feature Stores with Hopsworks

More blogs

Find out how to use Flink to compute real-time features and make them available to online models within seconds using the Hopsworks Feature Store.

Building Feature Pipelines with Apache Flink

Find out how to use Flink to compute real-time features and make them available to online models within seconds using Hopsworks.

Fabio Buso

Read about how Sweden’s largest bank trained generative adversarial neural networks (GANs) using NVIDIA GPUs as part of its fraud and money-laundering.

Detecting Financial Fraud Using GANs at Swedbank with Hopsworks and NVIDIA GPUs

Recently, one of Sweden’s largest banks trained generative adversarial neural networks (GANs) using NVIDIA GPUs as part of its fraud and money-laundering prevention strategy.

Jim Dowling

Learn about the newly released features and developments in Hopsworks 2.5 and how they help users collaborate better and scale their data to higher level.

March 4, 2022

3 min

Read

Hopsworks 2.5 Product Updates: Collaboration & Scalability

We go through the new features and developments in Hopsworks 2.5 that will benefit open-source users and customers alike.

Fabio Buso