8VC Emerging Builders Spotlight: Daniel Isen (Yugabyte)
In supporting the industry defining companies of the 8VC portfolio, we are fortunate to work with the brightest, most dedicated people in the world. We’re excited to feature some of the most promising engineering and product talent we have the pleasure of collaborating with, not only at 8VC, but within our broader network.
Today we are highlighting Daniel Isen, a Member of the Technical Staff at Yugabyte. He was one of three engineers that started the YugabyteDB Managed project, their fully-managed database-as-a-service offering.
Daniel has been full-time at Yugabyte for two and a half years, and was a Yugabyte intern in the fall of 2019. He graduated from the University of Waterloo in the spring of 2020 with a Bachelor of Computer Science degree.
In your own words, what is Yugabyte as well as the concept of 'Distributed SQL'? What drove your initial excitement to joining?
Historically, applications were developed using relational databases, like Oracle, which used Structured Query Language (SQL) as the language for interacting with data. However, these RDBMS’s like Oracle had some severe shortcomings when it came to scalability.
As a result, NoSQL databases (e.g. document-style storage) came to address some of these scalability needs for large web scale applications that needed higher throughput. The downside is that you lose benefits of a RDBMS - the structured nature and the consistency.
Distributed SQL exists to try to marry the benefits of the traditional RDBMS world with that of the web-era NoSQL databases – providing scalability while maintaining data integrity. That’s what Yugabyte is building.
As far as how I got excited about Yugabyte, we’d have to go back to college. In 2018, I was taking a few relevant courses at Waterloo, including a database implementations course and distributed systems, where I was digging into Paxos, Raft, and other protocols.
We have a Co-Op program where we take six internships over the course of our five years in school. As I was thinking about my next internship, I was scrolling through the Waterloo job board and stumbled upon Yugabyte. It wasn’t a company I had heard of before but I read through the description and saw it was a perfect blend of the two courses that I enjoyed!
I applied, but unfortunately ended up pursuing a different internship. That said, I kept in touch with Yugabyte and ultimately ended up working there for my subsequent internship. I still remember those initial interviews with Karthik (our CTO) – it just seemed like a great experience and place to learn.
What did you get to work on during your internship?
I got a lot of responsibility to design and implement things independently. The team gave me a lot of interesting project opportunities, from building out our encryption-at-rest feature set in our enterprise control-plane, to integrations with third party KMS solutions. I got to roll out features to customers within those four months and learned a lot. Plus, I got to be around the rest of the team in the office every day.
When I graduated in the spring of 2020, I joined full time.
Yugabyte has many streams of work and incredibly complex products to build, from the core database to the control plane. Can you speak to the separation of concerns there and how the engineering organization is most effectively structured?
The core of Yugabyte is our open-source Distributed SQL database, YugabyteDB. As it’s open-source, we’ve built products around it to monetize various offerings and provide value-add services. There are two core products: YugabyteDB Anywhere, a self-managed version of YugabyteDB that customers can install and deploy on their own infrastructure andt comes with a control plane to enable ease of cluster management, and YugabyteDB Managed, a public cloud hosted offering.
Eight months into my journey at Yugabyte, we went down the journey of building a fully managed DBaaS cloud offering. I was one of the first engineers that jumped over to that team, which is what I’ve been building since early 2021. I now lead a team of four engineers.
What do you get to work on within that team?
I work on building out functionality for a lot of our Day 2 operations. A customer may have a handful of clusters in our cloud, but we (the Yugabyte team) need to have an easy way to manage all of the ongoing maintenance type of operations on these clusters and do so in a reliable and scalable manner across all of our customers at a fleet-level. These include processes like upgrades (VM / OS / DB configuration, TLS certificates, etc.) which become super complex when factoring in that we want to give customers some control over when these fleet operations happen (i.e. maintenance windows).
We’ve also built a lot of the observability stack. Recently, we rolled out cluster health overviews, which help customers understand if and why performance might be deviating from expectation, as well as other troubleshooting (e.g. understanding why a customer might not have been able to connect with a cluster, if they need to consider scaling up, etc).
What are you most looking forward to on the Yugabyte roadmap?
There are a handful of workstreams we’re in the midst of that are forward looking and quite fun.
YugabyteDB Managed is a multi-cloud offering. Right now we support AWS and GCP with Azure on the roadmap. All of these cloud offerings have their own intricacies, so building a unified control plane that abstracts away all that complex logic is a fun process. There are various topologies we would also like to support to provide customers with more options on how to structure their clusters (read-replicas, multi-region setups, xcluster, cdc, etc) so they can adapt to a variety of situations.
Scalability is incredibly compelling as well. For the YugabyteDB Managed cloud offering, the scale that we might see from customers (as well as the complexity pushed to us) could be much larger than what individuals are self-hosting with YugabyteDB Anywhere. We’re looking at a lot of design decisions that we’ve made in the past and seeing how we can adapt those choices to be effective in YugabyteDB Managed. One such area is our metrics stack. We have a couple of million data points that we collect in the cloud on regular intervals across clusters, which is at least an order of magnitude greater than what we need to deal with in YugabyteDB Anywhere.
Let's dive deeper into the metrics stack -- can you elaborate more on how this is presented to users?
One set of data is telemetry on the health of different clusters that are deployed. The data that helps us come to conclusions on health checks include high-level things like read and write operations, all the way down to individual tablet (shard) metrics, node usage, and disk usage.
We also monitor our cloud infrastructure in the aggregate. We want to make sure everything is up and available.
Another bucket is performance and analytics. We have started to build features that give customers insight into whether they’re leveraging YugabyteDB to the best of their abilities. For example, they might have created an index that is unused or be running queries in an un-optimized way.
This gets presented to the user through a performance analyzer. We will make explicit callouts on how performance can be improved (e.g. by leveraging an index) or when overhead is incurred when it shouldn’t be.
Lastly, we’re looking at metrics around load balancing and route optimization. In YugabyteDB Managed, we give customers a load balancer for every region that a cluster is in. However, GCP and AWS load balancers are not cluster-aware (meaning they don’t have knowledge of where data resides which results in extra internal hops). It would be interesting in the future to have a cluster-aware load balancer so that clients could use any PostgreSQL-compatible OSS client driver they prefer.
What are some of your personal learnings from the past couple of years at Yugabyte?
On the technical side, this is my first time working with an open-source product. I get to see how important community is and how people want to use the product and be engaged in the development process.
On the non-technical side of things, this is my first job out of university. The importance of soft skills and interpersonal communication has been very enlightening. Working on the cloud team, we now have nearly forty engineers, that interact with QA, product, and engineering managers. You need to have clear communication and roadmaps when the team is at scale because you need cohesion. The end product cannot look like an incoherent amalgam of features.
Do you get any exposure to the core database work from the Cloud team?
Yes! For example, change data capture (CDC) is a relatively new feature that YugabyteDB offers. This enables users to ingest deltas and changes to the database in real-time for downstream applications. This requires a tight collaboration between the Cloud team and the core database team.
We have been focusing a lot on moving data in and out of YugabyteDB through concepts like CDC, database migrations, and telemetry integrations. This past summer we launched the beta version of YugabyteDB Voyager and are gearing up for the GA early next year. Voyager enables easy migration from other relational database systems to YugabyteDB. We support many databases, such as Oracle and popular cloud-based databases, and we make it easy for potential customers to move their data onto YugabyteDB without incurring downtime or massive operational overheads.
To wrap up, any hot takes on startups or databases?
When I started, Yugabyte was 20-30 people and it’s now close to 450. Operating at a startup and seeing it grow as much as Yugabyte has is a really fun opportunity which I’ve been grateful to have experienced. I hope I continue experiencing it for a while to come. There is so much energy and incredible opportunities that you can stumble into. The impact that one can make in such an environment, I feel, is often much larger than at a more well-established company, which is super exciting.
On the macro level, the database market is an exciting and mission-critical field to be in. If you look at long-term trends spanning the last few decades, the amount of data being captured is growing superlinearly and I don’t believe it is going to stop anytime soon. Storing and organizing data is incredibly economically valuable, and there is a huge, ever growing market opportunity.
Also, I’ve found a great group of people to work with. I really value learning from all the incredibly smart people here at Yugabyte, and there is always more to discover!