BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Podcasts A Java Performance Quest: Taming Unsafe Code, Embracing Idiomatic Style & Debugging the Linux Kernel

A Java Performance Quest: Taming Unsafe Code, Embracing Idiomatic Style & Debugging the Linux Kernel

In this podcast, Jaromir Hamala, a seasoned Java engineer specialising in high-throughput data systems, shares his thoughts on how developers can tackle high-performance software development. He touches on the benefits of modern Java that allow writing idiomatic Java code while remaining "mechanically sympathetic", and also on his experience debugging a Linux kernel bug.

Key Takeaways

  • QuestDB was built on ideas from high-frequency trading, focused on high ingestion. For this, it has three tiers: one focused on ingestion (write-ahead log), one focused on queries, and one for archiving (based on Parquet files).
  • The initial version didn’t resemble normal Java, as it used off-heap data (writing directly in memory via Unsafe), but the codebase has now been migrated towards more recent LTSs like Java 17 and Java 21.
  • The "modern Java" (Vector API or project Valhalla) promises to enable developers to build more efficient, memory-safe code using idiomatic constructs.
  • Chasing performance improvements as low as the Linux kernel can be a fun endeavour, but it's costly and doesn’t bring much business value.
  • AI-native coding enables tinkerers to experiment faster, even with codebases they are not familiar with.

Transcript

Olimpiu Pop: Hello, everybody. I'm Olimpiu Pop, an InfoQ editor, and I have in front of me Jaromir Hamala. He says that he's a generalist coder, but you'll see during our discussion that he has a pattern of working on very high-intensity, very efficient pieces of technology. And I'll let Jaromir introduce himself.

Jaromir Hamala: Hello, Olimpiu and everyone. My name is Jaromir. I am a generalist, I'm a developer, and I've been coding since I can remember. I had a Czechoslovakian ZX Spectrum Clone. So that was basic. And I've been coding over it since.

Olimpiu Pop: Okay. And you'll never stop because you'll always find something to tinker with.

Jaromir Hamala: Yes. To tinker, that's the right word because that's how I define myself a tinkerer. That's why I call myself a generalist because I'm not super specialized in any particular bit because I get bored easily, but I can weaponize it to learn about bits and bobs of everything.

Analytical versus Timeseries Databases [01:33]

Olimpiu Pop: You are currently involved with QuestDB. And if you read the website or the Wikipedia definition, it says that it's a time-series database built for high ingestion rates and to close the gap between high-intensity sources of information and data lakes. Maybe you can give us a brief overview of what QuestDB actually does.

Jaromir Hamala: Yes, sure. I'm happy to do so. Let's start with the category time series database. You can think of a time series as analytical database specialized in querying around time access. So a general analytical database would be, okay, here is my massive data set and run me some arbitrary aggregations over all data you have. Maybe do not even apply any predicates, just scan all everything and run this aggregations, the window functions over everything. That's a bit oversimplified analytical type of a database query.

Well, the time series means it is a bit specialized in a way that, okay, I do not want to run this aggregation over all my data, but maybe just data I have recorded in the last 24 hours. Or maybe I do want to run it all over all data, but I want to bucket my data by time. So it means that I want to run this aggregation functions maybe over all my data, but create a window one hour long each and then on rows within this time window, run this and that aggregation. That's the time series database part, right? It is all about time, and time is a first class citizen and not all, but oftentimes the query, they teach time is special. Let's put it this way. So that's the time series database part.

And by the tiered part, we basically mean that we have three kinds of storages. Tier number one is our writer deadlock. It means that when we ingest data, we just write it to a disk up and only super-fast. This gives us ingest generating millions of rows per second, but it's hard to query because it is written up and only as the data are coming. So to be able to create efficiently, we have the second tier where we transform this received data into a shape which is suitable for this time series processing. It means you can imagine that the data are physically on disk.

The second tier organization is by time. Literally the files on disk, they are physically sorted by time. It means if I want to see in the extreme case, what is the last row by time, I mean last, it is literally the last row we have written to the second tier. So it makes this kinds of queries efficient. If I want to find last hour, I just do some kind of a binary search where the last hour starts and then I know, okay, if this row is the one with the last hour, then if I read from that row down, then that's my last hour. And then we can partition it by time and tricks like this for efficient querying. So that's the second tier.

And the third tier is called storage because QuestDB and time series databases in general are often using high ingestion scenarios like ingesting data from exchanges, IoT data, physical AI, things like that. So typically machine generated data and the machine generated, it means it's a lot of data, right? So the portal data set can grow quite fast, and also, as I said in the beginning, the queries typically need just the recent enough subset of data. Oftentimes, not always, but oftentimes.

It means that we can afford to upload the order data and we know where they are because everything is sorted by time in the second tier. So we can upload it to some kind of cheap storage thing of S3, but not only that. And because we do not want to take your data in hostage, then we upload it to this object store in the Parquet format. So if you have your tooling capable of processing Parquet, which means all tooling these days, then you can process this offloaded Parquet files without even going through QuestDB. So that's what we mean by three tiers or three layers inside Quest DB.

Olimpiu Pop: Okay. Thank you for the explanation. So just to summarize, you have to report three different lenses that you can look through. One of them is the write ahead log and there you're just writing, writing, writing. It's just an open file only. And that's the main reason for the high intensity where you're able to ingest as much data as possible. And then after that, asynchronously, you digest it and then you put it into a time series format. That means the roles are aggregated based on time sensitive data and that's where the query comes.

If I remember correctly, it's a SQL query that will make things a lot easier. And if I have to do any kind of queries, I'll do it there. And then for integration purposes, but also for more back in time data, you use Parquet files, and then we can call it a standard these days, in terms of object storage, the S3 became the standard in this point, and then it's there. So that means that if, for instance, I have a data lake and obviously everything goes in Parquet now, it's easy and I can see the data there without even interacting with the database because they be in this case and then it makes the things a lot easier.

Jaromir Hamala: That is exactly right. And maybe you could summarize it in one sentence per tier. So the low tier is ingestion optimized. The mid-tier is query optimized and the last tier is archiving optimized.

Why Is Java Appropriate for High-Data-Ingestion Systems? [07:42]

Olimpiu Pop: Yes. Great summarization, Jaromir. So where is Java in all this? Because the old story, and we are old folks, because we speak for at least a decade about Java and we interacted 10 years ago, and at that point you're more focused on streaming with another company. Everybody says that Java is not fast. Then why do you build a fast database in Java? Is it Java or is it something else that looks like Java?

Jaromir Hamala: That's a great question because QuestDB, the core is technically Java. If you check our GitHub stats, then, I don't know by heart, but my guess is that 85% if not more of the total lines of code is Java. So nominally it's Java, but it's a rather unorthodox Java, at least people tend to tell us when they see our code base. And the reason is in history, the founder of the project has a background in high frequency trading. And he would use Java, even in high frequency trading.

But these folks, they use a very special Java. They tend to avoid allocations because especially back then, allocations, it implied garbage and that back then implied long courses, which were not acceptable in the high frequency trading domain. So style of how the high frequency trading people work in Java, they heavily rely on all heap data structures. They often use pooling and object reusage rather than allocating. And in general, they go against lot of so called best practices in Java.

And the founder who is now the CTO of the company, he thought, well, maybe I could apply the same principle to write a full-blown database. Because this high frequency trading, these programs tend to be small, right? They're not as generalized as the database is. So that took it as a challenge, "Hey, let's start a database, but apply the same principle". And he picked Java because that was the language he knew the best and the QuestDB code base is older than say Rust 1.0.

So he started slowly, he had a lot of dead ends and he had to provide big subsystems from scratch many times. At the end result is that now the code base is nominal in Java and we have a lot of extension. We have parts in C, C++, Assembler, Rust, but the heart is Java. And we mentioned the ingestion rate in millions of rows per second take this as yet another evidence that Java can actually be pretty fast. And the same goes for querying, in many independent benchmarks, we are often at the top positions.

Olimpiu Pop: Yes. It's in a great majority Java. So it's 90.8% from GitHub statistics is Java. Then there are bits and pieces of C, C++, some Rust, but it's mostly Java. But what's the secret sauce of QuestDB? That's the main question. Is it old school, Unsafe days, or you are going into more fancy stuff like the Vector API and CIMDs and other stuff like that?

Upgrading From "Old-School" Java [10:57]

Jaromir Hamala: The core right now, as it is, it still relies on the old school Unsafe base. One reason is that up until recently, the Java client was part of the main QuestDB jar for some historical reasons, and this was preventing us to upgrade the Java based version aggressively. And we just recently split it, so now the core is 17 or we just bumped it, or we are about to bump it, to 21. And then because of this client service split, we are now in a position to do this very aggressively. So, the long-term plan is to basically track new Java releases as much as possible.

It is possible that if they completely cut the access to Unsafe, then that will be a delay for us because we will have to adapt, but we are pretty excited about the new development. So, we are definitely eyeing Valhalla and Panama for value objects and better interactions with NaN Java word. I've been playing with Vector API just last weekend because it's super exciting for things like filtering where you have SQL predicate. Because we are filtering oftentimes over hundreds of millions and billions of rows, the machine code must be the most efficient machine code possible.

So what we do right now for filter execution is that we use JIT, but for filters, we don't use JIT, which is part of the Java runtime. We implemented our all JIT because of this predictability we need. So the architecture is that you are a user, I am a Quest DB database. You send me a SQL that it goes to the traditional parsing, we build some kind of abstract syntax tree and then we see, oh, this part is a filter. Then we serialize this filter, this "Where clause"  from the SQL into some kind of intermediate or intermediate representation store in off-heap.

We take a pointer and then over JNI, we call to our just in time C++ backend, which reads from that OVH-IP memory, this intermediate representation of the filter and builds a matching code for a given platform Intel or ARM, Intel can be scalable or vectorized and it builds the most efficient meshing code possible for that given filter and then returns back a pointer.

And so the Java code knows that, okay, at this pointer is a native function which represents that particle SQL filter. So when we are later executing the query and we need to filter the rows to find the one matching the filter, then there is another JNI call, which basically receives address, again, OVH-IP memory where the data are stored and address of these filter functions we have generated before, which will be applied to the data columns and will return back only the matching row IDs.

The other reasons why we would like to rid of this, or I would like to get rid of this C++ backend is that up until recently we supported only Intel. We didn't support ARM at all for JIT filters because for ARM, we had to write this backend from scratch. So my other motivation is that, okay, we could use Java Vector API to do the same kind of efficient filtering as right now we do from the C++ backend with say AVX2 instructions.

So I was playing with this just last weekend, the results are say promising. There's still some typical Java difficulties like with warmup time. So what I end up doing is consuming the same intermediate representation with the filter BC+ as backend consumes right now. And I would generate runtime byte code, which would represent that particular filter. The generated byte code would go to Java Vector API and then the runtime would be just regular Java function code.

The problem is that it takes time to warm up. So it was not really that great for ad hoc queries. If each filter is slightly different, then we need to generate a new Java class for each filter and that's not great because it means that the first execution is slow. And even when we are filtering over millions of rows, the Java in this case kick off almost immediately, but the first generated code is first JVM  interpreted and that's usual Java tiering, C1, C2. When the new code is generated while the filter is still running in that hot filtering loop, Java has this thing called on stack replacement, which means that, okay, the first execution of the hot loop will at some point in time upgrade to more efficient code, but it is still not as efficient as it could be because it cannot do this full replacement while the loop still running and things like that.

So those results are promising and Java Vector API, if we manage to solve some of the difficulties, I think it could be a nice way to, for example, get the vectorised execution on ARM. Because right now our JIT on ARM emits column only instructions while the hotspot JIT is known, and I think even the other extension, I forgot the name, Vector VLE or something like that, that's basically vectorised extensions for ARM ISA.

Vector API is a good one. Valhalla and then value object would be another milestone. And I heard some rumors that Valhalla might be integrated post the '27 freeze, so maybe in '28, in some kind of form, preview, incubator. And yes, so that would be also quite amazing because one of our principles is to control memory layout of our data structures, which right now in Java is not easy. So, being able to use Valhalla for value tied and have a fine control over memory layout, I think that's going to be fantastic, once it's there.

Olimpiu Pop: Yes. You mentioned Java 17 and targeting Java 21. These are the new epoch moments I was discussing the other week with Gunnar Morling and he was saying the same that after 17, it's a lot easier and that feels like the Java 8 moment of our generation when things start moving differently. So this is the case.

And then you guys are looking into Vector API, which will allow you the ability to move away just from X86 to from Intel, move towards form to ARM in terms of having these inquiries from that point on. And then obviously it's Valhalla, which will allow you to be in a nicer environment when you're looking into the memory layout.

Modern Java Aligns Idiomatic Programming with Mechanical Sympathy [18:29]

Jaromir Hamala: It allows us to keep our mechanical sympathy and write Java code because right now there is a bit of tension between these two, right? Either you are writing thematic Java, everything is an object with its own identity or the primitives. And if you embrace this, I think most developers should, then you can write nice idiomatic Java, but then sometimes this obstruction is too high and it removes some of the control.

So then when you want to have some sympathy for the hardware and you don't want to prevent your computer to be as fast as possible, then you need to have some sympathy for the hardware. And right now we have to decide what we value more, writing nice idiomatic code or have sympathy for the hardware. Well, we decided that we value the sympathy for the hardware more, but it doesn't mean that we don't value nice and readable maintainable code. We would like to have that too. So my hope is that the Valhalla will allow us to have both, hopefully.

Olimpiu Pop: How about Panama? Are you getting rid of the JNI?

Jaromir Hamala: I haven't played with that personally, but maybe we could, since we use mmap internally. Basically, we map a file into memory when we are querying it. Right now, we use a JNI call. So we have a C library, and the exposure is over JNI. So it means that when we want to read a data file, we call it from Java, open the file, then map it into memory and return a pointer to the memory. And then we use this pointer for Unsafe access to what is in that file.

And yes, with Panama, we could rid of JNI because we could just use the Panama bindings to call this mmap from the standard library and we could also read of the Unsafe access because then hopefully one day we will be able to have some kind of zero cost abstraction to read all the memory, mmap memory, without paying the price or too much price because right now with Unsafe, it can be fiddly.

I sometimes say that using Unsafe from Java, especially if the object is a bit more complex, is sort of like using a pointer to void in C. If you happen to know C data type, you can have a pointer to this data type or you have a point to void and then it can be anything, it's up to you, it gives you some flexibility, but also you can shoot yourself pretty badly because if you are assuming that the memory behind this pointer has certain layout, but these assumptions turns out to be wrong, then very bad things can happen and you have no type safety.

Olimpiu Pop: So all in all, Java is moving in the right direction to close the gap between having idiomatic Java and high-performing Java under the same roof.

Jaromir Hamala: Sorry, one thing I want to clarify, I do not want to sound like you have to use these techniques, your code is going to be slow. That's not the impression I would like to make. The techniques we use most are for most developers. If you are just writing your spring applications, you totally don't need to do that because someone else did it for you already in that framework. If you're writing the framework and performance is your differentiator, you need it right now. And the hope is that in the future it will be more straightforward and less risky, less fragile, easy to maintain and reasonable.

Olimpiu Pop: Yes. It's for a narrow amount of developers, in my opinion, where you just have to focus on each second and each millisecond, and that's very thin. As you mentioned in the financial side where you have to do with trading and then it's about databases and probably some other very important, probably in the robotics space again, where each millisecond counts and you have real time. So I think that's where it's more targeted.

Linux Kernel Debugging One-On-One [22:56]

Sometime ago, you wrote a post about playing in the Linux Kernel, and that was a rabbit hole. I mean, we are already in a very small bucket of developers talking about mechanical sympathy and talking about Unsafe and memory model, but then we are discussing about the operating system kernel, that's a whole different bucket. So maybe you can share something about that.

Jaromir Hamala: I was trying to reproduce a performance issue for one of our customer experience. I attached my IC profiler to QuestDB and my whole computer froze. What is going on? I restarted the computer, tried to attach the profiler again, the same thing happened. It was completely frozen. I was totally perplexed about what's going on. First, I suspected a bug, maybe in QuestDB or IC profile, but it was strange because we have IC profile embedded within Quest DB.

It's shipped. In every QuestDB installation is the profile. I was the one who did the integration, so I was pretty sure that it used to work before. My computer was freezing, and at some point I realized that I recently upgraded my Ubuntu version, which came with a new kernel. That was a clue. I started to Google around and I found that a virtual friend, Francesco Negroni from Red Hat reported just a few, I don't know if hours or days ago, very similar behavior. So again, this helped me to find a bug in Linux kernel, which under some very specific circumstances led to a deadlock inside the kernel. So the kernel basically deadlocked itself because internally it was trying to cancel the timer, but that task which was canceling the timer could not cancel the timer because the timer actually triggered that task.

In order to cancel a timer, I have to make sure that nobody else is executing the timer. Oh, there is someone executing timer. It's me. So basically it waited until that callback was done, but the callback could not be done because that was the one which was canceling the timer. And luckily there was a workaround, so I could carry on with the customer work. But as I mentioned in the beginning, I like to know a lot about computers and I don't see this kind of bugs every day. So I had to look at the patch because it was already patched in the kernel F3. So I checked how it worked and then I tried to reproduce it in a gram.

Eventually I managed to reproduce it and it was like, well, if I already have a stable reproducer in an emulator, maybe I can poke it a little bit. And I'm not a kernel developer, I can reason about some bit small parts of kernel, but by no means I'm a kernel developer, but I'm curious. So I opened the kernel source tree and see how it works. And then I realized that I could attach a GDB debugger to that emulated machine and I could debug kernel step by step as when you were debugging a Java program, for example. This was new to me. I never done this before. So I was playing a little bit, you do one step, you observe how registers are changing.

And then I though, well, maybe I could trick the kernel to exit from that deadlock and maybe I could revive that frozen computer some way in kernel. And first I could not do it and I was about to give up, but then, I like to poke things, so I tried something else with GDB and eventually I managed through series of super angry tricks which involved lying to kernel. So I was stepping out a function, which was turning one value, but under the feet of that kernel, I replaced that value so the rest of the codes thought that that function returned something other than it actually returned.

And yes, I was able to unfroze that frozen box, which is not practical at all because we basically lied to a kernel and I think once you lie to a kernel in situations like this, then basically, I don't know, maybe I violated some invariance and the machine I would not recommend to do this in production or anywhere outside of laboratory, but it was a fun exercise. Maybe you could call it a party trick, but I learned about how profiles work, how timers in the kernel work. I learned how to attach a debugger to a kernel and bits and bobs. So it was fun. That was all what it was. It was a fun and it was a learning exercise, totally impractical. I would not recommend it. No, I would, I would, but don't expect that your employer will be amazed. They will not.

Olimpiu Pop: Okay. That was my question. Will you do it again?

Jaromir Hamala: Well, it depends if my boss is listening to this podcast, then definitely no. I will always work on business critical task, but if he's not listening, I will, because sometimes I just cannot resist. Sometimes it's too much to resist.

How to Get Java to Process 1B Records in Seconds [28:43]

Olimpiu Pop: I remember that point of time because we have to mention it, three years ago, you got the bronze medal and that was quite amazing given the amount of submissions that challenge had.

Jaromir Hamala: But the rules played in my favor, like one billion row challenge, it was explicit in the rules that people were encouraged to inspire from each other, to borrow the ideas. And so if you joined late, it was not a big deal because you could see what others are doing, what they are trying, where they failed, and think about your own contribution. Of course, if you just copy what the best place person is doing, well, maybe you will be high in the scoreboard, but there's fun.

I think I was attending for the last two weeks or something like that. I spent most of the evenings in front of computer and just doing a lot of profiling, benchmarking, going again down from Java to assembly to see where the computer is wasting time, where maybe the algo is fast enough and via just memory latency and just trying to smooth out everything. And I think my own contribution, that was one at least, was in the realization that CPUs these days are super parallel out of all the execution machines. And by parallel, I don't mean just that they have multiple cores, I mean that one core can do many things within the same time.

So even if you have just a single threat running on one specific core, even if you have a modern CPU with one core, if there is such thing, it can still do many things in parallel. So my contribution was that I exploited this. Basically each line looks like copy pasted twice, except with different arguments, and that is to exploit the fact that CPU has multiple arithmetic logical units. So it can do multiple logical or mathematical, algebraical, you name it, operations at the same time. So my first version was literally just copy pasted every single line, just double it and process multiple things at once given the same threat and hope that the total runtime will be less than if you were just running it serially

And it worked. It worked. And at some point I was even, I think 12 hours before the deadline, I was in the first place. But then the other people used the trick as well because they saw it, it was public and they were anxious to copy it. And then they had some other tricks and then I could not jump over Thomas and Artem, I believe, who was the second guy. But yes, that was pretty amazing.

And again, I learned. I improved my intuition in what is possible to do in say 80 milliseconds or 200 milliseconds. Because when this challenge started, the first good times were around five, six, seven, something like that, seconds. And I was like, okay, maybe the half it so it will be two seconds, maybe that will be the winning time. And I was proven so wrong. I was proven wrong by order of magnitudes. So it was very time-consuming. The lesson for me is, again, the computers are extremely fast, and if you are not sabotaging them, they are surprising.

Olimpiu Pop: Okay. So for me, the lesson learned is that there is a big difference between beautiful code and fast coded points. If you know exactly how the hardware actually acts, it might be the case that you can optimize for that if the case requires it.

Jaromir Hamala: And again, it doesn't mean that in order to write fast code, it has to be ugly. This was really about squeezing the very last drops and that required some very ugly tricks. Again, I would not recommend this to anyone. We don't even use this level of trickery in QuestDB because the one billion rows challenge had one massive advantage. The code didn't have to be maintained after the deadline, so you could do whatever.

We could go bananas, try the ugliest possible tricks, specialize it for the input, that at some point people realized that there was certain distribution of city names of cities that certain shape and this fact could be exploit to specialize HashMaps for these particle shapes. Again, you can't do that in a general database because it has to work in all shapes of input data. But yes, for building a better intuition about hardware, this was priceless. Yes.

Olimpiu Pop: Okay, great. Thank you for sharing. Is there anything else that I should have asked you, but I didn't?

Is AI Native Coding for Tinkerers? [34:03]

Jaromir Hamala: You didn't ask me if I use AI for coding. I totally expected this question.

Olimpiu Pop: So do you?

Jaromir Hamala: Yes, I do. I do. I have different use cases. Sometimes it's just investigation of unknown code base. So when I was doing my Vector API exercise last weekend, I wanted to see how the mapping of the Vector API all the way through the compilation pipeline in Hotspot to specific vectorized instruction, how that works. And it's all there in Hotspot code base, but I'm not the Hotspot developer and the Hotspot C2 compiler is complicated beast.

It doesn't use the normal control flow graph. It has its own thing, sea of notes. And I don't know it that well. So I use Codex and Claude to basically explain me things. So I was like, no, I'm doing this and that and I have a hypothesis how this work, please check it, validate this. So that's one use case. I used it for Linux Kernel. If I want to see how something work in kernel, I have Linux Kernel Clone on my desk.

So I start Codex for Claude, ask question, then validate the answer. I think that's the important bits people are sometimes skipping on, but it's amazing for this kind of explorative work and as a learning tool, right? Because I can do things I wouldn't be able to do in a reasonable time before, like this Hotspot exploration. It doesn't mean that I would not be able to trace it all the way down, but I wouldn't be able to trace it all the way and down in one Saturday. So that it means practically it would not be possible because, I have family too.

Olimpiu Pop: Well, even if it's not that, what was my experience is like just trying something new a couple of years back, it meant a lot of research. You did it, you tried something and then life happened or business priorities happened and then you forgot and then you had to remember what you did. But now it's very easy to do it and you have a lot of iterations and then you have some aha moments that normally you would've just done it differently.

And now it feels that it's a lot faster, especially if you know how to even fix it, go deep into the code and use your intuition. So I think that's important. For me, the concerning part is it seems that it's working very good for people with more experience and higher intuition. For the guys that are actually still in the beginning, they don't know what's wrong. But looking in the last couple of weeks, I think, both Claude and Codex brought in the learning mode and they had some interesting experiments where they just had junior developers play with the framework and those that actually used the learning mode scored a lot better in the quiz afterwards. So we have options if we want to use these kind of things.

Jaromir Hamala: It's amazing. Just the other day, I was thinking, okay, if I were starting with programming these days, how it would be like. Because on the one hand, these tools, it's amazing for learning, right? You can learn about a lot of complicated real world pieces of software, Kernel being one example, right? So on one hand, it is amazing learning exercise. On the other hand, knowing myself, there might be this why bother question, right? If I am starting coding these days and I can see what agents can do on their own, I don't know if I would have enough discipline to actually learn the craft.

I don't know what's the answer. And I guess we will learn in a few years to see how this work out. Because coding needs a lot of discipline, especially when you are starting and even later. It needs deep concentration, you need to have at least some level of goal orientation because... You need to invest a lot of time, right? And if you are investing a lot of time and you invest six months, say, to run something and then you realize an agent can do it just by putting a few sentences, I think that's not very good for motivation.

Olimpiu Pop: I will just bring against you two things that you mentioned in our discussion and one of them is that you were shaped by experiments. Just think about the Linux side and something that made you curious. And then the other thing is that you have to have a difference between code that you run in production that needs to be maintained and that will bring it into particular cases and then the other ones.

So three years ago, there was a presentation at QCon London about one of the guys that is working as a SRE for Anthropic. And it was very nice to see how the guys from Anthropic are using Claude to dig through logs and stuff like that. My expectation is we saw it already with the launch of OpenAI three, four years ago or whenever it happened, the fact that now the products are built in front of your eyes and that's what they did.

So I think we need this acceleration, but my fear is that it's about superficiality because you don't understand in depth a lot of things, but on the other side, if you do, you actually have these powerful eye-openers. There was a security researcher from Anthropic, he was mentioning that he experimented with a couple of things using cloud and doing some red teaming and actually he found a defect vulnerability in a project that is available open source for 10 years now. It was the first vulnerability ever reported. It was about Ghost, the static content generated in JavaScript, and that was quite interesting. So I think it's an interesting era and with its challenges and its cognitive loading, but yes, let's see where this takes us.

Jaromir Hamala: Yes, yes, yes. Agree.

Olimpiu Pop: Thank you for your time, Jaromir. It was a great discussion.

Jaromir Hamala: Thank you for having me. Yes, I was having fun too.

Mentioned:

About the Author

More about our podcasts

You can keep up-to-date with the podcasts via our RSS Feed, and they are available via SoundCloud, Apple Podcasts, Spotify, Overcast and YouTube. From this page you also have access to our recorded show notes. They all have clickable links that will take you directly to that part of the audio.

Previous podcasts

Rate this Article

Adoption
Style

BT