At Tallinn University of Technology, Information and Communication Technology PhD student Ian Erik Varatalu created a high-performance engine for searching text that has drawn attention in both academic circles and the software industry

From Academic Interest to Technological Innovation
Varatalu began his studies at TalTech with a bachelor’s degree in Business Information Technology. He later moved on to the Computer Science master’s programme to deepen his technical knowledge, which eventually led him to doctoral research.
Over two and a half years, and under the supervision of Margus Veanes (Microsoft Research, Redmond) and Juhan Ernits (TalTech), Varatalu developed RE#—a high-performance regular expression library that is currently the fastest one in the world in processing speed.
Recognised Research with Practical Applications
Varatalu presented his work at the Principles of Programming Languages 2025 conference in Denver. His paper was published in the January 2025 issue of the Proceedings of the ACM on Programming Languages.
RE# shows significant performance advantages over existing regex engines, which are widely used in data processing. Varatalu explains that regular expressions allow you to search for patterns in text—like email addresses or phone numbers. “A basic search is like reading a book with a magnifying glass. Regex lets you search the entire library.”
The new tool is not just faster than competition but also has a well designed internal architecture. Even saving milliseconds for an operation that gets repeated millions of times per day may accumulate into large savings both in computation time and energy use.
Integrated into Microsoft’s Technology Stack
Beyond academia, RE# also contributed to improvements in the .NET 9 framework—one of the most widely used development platforms globally.
.NET powers systems used in enterprise software, finance, healthcare, and widely known applications like Microsoft Office, Stack Overflow, and Unity—the game engine behind nearly half of all video games.
Varatalu’s contributions were integrated directly into the .NET codebase, enhancing regex performance. His GitHub pull request, submitted under the username ieviev, was highlighted in Microsoft’s official .NET 9 blog.
While applications like Microsoft Excel may not use RE# directly, components improved by Varatalu’s work are widely used in various Microsoft technologies. As Ernits describes, .NET is like a set of building blocks. If your contribution becomes one of those blocks, it can support countless systems and developers around the world.
Impact on AI Infrastructure
Varatalu’s work also influenced the development of a component of Microsoft’s Artificial Intelligence Controller Interface (AICI). Based on the concepts from his research, Microsoft engineers developed an efficient tokenizer, which is now part of AICI’s preprocessing pipeline for large language models like ChatGPT.
“When you ask ChatGPT a question, the system first needs to break it down into tokens before it can process the meaning,” Varatalu says. “That step is directly connected to our work.”
The result is a behind-the-scenes contribution to billions of AI-generated responses around the world.

A Flexible and Rigorous Academic Environment
TalTech’s master’s programme in Computer Science allows students to shape their studies through a wide range of elective subjects—spanning computer science, robotics, machine learning, cybersecurity, and more.
“You can choose the topics that interest you and explore them deeply,” says Varatalu. “Bachelor’s studies are broader, but at the master’s level, you can truly define your path.”
Ernits adds that the goal isn’t just to produce coders. “We aim to develop people who understand how systems work and how to improve them.”
Technical Depth in the Age of AI
With AI increasingly capable of writing code, is a degree in computer science still relevant?
“Absolutely,” Ernits says. “AI can handle simple tasks, but complex problems still require specialists who understand systems and know how to ask the right questions.”
Varatalu agrees. “AI can generate code, but knowing what’s correct requires expertise. That’s where deeper understanding comes in—and that’s the real value of a master’s degree.”
Research that Connects with the Real World
Students at TalTech are encouraged to define their own study path and choose their thesis topic. Research areas supported by the research groups of the Department of Software Science include software engineering, natural language processing, computer vision, robotics, theoretical computer science, and interdisciplinary projects with corresponding co-supervisors from other domains that connect IT to fields like chemistry or astrophysics.
“We’ve had theses on topics from quantum chemistry to natural language processing,” says Ernits. “It’s about applying IT knowledge in real-world contexts.”
The Value of Community and Initiative
University life is more than just lectures and labs. For Varatalu, one of the most important aspects was connecting with peers and mentors.
“When you begin your master’s, you form connections. These relationships often lead to interesting projects and new opportunities,” he says. “It’s also a valuable social experience—you’re not working alone.”
“TalTech offers a lot,” he adds. “If you have the interest and motivation, there’s much to gain here.”
Ernits notes that students are expected to take initiative. “ When you do, the world will become your oyster. Our role is to facilitate that.”
Admissions Open
TalTech provides the knowledge and tools to develop real solutions.
Admissions are currently open: www.taltech.ee/admissions
For more information, join the IT Master’s Programmes Info Evening on April 8 at 17:00, held at Mektory.