Unlocking New Frontiers: Associate Professor Peter Jansen Leads Research on AI and the Future of Scientific Discovery

Dec. 9, 2024

Can AI perform science like humans—and perhaps someday, better?

Image
Peter Jansen

Peter Jansen, Associate Professor and Researcher, Allen Institute for Artificial Intelligence (Ai2). Photo courtesy Ai2.

At the University of Arizona and beyond, Peter Jansen is redefining how machines engage with science. As an associate professor in the College of Information Science and a researcher at the prestigious Allen Institute for Artificial Intelligence (Ai2), Jansen sits at the cutting edge of a field long viewed as a distant dream: automated scientific discovery. His latest project in concert with other Ai2 researchers, DiscoveryWorld, blends the charm of a video game with the rigor of scientific exploration. At its heart, this research asks an ambitious question: Can AI perform science like humans—and perhaps someday, better?

Jansen is an interdisciplinary artificial intelligence researcher with expertise in natural language processing, automated inference and virtual world simulators. His work spans automated scientific reasoning and knowledge representation, including pioneering projects such as ScienceWorld, a high-fidelity simulation of 30 science experiments; TextWorldExpress, capable of running up to a billion experiments per day; and DiscoveryWorld. He also leads initiatives to make AI systems capable of producing explanations for their reasoning, exemplified by resources like WorldTree and EntailmentBank. With a joint background in computer science, cognition, physics and engineering, Jansen’s career bridges the gap between theory and application, integrating fields that traditionally operate in silos.
 

DiscoveryWorld

DiscoveryWorld contains 120 different challenge tasks, organized as parametric variations of eight discovery topics spanning three levels of difficulty. Graphic courtesy Peter Jansen.

His diverse research interests reflect an equally diverse professional journey. Jansen’s projects have received international recognition, including his widely celebrated open-source science “tricorder”—a device inspired by Star Trek that allows users to sense and analyze the world around them. In 2015, the tricorder earned a place on permanent exhibit at the German Museum of Technology in Berlin. His work has also been featured in outlets like Reuters, WIRED and The Washington Post, and he has delivered a TEDx talk on accessible science at TEDxBrussels.

Yet his passion for discovery doesn’t stop at hardware. Through his collaborative work at Ai2—a research institute renowned for developing transformative AI tools—Jansen explores how artificial intelligence can augment scientific progress. “The idea is, can we build AI models that actually do science?” he says.

This ambitious question drives DiscoveryWorld, a virtual environment designed to test whether AI systems can engage in genuine scientific discovery—hypothesizing, experimenting and refining results to solve complex problems.

Inspired by old-school role-playing games, DiscoveryWorld sets itself apart by offering more than just theoretical puzzles—it immerses AI agents in the practical chaos of research. The game presents 120 tasks across fields like biology, rocket science and epidemiology, all with varying difficulty levels and all set on a fictional Planet X where the laws of physics differ just enough to prevent AI from relying on pre-trained data sets. Players—whether human or machine—must navigate scenarios that require planning and experimentation, such as diagnosing foodborne illnesses or calculating the propellant needed for a rocket launch.

“It looks like a game,” Jansen jokes, “but it’s really the least fun game you’ll ever play. You need spreadsheets, lab notebooks and statistics software open—because it’s real work, not entertainment.”

The challenge? Real scientists have been able to solve on average 60% to 70% of the tasks, with some able to solve nearly all tasks, though there’s a large variance. Yet even today’s most advanced AI models, using OpenAI’s GPT-4o framework, manage about 20%—and mostly the simplest ones.

Jansen emphasizes that this gap between human and machine intelligence, while stark, is exactly what makes DiscoveryWorld so valuable. “The very nature of discovery means you're working with unknowns, so you can't use standard evaluation metrics,” he explains. The game provides a sandbox for researchers to assess how well AI can handle the full scientific process—from forming hypotheses and designing experiments to drawing meaningful conclusions.
  

DiscoveryWorld tasks

DiscoveryWorld demands all the key facets in end-to-end scientific discovery, from ideation, forming hypotheses, planning and executing experiments, and drawing conclusions to solve challenge tasks developed for this virutal world. Graphic courtesy Peter Jansen.

The project’s broader aim is outlined in the research paper, “DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents,” which will be presented at the NeurIPS Annual Conference from December 10-15, 2024. DiscoveryWorld provides a vital benchmark, enabling researchers to evaluate AI systems’ ability to engage in open-ended scientific inquiry—a far cry from the rote tasks most AI systems excel at today. “If we can crack this,” Jansen says, “we could genuinely accelerate the pace of scientific discovery.” The hope is that other researchers will take up the challenge, using DiscoveryWorld as a springboard to push the boundaries of what AI can achieve. "We’ve released this as a challenge to the broader community," he continues. "Our hope is that others will develop clever solutions that push the field forward.”

With the project set to be showcased this week at NeurIPS, DiscoveryWorld has already earned its place on the AI research map, including a recent story in New Scientist. Yet, for Jansen, this work is just the beginning. Whether it’s creating AI systems capable of performing scientific discovery, developing virtual environments that test and expand the boundaries of automated reasoning or designing open-source sensing tools to empower science education, his career as a researcher reflects a relentless curiosity and a drive to make science more accessible and impactful.

That drive is similarly reflected in his teaching. At the University of Arizona, for example, Jansen’s applied natural language processing course includes students from across disciplines—from linguistics and engineering to policy analysis—equipping them with the tools to navigate the fast-changing landscape of AI and natural language processing.

“This is the most exciting time to be in the field,” he says, thinking of his students. But with his pioneering research at Ai2 in mind, and considering what’s next for automated discovery in the AI domain, he expands on that: “It feels like the best work of my life.”
 


Learn more about Peter Jansen on his InfoSci faculty page or his personal website, CognitiveAI.org.