Recent master’s graduate Simeon Krastnikov has received one of two 2021 Huawei Prizes for Best Research Paper by a Mathematics Graduate Student. The prestigious annual award comes with a prize of $4,000 and is conferred to recognize exceptional papers that present original results with the potential to make a significant and lasting impact in the field.
“Simeon’s master’s research is truly extraordinary and it meets the criteria of this award perfectly,” said Cheriton School of Computer Science Professor Florian Kerschbaum, who along with Combinatorics and Optimization Professor Douglas Stebila co-supervised Simeon. “By designing an efficient algorithm for the equi-join database operation in a way that provably satisfies an important property related to data privacy, Simeon managed to solve a long-standing problem.”
Simeon presented the research at VLDB 2020, the 46th International Conference on Very Large Data Bases, which was held virtually in 2020 from August 31 to September 4. Titled Efficient oblivious database joins, the research was authored by Simeon Krastnikov and Professors Kerschbaum and Stebila.
“Congratulations
to
Simeon
for
winning
this
year’s
Huawei
Prize
for
Best
Research
Paper,”
said
Raouf
Boutaba,
Professor
and
Director
of
the
Cheriton
School
of
Computer
Science.
“The
amount
of
data
being
stored
in
the
cloud
is
only
increasing.
Simeon
has
developed
an
algorithm
that
allows
computation
on
encrypted
data
in
the
cloud
while
preserving
privacy.
This
research
is
highly
significant
with
tremendous
potential
for
future
impact.”
 
Dan
Ursu,
a
PhD
student
in
the
Department
of
Pure
Mathematics
supervised
by
Professor
Matthew
Kennedy,
also
won
a
Huawei
Prize
for
Best
Research
Paper
for
his
work
titled
Characterizing
traces
on
crossed
products
of
noncommutative
C*-algebras. 
The Huawei Prize for Best Research Paper by a Mathematics Graduate Student is made possible by generous support from Huawei.
About Simeon’s research
Most graduate students are hopeful their research will contribute to their discipline, but few will crack a long-standing problem and in doing so develop a solution that’s a major advance in the field. Simeon Krastnikov, a recent master’s graduate from the Cheriton School of Computer Science, has done exactly that.
He developed an oblivious algorithm — a type of algorithm whose behaviour is independent of its input data unlike a typical algorithm designed for the same problem — that solves how to join two tables securely in cloud databases.
Governments, businesses and organizations rely on the cloud to store large amounts of data securely — everything from financial records to government documents to sensitive healthcare data. And along with this growth in cloud-based storage has been the demand for service providers to allow computation to occur remotely on the encrypted data they host while preserving data privacy.
A major algorithmic challenge in designing applications for secure computation in the cloud is to combat side channels by ensuring such applications are oblivious to their inputs — that their memory access patterns do not leak sensitive information to the server, Simeon explained. “You may think that computation on your encrypted data in the cloud is happening completely in secret, but depending, for example, on the memory accesses a program makes the server may be able to learn certain information about what’s happening and what the encrypted data looks like, thereby breaking the encryption.”
Various mechanisms have been developed to preserve privacy, among them dedicated secure cryptographic coprocessors and hardware enclaves such Intel’s Software Guard Extensions or SGX, which provide security through a dedicated set of processor instructions. These approaches provide good cryptographic guarantees, ensuring a user’s data remains encrypted while computation is performed on it remotely, but on their own these mechanisms provide no guarantee to address various types of subtle information leakages.
“As the program reads and writes to specific addresses of the server’s memory, these memory access patterns can reveal information about the user’s data if the program’s control flow depends on its input,” said Professor Florian Kerschbaum, Simeon’s co-supervisor and a Professor at the Cheriton School of Computer Science. “This is why we need to develop oblivious algorithms, ones that do not leak sensitive information to the server through such side channels.”
Making database operations oblivious does not pose a great algorithmic challenge in many instances, but one common database operation — the join operator — has proven difficult to make oblivious. In fact, since the first paper describing this problem was published in 2006, no solution has been satisfactory until Simeon’s work that not only cracked the problem but did so elegantly.
“Making the sort-merge algorithm secure meant it had to be made oblivious, ensuring its decisions about which memory locations to access do not depend on the contents of its input data,” said Professor Douglas Stebila, Simeon’s co-supervisor and a Professor in the Department of Combinatorics and Optimization. “This is difficult to do without introducing substantial computational overhead. Yet Simeon developed an oblivious algorithm for binary database joins with a run time that matches that of the standard non-oblivious sort-merge join algorithm. Any model of computation that can support a sorting network on encrypted data can also support his algorithm.”
The research team of cryptographers has tested a working prototype of the oblivious algorithm, which consists of just around 600 lines of C++ code, as well as a version that uses Intel SGX.
A future application will be to apply the logic behind this algorithm to purely cryptographic problems that require similar kinds of operations. A number of database operations are very similar to joins in their computational complexity and the newly introduced oblivious primitives used in the join algorithm could potentially be applied to those as well.
“Both Douglas and I are in awe of Simeon’s breakthrough,” Professor Kerschbaum said. “His solution is scalable to the optimal complexity, it is provably leakage free, it supports the full functionality of equi-joins, and requires no random access memory. All that and its practical efficiency is already extremely good, much better than anything else. It’s an impressive solution to a long-standing problem.”