Celebrating 25 years of coding :)

Professional Projects

SAP SE (2007-ongoing)

Worked on the following products: The SAP HANA database is relied on by more than 30,000 customers of SAP, running mission critical ERP and analytic systems. That includes most of the largest companies in the world.

The code of the HANA database server follows the highest standards necessary in an enterprise server deployment, with regards to malfunction safety/no leaks, fully tested, verified by static and dynamic analysis tools. This enables continuous uptime.

Some of the technologies used:

Engineering Productivity Improvements (2015-ongoing)

Technologies: C/C++, Python, binutils, GCC and clang toolchains
Role: Lead

Semi-Cube Execution Engine for SAP Analytics Cloud (2019-2021)
Implementation of HyperLogLog operator.
Technologies: C/C++17, LLVM, gtest, Python
Role: Senior Developer

Snapper: A custom, highly-optimized cube creation tool, enabling fast queries. Acquires data from SQL sources, and produces optimized in-memory data structures that can be quickly persisted and loaded back from disk.

Technologies: C/C++17, SQL, AVX, JSON, gtest, Python
Role: Lead Developer

Various quality topics (2015-ongoing)

Role: Contributor/Lead

Memory Leak Checker (2018-2020)
Custom memory leak detector able to find memory leaks in a live process. In contrast to valgrind and ASan it does not require instrumentation but uses base functionality of SAP HANA's custom allocators. Lots of optimizations were performed to make it fast, since the working set of a HANA instance is >8 GB and can reach into the TBs. As a result, leak detection is much faster then with valgrind. The tool can be used in testing and memory usage errors can be found before software is shipped.
Technologies: C/C++17
Role: Lead

Hardening SAP HANA (2016-2018)
Introduced state-of-the art hardening to the SAP HANA product (e.g. stack protector, RELRO/BINDNOW, ASLR) by upgrading our build and also testing the adherance of third-party components that were consumed. Also introduced hardening to those components where possible.
Technologies: binutils, GCC, Python
Role: Lead

AddressSanitizer Testing (2013-ongoing)
Ported the codebase to build with the clang compiler, which made the usage of additional QA tools possible. Established AddressSanitizer testing for SAP HANA. In the past 6 years, 1300 bugs were detected, leading to a markedly improved quality of the product and much time saved on debugging issues.
Technologies: C/C++, clang toolchain, AddressSanitizer
Role: Lead

Trusted Code Reviewer (2015-ongoing)
Reviewing of critical fixes destined for release branches, across the whole engine, to avoid regressions.
Technologies: C/C++, Gerrit, Bugzilla

Loadster (2013-2015)
Custom loading tool for large-scale data warehouse migrations from other databases to SAP HANA. This tool ran on a scale-out blade landscape and migrated petabyte-scale data to the internal format of HANA. Speed was crucial in order to migrate large warehouses with minimal downtime.
Technologies: C/C++, VTune, bash
Role: Lead developer for C++ part

BW In-Memory-Optimized Cube Conversion (2011-2013)
Custom procedure to very quickly convert data warehouse cubes from the old format to the in-memory format of SAP HANA. Lots of optimizations made to be able to do this with minimal disruption to the customer. Also lots of tests written to make sure this conversion is safe, since low-level functionality on persistency level had to be used to make it fast.
Technologies: C/C++, Python
Role: Lead developer for C++ part

Column Store Dictionary (2007-2011)
Implementation of central dictionary data structures and algorithms for the SAP HANA column store. Focus was on optimal compression for small representation in memory, and fast access. Interesting takeaway (at that time ;): The smallest representation in memory often is the fastest, too.
Technologies: C/C++, Python, VTune

OLAP Engine
Implement features in the OLAP engine of SAP HANA, including the count distinct operator and high-performance expression evaluation during aggregation based on SAP HANA's language.
Technologies: C/C++, Python, VTune
Role: Senior Developer

BO Explorer Integration (2011-2013)
Implement integration of SAP HANA with the Business Objects Explorer frontend and provide optimized query processing functionalites.
Technologies: C/C++, Python
Role: Senior Developer

Various topics


Side Projects

Glacier Deep Archive Backup (2021-2022)
Backup solution that uses S3 Glacier Deep Archive for storage. This is the most cost effective cloud backup storage I'm aware of ($1/TiB/month) and also very reliable (99.999999999% data durability, data spread over 3 Availability Zones). The use case is for full, off-site, encrypted backups that are only retrieved after a catastrophe (i.e. your house burns down, RAID and local backups are gone). Available on GitHub: mrichtarsky/glacier_deep_archive_backup. Featured on Hacker News.
Technologies: Python, bash, ZFS, AWS, AWS CLI, gpg, zstd
Role: Lead Developer

Ergodox Tools (2022)
ergodox-compress-keymap: Compress keymaps to fit more code onto the keyboard
ergodox-heatmap: Generate heatmaps to see which keys are used most often and optimize the layout.
Technologies: Python, SVG, bash
Role: Lead Developer

mailbox.my - Your personalized email address (2021-)
mailbox.my is a service where you can get an email address on any of 150+ domains, so that you can have a nice address like john@mymy.email instead of john233212332@gmail.com.
Technologies: Python, Linux, Apache, postfix, SASL, DoveCot, Flask, MySQL, Payment APIs (Braintree, Stripe), certbot, REST, SPF, DKIM, DMARC, SRS, Fomantic UI, HTML, CSS, JavaScript, DNS, bash, video editing, Text-to-Speech, Google Cloud
Role: Lead Developer

filedb - File deduplication tool (2020)
The tool indexes provided file system locations, checksums the files and is able to show and prune duplicate files and directories, while also being able to drill up to the uppermost common directory. Available on GitHub.
Technologies: Rust
Role: Lead Developer

PostgresQL Median Extension (2019)
Implement aggregation function that can calculate the median for most of PostgresQL's data types, including mean calculation. Properly works for large values that PostgresQL cannot handle in its AVG() function. Supports window aggregates and large data sets via tuple sorter. Large test suite.
Technologies: C, PostgresQL, SQL, make
Role: Lead Developer

High-Frequency Trading Analysis (2018-)
Project consisting of several parts:

Technologies: Python, Flask/Dash, WebSockets
Role: Lead Developer

University Projects

FemFind (1998-2001)
SMB/FTP file search engine for a university campus network, featuring a web interface and a Windows client.
Technologies: HTML/CSS, Perl, CGI, Linux, Win32 API, Socket API, FTP, SMB protocols

allen.iverson.net (1998)
Basketball fansite. My first public website. Archived version This includes my first real usage of JavaScript on client side for "interactive" calculation of player statistics.
Technologies: HTML/CSS, JavaScript, Web Design

School Projects

Galton Board (1997)
Graphical simulation of a Galton board, with added features like background graphics. Ran on an Intel 80286.
Technologies: Turbo Pascal with inline assembly

K-B00M (1997)
Video game that runs on an Intel 80286. Atomic Bomberman clone. Features a short video of a fish. Used compression to fit in the limited memory of the computer/on the floppy disk.
Technologies: Turbo Pascal with inline assembly
Source | Video

Kaospaint (1996)
Drawing program comparable to MS Paint. Code for drawing primitives optimized in assembly to make it run smoothly on slow computers.
Technologies: Turbo Pascal with inline assembly