Projects
All in no particular order.
Projects - IT
Theses
Bachelor
My Bachelor’s thesis (2019) was about native language identification in tweets, that is, given an English-language tweet written by a non-native English speaker, detect their first (native) language.
Kyiv Polytechnic Institute Bachelor’s thesis was about using Shamir’s Secret Sharing to hide information in whitespace of multiple HTML pages, using a browser extension to recover that data by browsing through enough the pages.
Eval-UA-tion: a Benchmark for Evaluating Ukrainian Language Models
A benchmark fdor evaluating Ukrainian language models, my Master’s Thesis.
- Repository: https://github.com/pchr8/eval-UA-tion
- Thesis
- PDF: On GitHub
- Thesis defense presentation: https://serhii.net/F/MA/presentation/
- Paper
- Python packages born along the way:
- pchr8/pymorphy-spacy-disambiguation: A package that picks the correct pymorphy2 morphology analysis based on morphology data from spacy
- pchr8/ukr_numbers: Converts numbers (3) to Ukrainian numerals (третій/три/третьому/третьої) of the correct type (ordinal/cardinal) and in the correct inflection
- pchr8/up_crawler: Script that downloads articles from Ukrainska Pravda
Other stuff
The work log on this website is the longest-running and actually most useful project, it makes my life much better and which hopefully helped some other people.
Otherwise I did a lot very random IT-adjacent stuff, mostly to solve problems I myself had (such as a mirrored left-hand Dvorak keyboard layout).
I once created a shorthand system, then tried to optimize it with genetic algorithms, then thought about how interesting would it be to think of an algorithm to generate alphabets based on requirements/medium/messages, which became my first published paper.
Here are the (Russian) slides from a very informal privacy talk I once did, to a very lay audience, about why should one care about privacy at all.
Projects - not IT.
From time to time I wrote stuff, sometimes I wrote poetry, sometimes I translated poetry.
I also like drawing (pencil and vector graphics).