Took me two 160-page spiral-bound notepads to get through my first four terms, as well as countless leads for my mechanical pencil. (I love that pencil, and it’s got two elements of irony attached to it: It was originally owned by my wife’s first husband, who was an engineer, and it’s called “PhD”. He didn’t have a PhD. I do. But I am only now using it when I’m no longer active in the profession I have that PhD in. Which is history. Pun intended.)
Where was I? Oh yeah, notepads. I bought my third in the middle of the fifth term. My practicum partner warned me I would no longer need it. She was right. For two years I had used the notepad and the pencil more than my laptop computer. All those math and formal CS courses with their countless exercises, tests, practicum assignments, and mock exams. All those modelling exercises in programming and software engineering, drawing domain models or class and sequence diagrams. All those study projects done in groups with other students, they all needed sketches, lists, scribbled drafts to illustrate ideas and get on the same page. Mind you, I’m not even using my notepad for actually taking lecture notes–those go on digital cribs or flashcards. Yet in all those years I could have forgotten to bring my computer without any great harm, but I couldn’t have done without my notepad. And now I must actively remember to pack it. In the past four months or so I used maybe two pages.
Things have changed, truly. We’re hardly here anymore. The term has begun nine days ago and I almost don’t notice. In fact, right now I’m sitting in the cafeteria with a cup of coffee, as I did so many times before, and know that, with a full term left to go, it will be one of the last times I’m doing this. I have a lecture on Thursday and a compulsory choice module, lecture plus practicum, every other Tuesday. Both start at 8:30 a.m., so no quiet study mornings in the cafeteria. The only thing that’s taking place in the afternoon is the IT security practicum, four times on Monday. Other than that I’m here only three days in two weeks. In fact, right now I’m only here because I have a meeting with my thesis supervisor in a couple of hours.
Speaking of the thesis, one might say it’s going well, after a fashion. I have been busy the past couple of weeks collecting the data I’m going to use and making them programmatically available. I.e., I have been writing the basic blocks to get started, small tools to download values from public APIs, or to read data I have downloaded as spreadsheets or CSV files (comma-seperated value, i.e. text files), or from a PostGreSQL database into in-memory data structures.
That was a veritable crash course not primarily in data science basics, but rather in real-world Scala. Doing small exercises in a new programming language is a piece of cake after the first 10 languages. Scala is officially my 13th language, and after two weeks my core Scala was probably better than my Java was after two years. But as soon as you get your hands dirty with those necessities without which you can’t do any real stuff, such as I/O, HTTP, ReST, database access, asynchronous calls, and so on, you stumble into the underworld of various sparsely documented libraries and frameworks that all do thing slightly differently from each other and from other languages. Not to mention, in the case of Scala, they force you to wrap your head around doubtlessly highly useful and idiomatic, but nevertheless confusing and intuitively rather impenetrable language constructs such as Futures and Options. And mind you, I haven’t actually worked with a database since Programming II, and haven’t written any SQL since the databases exam, both nearly two years ago.
In short, I’ve been making headway surely, but rather slowly. Even properly parsing a JSON string took me almost a day. But I think with a few more days I’ll be able to actually have all the data available and start simulating the “synthetic” (fake) data I’m going to use for the actual machine learning model, en lieu of the real data I didn’t get. (Mind you, the saving grace of that failure is that I won’t need a confidentiality agreement between UAS and my employer.) And that means I will have concluded one of the four or so major steps of my project.
The “rest”–which means all I originally meant to do–is still plenty and scary, but will be manageable. I do think. Step two is setting up the tech stack, and I’m at least clear about the basics there. I have already once successfully connected Scala with Spark and Spark with a database at least on my local machine. And I have run a small neural network example with my chosen library (DL4J) in Scala. (I emphasize in Scala because as the ‘J’ indicates, DL4J is a Java library, and I failed signally to get the WIP-Scala wrapper to run. The current version is a snapshot my build tool refused to load from the repository, and the latest stable version is worlds apart from the code examples that are the only documentation provided. But since you can call Java libraries from Scala projects, it’s just as well. Only not quite so elegant.)
It will still be a challenge to get all this wired together using the cloud. I haven’t yet used Spark from DL4J, although they do provide wrapper classes for doing so, and I am still waiting for access to my Kubernetes (container framework) cluster so I can use it for distributed calculations in Spark. I expect a tough time getting all this to work, particularly since I’m by no means a dev-ops person. In fact containers still scare me by their being so remote. I am constantly toying with the thought that I may just be making this way too complicated and should rather just rely on familiar Python with its plethora of intuitive data science libraries and do it all on my local machine again. But a major incentive of the project, for me, is in fact combining my machine learning experience with the superior performance of the Java Virtual Machine and the resources of cloud computation. At least that’s what makes it at all attractive to my employer. We’re not data scientists, but consultants. If machine learning or let alone artificial intelligence will ever be interesting to our customers, it has to be industrial-strength, reliable, and fast. That’s the Java ecosystem, not Python. So there.
Step three is going to be the actual machine learning–using the data on the tech stack to come up with an accurate neural network. Step four will be making that network available for predictions, probably as a web service. But since the network will be trained on fake data, there is very little point in actually implementing this. I might as well just hint at the ways of doing so, in the thesis.
For in the end, the main purpose of the project is actually writing the thesis. I must not lose sight of that. All the programming is just a means of finding something to write about, and satisfy my employer that investing all that money into my project is, at least in principle, worthwhile.
And in fact on that front my betters at work dropped a couple of bombs in the past few days. First my boss’s boss came to tell me I’d rather consider right now how the expertise I am about to gain in my thesis project could eventually be “monetized” (his actual word) by our company. If we–that is, I–didn’t have a plan for that well before I completed the thesis and started working full-time for them, I might just get snatched up by the first CR (sales) person who managed to match one of my technical expertises to a customer’s requirement. That sounded scary, but then I felt, two weeks into my six-month thesis, quite overwhelmed by his suggestion I had better prepare a complete concept for establishing a broadly-based machine-learning competence in our company, complete with ideas which customers to offer it to and with which arguments. Hello? I’m just the student. Even if I do know a little machine learning, what makes me an expert in selling it?
And he wasn’t kidding. Barely a couple of days later, an enthusiastic colleague from sales stood at my desk trying to send me to Hannover (of all places; I’d rather be buried in Eastern Siberia) because he had seen the magical word Scala on my internal profile. I told him I had other priorities right now, and for the next five months as well. He was not deterred. And after? Let’s see what my superiors have in store for me, I said. Oh, I’ll have a say in that, he promised, or threatened? Not so, said my bosses; CR could make suggestions, but it was still the teams and business units who decided. A small relief.
And then last Friday my team manager announced that he was quitting the company in June to work closer to home (he has a family quite a distance from Hamburg). Remember, he was supposed to be the second, but in fact actual supervisor for my thesis. Now that’s not the primary problem, because coincidentally on the same day UAS figured out that our study regulations were in conflict with state law and we could no longer have external supervisors (from the industry rather than university) at all. So I’ll get an internal second supervisor, and in fact that’s what my meeting with my professor today is about.
But the real shock for me is that after just over half a year I am losing the guy who recruited me into the company and who has always been extremely loyal and supportive, smoothing my start in a real-world profit enterprise and shielding me from tries by other managers to employ me for their projects (rather than my bachelor thesis). I will sorely miss that protection.
To make up for it, I think, on the next day I was called into my boss’s boss’s office and handed a rather surprising and rather early contract offer for next fall. My company intends to employ me as a full (rather than junior) consultant, in recognition of my life experience rather than my professional experience, and with a corresponding salary that’s really OK for a carreer entrant, albeit a full third lower than what I earned last in my previous job. But that’s alright because that salary (for a research position!) was out of touch with the real world.
I was flattered, but also intimidated. Trying to imagine the company with a different boss, or none at all (if they don’t find a replacement in time we’ll be under the business unit leader directly for the time being, which is never a good idea–there’s a reason there’s a hierarchy), faced with the prospect of finding my way alone in a yet rather unfamiliar professional world, and with the bachelor thesis hardly started. In fact not even knowing yet whether I even want to be a consultant. And then with the request to come up, on my own, with a concept for an entirely new area of expertise for the company. Come on, guys, I’m just a scared undergraduate student, and a PhD that’s old enough to vote in an irrelevant subject isn’t going to help me in figuring this out. I’ll take quite a while to getting used to the size of this.
And that’s what’s on my mind right now. Sitting in a classroom with a few of my remaining co-students, as I did yesterday, hearing for the third time the principles of software testing (our compulsory choice module is a preparation for the Certified Tester certification exam), feels like a digression in comparison. Which is a pity, because I loved this course of studies and would have loved a chance to have a relaxed final term. Instead my boss’s boss made it crystal clear that they wanted me earlier rather than later, so I had better get on with the thesis. Get it out of the way, in fact. Oh, but of course, it still should be excellent, he added as an afterthought.
I should have worn my favorite T-shirt that day. It says “People Under Pressure Don’t Think Faster.”