- UCLA M.S. Computer Science
- Android Developer
- Freelance IT
or, thoughts from a random graduate student
I've been working at Google this summer as an intern, which is part of the reason why there haven't been any updates to any of my things. (This blog, MicDroid, etc) They say Google is a place that engineers disappear into, and are not heard from again, and that seems somewhat true for me this summer. The other part of it is simply that my life has become busy, and I've had to make some sacrifices in what I spend my time on.
If there's one thing that's happened fairly frequently, it's that people are interested in what goes on at Google. Today I'll attempt to talk about some of the things I found interesting.
Please note: these are my personal views, they are not meant to be official announcements of any sort, nor are they in any way endorsed by Google.
Let's begin with what I've learned this summer at Google.
Next let's talk about how things are done at Google.
As expected of the engineering-driven culture at Google, the toolchain for working in the main Google source tree is pretty heavily developed.
Infrastructure at Google is actually quite interesting too. Having to deal with a lot of machines also equates to having a good amount of tools to deal with them as well.
In addition to being known for engineering prowess, Google is also known for being an interesting place to work.
I'm definitely going to miss working at Google (and not just for the free food!).
There are a few things that I did find irritating while working at Google though. Here are some.
Overall, Google has been great to me this summer, and I really think I would enjoy working there in the future (provided this post doesn't disqualify me, oops).
I'd like to take a moment to talk about doing research. I'm currently working towards obtaining my MS degree, which entails either completing a thesis, or passing a (difficult) test. I think the thesis route is generally more useful as a whole, even though it's so much more scary, which is pretty much why I'm headed that direction. I'm theoretically doing research in multicore systems/parallel languages, although it really is difficult to come up with a testable, unique idea. Of course, if that wasn't the case then everyone would be doing it.
In any case, recently I've been reading a lot about PGAS languages, particularly UPC, and the runtimes associated with it. The main component that enables performance on distributed systems is RDMA functionality, which in the Berkeley UPC case, is supplied by GASnet, a communication library that abstracts the underlying communication fabric (Infiniband, Myrinet, Ethernet, etc) for high performance remote memory access. Now, where things get interesting is that the HPC community seems to have agreed to some extent that the SPMD/PGAS model is an interesting candidate for future computation, most prominently due to CUDA, as NVIDIA's graphics cards have enabled some incredible numbers on the latest TOP500 listings. Since today's commodity chips actually derive quite a bit (more than you expect) from yesterday's supercomputers, any sort of interesting functionality enabled by the communication network of today's supercomputers should be applicable to a future many-core (32+) chip. The key point here is that although the appearance of the system is quite different (think Jaguar vs TILE64), the architecture is really quite similar, because as the number of cores on chip increases, the need for a sophisticated communication network, very much like those found in today's massive supercomputers, becomes much more apparent. It is here that my research efforts are currently directed.
While all this is well and good, my problem is that I'm still lacking "the idea", although I have a suspicion that is due to me searching for some grand unifying theory of many-core systems as it were, instead of concentrating on finding some salient point of interest and digging from there.
I didn't expect to be this busy with stuff this quarter, class projects and such have been taking up all my time, and I STILL haven't gotten around to finishing the resample/background instrumentals functionality for MicDroid.
So here's a short list of what I HAVE been working on this quarter:
1. UCLA CS 259 project: using high level synthesis tools like Autopilot to generate a custom FPGA for accelerating medical imaging benchmarks.
2. Summer internships and jobs: I've interviewed with (or am still interviewing with) Google, Intel, Amazon, Adconion, Quantcast, Riverbed, and a startup out of downtown LA.
3. UCLA CS 180: yes, the algorithms class I didn't do so well in as an undergraduate class is back!
As an undergrad majoring in Computer Science, the one class that will get you a job is going to be your algorithms class. It's also one of the few classes I didn't do so well in as an undergrad, which is why I've got to take the course again as a graduate student. That said it doesn't help that it feels like a remedial course this time around. Still, the timing was incredibly useful since there is a lot of interviewing to be done this quarter, so reviewing things like hash tables, stacks, tree traversal, sorting, and big-O notation in class was a great help.
Big-O notation surprisingly is by far the subject that comes up the most, since every interviewer ever will ask for the (typically time, sometimes space too) complexity of every algorithm you design during an interview. Reviewing average-case complexity for things like hash tables and quicksort is also worthwhile, since those questions do (and did for me) come up.
Next up is tree traversal, which is surprisingly common, most typically in the context of search, pointers, and recursion. The most common questions I've heard this year ask for either tree comparison or search. Usually this involves designing a tree structure, including nodes using pointers, or references, followed by implementing a recursive search implementation of depth-first search. The best advice I have for this is to keep things simple, and be very familiar with your language of choice.
Sorting is a perennial favorite for interviews, since sorting algorithms cover the entire range in time and complexity. But for an interview, there's no need to know all the sorting algorithms in existence, just pick one simple one with linear time complexity, and another with O(log n) complexity, and that should be enough. It's important to be able to re-implement your chosen algorithms from scratch on the fly. I've had a few interviews where I had to do this as part of a larger question, so make sure to practice!
The last, most interesting point I've noticed about interview questions is the frequent occurrence of hash tables. There are many cases where the question asked has structure such that comparison and lookup are linked very tightly, a perfect application of hash tables. A typical example question will be something like finding duplicate elements in a list, or anything that is attempting to uniquely identify some subset of data. This is where the comparison+lookup structure of the program lies. When the interviewer is unsatisfied with the time complexity of the problem, I have found they are frequently looking to speed up some part of the algorithm through use of a hash table. Just be certain that it is completely clear a hash table lookup is average case constant time, and worst case linear time.
My interview experience had 90% of the questions covered by the topics I've posted about above. The other 10% of questions tended to be more in-depth questions that link a few different topics together. I'm looking forward to finishing up my interviews and finding out where I'll be headed this summer. But for now I plan to enjoy my break, and tentatively finish up some long overdue work.
In the undergraduate UCLA CS curriculum, the most notorious courses are the OS course and the compilers course. As the most difficult undergraduate courses offered by UCLA, everyone tries to avoid them as well. That's a rather sad state of affairs now, especially since compilers is no longer a required course for CS undergraduates. In any case, it's the OS course I'd like to talk about today.
The meat of the undergraduate OS course at UCLA involves writing a basic shell, ramdisk driver and filesystem for use with a (minimal) Linux environment. A lot of the boilerplate code that is responsible for the kernel interface is written for you (the student), so students are expected to provide the main functionality of the driver/filesystem, while the basic shell is almost implemented from scratch using fork() in combination with execvp(). I have to say that this was one of the best courses I ever took at UCLA, and I learned quite a lot from it, even if my grades didn't always reflect the situation. It's not just that I enjoyed the course content, it's also the fact that the course content will teach you why your system behaves the way it does. For those of us who are in CS to learn how to program so they can go on to get a nice Java or C# job somewhere, it pays to know this, because despite all the abstractions in between your web app and the operating system, eventually those abstractions will fail, and the underlying system will be exposed, and you will need to know how to solve that problem. It's very similar to the whole concept of owning a car (computer) and not knowing how the car (computer) functions. Most of the time you will be fine; when you turn the wheel left, the car will follow, when you brake, the car slows, and so on. But when you subject the car to heavy braking and are surprised the pedal kicks back at you because you are not aware how ABS works, then you are in trouble. For that reason, CS students, just like professional drivers need to know how the underlying system works.
I'm now taking the graduate level OS course, and it is a world of difference from the undergraduate course. So far the centerpiece of the course has been the virtual memory system, and the x86 memory layout. The coursework expects students to be responsible for far more extended functionality (debuggers, multi-stage bootloaders, etc) in addition to the required functionality.
It's a lot of time and work, and I wouldn't have it any other way.