The happy 7-day break of the Chinese new year of Ox

This year’s Chinese new year (CNY) break is a little bit different for us. Because Beijing or the government in general discouraged people from returning home (due to COVID) and promoted “celebration in situ / 就地过年”, many people, including me, decided to postpone our home-returning plan.

I eventually put the time to some good reading, coding, and video clip watching. So here’s a short note of what I did.

CS 193P

I learned about CS193p from Hacker News one day before the break, and immediately decided to put it to my to-do list. I learned SwiftUI last year but since it was still in beta, I found myself unable to do many things as easily as in UIKit (maybe due to lack of good documentation), also the tooling sucked (had to switch between XCode 11 and 12 Beta, which was a huge pain when you had to download 20GiB of binaries and restart when failures).

The instructor Paul Hegarty, as mentioned in the comments, kept a very low profile on the Internet. But he indeed was an OG when it comes to Apple’s ecosystem (VP in NeXTStep?). I found his teaching style very sharp and enjoyable. He speaks very clearly and at a moderate pace. (I nevertheless turned on YouTube’s 1.75x speed because of my general familiarity of the material, which is also only possible exactly because of that.) The video recording of his live coding seems unbelievably fluent and natural, i.e. you can totally resonate even when he makes mistakes or typos, like the ones you would’ve made on your own, and that error exploration moment is gold, is when and how you actually learn. It’s like basically having a great coder being your teacher in your brain overseeing you translate thoughts to code.

I confirmed with a friend from Stanford that he was the same instructor during the iOS and UIKit era, which makes the popular course even more valuable. You can feel it when the instructor truly loves the subject and tries to keep up with the latest development: as of April 2020, he can fluently explain not only the best practices of SwiftUI during its beta phase, but also some quirks and bugs soon to be fixed. To this end, I’ve learned also what craftsmanship means.

Chamath Palihapitiya

After CS193p taken around 3–4 days, I’ve spent rest of the week on video clips. Chamanth gave a talk at GSP in 2017, where he explained some of his philosophical ideas and opinions on social networks and their impact on new generation. I found him a candid and controversial person, the latter of which isn’t necessarily a bad thing. I guess in the venture world, being an outlier is a gift or even a must.

The most point in his talk that resonated with me was that he saw money as a method or instrument of change, where one can amass in order to project his or her ideology and point of view to the world. Also he said too many fucks in that auditorium, I hope it’s fine with GSB.

Readings

There are some finished and ongoing readings that I’ve found interesting.

  • Rand’s research on China’s grand strategy is really interesting in that a) I found translations like paramount leaders really amusing (and only after translation!), and b) the fact that this type of facility can really provide a view into China where you wouldn’t usually hear from Chinese materials and social circles
  • 基于深度学习的生命科学 / Deep Learning for the Life Sciences from O’Reilly but the translation was really bad — seems like the translator didn’t even know the meaning of basic when presented along with acid, and then translated it into fundamental or primary in Chinese; nevertheless I see that materials are good and authors are really well informed, and it leads to my interest in Deeplearning.ai’s AI for medicine specialisations

Clubhouse

Lastly I’ve been listening to conversations on Clubhouse on and off these days. However, due to its low information intensity I’ll usually just put my AirPods on while walking the dog. Most of the conversations are ill organized (but understandably so), while there were indeed some good ones.

Naval did one spontaneously where he discussed philosophical topics with people like Marc Andreessen, and to conclude he said that he only did this to be on par with Elon Musk, which he did, by reaching 5k listeners. He later explained in Twitter that he failed to properly record it because his iPhone was on vibrate. Nevertheless people uploaded the clips to YouTube and now it’s available in a very low efficiency format: unlike the video, you can’t preview and jump to a specific location on an audio clip.

I guess that’s an interesting problem to solve, since the technology i.e. TTS, is ready. It’s just the product in place.

体育运动员应该先打新冠疫苗么?

这两天正好是有报道的新冠疫情一周年,ModernaPfizer 的疫苗已经进入冲刺阶段,基本达到 90% 以上的有效性,在英美已经开始申请加急许可了。

我一直在听 Dithering,这个付费播客的两个主播 John GrubberBen Thompson 都是科技界的大佬。上一期他们讨论到一个话题,就是体育运动员(比如橄榄球队成员)能不能优先打疫苗。Ben 说这是一个 no brainer:肯定是的。

结果两天一更的节目,这一期他们就来解释澄清了,因为招致了很多反对的声音。

Ben 重申了一下他的理由:按照现在疫苗预计的生产规模和速度,和几个大型赛事参赛队伍的规模,生产后者所需要的疫苗也就是几分钟的事情,不会耽误太多医务工作者的时间;医务工作者的确是很重要的,没有说他们不应该优先,但是假想体育运动员接种了疫苗之后有什么好处:第一,大型体育赛事至少可以不带观众的恢复,很多人因此可以在家看比赛有事做,起到居家隔离的效果;第二,很多运动员带头打疫苗,对于那些对疫苗持怀疑态度的人也有一个鼓励的作用。John 笑了笑说你其实可以不用解释这么多,因为我对我们的听众有信心。

大家对这个有没有判断力我不知道,至少愿意付费听他们播客的人肯定是筛选过的,应该是愿意听完 Ben 这么一番说道。但是如果是别的情况,我猜可能就没有那么简单了:这个小小的话题背后,其实是一个很深刻的话题:我们应该追求公平,还是效率?

小到火车票买票(比如互联网买票),大到一个国家或者政党的宪法或者施政纲领,都体现了这个深刻的矛盾。很多时候这两者是没办法得兼的,而且甚至不同的人对于怎么做是公平的、怎么做有效率都会有争议——你觉得公平的做法,我不一定觉得公平,而且可能真正有效率的方案,我们都看不到。

如果有一个有效率、最终可以让大家都获益的方案,只是需要短时间牺牲一部分人的利益,短时间内不那么公平,「让一部分人先富起来」,你愿意么?如果有一个主事的人或者团体出来说自己来实施这个方案,你能信任他们吗?如果中途变卦了怎么办?如果结果发现他们认为更有效率的方案其实失败了,或者预先就知道有风险,大家还愿意去冒险么?这个问题听上去抽象的话,想想气候变化和修建核电站的矛盾,就不难理解了。

我这里也不想展开谈自己的见解,因为这块阅读和思考的都还不够深刻,我想思考的是这个问题为什么这么难。

最近在学习强化学习(资料有 Sutton & Barto 的书Coursera 的课,以及 DeepMind 的课),其中一个很重要的概念是 discount factor γ,它指的大概是我们是如何在现在的 reward 和未来的预期 reward 之间获得取舍。(reward 这里的意思是对行为的回报,类似收益或者得分)。

别小看了这个 γ,比如很多强化学习处理的问题是 episodic game,比如围棋、走迷宫等等,有一个明确的起点和终点,结束了可以重来;但是很多现实的问题是没有终点的,我们需要在一个很长甚至无限的时间线上最大化收益(想起了有限与无限的游戏没有?)。处理这种无限时间的收益,必须有一个小于 1 的 discount factor,否则问题是不收敛的(当然另外一方面在 episodic game 里面可以把 γ 设成 1 就可以了)。如果 γ 越小,我们就越只顾眼前利益,我们的规划问题的算法就越「近视 myopic」;反之则看的越长远,但是相对来说收敛速度还有对计算资源的要求可能就会越高(因为要回顾的东西很多)。

但是这个 γ 很多时候是没有一个预设的值的,更多是一个「超参数」,也就是说需要经过多次实验,不断调整,才能找到一个合理高效的值。

一个简单的例子,机器人需要在左边和右边的路上做决策,γ 小就会走左边(活在当下),反之就会走右边(延迟满足),你甚至可以计算出 γ 的临界点。

所以别说国家、社会和人了,就是一个这么小小的机器人,面对一个规则固定的假想游戏,在「现在」和「未来」的取舍上,都需要多次尝试。人的价值观可不那么容易改变,而且不同的人,同一个人现在的他和未来的他,都不一样;人生不能重来,很多重大决策没有回头路,我们该怎么更好怎么面对呢?


近期分享

近期在公司做了两次分享,因为是内部分享,没有办法分享视频,所以我在这里把文字材料贴出来。

第一个是关于无人驾驶这个话题,主要是其中需要解决的问题和面临的挑战。

Notion 链接是 https://www.notion.so/jiayul/Autonomous-Driving-Quick-Tour-3caaac13aa64430fa82f9f29b0660bfa

第二个是在 Air Reading Group 做了前几年读的一本书的阅读分享,叫做「Crossing the Chasm」。

Slides 链接是 https://crossing-the-chasm.now.sh/

技术人的职业发展和个人成长

周五(7月31日)参与了一个同事组织的对外线上分享活动,同时也在 B 站有直播。

讨论的话题是「技术人的职业发展和个人成长」,但是主要是从 meta-thinking 的角度来阐释的,因为我本身并不认为一个人可以为另外一个人的成长和发展给特别正确的结论——这个结论和决策是需要自己来亲力亲为的。

分享做的 Slides 我放到了 https://growth.jiayul.me ,以供之后参考用。

How I passed Google’s TensorFlow certificate

I passed the test for Google’s TensorFlow developer certificate back in April and got the certificate in a few weeks. This is a relatively new addition to Google’s developer certificates program, and I would love to share a bit more context on it without too much spoiler, hoping it’ll be useful to anyone who’s interested in taking it.

Motivation 

As an engineering manager, my own or my team’s daily job does not involve machine learning per se, so learning TensorFlow and deep learning in general is purely motivated by my personal interest. Throughout my career so far, I have not yet been in a position where I have to personally train a model to tackle a problem, but I’ve done some work with Kubernetes that empowers TensorFlow, as well as Coursera courses that involves using deep learning to empower self-driving cars, so that can explain where my interest comes from.

Also I’ve been reading textbooks, taking courses, running open sourced Colab notebooks, etc. for the last few years, so I feel like I can challenge myself and get my skills tested. Reading the test FAQ I can totally see that this is by no means a test for SOTA knowledge and skills, but fundamentals are important, right?

Preparation

Since this is a test, like all other types of test in life, getting prepared is important. The most relevant information is on the official website, specifically the public FAQ document, and you should read it at least twice, because it basically covers what’s to be expected in the exam, and also a major hint on the shape and form (e.g. number of questions).

If you are like me who have taken the TensorFlow in Practice specialization by deeplearning.ai before, the test coverage should strike a familiarity here. If it’s not the case, it is strongly suggested that you take that specialization as a preparation, or just as an introduction on how to get onboard with the new TensorFlow 2.0 API. I spent about 8 hours in total going through the Colab notebooks in the course materials and it turned out to be most helpful.

Taking the test

The test is taken on your own computer (i.e. remote) with PyCharm and a plugin. I cannot go too much into the details, but an important tip is: basically all the rules on taking tests and interviews apply here: find a quite and comfortable place with AC power and easy access to water and toilet, etc.

My initial estimate was that I don’t need to use up the full 5 hour time window, because I was pretty confident that training a few toy neural nets won’t take that long. I mentioned toy neural nets because if I wore the test designer’s hat, it would be unnecessarily difficult to evaluate people’s work based on very complex models and small improvements made to basic models, i.e. getting everything working is more importantly than fine adjusting one model. (This actually applies to real production work, since establishing a baseline, even if you use just SVM or logistic regression, is also the first thing to do).

But then I was wrong. It turns out that I underestimated all the possible places where my code can go wrong. I believe I wasted more than 3 hours in e.g.:

  1. Fixing a cryptic exception message thrown from TensorFlow internals because I didn’t correctly specify input dtype (or do proper image reshaping), and that error message wasn’t helpful at all
  2. Failing to supply the correct combination of params to model.fit with image data generator so each epoch takes too much time (I am using my MacBook Pro so there’s no GPU)
  3. Trying to upgrade my model from a single layer of LSTM to two, but then my score went down from 3/5 to 2/5 (there’s no other feedback during tests so you can’t do error analysis), after which I can never get back to the original score for some reason

Luckily, after using up all the time I was table to pass the test! I really enjoyed the process since all the test contents were within my expectation and there was no surprise, meaning all my past learning and preparation really meant something. I did run into several problems but it did teach me a good lesson on how things can actually go wrong in real ML work.

What’s next

After finishing the test, the result came in one or two hours, and you are then prompted to list your information on the developer directory. I imagine this would be useful for people doing ML job hunting but for me it is more of a social purpose. The certificate came in after more than two weeks later, but then there’s no other meaning than just a reminder to yourself.

This is by no means an end to the journey, as the criteria and content covered in this test is really just fundamentals. It helps to serve as a checkpoint, and a starting point to learn something more SOTA. For me the single biggest takeaway is (during my preparation) that I found out how useful Colab is and got used to working on it by default! In fact, I believe most people should be sufficed to work on Google Colab without buying any physical GPUs: you get data center level network speed with zero setup GPU/TPU support basically for free. Unless you want to train your models for days, e.g. retrain BERT from scratch, which you most definitely shouldn’t, it is definitely a more efficient choice.

So, is it really worth it?

To wrap up, I want to explore the question of whether it is really worth it, after all there’s a 100 US$ fee.

For people looking for ML jobs: basically you can think of it similar to buying a LinkedIn premium account: you’ll get more exposure, save some market discovery time and cost, but then again it’s you that they are looking for, the whole package. So getting that won’t buy you a job, but merely an entrance maybe.

Other than that, I find it a good excuse to push you to study for the materials, which is the truly meaningful part. If you have done that, then I wouldn’t worry if I didn’t take this.