Jiayu Liu

20210314

2021-03-14T07:28:59Z

今天看完了许知远对话罗翔。

印象深刻的是他无时不在反省自己的状态，以及时时刻刻的对虚荣浮夸的警惕。罗翔是一个非常强调克制和理性的人，即使他算是一个「文科生」。

The happy 7-day break of the Chinese new year of Ox

2021-02-17T13:45:14Z

This year’s Chinese new year (CNY) break is a little bit different for us. Because Beijing or the government in general discouraged people from returning home (due to COVID) and promoted “celebration in situ / 就地过年”, many people, including me, decided to postpone our home-returning plan.

I eventually put the time to some good reading, coding, and video clip watching. So here’s a short note of what I did.

CS 193P

I learned about CS193p from Hacker News one day before the break, and immediately decided to put it to my to-do list. I learned SwiftUI last year but since it was still in beta, I found myself unable to do many things as easily as in UIKit (maybe due to lack of good documentation), also the tooling sucked (had to switch between XCode 11 and 12 Beta, which was a huge pain when you had to download 20GiB of binaries and restart when failures).

The instructor Paul Hegarty, as mentioned in the comments, kept a very low profile on the Internet. But he indeed was an OG when it comes to Apple’s ecosystem (VP in NeXTStep?). I found his teaching style very sharp and enjoyable. He speaks very clearly and at a moderate pace. (I nevertheless turned on YouTube’s 1.75x speed because of my general familiarity of the material, which is also only possible exactly because of that.) The video recording of his live coding seems unbelievably fluent and natural, i.e. you can totally resonate even when he makes mistakes or typos, like the ones you would’ve made on your own, and that error exploration moment is gold, is when and how you actually learn. It’s like basically having a great coder being your teacher in your brain overseeing you translate thoughts to code.

I confirmed with a friend from Stanford that he was the same instructor during the iOS and UIKit era, which makes the popular course even more valuable. You can feel it when the instructor truly loves the subject and tries to keep up with the latest development: as of April 2020, he can fluently explain not only the best practices of SwiftUI during its beta phase, but also some quirks and bugs soon to be fixed. To this end, I’ve learned also what craftsmanship means.

Chamath Palihapitiya

After CS193p taken around 3–4 days, I’ve spent rest of the week on video clips. Chamanth gave a talk at GSP in 2017, where he explained some of his philosophical ideas and opinions on social networks and their impact on new generation. I found him a candid and controversial person, the latter of which isn’t necessarily a bad thing. I guess in the venture world, being an outlier is a gift or even a must.

The most point in his talk that resonated with me was that he saw money as a method or instrument of change, where one can amass in order to project his or her ideology and point of view to the world. Also he said too many fucks in that auditorium, I hope it’s fine with GSB.

Readings

There are some finished and ongoing readings that I’ve found interesting.

Rand’s research on China’s grand strategy is really interesting in that a) I found translations like paramount leaders really amusing (and only after translation!), and b) the fact that this type of facility can really provide a view into China where you wouldn’t usually hear from Chinese materials and social circles
基于深度学习的生命科学 / Deep Learning for the Life Sciences from O’Reilly but the translation was really bad — seems like the translator didn’t even know the meaning of basic when presented along with acid, and then translated it into fundamental or primary in Chinese; nevertheless I see that materials are good and authors are really well informed, and it leads to my interest in Deeplearning.ai’s AI for medicine specialisations

Clubhouse

Lastly I’ve been listening to conversations on Clubhouse on and off these days. However, due to its low information intensity I’ll usually just put my AirPods on while walking the dog. Most of the conversations are ill organized (but understandably so), while there were indeed some good ones.

Naval did one spontaneously where he discussed philosophical topics with people like Marc Andreessen, and to conclude he said that he only did this to be on par with Elon Musk, which he did, by reaching 5k listeners. He later explained in Twitter that he failed to properly record it because his iPhone was on vibrate. Nevertheless people uploaded the clips to YouTube and now it’s available in a very low efficiency format: unlike the video, you can’t preview and jump to a specific location on an audio clip.

I guess that’s an interesting problem to solve, since the technology i.e. TTS, is ready. It’s just the product in place.

体育运动员应该先打新冠疫苗么？

2020-12-02T14:58:28Z

这两天正好是有报道的新冠疫情一周年，Moderna 和 Pfizer 的疫苗已经进入冲刺阶段，基本达到 90% 以上的有效性，在英美已经开始申请加急许可了。

我一直在听 Dithering，这个付费播客的两个主播 John Grubber 和 Ben Thompson 都是科技界的大佬。上一期他们讨论到一个话题，就是体育运动员（比如橄榄球队成员）能不能优先打疫苗。Ben 说这是一个 no brainer：肯定是的。

结果两天一更的节目，这一期他们就来解释澄清了，因为招致了很多反对的声音。

Ben 重申了一下他的理由：按照现在疫苗预计的生产规模和速度，和几个大型赛事参赛队伍的规模，生产后者所需要的疫苗也就是几分钟的事情，不会耽误太多医务工作者的时间；医务工作者的确是很重要的，没有说他们不应该优先，但是假想体育运动员接种了疫苗之后有什么好处：第一，大型体育赛事至少可以不带观众的恢复，很多人因此可以在家看比赛有事做，起到居家隔离的效果；第二，很多运动员带头打疫苗，对于那些对疫苗持怀疑态度的人也有一个鼓励的作用。John 笑了笑说你其实可以不用解释这么多，因为我对我们的听众有信心。

大家对这个有没有判断力我不知道，至少愿意付费听他们播客的人肯定是筛选过的，应该是愿意听完 Ben 这么一番说道。但是如果是别的情况，我猜可能就没有那么简单了：这个小小的话题背后，其实是一个很深刻的话题：我们应该追求公平，还是效率？

小到火车票买票（比如互联网买票），大到一个国家或者政党的宪法或者施政纲领，都体现了这个深刻的矛盾。很多时候这两者是没办法得兼的，而且甚至不同的人对于怎么做是公平的、怎么做有效率都会有争议——你觉得公平的做法，我不一定觉得公平，而且可能真正有效率的方案，我们都看不到。

如果有一个有效率、最终可以让大家都获益的方案，只是需要短时间牺牲一部分人的利益，短时间内不那么公平，「让一部分人先富起来」，你愿意么？如果有一个主事的人或者团体出来说自己来实施这个方案，你能信任他们吗？如果中途变卦了怎么办？如果结果发现他们认为更有效率的方案其实失败了，或者预先就知道有风险，大家还愿意去冒险么？这个问题听上去抽象的话，想想气候变化和修建核电站的矛盾，就不难理解了。

我这里也不想展开谈自己的见解，因为这块阅读和思考的都还不够深刻，我想思考的是这个问题为什么这么难。

最近在学习强化学习（资料有 Sutton & Barto 的书，Coursera 的课，以及 DeepMind 的课），其中一个很重要的概念是 discount factor γ，它指的大概是我们是如何在现在的 reward 和未来的预期 reward 之间获得取舍。（reward 这里的意思是对行为的回报，类似收益或者得分）。

别小看了这个 γ，比如很多强化学习处理的问题是 episodic game，比如围棋、走迷宫等等，有一个明确的起点和终点，结束了可以重来；但是很多现实的问题是没有终点的，我们需要在一个很长甚至无限的时间线上最大化收益（想起了有限与无限的游戏没有？）。处理这种无限时间的收益，必须有一个小于 1 的 discount factor，否则问题是不收敛的（当然另外一方面在 episodic game 里面可以把 γ 设成 1 就可以了）。如果 γ 越小，我们就越只顾眼前利益，我们的规划问题的算法就越「近视 myopic」；反之则看的越长远，但是相对来说收敛速度还有对计算资源的要求可能就会越高（因为要回顾的东西很多）。

但是这个 γ 很多时候是没有一个预设的值的，更多是一个「超参数」，也就是说需要经过多次实验，不断调整，才能找到一个合理高效的值。

一个简单的例子，机器人需要在左边和右边的路上做决策，γ 小就会走左边（活在当下），反之就会走右边（延迟满足），你甚至可以计算出 γ 的临界点。

所以别说国家、社会和人了，就是一个这么小小的机器人，面对一个规则固定的假想游戏，在「现在」和「未来」的取舍上，都需要多次尝试。人的价值观可不那么容易改变，而且不同的人，同一个人现在的他和未来的他，都不一样；人生不能重来，很多重大决策没有回头路，我们该怎么更好怎么面对呢？

近期分享

2020-11-30T03:09:40Z

近期在公司做了两次分享，因为是内部分享，没有办法分享视频，所以我在这里把文字材料贴出来。

第一个是关于无人驾驶这个话题，主要是其中需要解决的问题和面临的挑战。

Notion 链接是 https://www.notion.so/jiayul/Autonomous-Driving-Quick-Tour-3caaac13aa64430fa82f9f29b0660bfa

第二个是在 Air Reading Group 做了前几年读的一本书的阅读分享，叫做「Crossing the Chasm」。

Slides 链接是 https://crossing-the-chasm.now.sh/

技术人的职业发展和个人成长

2020-08-01T10:22:39Z

周五（7月31日）参与了一个同事组织的对外线上分享活动，同时也在 B 站有直播。

讨论的话题是「技术人的职业发展和个人成长」，但是主要是从 meta-thinking 的角度来阐释的，因为我本身并不认为一个人可以为另外一个人的成长和发展给特别正确的结论——这个结论和决策是需要自己来亲力亲为的。

分享做的 Slides 我放到了 https://growth.jiayul.me ，以供之后参考用。

How I passed Google’s TensorFlow certificate

2020-05-27T09:31:49Z

I passed the test for Google’s TensorFlow developer certificate back in April and got the certificate in a few weeks. This is a relatively new addition to Google’s developer certificates program, and I would love to share a bit more context on it without too much spoiler, hoping it’ll be useful to anyone who’s interested in taking it.

Motivation

As an engineering manager, my own or my team’s daily job does not involve machine learning per se, so learning TensorFlow and deep learning in general is purely motivated by my personal interest. Throughout my career so far, I have not yet been in a position where I have to personally train a model to tackle a problem, but I’ve done some work with Kubernetes that empowers TensorFlow, as well as Coursera courses that involves using deep learning to empower self-driving cars, so that can explain where my interest comes from.

Also I’ve been reading textbooks, taking courses, running open sourced Colab notebooks, etc. for the last few years, so I feel like I can challenge myself and get my skills tested. Reading the test FAQ I can totally see that this is by no means a test for SOTA knowledge and skills, but fundamentals are important, right?

Preparation

Since this is a test, like all other types of test in life, getting prepared is important. The most relevant information is on the official website, specifically the public FAQ document, and you should read it at least twice, because it basically covers what’s to be expected in the exam, and also a major hint on the shape and form (e.g. number of questions).

If you are like me who have taken the TensorFlow in Practice specialization by deeplearning.ai before, the test coverage should strike a familiarity here. If it’s not the case, it is strongly suggested that you take that specialization as a preparation, or just as an introduction on how to get onboard with the new TensorFlow 2.0 API. I spent about 8 hours in total going through the Colab notebooks in the course materials and it turned out to be most helpful.

Taking the test

The test is taken on your own computer (i.e. remote) with PyCharm and a plugin. I cannot go too much into the details, but an important tip is: basically all the rules on taking tests and interviews apply here: find a quite and comfortable place with AC power and easy access to water and toilet, etc.

My initial estimate was that I don’t need to use up the full 5 hour time window, because I was pretty confident that training a few toy neural nets won’t take that long. I mentioned toy neural nets because if I wore the test designer’s hat, it would be unnecessarily difficult to evaluate people’s work based on very complex models and small improvements made to basic models, i.e. getting everything working is more importantly than fine adjusting one model. (This actually applies to real production work, since establishing a baseline, even if you use just SVM or logistic regression, is also the first thing to do).

But then I was wrong. It turns out that I underestimated all the possible places where my code can go wrong. I believe I wasted more than 3 hours in e.g.:

Fixing a cryptic exception message thrown from TensorFlow internals because I didn’t correctly specify input dtype (or do proper image reshaping), and that error message wasn’t helpful at all
Failing to supply the correct combination of params to model.fit with image data generator so each epoch takes too much time (I am using my MacBook Pro so there’s no GPU)
Trying to upgrade my model from a single layer of LSTM to two, but then my score went down from 3/5 to 2/5 (there’s no other feedback during tests so you can’t do error analysis), after which I can never get back to the original score for some reason

Luckily, after using up all the time I was table to pass the test! I really enjoyed the process since all the test contents were within my expectation and there was no surprise, meaning all my past learning and preparation really meant something. I did run into several problems but it did teach me a good lesson on how things can actually go wrong in real ML work.

What’s next

After finishing the test, the result came in one or two hours, and you are then prompted to list your information on the developer directory. I imagine this would be useful for people doing ML job hunting but for me it is more of a social purpose. The certificate came in after more than two weeks later, but then there’s no other meaning than just a reminder to yourself.

This is by no means an end to the journey, as the criteria and content covered in this test is really just fundamentals. It helps to serve as a checkpoint, and a starting point to learn something more SOTA. For me the single biggest takeaway is (during my preparation) that I found out how useful Colab is and got used to working on it by default! In fact, I believe most people should be sufficed to work on Google Colab without buying any physical GPUs: you get data center level network speed with zero setup GPU/TPU support basically for free. Unless you want to train your models for days, e.g. retrain BERT from scratch, which you most definitely shouldn’t, it is definitely a more efficient choice.

So, is it really worth it?

To wrap up, I want to explore the question of whether it is really worth it, after all there’s a 100 US$ fee.

For people looking for ML jobs: basically you can think of it similar to buying a LinkedIn premium account: you’ll get more exposure, save some market discovery time and cost, but then again it’s you that they are looking for, the whole package. So getting that won’t buy you a job, but merely an entrance maybe.

Other than that, I find it a good excuse to push you to study for the materials, which is the truly meaningful part. If you have done that, then I wouldn’t worry if I didn’t take this.

Coder at work (from home)

2020-05-24T11:13:42Z

In the past few months, life has turned upside down for more than half of the people around the globe, yet we here in Beijing are close to the end of it (hopefully).

It’s Sunday afternoon so let me try to recap on some of the moments that went by…

Staying at home, watching and observing

After returning from the (abruptly interrupted) spring festival holiday, we returned home in Beijing in early February, and then for the next four months we were basically locked up at home, working or not.

I can still remember early on we were quite eagerly following up with how situations unroll, first in China and then spread out across the world (Italy, Japan, U.S., etc.). Then I think the video from 和之梦 (【感染者为0的城市—南京】日本导演镜头下最真实的南京防疫现场) blew up and marked that transitional moment (or transitional week) where things are stabilizing in China while the world outside starts to turn upside down.

From there, we also see a train of outside blames against China, and also some of the influential opinions that are criticizing the blames themselves. Amongst them Daniel Dumbrill and Stratechery. I encountered Daniel’s talks via a strange path - since 和之梦 interviewed the bar owner Ben who speaks only Chongqing dialects, and then YouTube recommended his interview by Daniel Dumbrill. I specifically enjoyed his sit down interview in American PhD Student in China - A Discussion about China & More where I was amazed on how well read that PhD student was. Ben Thompson has always been an inspiration (and I recently was sold on his Dithering.fm with John Gruber) but in that very episode he and James did a great job giving an independent voice upon what did Beijing, Taiwan, and the US do right and wrong.

Sometimes it gets a bit overwhelming to care about the virus 24x7, so while jogging or walking my dog I mostly listen to other Podcasts. I continue to find Lex’s Artificial Intelligence Podcast a gold mine, where he interviews in person (amongst many others) with Jack Dorsey, Andrew Ng, Ilya Sutskever, David Silver, Donald Knuth, Michael Jordan, etc. The topics are all very interesting but what I find most unique about were the ones that you don’t usually get to hear on other occasions, anecdotes that you can only hear in a fireside chat style conversations, e.g. Andrew Ng mentioned that some of the early videos on Coursera (the Machine Learning course) were recorded after he finished weekend dinners with friends and returned to the lab after 9pm; and that Michael B. Jordan did meet with the other famous Jordan, etc. It was also a pleasant surprise when Lex did the interview with Melanie Mitchell only a week after I finished her book on AI, a perfect timing and you’ll feel lucky to have the chance to shadow talk to the author with your questions in mind while reading her book.

Sebastian Thrun is one badass entrepreneur and researcher

I especially enjoyed Karpathy's talk on how Tesla handle the long-tailness of different stop signs

Also I believe I’ve done a tone of binge watching on Wang Gang’s channel, Liziqi’s channel, and even the funny TechLead. Many people watch their videos for their soothing music and elements but I find myself enjoying most of how people outside China talk about seeing many things for the first time while totally understanding the culture and food and relating to them. Well, not for the tech lead videos, which I was just for the many “as a millionaire” humor.

Layoff, and how that affects work

Airbnb did a large layoff, amongst many other companies (Uber did two so far, before and after our round, and there was also Lyft and Cruise, etc.), and many people were affected. Airbnb China was also equally affected. Luckily I’m not one of them but I do see many talented people had to be let go.

Saying goodbye is hard, being put into that situation was harder, but as someone who had had a startup before and had been through a layoff (by ourselves), the experience is a refresher to that memory and this time I get to be more focused on non-emotional parts, i.e. learnings.

I believe Airbnb’s and Brian’s actions during the layoff are well executed if not impeccable. Sure there was some miscommunications that could’ve been done better, but in general I would rate the execution a humane, considerate, and meaningful one. Being part of the organization I am surely not seeing the full picture but people around me did say that the severance package, the fact that our recruiting team did go extra miles to set up a talent directory, and all those smaller things like letting people keep the computers is a good choice. What’s more, you can’t cover all the negativities (since people who got impacted were impacted) but the co-founders put a lot of effort on cheering people up and adjusting the focus of the company, and that is indeed a more important post layoff. I think one of my colleagues was right on point: it takes a layoff to see the difference between companies run by founders versus hired executives. It really does.

Things are still unrolling, and there would be more repercussions as the world gradually gets back to its foot, but I think that was a good lesson and hard-to-get experience to learn.

Work life balance is shifted, tilted, and redefined

Throughout the lockdown I’ve probably gulped down 3 kilogram of grounded black coffee, bought in via 4-5 batches, each bagged in 150g, some dark roast and some more light ones. I’ve learned somewhere that coffee is better filtered, so I’ve tried to drink pour over (drip coffee) instead of using, e.g. my old French press. I think at least my taste has improved to the point where I can taste the difference of myself on a good day v.s. a bad one.

Book reading as a habit is back and thriving!

There were many discussions on e.g. a16z and Stratechery podcasts of remote work, new startup ideas, and the new paradigm of work. Now that companies like Facebook, Google, Shopify, Box, etc. are releasing policies to allow employees to work from home permanently or for the next few years, I am getting to see more of those “I told you so” tweets from e.g. BaseCamp’s DHH and Jason, and digital nomad lifestyle promoters. I personally think this is a wonderful thing to happen (of course after all it was a sad thing that many lives were lost due to the virus), that the world now can embrace this new paradigm with at least less doubt. Of course the WFH thing isn’t for everyone, as there are just types of work that can’t be done so. If we restrict the discussions on tech related jobs, at least it is much more approachable.

I personally think people need to take this trend more carefully, as you have to test the water first and decide whether this is or isn’t working for you. Our Airbnb Beijing office has been partially open since last week and as a cautious measure we are split up into A/B groups, each is allowed in the office every other week. I went to the office every single day last week, partly because I have been away from the office for too long (4 months! and magically my rubber tree is still alive!) and more importantly I just found that Zoom’s bandwidth just isn’t comparable to meeting people in person. People should really be educated that long term working from home requires tremendous communication skills as well as regular breaks where you do get to meet in person - after all people are social animals and there isn’t any technology immersive enough to give you all that. On that front, I think Facebook did right in requiring that only senior level and above have more freedom where people in their early career are still required to go to the office.

Having said all that, I’m glad that I get to work from home this next week and go back to the office the week after - every other week is a good balance. But also being a manager I’m still a bit worried on how my team members adjust to this new pattern. Work life balance was not well defined before (and even ridiculed in China’s 996 context) but now that it is no longer being redefined, rather it is tilted, reshaped, and the whole concept is no longer the same.

读皮凯蒂「二十一世纪资本论」

2020-04-05T07:34:54Z

最近几周读了皮凯蒂的大部头书「二十一世纪资本论」。

这是一本前两年占据了好久书店门口排行榜的书，不过其实并不算特别「流行」，而且作者也算是小有名气。全书基本都是充斥了图表，看上去非常唬人；不过细细读下来，发现其实并没有那么晦涩难懂。

增长率数字

这本书前半部分 demystify 了一个常见观点，就是百分之 5-6 及以上的经济和财富增长（比如中国过去几十年）速度并不是常态——人类在过去的好几百年内，长期的经济和财富增长其实是小于 3% 的，更确切的说，工业革命之前，长期都是 0.1% 的水平。

作者举了很多巴尔扎克和简奥斯汀的文学作品来侧面论证社会经济增长长期保持稳定较低水平的论点，核心还是大量的数据和图表：英国法国的居多，美国的战后多一些，亚洲等新兴国家的在近代才有——因为依赖税收的数据为主。直到近代，特别是战后，才有了 3% 以上的发展水平。不过这个差别不容小觑，因为 3% 的年均增长率意味着每 30 年（一代人）财富水平就会翻倍，这在以前是不可想像的。

这里面核心的一个指标就是财富水平和年国民收入的比值，这个数字一般是在 5-6 的水平，数字越小代表历史积累的资本越少，这种社会相对来说「食利者」也少，更多出现在战后、新兴发展国家等市场。

其中一个相关的话题是公共财富的定义以及比例——这里面相当多的欧美国家公共财富是很少的，甚至是负的水平：大概这就是「藏富于民」的定义吧。另外在这种社会里面，如果国家要大举兴建，就必须举债了，而政府的债务水平又是一个定时炸弹，很考验央行的财政水平。

不平等的来源：r > g

作者举例说明了基尼系数如何的没有代表性，而他更推崇的数字是前 10% 人群的收入和后 10% 人群的收入比。

提到收入或者收入的不平等，书中区分开了劳动收入和资本收入（及其不平等），后者更多是指的是不需要自己亲自劳动的收入（比如贷款，机器，股权，遗产等）。一系列的数字分析指出劳动收入的不平等大概是小于 10 的，甚至很多地方只有 2-3（这里指的是最高 10% 劳动收入的人比上最低的情况），而反观资本收入，这个不平等往往就是几十，甚至几百——简而言之，人人可以工作，收入差距 CEO 不会比贫穷老百姓高那么多，但是看收租的人的资本收益，就难以想象了——因为中下阶层的人几乎一无所有，入不敷出的情况下还哪里有结余。

虽然这个规律放之四海，古往今来皆准，但是还是有一些区别变化的。比如这个比值在 20 世纪 70-80 年代的斯堪的纳维亚就相对比较小，而欧美国家在巴尔扎克的年代却是很大的（所谓的「美好年代」），以及虽然在两次世界大战之后这个收入差距缩小，随后又反弹，但是这个反弹并没有回到最开始的高水平：作者的解释是因为有一批靠劳动致富的中高层管理者，靠着劳动收入（高工资）而不是仅靠食利（资本收入）富有起来，以及因此产生出来一批「世袭中产阶级」。

作者总体来说对这个趋势的判断是会升高的，背后的本质原因是资本收益率 r 长期来说没有怎么低于过 5-6%，而社会的自然增长率 g 却是很少能够高于 2-3%，未来甚至还会回到 1% 的水平（亚洲国家逐渐步入发达国家行列，收入增长放缓）。前者是因为人对未来不确定性的排斥以及「不耐心」的天性。长此以往，资本收入差距自然而然还是会回到较高水平。

书里面提到的最彻底的解决方法是累进税率，以及一系列涉及遗产税、财政政策等的法律政策制约。他呼吁各国的政策制定者参与到这个环境的建设里面，这样才会让高企的收入差距回归正轨。当然，现实情况可能更糟，因为最富有的人往往会绕过税收限制，而他做研究的数据根据就是税收数据，所以现实情况可能被严重低估了。

塔勒布的批评

比较有意思的时候我接下来还读了 Nassim N. Taleb 的 Skin in the game，他对皮凯蒂还有保罗克鲁格曼做了无情的批评（甚至还有点人身攻击，而且引以为傲，so typical NNT）：他觉得这些经济学家是屁股决定脑袋，仰望着比自己收入高一个层级、在 pecking order 上一级的人酸着他们的收入，而并不是真切的为中下层人民考虑——这个且按下不表。另外一个观点我是基本同意的，就是作者基本没有探讨「知识经济」的出现和影响，以及它可能才是真正收入不平等一定程度缓解的原因。总的来说，历史学家看过去是很准确的，但是面对未来和当下，还是需要多一些动态的眼光。（当然 Taleb 批评太多图表的点我也是赞同的…… 真的有点过了）。

总的来讲，挺有意思值得一读的。

COVID-19 宅家的某个周日下午

2020-02-23T08:46:16Z

因为新冠肺炎，整个二月都在家办公了。旅游行业受较大影响，我们也因而产生了很多额外的工作。

好不容易轻松下来，度过了 all hands on deck 的时期，下一步工作上的重点就和各地政府要面临的问题一样了：如何 recover。

好在这个周日下午可以歇歇，而暂时不用考虑这个问题。

老婆给我买了一个 Lametric，正好放到餐桌上做时钟，这才发现在家时候有个时间参考是挺重要的。今天天气很好，莫扎特比较适合此刻的心情。

新买的咖啡豆，但比不上之前在墨尔本 Proud Mary 买的，豆粒小很多。不过香味总体还是不错。算了下如果每次冲泡都是 15g 的话，大致每杯成本都在六七块：而这还算便宜的豆子。好奇麦当劳是如何做到每杯咖啡那么便宜的。（当然质量完全没法比就是了）

烤了红薯🍠，本身质量好不需要任何调味就好吃。

看茨维格讲巴西的故事，让我想起当年看林达的「西班牙旅行笔记」的感觉，除了憧憬文化和风景之外，也有点垂涎当地的食物。

周六上完了最新的 deeplearning.ai 的 Cousera 课程，也算是学到了一些挺有用的信息。目前来说最让我印象深刻的就是他们居然这么认真的在考虑 federated learning 和对个人隐私数据的保护；回头看国内这些科技独角兽，有几个能有这样的觉悟呢。换句话说，觉悟是一方面，私企和资本生存的政治和法律土壤是怎么样看重个人权利的，也会隐隐约约的体现在产品方向上。

感谢这些小事，让我有一个心情愉悦的周日下午。

How I Discovered the Cause of Slowness in My Express App

2019-10-21T09:42:22Z

Please note that this is a back-ported post from 2016.

Recently I discovered some slowness in my express app.

A bit background here: in the platform we are building here in Madadata, we are using an external service to provide user authentication and registration. But in order to test against its API without incurring the pain of requesting from California (where our CI servers are) to Shanghai (where our service providers' servers are), I wrote a simple fake version of their API service using Express and Mongoose.

We didn't realize the latency of my service until our recently started load testing, where it shows that more than half of requests didn't return within 1 second and thus failing the load test. As a simple Express app using Mongoose there is hardly any chance of getting it wrong, at least not anywhere near 1 second of latency.

The screenshot above for running mocha test locally revealed that there is indeed some problem with the API service!

What went wrong?

From the screenshot I can tell that not all APIs are slow: the one where users log out and also the one showing current profile is reasonably fast. Also, judging from the dev logs that I printed out using morgan, for the slow APIs, their response time collected by Express is indeed showing a consistent level of slowness, (i.e. for the red flagged ones, you are seeing a roughly sum of latency of two requests above them, respectively).

This actually rules out the possibility that the slowness comes from connection, rather than within Express. So my next step is to look at my Express app. (N.B. this is actually something worth ruling out first, and I personally suggest trying one or two other tools rather than mocha, e.g. curl and even nc before moving on, because they almost always prove to be more reliable than the test code you wrote).

Inside Express

Express is a great framework when it comes to web server in Node and it has come a long way in terms of speed and reliability. I thought it is more likely due to the plugins and middlewares that I used with Express.

In order to use MongoDB as session store I used connect-mongo for backing my express-session. I also used the same MongoDB instance as my primary credential and profile store (because why not? it is a service for CI testing after all). For that I used Mongoose for ODM.

At first I suspected that it might be because of the built-in Promise library shipped by default in Mongoose. But after changing it with ES6 built-in one the problem wasn't solved.

Then I figured it is worth to check the schema serialization and validation part. There is only one model and it is fairly simple and straightforward:


const mongoose = require('mongoose')
const Schema = mongoose.Schema
const isEmail = require('validator/lib/isEmail')
const isNumeric = require('validator/lib/isNumeric')
const passportLocalMongoose = require('passport-local-mongoose')

mongoose.Promise = Promise

const User = new Schema({
  email: {
    type: String,
    required: true,
    validate: {
      validator: isEmail
    },
    message: '{VALUE} 不是一个合法的 email 地址'
  },
  phone: {
    type: String,
    required: true,
    validate: {
      validator: isNumeric
    }
  },
  emailVerified: {
    type: Boolean,
    default: false
  },
  mobilePhoneVerified: {
    type: Boolean,
    default: false
  },
  turbineUserId: {
    type: String
  }
}, {
  timestamps: true
})

User.virtual('objectId').get(function () {
  return this._id
})

const fields = {
  objectId: 1,
  username: 1,
  email: 1,
  phone: 1,
  turbineUserId: 1
}

User.plugin(passportLocalMongoose, {
  usernameField: 'username',
  usernameUnique: true,
  usernameQueryFields: ['objectId', 'email'],
  selectFields: fields
})

module.exports = mongoose.model('User', User)

Mongoose Hooks

Mongoose has this nice feature where you can use pre- and post- hooks to interact and investigate document validation and saving process.

Using console.time and console.timeEnd we can actually measure the time spent during these processes.

User.pre('init', function (next) {
  console.time('init')
  next()
})
User.pre('validate', function (next) {
  console.time('validate')
  next()
})
User.pre('save', function (next) {
  console.time('save')
  next()
})
User.pre('remove', function (next) {
  console.time('remove')
  next()
})
User.post('init', function () {
  console.timeEnd('init')
})
User.post('validate', function () {
  console.timeEnd('validate')
})
User.post('save', function () {
  console.timeEnd('save')
})
User.post('remove', function () {
  console.timeEnd('remove')
})

and then we are getting this more detailed information in mocha run:

Apparently document validation and saving doesn't take up large chunks of latency at all. It also rules out the likelihood a) that the slowness comes from connection problem between our Express app and MongoDB server, or b) that the MongoDB server itself is running slow.

Passport + Mongoose

Turning my focus away from Mongoose itself, I start to look at the passport plugin that I used: passport-local-mongoose.

The name is a big long but it basically tells what it does. It adapts Mongoose as a local strategy for passport, which does session management and registering and login boilerplate.

The library is fairly small and simple, so I start to directly edit the index.js file within my node_modules/folder. Since function #register(user, password, cb) calls function #setPassword(password, cb), i.e. specifically this line, I started to focus on the latter. After adding some more console.time and console.timeEnd I confirmed that the latency is mostly due to this function call:

pbkdf2(password, salt, function(pbkdf2Err, hashRaw) {
  // omit
}

PBKDF2

The name itself suggested that it is a call to cryptography library. And a second look at the README show that the library is using 25,000 iterations.

Like bcrypt, pbkdf2 is also a slow hashing algorithm, meaning that it is intended to be slow, and that slowness is adjustable given number of iterations, in order to adapt against ever-increasing computation power. This concept is called key stretching.

As written in the Wiki, the initial proposed iteration number was 1,000 when it first came out, and some recent updates on this number reached as hight as 100,000. So in fact the default 25,000 was reasonable.

After reducing the iterations to 1,000, my mocha test output now looks like:

and finally it is much acceptable in terms of latency and security, for a test application after all! N.B. I did this change for my testing app, it does not mean your production app should decrease the iterations. Also, setting it too high will also render the app vulnerable to DoS attack.

Final thoughts

I thought it would be meaningful to share some of my debugging experience on this, and I'm glad that it wasn't due to an actual bug (right, a feature in disguise).

Another point worth mentioning is that for developers who are not experts on computer security or cryptography, it is usually a good idea not to homemake some code related to session/key/token management. Using good open source libraries like passport to start with is a better idea.

And as always, you'll never know what kind of rabbit hole you'll run into while debugging a web server - this is really the fun part of it! 

如何问优秀的问题

2019-10-21T09:40:22Z

Please note that this is a back-ported post from 2016.

This post is a translation of Julia Evans’ How to ask good questions. I personally like it a lot and think it’ll be a good idea to share with translation. She has acknowledged the translation but not endorsed or verified it.

这篇文章是 Julia Evans 的这篇博客文章的翻译。我觉得写得很不错所以翻译了这个版本。翻译得到了原作者授权，但没有审核或者背书。

提优秀的问题是一个软件开发时超级重要的技能。我在过去几年中做的越来越好（以至于我的同事还经常评论和提及）。以下的几点是我很受用的原则！

首先，我其实非常同意「你完全可以问很蠢或者不那么优秀的问题」。我自己就经常问别人比较蠢的问题，那些问题其实我自己可以 Google 或者搜索代码库就可以解决。虽然我大多数时候尽量避免，但是有时候还是问了很傻的问题，之后也不觉得是世界末日。

所以以下的策略并不是关于「你在提问之前必须要做这些事情，否则你就是一个坏人且应该自责」，而其实是「以下这些事情曾经帮我更好的提问和获得我想要的解答」。

什么算是一个优秀的问题？

我们的目标是问那种容易解答的技术概念的问题。我经常遇到这样的人，他拥有一堆我想了解的知识，但他不总是知道如何以最好的方式解释给我听。

但如果我可以问一系列好的问题，那我可以帮那个人有效地解释清楚他所知道的，并且引导他，告诉我我感兴趣的东西。所以我们来讲讲如何做到这一点！

陈述你所知

这是我最爱的提问技巧之一！这种问题的基本形式是：

陈述关于这个话题你目前的理解
然后问「我理解的对么」？

例如，我最近和一个人（一个非常优秀的提问者）讨论计算机网络！他说：「所以我的理解是，这里有一系列的递归 DNS 服务器……」。但那不对！事实上，递归 DNS 服务器是没有「一系列」的（当你访问递归 DNS 服务器的时候，这里只有一个递归服务器参与其中）。所以他事先陈述了他的目前的理解，这帮助我们很容易的就厘清了其中的工作原理。

我之前对 rkt 感兴趣，但并不理解为什么 rkt 跑容器的时候比 Docker 多占用了那么多磁盘空间。

但「为什么 rkt 比 Docker 占用更多的磁盘空间」感觉上不像该提的问题——我或多或少明白它的代码是如何工作的，但并不明白为什么他们要这么写。所以我向 rkt-dev 邮件组写了这个问题：「为什么 rkt 存储容器镜像的方式不同于 Docker？」

我：

写下了我对 rkt 和 Docker 是如何在磁盘上存储容器的理解
提出了几点我认为的他们如此设计的理由
然后就只问了「我的理解对么？」

我得到的答案都超级超级的有用，正好是我想要的。我在组织问题上花了不少时间，才得到我满意的方式，但我很高兴我花了这些时间，因为它帮助我更好的理解了来龙去脉。

陈述你的理解其实一点都不容易（它需要你花时间去思考你所知，并且理清你的思路！！）但是它很有用，而且让回答你问题的人可以更好的帮助你。

你的问题的答案应该是一个客观事实

我一开始的很多提问都有点模糊，比如「SQL 里面 join 是怎么工作的？」那个问题并不好，因为 join 的工作原理分好多部分！对方怎么可能知道我想了解的是哪一部分呢？

我喜欢问的那些问题，他们的答案是一个简单直接的事实。比如，在我们的 SQL join 的例子里面，一些以事实为答案的问题可能是：

Join 两个大小为 N 和 M 的表的时间复杂度是什么？是 O(NM)? 还是 O(NlogN) + O(MlogM)?
MySQL 是不是总是在 join 之前先对 join 的列进行排序？
我知道 Hadoop 有时候会做 hash join——其他数据库引擎是不是也会使用这个？
当我在一个有索引的列和一个未索引的列上做 join 的时候，我需要事先对未索引的列排序么？

当我问这样超级具体的问题的时候，对方不总是知道答案（但这没关系！！），但是至少他们明白我想了解哪种问题——比如，明显我对如何使用 join 不感兴趣，我想了解的是具体实现和算法。

敢于说出你不理解的地方

当有人给我解释一个东西的时候，他们常常会说一些我不明白的东西。比如，有人给我解释数据库的时候可能会说：「好，我们在 MySQL 里面使用了乐观锁，然后……」我完全不知道「乐观锁」是什么。所以，那就是一个理想的提问时机！:-)

学会打断对方，然后说「嘿，那是什么意思？」是一个超级重要的技能。我觉得这是一个自信的工程师的素质之一，而且是件很棒的事情。我经常看到一些资深工程师，他们经常提问要求清楚解释概念——我觉得当你越来越对你的技能自信的时候，这一点也变得更容易。

主动提问越多，我就越觉得请求对方解释这件事情很自然。事实上，在我解释事情的时候，如果对方不主动问题，我会担心他们不是在认真听。

同时这也创造了更多的机会来给回答问题的人承认，他们已经穷尽了他们所知。我经常碰到提问的时候对方不知道答案的情况。我问的人通常都比较擅长说「不，那个我不知道！」

指明你不清楚的概念

刚开始现在这份工作的时候我在数据组。在了解我的新工作职责的时候，里面全是这种词！Hadoop、Scalding、Hive、Impala、HDFS、Zoolander 等等。我之前可能听过 Hadoop，但基本上不知道以上任何单词的意思。里面有的词是内部的项目，有的词是开源项目。所以我就开始请别人帮我了解其中每一个概念的含义以及他们之间的关系。期间我可能问过这样的问题：

HDFS 是一个数据库么？（不是的，它是一个分布式文件系统）
Scalding 用到了 Hadoop 么？（是的）
Hive 用到了 Scalding 么？（没有）

因为实在太多了，事实上我为这些所有概念写了一个「词典」。了解这些概念帮助我找到了方向，以及在之后更好的提问。

自己做些调研

当我打出上面那些 SQL 问题的时候，我在 Google 里面搜索了「如何实现 SQL 的 join 语句」。点击其中一些链接之后我看到了「噢，我明白了，有时候有排序，有些时候有哈希 join，我都听过」，然后写下了我的一些更具体的问题。一开始先自己 Google 一番帮我提出一些稍稍更好一点的问题。

话虽如此，我觉得有的人太坚持「永远不要在自己 Google 之前提问」这件事——有时候我和别人吃午饭时候，好奇对方的工作内容，我会问一些比较基础的问题。这完全没问题！

但自己做些调研真的很有用，而且做足功课之后可以提出一系列很棒的问题，这真的挺有意思的。

决定谁是请教对象

这里我主要讨论的是问你的同事问题，因为我自己大部分时间都是这样。

我在问同事问题之前，会做以下考量：

对对方来讲这是一个好的时机么？（如果对方正在处理一个紧迫的事情，很可能不是）
我问这个问题节约的时间是不是值得我提问所花的时间？（如果我提问需要5分钟，可以节约我2小时的时间，那太棒了 :D）
对方需要花多少时间来回答我的问题？（如果我有个需要半小时时间的问题，我可以和对方预约一个之后的一整块时间；如果我只是有个小的问题，可能我就会立刻就问）
对方是不是在这个问题上太资深了？我觉得总是问那些对某个话题最在行和资深的人，是挺容易的陷进去的一个误区。但是常常去找那些稍微不那么资深的人会更好——他们常常可以回答你的大多数问题，回答问题的压力也分散了，而且他们还可以有机会展示他们的知识（这一点很棒）。

上面的原则，我也不总是搞的清楚，但是考量一下他们的确帮到我很多。

另外，我经常花更多时间问离我近的人问题——他们每天和我交流最多，我可以很方便的问他们问题，因为他们已经有了我做的工作的背景知识，也很容易给出建设性的答案。

ESR （Eric Steven Raymond）写的「如何聪明的问问题」是一篇流行但挺刻薄的文章（它开头就有「我们叫这样的人怂货」这样的糟糕语句）。这是关于在互联网上向陌生人提问的。在网上向陌生人提问是一个超级有用的技能，也可以给你很有用的信息，但它也是提问的「困难模式」。你的提问对象不了解你的处境，所以你需要以成倍的耐心去陈述你想了解什么。我并不喜欢 ESR 的那篇文章，但是它讲了一些有价值的东西。其中「如何有建设性地回答问题」一章其实非常棒。

问那些可以揭示隐藏知识的问题

以提问的方式来揭示隐藏的假设或知识是一种高级的提问技巧。这类问题其实有两个目的，其一是为了获得答案（可能有些信息是一个人知道但另外的人不知道的），其二是为了指出其中有隐藏的信息，分享出来有好处。

Etsy 的 Debriefing facilitation guide 其中的「提问的艺术」一章是在讨论突发事件的背景下，对此的一个精彩介绍。以下是来自其中的一些问题：

当你怀疑这类失败发生的时候你会寻找什么迹象？你怎么判断一个「正常」的情况？你怎么知道数据库下线了？你怎么知道你需要报告给哪个组？

这类（看起来挺基本，但却并不明显的）问题，在那些有些权威的人问出来的时候特别有效。我特别喜欢那种情况，就是一个主管或者高级工程师问类似「你怎么知道数据库下线了」这样的基本但却重要的问题，因为这时候会让不那么有权威的人之后有条件来问同样的问题。

回答问题

André Arko 的如何为开源软件做贡献文章里面，我最喜欢的一部分是：

现在你读完了所以所有的 issues 和 pull requests，开始寻找你可以回答的问题了。用不了多久你就会发现有人在问之前回答过、或者在你刚刚读过的文章里面回答过的问题。回答这些你可以回答的问题。

如果你刚开始了解一个小的项目，回答他人的问题是一个非常棒的强化你的知识的方法。每当我第一次回答一个关于某个话题的问题的时候，我都感觉「噢天呐，万一我的答案是错的呢？」不过一般我都可以回答正确，然后我就会觉得更了解了这个话题一些。

问题也是一大贡献

好的问题也可以是对社区做的一大贡献！我之前在 Twitter 问了一堆关于 CDN 的问题，然后在我的文章里面写出了总结。很多人告诉我他们很喜欢那篇文章，我觉得我提那些问题不只是帮到了我，也帮到了许多人。

有很多人很喜欢回答问题！我认为提出好的问题也是一件你可以为社区做的很棒的事情，不只是「提出好的问题，让对方从非常不爽，变得只是有点不爽」而已。

编程很难（但之后会变得容易一些）

2019-10-21T09:39:34Z

Please note that this is a back-ported post from 2016.

今天我们的前端工程师告诉我说，postcss 的一个小 bug/feature 花了他很多时间才搞清楚。

Postcss 是我们用的一个处理 CSS 的前端框架，和我们目前也在用的 webpack 可以一起使用。后者负责把各种 JavaScript，CSS 以及图片等资源进行打包，而前者的主要工作就是在此之前把 CSS 代码进行一遍预处理。

比如 Webpack 的配置文件里面可能有这么一段：


// webpack.config.js
module.exports = {
  // 省略
  postcss: [require("autoprefixer"), require("precss")]
  // 省略
};

其中的 postcss 部分是一个数组（array），里面的每一个元素都是一个 postcss 的插件。Postcss 自己做的非常模块化，大部分任务都交给各式各样的插件来完成，通过组合不同的插件来实现多种功能。「Do one thing and do one thing well」，多么 UNIX 的哲学。

但我们的前端工程师遇到问题是，他之前不知道这个定义的顺序是有意义的，还以为只要把所需的功能模块放上去就好了。我说，其实这个和 UNIX 的 pipe 是一样的，或者你如果用过 gulp 也更能清楚这种流式处理（Streamline/Pipeline）系统的工作原理：顺序其实非常重要。他总结说是自己只用过 webpack 的原因。但是我仔细一想，其实这种情形我自己倒是反复遇到过。

「总会有漏洞的抽象层」

记得本科的时候读过 Joel Spolsky 的文章 The Law of Leaky Abstractions，大体讲的是，所有的抽象层都会有漏洞，总会出现一个情况下，你必须绕过抽象层去了解背后的细节。文中举了 TCP 作例子，说很多程序都依赖 TCP 协议，把易错的 IP 层协议封装好，让你发送的数据总是有序、完整和没有冗余的。但是总会出现一些情况，比如老鼠咬断了网线，那时候你还是会观察到抽象层的破碎（TCP 不再正常工作），不得不处理那些意想不到的情况。

抽象是一个超级强大的工具，事实上有句很有名的话就是说：「计算机科学里面，没有什么问题是多加一层抽象不可以解决的」。比如上面 webpack 的配置文件其实就是对流式代码的一种封装和抽象。而且抽象其实不只是计算机科学里面的概念。它是一种人类最最基本的思维工具，可以帮助人脑减少同时要处理的信息量，把不重要的、可重复的细节滤掉，而着眼于目前最重要的东西。抽象大部分情况下都是思维的利器，也是每一个程序员必须掌握的本领。

一切都很美好——直到你第一次碰壁。

「编程很简单」

最近有个好朋友在做零基础编程培训，课程设计的紧凑但很有含金量：他会教所有学员在三周时间内，先学会用 HTML 搭建一个静态网页，然后学习基本的 CSS 知识，用 Twitter Bootstrap 美化页面，最后学习 JavaScript 给页面加上一些动态的元素。

和许多其他给零基础的人群做的培训一样，他希望在很短时间内让大家了解什么是编程，什么是编程思维，也顺便学会如何和程序员同事、朋友沟通共事。我觉得这个目标非常的实际和靠谱——当然，与之相对应的就是许多夸大宣传的所谓程序员培训，希望在短期内让人速成，可以学会数据挖掘，AR/VR，和人机交互等「酷炫」的技能——说到底，最多可以让你走马观花看看而已（虽然这个本身也是好事）。

的确，现在的很多程序语言（比如 Python，JavaScript，Swift）已经比以前（比如 C，C++）好学太多了，不用手动管理内存，不用学习文件系统，甚至可以拖拽式的操作，完成一个基本的「编程」。我个人非常喜欢这些「user friendly」的编程语言，而且无论是 code.org 还是 code academy 都是非常有价值的存在——因为他们都旨在激发大家对编程的兴趣，而编程几乎肯定是未来的必备技能（最多是换一种形式罢了，编程思维才是核心）；他们无一例外的都要告诉受众：编程很容易，开始学习吧！

但，其实这不是故事的全部：其实编程很难；它一开始比较容易，之后就会变得比较难；但等你熬过去，它又会变得比较容易。而这个过程下来，你会真正的感觉到受益匪浅。而我认为，这一切背后的原因，很大一部分就是来自于这个无所不在的、总会有漏洞的抽象层。

比如，我的那个朋友在备课的时候，和我探讨过以下几个问题：

为了保证学员在最短时间内接触最有意思的东西，编程语言用相对好入手的 JavaScript，因为网页编程是最「所见即所得」的编程方法。但是每个人用的浏览器可能不同，所以最好用 Twitter Bootstrap 这样的 CSS 框架，至少整体页面的 look and feel 不会因为浏览器不同而不同——但是，但是，我和你打赌肯定会有人用 IE 6 的，oops！
编程环境也必须得统一，否则就光去解决每个人的配置问题了（想想几种不同的换行符，想想文件编码，想想至少四种不同的 npm 版本）。我提议用 Vagrant，让每个人都用同一个虚拟机镜像，但是这个方案因为中国的网速最后作罢。（不过好在他最后直接找到了 codepen 这一个终极的受控的环境。）
为了让大家不受网络连接的问题影响，需要找到靠谱的代理或者 VPN，否则一些 CDN 可能会受影响，也会出现比如 CSS 无法加载的问题，而如何教会大家使用网络工具本身又是一个挑战

你看，虽然已经是一个及其简化的配置（所见即所得，不需要编译），一个极其容错的编程平台（浏览器），也还是到处都是「有漏洞的抽象」。如果一个人因此认为自己掌握了前端编程而信心满满的时候，他下次换个机器，换个浏览器，甚至哪怕忘记打里面的 / 的时候，就会感觉到受挫和沮丧。

什么时候变得容易？

在过去的编程经历里，我自己也无数次遇到过这样的「有漏洞的抽象」，而以往的经历往往是这样的：

发现一个很好的框架/抽象/编程语言特性，玩起来发现很有意思，可以提高效率，做到以前不能或者不容易做的事情
直到你用它来做一些更重要的事情，用到更重要的项目里面，才发现原来里面这么多的「坑」，于是你不得不开始研究它背后的原理，读文档，搜 StackOverflow 的问题，读源代码，用不同的配置方法去调试，直到你搞清楚了它的原理，它究竟解决什么问题，解决这个问题的思路是什么，具体做法是什么
这个时候再回过头来看，原来它这层抽象是有这么一个意义，而且比它取而代之的之前方法更简洁、高效，只是唯一的弊端是你之前踩过的那些坑，并且你知道了如何避免，甚至改进它

从这个角度看，其实学习编程的过程也许也可以类比：

发现编程很好玩，可以用来写一个简单的程序，可以给朋友炫耀，自己也体会到了动手的快乐
直到你想深入学习，才发现踩了大坑，你开始去学习文件系统，学习内存管理，学习好的编程习惯，学习数据管理和数据库，学习数据结构，学习离散数学，学习编译器，你才发现这些东西原来都联系在一起，才发现那个「找不到对象」的错误提示背后指的是数据库没有启动，而不是一个悲伤的故事
这个时候再回过头来看，原来编程和编程思维是这样的，和想象中有那么多的不一样。原来 DeepMind 下围棋赢了人类和强人工智能的到来之间没有那么强的因果关系，也不再会被新闻媒体随随便便的忽悠了；最关键的，是你掌握了一整套分析和解决问题的思维框架

不难发现以上都是 2. 最长，这也的确是最漫长的一个过程，甚至还意味着很多次在 123 之间的反复循环。

但是结论也许也是这样的：一切都很美好——直到你第一次碰壁，然后你摸索着沿着墙壁走，逐渐的，找到了门。

Review of University of Toronto’s Coursera Specialization on Self-Driving Cars

2019-06-02T03:59:43Z

Last weekend I’ve finally completed University of Toronto’s specialization on self-driving cars. It has been a rewarding journey so let me take a moment to review and reflect.

Generally I would recommend it to people with engineering background and anyone who’s interested in this field. The specialization has four courses, which cover introductions and a kinodynamic model, sensing and localization (LIDAR and IMU), visual perception and 3D modeling, and lastly motion planning and actuation. Most of the topics are useful and necessary in order to get into this field, and combined they provide a good overall feel of what it takes to build a self-driving car.

The courses

For 1st course, the bicycle model of a car appears amazing simple yet powerful when it comes to modeling the trajectory of a car. I liked how it shows one can approach an engineering problem by layering and abstractions, e.g. by simplifying four wheels into two without losing generality. Also as an introductory course, it contains many real-world engineers and entrepreneurs’ opinions on the industry, the project, the problem domain, and future expectations. I specifically like Paul Newman’s take on the industry and why building a self-driving car which can adapt to all variety of infrastructure instead of building an infrastructure to some specific spec is necessary.

For the 2nd course, as a CS-background I find it challenging yet very rewarding to get to understand and implement Kalman filtering and its variations. I even went out to compare the slides on Extended Kalman Filter against the real open-sourced code in Baidu’s Apollo project, and you might be surprised that the formulas almost get plainly translated line-by-line into C++ code.

3rd course is the easiest one for me because most of the deep learning part was already covered in deeplearning.ai’s specialization. What was new to me was mainly the OpenCV part where some legendary algorithms like SIFT still play a shining role. Also I find this course most comprehensive and thorough where it starts with a classical pinhole camera model and works all the way towards a recent solution to image semantic segmentation problem using VGG-net.

4th course builds up a top-down approach, where the high level problem of route planning is solved using Dijkstra’s algorithm, and then a mid-level behavior planning problem tackled using finite state machines, and then finally a local-level maneuver planning problem using parametric curve solving. The highlight is the final project where all the pieces are put together in order to successfully drive the vehicle to perform obstacle avoidance, lead vehicle tracking, as well as stop sign handling. This almost gives you a feeling of how a real self-driving car performs in action.

Room for improvement

I feel like the programming assignments could’ve been more well-rounded, because there are sometimes bugs in provided utility functions, a Python version mismatch that broke the Jupyter-Hub, and also the feedback from wrong submissions was very minimal - a better prepared assignment could’ve included more intermediate steps of submissions so that learners could sanity check their progress as check-ups.

Also the teaching staff were not responsive enough when it comes to answering questions in the forum. I feel like the forums were only used by students to help each others (although very useful as well).

Outside the scope of this specialization I also find that OpenCV has a relatively poor documentation. Many of the functions’ Python version has wrong return type and/or little explanation of the algorithm background.

Afterthoughts

This area is almost white-hot in recent years. I can count a few high-profile startups as well as big names (Pony.ai, Tesla, Zoox, drive.ai, Waymo, Momenta, Tusimple, Baidu’s Apollo, Uber’s ATG, Cruise, comma.ai, Mobileye, etc. just to name a few without a specific order), each with a different approach and focus area. They are also taking in huge amount of investment money and resources and racing to build a larger fleet of autonomous cars by the day.

I think this specialization gives a glimpse of what the autonomous driving future will be like. Indeed, “any sufficiently advanced technology is indistinguishable from magic”. I think the moment of that magical future is not yet near. Like Kaifu Lee and Rodney Brooks challenged, it’s not anywhere near 2020. My (unqualified) opinion is that it’s not going to come within the next five years, but in mid-/long-term it’ll be possible in our lifetime (and luckily much sooner).

My short-term pessimism comes from understanding of the problem and design domain we need to tackle as learned from the cource material. And these are (like anything else) threefold: technology, talents, and capital.

Technology-wise I believe the current trend of boost in the industry is largely driven by the software and hardware upgrades from dissemination of deep learning (near-realtime object detection algorithms like Yolo, and cheaper GPU/TPUs). It certainly gets us near the goal of L4/L5 autonomy but to really get there it’s not enough. Tesla hasn’t fully convinced everyone that Lidar is optional, nor hasn’t the accumulated number of vehicles of all the fleets driven down the price significantly enough (I think). On behavior planning the usage of reinforcement learning is still early stage, and it has to deal with explainability of the agent’s policy (to pass regulatory and media’s scrutiny). As for the access/sharing of data, I haven’t yet seen the “ImageNet” moment (e.g. like what Bert is likely perceived in NLP community) of high-precision map and many other areas.

Predictions like those could easily be wrong but it stands that deep learning itself hasn’t moved everything in the industry just yet, and there are plenty of such technicalities that need be solved, which is hard within the next 5 years.

Talents, both in engineering and management, are lacking. I think some of the founders of the startups can raise tons of money because of their track of records in Internet and software industry. But to tackle this problem one needs more than that. Managing hardware supplies, OEMs, risk assessing, quality control, etc. is (probably) harder than pushing accuracy rate up 1% or fixing a software continuous delivery pipeline. Good engineers are also hard to be found. Earlier definition of full-stack engineers (in the Internet industry) might involve CSS all the way down to database and dev-ops, but to build a self-driving car it brings a whole new level (you’ll know how to calibrate cameras, analyze point clouds data, understand gears and transmissions, and also how to program RL agents). If we don’t need that type of full-stack-ness, we still need engineering leaders to cover the whole lot. Training such people takes time and a lot of failures. The progress probably will take many deaths of companies which train talents in the meantime. Hopefully the it is slow but steady. I have no much authority in this area but as I will happily be proven wrong.

Given the timespan will take likely > 5 years, it’s also a challenge for venture capitals. Large corps like Google can be supported by its board to invest in this type of moon-short projects, but for financial VCs, their LPs might not be that patient. Maybe some of the startups either have to pivot and re-focus on something more achievable than L4/L5 autonomy or they have to persuade either their investor or acquirers. For high profile startups, both ways are exponentially more difficult when you have already raised hundreds of millions of USD.

Having written all that I think still the problem is hard but solvable. It might take another example of “PayPal mafia” story where companies appear, growth, burst, and them disseminate seeds of talents, industry know-hows, etc. and then just behind the horizon a new level of advance can be made. The whole society is actively excited about this area and both US and China (and they are not the only ones) are pouring financial and policy resources. Maybe a new form of venture capitalism can be formed to adapt to this industry which can foster better cross-industry corporation, etc. I don’t know. As an engineer background I think this might just be the Apollo project of our time, and it’s exciting.

Hello world

2018-12-22T16:33:05Z

Starting a new blog is both exciting and daunting experience for me, because thinking of the actual real commitment and future (less possible) serendipity is hard to be materialized and weighted. So maybe laying the expectation low would be a better strategy.

It’s the end of year 2018 and many things have changed especially for the last four years. I do not usually care for looking back and this time it’s no different: in this piece I should only try to cover my expectation for the next year and this blog.

In the past few years I’ve been reading a lot but to be honest most of them are just random articles on web that I forget within an hour. To change that I think writing down my thoughts and rumination upon the article and subject can be useful. (Also this applies to books.) So to write is to remember.

Making myself write longer blog articles can also hopefully help building my ability to articulate and then be more terse again. I read about an article from someone who had been blogging since sixteen and he shared that writing more than two thousand words per article can help with SEO as well: this might be a virtue of longer articles. So in this sense to write is to articulate and oil the vehicle of expressiveness - because expressiveness limits one’s thoughts as well.

Also mentioned in that articles is the importance of picking an audience: because to write is to express oneself and spread ideas; people are busy and won’t care about who you are or what you did - it had to be relevant to them. I hope this can help me build up the sense of meaningfulness as well as a sharper sense of what really matters to people and their daily and professional life (of course within a technology context).

Hopefully that laid the background of the why for this new blog.