Monday, May 27, 2024

How to become a committer

Wearing three different hats over the years, I have received three different versions of the same question, to which I always respond, "You are asking the wrong question."

  1. What do I need to do to get an A in your class?
  2. What do I need to do to get promoted?
  3. What do I need to do to get commit?
This post is about question number 3., but the basic concept in all three is the same: you need to change "get" to "earn" or "become a good candidate for" to focus on the right question.

So what does it mean to be a good candidate for commit?

  1. Strongly net positive energy flow  Not everything you do in an OSS community adds energy.  Sometimes you will ask stupid questions, submit bad PRs, take offense, offend others, lick a cookie, or do other things that the community would be better without.  These things need to be balanced by good questions, helpful PRs, correct answers and other community-helpful things.  We all have our bad days and stupid moments, so don't obsess over always being "right," but do try to make sure that when you ask yourself "Am I really being useful in this community?" the answer is a strong "yes."
  2. Real mastery of some aspect of the project  Commit means you are trusted to merge PRs.  In some projects, commit may be limited to certain branches or docs or whatever; but the basic idea is the same in every case: you are trusted by the community as a steward for the project's assets.  To earn that trust you have to demonstrate real mastery of some part of the code, documentation or other non-code assets of the project.
  3. Understanding and following the ways of the project  OSS communities vary widely in how they work.  This kind of overlaps with 1., as if you don't understand and follow the written and unwritten rules of the project you will end up being an energy sink as people will have to correct you all the time. Of course, healthy OSS communities are always open to new ideas about how to do things, so if you don't like the way things work in a project initially, you may be able to drive change later.  But you will never be able to that or anything else useful unless you first take the time to learn how things are done and initially adjust your personal style as necessary.
For the remainder of this post, I am going to focus on practical strategies to achieve number 2. as a code contributor.  But it is really important that you also achieve 1. and 3. and most importantly you have to really want to achieve all three.

Many of the best contributors to open source projects start off as users of the software.  This is usually the best way to start.  The ideal scenario is that you are using the software as part of your day job, or some component of something you work with uses something from the project.  If that is not the case,  you should try to find a work or personal project that uses software from the project in some way. 

Start by really mastering the code in your own project that touches the OSS.  For example, suppose that you are interested in getting involved in Apache Kafka and you have a project at work that uses it. Look carefully at the code that uses the Kafka client APIs and the configuration of the Kafka system components.  These things may be hidden from you by an abstraction layer somewhere.  If so, go find that code.  Start by understanding why working code using the project works.  Make sure you understand why the specific APIs being used are the right ones to use for what the code is doing.  Or if you are starting something that uses the project, get it to work and make sure you can explain exactly why it works.  Confirm this with tests in your own project.

At first, confirm your understanding of how your app works just using the documentation, other online sources and your own testing.  Then take the leap to look at some code inside the project.  Sometimes the code that your own code interacts with directly is not very enlightening or it may be difficult to understand. That's OK.  Go find some other code that it looks likely that your code is exercising that looks more interesting or understandable. Look at its documentation, unit tests and recent commit history. 

After poking around for some time, making changes to your code and watching what happens when you play with release sources and binaries, you can take the next step, which is to build the software.  Depending on the project, this may require some patience and even some special tools or access to a special environment. OSS communities die if it is not possible for newbies to figure out how to build the software, so there has to be a way.  You need to figure it out. First look for build docs.  Most projects have them.  Try your best to get the build to work, but don't spend many, many hours stuck.  When you do get stuck, go back over everything you think you know about the build, look through project archives and docs and if you are still stuck, come up with the simplest possible question the answer to which is likely to get you unstuck.  Ask the community that question.  Often some script, doc, test or main code is either misleading or broken and that simple question can be very helpful - especially if you get a simple answer and your first contribution is to fix whatever is misleading or broken so the next newbie does not get similarly stuck.  That is being net positive.  Trying once and asking for help immediately is not.

Once you can build the software, you can make changes to it and watch what happens.  A fun game to play is to see if you can do things that won't break the build but will make an observable difference in your application.  Even adding log messages or debug print statements can help build understanding. If the project has good tests, breaking them will be easy.  Intentionally breaking tests and explaining why your change breaks them is a very good way to learn the stated and unstated invariants in the code.

The play steps above may not seem like a direct path to mastery, but if you skip them and try to go directly to attacking issues or conceiving a great contribution, you will end up stuck and frustrated.   A new codebase is like a new neighborhood.  If you just use GPS all the time to go as fast as possible to chosen destinations it will take you a long time to actually learn the place.  If you allow yourself to walk around a bit you will not need the GPS as much and you will end up always knowing not just one, but several ways to get to where you want to go.

A good place to start contributing is in tests and / or documentation.  Assuming that you have found and penetrated an area of the code to the point where you have a decent understanding of its behavior, you can ask yourself if the existing docs and unit tests fully explain and confirm the behavior.  Almost always, you will be able to find some things that you are not sure about or that seem vague or misleading in the documentation.  Write tests to first discover, then confirm the behavior.  Then ask the community if in fact this is the desired / expected behavior.  If it is, create a PR that includes the unit test and a patch to the documentation that clarifies the contract of the code.  Make sure that your PR passes all tests and works with whatever CI system the project uses.  Keep things as simple as possible and don't try to combine too many things into one PR.  Simple PRs that improve documentation and tests tend to be thankfully accepted.  Focusing on tests and docs initially also helps deepen your understanding of the code.

Another good place to start when the opportunity presents itself is on straightforward, labor-intensive tasks.  Upgrading dependencies, replacing deprecated methods, adding annotations, fixing linter errors, or carrying out other boring, but useful refactoring or code improvement tasks are all things that in some cases can be done without deep knowledge of the code, but which can be very helpful.  Make sure to pay careful attention to tests and carefully review anything that you generate with refactoring or AI tools if you take on this kind of task.  When upgrading dependencies also make sure to review release notes and test coverage for uses of the dependent code.  Break things into small PRs and make sure not to mix formatting or other kinds of changes with the specific improvements that your PRs claim to make.  Take extra time to make sure that your changes are correct, taking advantage of the opportunity to deepen your understanding of the code and tests.

Different communities use different forms of communication.  Make sure to subscribe to all relevant channels and try to follow as much as you can.  At first, a lot of the conversation, issues and PRs will be hard to follow, but over time more of it will make sense.  Start by paying special attention to your chosen area of focus.  If you see a question that you can answer or a problem that you might be able to solve, go for it.  Remember the net positive energy rule though: it's OK if you don't get everything exactly right immediately, but you need to be net positive - more useful ideas and contributions than distraction.

As you learn more about the code and community, you can start attacking bigger issues or bringing new ideas of your own.  Don't focus on impressing people or talking about what you have done.  Self-promotion does not work in OSS communities and is in general not necessary because everything happens in public, visible to the whole community.  What matters is what you contribute and how you work in the community.  If you consistently contribute high-quality PRs and participate positively in the community, one day you will be surprised to learn that you have been voted in as a committer.  







Tuesday, January 23, 2024

How to read

When I was first starting to read research papers in mathematics, I got some great advice from one of my professors.  He said, "Always have paper and pencil with you when you read a paper.  Read a little bit and then try to write the next part yourself.  Look at what is written in the paper as a sequence of hints.  That's all you are going to get.  You need to fill in the details yourself and if you can't do that, you have not understood the paper."   Over the years, I have realized that while research mathematics is kind of an extreme case, the same actually applies to any challenging text. So here is "the method":

  1. Read a sentence or paragraph or however much you need to get an idea.
  2. Write or say to yourself what you think is going to come next.
  3. Start from the beginning and read through the next chunk.  Compare your continuation to what actually came next. 
  4. Go to 2
You end up re-reading the whole piece many times this way.  For long things, use major breaks like chapters or whatever to limit the look back.

If you are trying to really learn the material, you can do the whole process repeatedly.  In that case, the checking in step 3 should start to show less and less divergence, mostly just style or sequencing.  You need to be careful though not to devolve into memorization.  You want to actually come up with the ideas that come next, not the words.

I do this kind of thing when I read hard material of any kind - not rigidly and sometimes changing chunks around.  If I go slowly enough, unless the material is really over my head or I am lacking needed background or something, I always end up feeling like I have had the ideas that the author was trying to convey.  That usually means that I can start to apply the ideas myself.

Sunday, January 14, 2024

Why I love mathematics

Millie and Al's https://www.flickr.com/photos/27480193@N05/

I love mathematics because it never says one thing and does something else. If it ever seems to do that, it is always because I am missing some idea. I never stay mad at mathematics.

I love mathematics because it is always there, waiting for me. It will always be there even if I don't jump on it right now. It worn't run away or turn into some not fun thing. When I go back over mathematics that I haven't looked at in a while, it's like going back to the old neighborhood and having that warm and happy feeling you get when nothing has changed. 

I love mathematics because it loves me. Mathematics has infinite patience for me. I can be arbitrarily stupid for arbitrarily long. Mathematics keeps the light on for me.

I love mathematics because it surprises me and makes me think differently all the time. I feel like Aeneas in the world of mathematics, constantly meeting monstra mirable dictu, but without the carnage.