Q&A #3: Sophisticated vs trivial problems, when to learn coding, how I learned SQL

by Justin Skycak (@justinskycak) on December 28, 2024

Link to Podcast

What it means for a problem to be sophisticated, not made trivial by foundational knowledge. When is the best time to learn coding, at an early age or after you have some university-level math under your belt? How I learned to write, organize, and debug big-ass SQL queries.

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.

Link to Podcast

The transcript below is provided with the caveat that there may be occasional typos and light rephrasings. Typos can be introduced by process of converting audio to a raw word-for-word transcript, and light rephrasings can be introduced by the process of smoothing out natural speech patterns to be more readable via text.

∗ ∗ ∗

Intro

Hey, I’m back for another Q&A.

Last time I mentioned that I would do the next Q&A session on coding questions in particular, that’s what I’ll focus on here.

Let’s just jump right to it.

First question, what do you mean by sophisticated problem, not made trivial by foundational knowledge?

For a bit of background, this was part of a – this had to do with a post that I made on how to get yourself into a full-time software job during college. And I made a roadmap for that. And the first two steps were step one is learn a bunch of skills ahead of time, and the step two is demonstrate those skills on interesting projects.

And in particular, I mentioned that the problems that you work on should be sophisticated, not stuff that’s matrivial by foundational knowledge.

What does that mean? What does it mean for a type of problem to be sophisticated and not matrivial by foundational knowledge?

Basically, what I’m saying here is that you don’t want to spend a ton of time putting together some kind of hacky solution to a problem that could be solved using basic algebra or basic calculus.

What’s not impressive would be if you spend a month essentially doing, finding the area under a curve, except you don’t know calculus, you don’t know Riemann sums, and you come up with this janky method that sort of works in your use case, and then you tell somebody about it and about how you came up with this complicated method, and here’s the shortcomings, here’s how you resolve some of the shortcomings, and then they just ask you, well, why don’t you just use Riemann sums? I mean, anyone who took basic calculus knows what a Riemann sum is.

Or the same thing for solving a system of linear equations. Maybe you came up with some kind of personally interesting way of doing it, but it turns out that if you just knew some basic linear algebra, that this would be curvial. You just dump it in a matrix, row reduce the matrix, and bam, now you know whether there is a solution, whether it’s unique, whether there’s infinitely many solutions, what they are. It’s just easy if you know your basic linear algebra.

Basically, it’s not impressive to solve an easy problem in a complicated way. What’s impressive is to solve a problem that’s actually hard, and when you’re explaining it to somebody, they don’t know already how to solve the problem. Or maybe they think they do, but then you mention some kind of edge cases that they’re not thinking about, and then they’re like, oh, okay, this is actually a very hard problem to solve. I’m not sure how I do it. And then you tell them how you did it, and then they’re like, wow, that was a good idea.

That’s the effect that you’re going for. You want to stay away from reinventing a worse wheel. Because think about what kind of impression that makes. If you show off reinventing a worse wheel, basically taking an easy problem that could be resolved really quickly and elegantly, if you just had your foundational knowledge in place. And then you come up with some personal solution that is just worse. It’s more complicated. It took you a long time to put together. And maybe it doesn’t solve the problem as well.

What impression that gives is that you’re somebody who gets satisfaction out of creating problems that don’t need to be there. And that is not the impression that you want to make on a potential employer. You don’t want to come across as somebody who creates these issues. You want to come across as somebody who just drills into the core of a real problem, finds all the things that can be solved easily, and then just solves them easily using foundational skills. Just gets those out of the way, drills into the either, then you either fully solve the problem or you hit this core of the problem that can’t be solved easily using well-known foundational knowledge and skills. And then that’s where you start bringing some of your original intellect and problem-solving skills to just chip away at it.

And that’s what you want to communicate, that you’re able to do. And there are legitimately hard problems that you’re able to make progress in destroying them.

So it’s really two parts. Part one is you want to show that you can get all the easy parts of the problem out of the way. They’re not a distraction to you. You obliterate them quickly, elegantly, with your foundational knowledge. You don’t get bogged down in them. And you hit this rock-solid core of legitimately hard problem. And then you’re able to grapple with that and make progress. It’s a sophisticated, it’s not made trivial by your foundational knowledge or your potential employers foundational knowledge. It’s something that everyone who looks at it can point and say, wow, yeah, that’s a challenging problem. I’m not sure exactly. It’s not obvious how to go about solving it. And that’s when you can say how you want about solving it. And that’s when you can impress with your solution to it.

Next question, I would love it if you would sketch out more of your ideas on how to learn coding. I've read your FAQ on your website, but as a non-coder, it's still mystifying to me where coding sits properly in the educational sequence. Should it be started as young as possible? After multivariate calculus? I don't understand how it works together with other subjects.

This is a question about a post where I was talking about these coding courses that I developed and taught to students in Math Academy’s original in-school program in Pasadena. Basically, they learn high school math in middle school and university math in high school.

In eighth grade, they would pass the AP Calc BC exam. In ninth grade, they get a ton of linear algebra and multivariable calculus under their belt, some differential equations, and probability and statistics too. In tenth grade, they could join the lowest course in my coding sequence, which would leverage all of that background math and pull it together into doing some serious quantitative computer science. Scaffolding it up all the way through basic data structures and algorithms and then building a machine learning library from scratch.

I mean, not even using NumPy or some linear algebra library, but actually building a matrix class, building determinate and inverse and row reduction and matrix multiplication methods within that class, and then using those matrices to fit linear regressions. Then also building decision trees and logistic regressions, neural nets, implementing backprop by hand, not just offloading it to some other off-the-shelf library that does all the hard work. They would have to do all this by hand. They would have to code this up themselves.

Eventually, they got into reproducing research papers in machine learning and AI. In particular, the research papers in the Blondie 24 research program, which was in the 90s. It was centered around evolving neural nets to play games, like starting out with tic-tac-toe, but then moving on to more advanced games like checkers.

Back to the question. Personally, I started these students on coding after multivariable calculus and linear algebra. That was necessary in order to, for instance, build a machine learning library from scratch, implementing the matrix class from scratch, like determinant, inverse, row reduction, and implementing backpropagation and neural nets from scratch, which requires using the multivariable chain rule.

But I don’t think that coding has to be postponed until there. I just think this particular, very heavily quantitative computer science—building a machine learning library from scratch, reproducing research papers in machine learning and AI—you need some heavy math chops to even get your arms around those things.

There’s plenty of coding you can do aside from that before you’ve learned all the math. Sorting algorithms, hash tables, breadth-first and depth-first search, some of the more basic data structures and algorithms, you don’t need a whole lot of advanced math to do that. Things related to front-end design, like designing a game board with the game cells, having the right margins, and being arranged in a hexagon board. There’s plenty of interesting projects you can do without knowing university-level math, just some basic algebra.

For instance, one category of those projects would be reimplementing board games, making a class for every piece on the board that has its own properties, making a class for each player, and then an overarching game class that asks the players to make their moves. The players make the moves, tell the game what to do, and the game arranges its internal state.

This hits on more of the software engineering, object-oriented programming, elegantly dealing with complexity, dipping your toe into managing a code base that’s larger than just a LeetCode problem. That can all be started earlier, well before your university-level math. All you really need for that is arithmetic and basic algebra knowledge. You could probably do this with a bright middle schooler.

I actually got a similar question a while ago from a parent who was asking about, in general, recommendations on how to teach kids to code. I wrote up a little response to that. My recommendation was to start with the simplest version of the real thing, something like Python or Node, as opposed to Scratch.

Start with a language that’s easy, where the low-level details like memory management are abstracted away. It reads almost like plain English, but it’s actually used by professional programmers. It’s the minimum viable version of the real thing.

I think leaning on something that tries to be more scaffolded for novices but isn’t the real thing is similar to when you’re teaching a kid arithmetic. You can use manipulatives like counting blocks. Those might be helpful at the beginning if the kid is really struggling to wrap their head around the idea of numbers and addition. But you want to wean yourself off the counting blocks as quickly as possible. You don’t want to stay in counting block world and get so used to using these manipulatives that they don’t generalize well in terms of efficiency.

You don’t want to let a kid stay in finger-counting land for too long. Otherwise, they’ll get trapped there, become comfortable with finger counting for everything, and it will turn from a scaffold into a crutch. It’s the same way with programming. You want to keep things simple, but at the same time, use tools that are fine to continue using into the future. If you’re using a real programming language like Python or Node, you can also use a real environment like VS Code or some kind of emulator and use a real debugger.

In terms of curriculum, what I recommended was to start off with some intro course that covers the very basics of coding. I saw a Python course on Codecademy that’s free and covers the very, very basics. After that, you can jump right into building progressively more complicated board games from scratch.

Maybe start with tic-tac-toe, then move on to Connect Four, then checkers. It doesn’t require a lot of math knowledge, but it forces you to get good at software development principles, like wrangling complexity, getting your class structures right, and debugging a system.

Keep the focus on building the core functionality of the game and doing it from scratch. I would always be wary about getting lost in side questions. Keep the focus on moving the needle in terms of actual coding ability. Focus on building the board game from scratch, printing out the board into the terminal.

If you open the door to all sorts of colors and fonts and other design settings, kids are prone to go overboard spending all their time messing around with the design when they should just be working on things that actually level up their coding skills. When they go overboard on the design, it often doesn’t even turn out good. It’s just some monstrosity of yellow text on a purple background with invisible buttons and heart sound triggers for laughs.

Keeping things limited to the terminal removes those distractions. It’s important to scaffold the games up. Start with a simple game. Don’t go after the most complicated game off the bat. Don’t start by trying to build your own World of Warcraft. The kid will bite off more than they can chew, spend a bunch of time not getting anywhere, and get frustrated. Very little learning will happen.

Start with simple games but progressively ramp them up in complexity. Never take too big a step in complexity.

I should also say this recommendation assumes that you actually know how to code and can coach your kid through this process of building these board games. If you don’t know how to code and you’re just looking for some kind of online course to offload all this work of teaching the kid through this process of building out these board games, then I don’t know. I haven’t really seen anything like that that works as effectively as I would hope online.

I mean, there are very basic coding courses, but beyond that, the level of scaffolding typically isn’t what it needs to be to keep kids on the rails. That’s something we’re working on at Math Academy. After we release this upcoming machine learning course, we’re going to look at an intro to coding course that is really highly scaffolded.

It will go beyond just the basics like what is an if statement, what is a while loop, what is a variable, and give students practice combining all these coding constructs in progressively more complicated ways. It will smooth out the transition into doing bigger projects. But that’s in the future.

For now, if you don’t know how to code and you want your kid to learn coding beyond a simple intro to Python, what’s a variable, I don’t know that there’s currently another way other than having an actual tutor or somebody who knows how to code work with them and coach them through these more advanced skills and projects.

Next question was, I was talking about how it's worth learning serious SQL during school, during university, and not just the simplest query, but actually writing, organizing, and debugging very large, complicated queries. The question is, how do you actually do that? There's another question that's similar: how do you learn SQL after landing a job? Any courses or whatever?

Honestly, I’ve not seen any amazing SQL courses that expose you to these big, complicated queries. Though I haven’t been looking lately, and it’s been years since I’ve really taken a proper look. If anyone knows of a good course out there, feel free to leave a comment. Let me know. Let everyone else know.

Personally, my SQL learning just came from taking on bigger and bigger projects and making the transition from doing one-off pieces of work to building and owning a subsystem that had to expand in capability as our product got more and more complicated.

That kind of had a scaffolding effect. When I first started working on our task selection model, the queries were a lot simpler because they could be simpler since I was building a V1 of it. The system as a whole wasn’t as complicated. Gradually over time, they had to be refined, pulling in more data, having heavier logic within them as we kept layering more capabilities onto the product.

Outro

Alright, I’m getting kind of tired. It’s getting around that time where I got to wrap it up for today. I didn’t get through all the coding questions that I was hoping to. Only got about halfway there.

That’s alright. I’ll just save the rest for another Q&A session in the future.

Signing off, till next time.

Prompt

The following prompt was used to generate this transcript.

You are a grammar cleaner. All you do is clean grammar, remove single filler words such as “yeah” and “like” and “so”, remove any phrases that are repeated consecutively verbatim, and make short paragraphs separated by empty lines. Do not change any word choice, or leave any information out. Do not summarize or change phrasing. Please clean the attached document and deliver it to me one section at a time. Again, do not summarize. It should be almost exactly verbatim. Keep all the original phrasing.

After first response:

No, you are changing my words. Don’t change my words. Only remove single filler words such as “yeah” and “like” and “so”, remove any phrases that are repeated consecutively verbatim, and make short paragraphs separated by empty lines. Do not change any word choice, or leave any information out. Do not summarize or change phrasing. It should be almost exactly verbatim. Keep all the original phrasing.

After each section:

Next. Remember: You are a grammar cleaner. All you do is clean grammar, remove single filler words such as “yeah” and “like” and “so”, remove any phrases that are repeated consecutively verbatim, and make short paragraphs separated by empty lines. Do not change any word choice, or leave any information out. Do not summarize or change phrasing. It should be almost exactly verbatim. Keep all the original phrasing.

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.