This is a good method if you are stuck and you don't know what you need to do. It also helps explore a project with a specific task in mind.
It is not very useful in giving you confidence your changes would not cause unexpected side effects, which is usually the main problem working with legacy code.
If you want confidence when working with legacy code, your best bet is to do a strangler fig pattern - find a boundaries for the module you want to work on, rewrite the module (or clone and make your changes), run both at the same time in shadow mode, monitor and verify your new module is working the same as the old one, then switch and eventually delete the old module.
Write tests. Most likely those 300k lines of code contain a TESST folder with 4 unit tests written by an intern who retired to become a bonsai farmer in the 1990s, and none of them pass anymore. Things become much less stressful if you have something basic telling you you're still good.
The problem with complex legacy codebases is that you don’t know about the myriads of edge cases the existing code is covering, and that will only be discovered in production on customer premises wreaking havoc two months after you shipped the seemingly regression-free refactor.
I agree. This is one area I'm hoping that AI tools can help with. Given a complex codebase that no one understands, the ability to have an agent review the code change is at least better than nothing at all.
Is it possible in practice to control the side effects of making changes in a huge legacy code base?
Maybe the software crashes when you write 42 in some field and you're able to tell it's due to a missing division-by-zero check deep down in the code base. Your gut tells you you should add the check but who knows if something relies on this bug somehow, plus you've never heard of anyone having issues with values other than 42.
At this point you decide to hard code the behavior you want for the value 42 specifically. It's nasty and it only makes the code base more complex, but at least you're not breaking anything.
Anyone has experience of this mindset of embracing the mess?
For me now days is like this:
- try to locate the relevant files
- now build a prompt, explain the use case or the refactor purpose. Explain the relevant files and mention them and describe the interaction and how you understand that work together. Also explain how you think it needs to be refactored. Give the model the instruction to analyze the code and propose different solution for a complete refactor. Tell it to not implement it, just plan.
Then you’ll get several paths of action.
Chose one and tell the model to write into a file you’ll keep around while the implantation is on going so you won’t pollute the context and can start over each chunk of work in a clean prompt.
Name the file refactor-<name >-plan.md tell it to write the plan step by step and dump a todo list having into account dependencies for tracking progress.
Review the plans, make fixes if needed. You need to have some sort of table reassembling a todo so it can track and make progress along.
Open a new prompt tell it analyze the plan file, to go to the todo list section and proceed with the next task. Verify it done, and update the plan.
While great in theory, I think it almost always fails on "non-existent" testing structures that reliably cover the areas you're modifying. I change something, and if there's no immediate build or compile error, this (depending on the system) usually does not mean you're safe. A lot of issues happen on the interfaces (data in/out of the system) and certain advanced states and context. I wouldn't know how Mikado helps here.
In other words, I'd reword this to using the Mikado method to understand large codebases, or get a first glimpse of how things are connected and wired up. But to say it allows for _safe_ changes is stretching it a bit much.
Yes, most of the time such spaghetti code projects don't have any tests either. You may have to take the time to develop them, working at a high level first and then developing more specific tests. Hopefully you can use some coverage tools to determine how much of the code you are exercising. Again this isn't always feasible. Once you have a decent set of tests that pass on the original code base, you can start making changes.
Working with old code is tough, no real magic to work around that.
The things that always get me with tasks like this is that there are *always* clear, existing errors in the legacy code. And you know if you fix those, all hell will break loose!
Using a Mikado style graph for planning any large work in general has been really useful to me. Used it a lot at both Telia back in 2019 and Mentimeter at 2022.
It gives a great way to visualise the work needed to achieve a goal, without ever mentioning time.
I’ve been using a form of the Mikado Method based on a specific ordering of git commits (by message prefix) along with some pre commit hook scripts, governed by a document: https://docs.eblu.me/how-to/agent-change-process
I have this configured to feed in to an agent for large changes. It’s been working pretty well, still not perfect though… the tricky part is that it is very tempting (and maybe even sometimes correct) to not fully reset between mikado “iterations”, but then you wind up with a messy state transfer. The advantage so far has been that it’s easy to make progress while ditching a session context “poisoned” by some failure.
I think there are similar methods, such as nested todo-lists. But DAGs are exceptionally good for this use case of visualising work (Mikado graphs are DAGs).
1. take a well known method for problem solving basically any programmer/human knows
2. slap a cool word from the land of the rising sun
3.???
4. profit!
This article is painfully pretentious and stealth marketing for a book
So you do things one step at a time and timebox as you go? This method probably doesn't need its own name. In fact I think that's just what timeboxing is.
FWIW Mikado seems to be the name of that game where you pick up one stick at a time from a pile, while trying to not disturb the pile. (I forget the exact rules). So it isn’t as if somebody is trying to name this method after themselves or something, it is just an attempt at an evocative made up term. Timeboxing is also, right? I mean, timeboxing is not recognized by my spell checker (I’d agree that it is more intuitive though).
Mikado is the name of an opera (by Gilbert and Sullivan) in which someone is deemed to have been executed without actually having been executed. Sounds like an ideal test strategy to me: yes, all the tests were executed, just not actually run.
There are important additions beyond timeboxing, at least according to the post. Notably, reverting your changes if you weren't able to complete the chosen task in the time box and starting over on a chosen subset of that task. I can imagine that part has benefits, though I haven't tried it myself.
It goes a lot further than plan mode though, in fact I would say the key difference of mikado refactors from waterfall refactors is that you don’t do all the planning up front with mikado. If anything you try to do as little planning as possible.
It is not very useful in giving you confidence your changes would not cause unexpected side effects, which is usually the main problem working with legacy code.
If you want confidence when working with legacy code, your best bet is to do a strangler fig pattern - find a boundaries for the module you want to work on, rewrite the module (or clone and make your changes), run both at the same time in shadow mode, monitor and verify your new module is working the same as the old one, then switch and eventually delete the old module.
Maybe the software crashes when you write 42 in some field and you're able to tell it's due to a missing division-by-zero check deep down in the code base. Your gut tells you you should add the check but who knows if something relies on this bug somehow, plus you've never heard of anyone having issues with values other than 42.
At this point you decide to hard code the behavior you want for the value 42 specifically. It's nasty and it only makes the code base more complex, but at least you're not breaking anything.
Anyone has experience of this mindset of embracing the mess?
Then you’ll get several paths of action.
Chose one and tell the model to write into a file you’ll keep around while the implantation is on going so you won’t pollute the context and can start over each chunk of work in a clean prompt. Name the file refactor-<name >-plan.md tell it to write the plan step by step and dump a todo list having into account dependencies for tracking progress.
Review the plans, make fixes if needed. You need to have some sort of table reassembling a todo so it can track and make progress along.
Open a new prompt tell it analyze the plan file, to go to the todo list section and proceed with the next task. Verify it done, and update the plan.
Repeat until done.
In other words, I'd reword this to using the Mikado method to understand large codebases, or get a first glimpse of how things are connected and wired up. But to say it allows for _safe_ changes is stretching it a bit much.
Working with old code is tough, no real magic to work around that.
Then by definition you have the smallest safest step you can take. It would be the leaf nodes on your graph?
Is that the Mikado method?
It gives a great way to visualise the work needed to achieve a goal, without ever mentioning time.
I have this configured to feed in to an agent for large changes. It’s been working pretty well, still not perfect though… the tricky part is that it is very tempting (and maybe even sometimes correct) to not fully reset between mikado “iterations”, but then you wind up with a messy state transfer. The advantage so far has been that it’s easy to make progress while ditching a session context “poisoned” by some failure.
I think there are similar methods, such as nested todo-lists. But DAGs are exceptionally good for this use case of visualising work (Mikado graphs are DAGs).
Edit: thought I read it was of Scandinavian origin, hence my comment. But Wikipedia said european origin. Well well.
Using a programming language that has a compiler, lucky.