Refactoring Algorithmic Code using a Golden Master Record

18.12.2017 | 9 minutes of reading time

Introduction

There are days when I find a piece of code I simply have to refactor. Sometimes because I actually have to for project-related reasons, sometimes because it’s easy to, and sometimes because I just want to. One definition of a nice day is when all these reasons meet.

Enter the Portuguese tax number verification.

For those who don’t know, in most countries the tax number has one or more check digits which are calculated according to some algorithm the country’s legislators think is up for the task. If you have a frontend where customers enter tax numbers, it’s a good first step to actually check the validity of the number according to the check digit in order to provide fast feedback to the user.

Usually I don’t do the research on the algorithms we implement for this myself. Someone else on my team calls someone from the country in the same organization and we code monkeys usually get code snippets, such as this one . In this case I got curious. The portuguese wiki contains a nice explanation of the check digit algorithm, which is a variation of an algorithm called (according to google) the modulus 11 check digit algorithm.

Implementing this from scratch probably would have been straightforward, but I decided to refactor for several reasons:
Sometimes web sources can be wrong. Take, for example, this community wiki : Here they seem to have forgotten about a part of the algorithm. If this had been my first source, I’d have had a bug report. Thus I usually try to stay near the scripts my customers provide me with.
Refactoring this very isolated piece of code would be easy.
Sometimes the scripts we get from our colleagues contain a little bit of extra logic which make sense in the context where we use them. In this one, for example, I was told to “just don’t worry about the extra cases at the beginning. We are not interested in these and it’s okay to delete them.”

Side note: For the purposes of this exercise, I used ES6 transpiled with Babel. For my tests, I use mocha and chai.

Enough introduction! Let’s have a look at:

The Code

I admit I did a tiny bit of untested refactoring first: Returning true or false instead of an alert, exporting the function and deleting the, as per our definition unnecessary, lines.

1export function validaContribuinte(contribuinte) {
2// algoritmo de validação do NIF de acordo com
3// http://pt.wikipedia.org/wiki/N%C3%BAmero_de_identifica%C3%A7%C3%A3o_fiscal
4    let comparador;
5    var temErro = 0;
6 
7    var check1 = contribuinte.substr(0, 1) * 9;
8    var check2 = contribuinte.substr(1, 1) * 8;
9    var check3 = contribuinte.substr(2, 1) * 7;
10    var check4 = contribuinte.substr(3, 1) * 6;
11    var check5 = contribuinte.substr(4, 1) * 5;
12    var check6 = contribuinte.substr(5, 1) * 4;
13    var check7 = contribuinte.substr(6, 1) * 3;
14    var check8 = contribuinte.substr(7, 1) * 2;
15 
16    var total = check1 + check2 + check3 + check4 + check5 + check6 + check7 + check8;
17    var divisao = total / 11;
18    var modulo11 = total - parseInt(divisao) * 11;
19    if (modulo11 == 1 || modulo11 == 0) {
20        comparador = 0;
21    } // excepção
22    else {
23        comparador = 11 - modulo11;
24    }
25 
26    var ultimoDigito = contribuinte.substr(8, 1) * 1;
27    if (ultimoDigito != comparador) {
28        temErro = 1;
29    }
30 
31    if (temErro == 1) {
32        return false;
33    }
34    return true;
35}

Where to start

The first thing you want to do when you refactor code is have unit tests for the code. Since most code to refactor is hard to understand, a lot of people prefer to not even try and rather create a Golden Master test. A detailed explanation as well as a walkthrough in Java can be found here .

Creating a Golden Master

So the steps to creating a Golden Master test are:

Create a number of random inputs for your testee
Use these inputs to generate a number of outputs
Record the inputs and outputs.

Why are we doing this?

If the number of random inputs is high enough, it’s very probable that we have all test cases in there somewhere. If we capture the state of the testee before we start changing anything, we can be sure we won’t break anything later.

There’s one thing I want to say now: A Golden Master record should in most cases be only a temporary solution. You do not really want files or databases full of randomly generated crap to clog your server, and you don’t want long-running tests with way too many redundant test cases on your CI server.

Step 1: Create A Number Of Random Inputs

For this, we have to actually look at the code to be refactored. A quick glance says: “This function takes strings of length 9 which contain only digits as valid input”.

My first instinct was to try and calculate all of them. After a few frustrating minutes which I spent discussing with my computer’s memory, I did a small back-of-an-envelope calculation (16 Bit x 9 x 899999999 > 15 TB). So this turned out to be a Bad Idea™.

The next best thing was to create some random numbers between 100000000 and 99999999. After a bit of experimentation, because I “have no idea of the algorithm” for the purpose of this exercise, I settled on 10000 random fake tax numbers, which corresponded to three seconds overall test runtime on my machine. The code to generate these is wrapped in a testcase for easy access (remember, this is temporary):

1describe('validatePortugueseTaxNumber', () => {
2    describe('goldenMaster', () => {
3        it('should generate a golden master', () => {
4            const gen = random.create('My super Golden Master seed'),
5                expectedResultsAndInputs = [ ...new Array(1000000) ].map(() => {
6                    const input = gen.intBetween(100000000, 999999999),
7                        ...
8                });
9        }).timeout(10000);
10    });
11});

Side note: It is often recommended to use a seedable random generator. Since at that point I was not sure whether I wanted to actually save the inputs or not, I ended up using this PRNG . It’s not strictly necessary for this exercise, though.

Step 2: Use These Inputs To Generate A Number Of Outputs.

Just call the function.

1...
2    const input = gen.intBetween(100000000, 999999999),
3        result = validaContribuinte(input.toString(10));
4 
5        return { input, result };
6...

Step 3: Record The Inputs And Outputs

This also was pretty straightforward. I used the built-in mechanisms of node.js to write the output to a ~3.5MB file.

1fs.writeFileSync('goldenMaster.json', JSON.stringify(expectedResultsAndInputs));

And just like that, a Golden Master was created.

Create a test based on the Golden Master

The next step is to use the Golden Master in a test case. For each input, the corresponding output has to correlate to the file.
My test looks like this:

1it('should always conform to golden master test', () => {
2    const buffer = fs.readFileSync('goldenMaster.json'),
3    data = JSON.parse(buffer);
4 
5    data.map(({ input, result }) => {
6        return expect(validaContribuinte(nextNumber.toString(10))).to.equal(result);
7    });
8}).timeout(10000);

Side note: I stopped running the Golden Master generation every time; even though it would never produce different results unless the seed changed, it would’ve been a waste of resources to run every time.

I ran this a couple of times just for the heck of it. Then I started playing around with the code under test, deleting a line here, changing a number there, until I was confident that my Golden Master was sufficiently capturing all the cases. I encourage you to do this, it’s one of the very few times that you get to be happy about red tests.

I was not really satisfied with the output yet. “expected false to equal true” in which case, exactly? Again, in this simple case it would probably not have been necessary, but sometimes it can be useful to also record the failing input. So, after some refactoring, this happened:

1data.map(expectedResult => {
2    const { input } = expectedResult;
3    const result = validatePortugueseTaxNumber(input.toString(10));
4 
5    return expect({ input, result}).to.deep.equal(expectedResult);
6    });
7}).timeout(10000);

Refactoring

The refactoring itself was pretty straightforward. For the sake of brevity, most of the steps are skipped in this post.
Renaming the function and a few variables:

1export function validatePortugueseTaxNumber(taxNumber) {
2// algoritmo de validação do NIF de acordo com
3// http://pt.wikipedia.org/wiki/N%C3%BAmero_de_identifica%C3%A7%C3%A3o_fiscal
4    let comparator;
5    let checkDigitWrong = 0;
6 
7    const check1 = taxNumber.substr(0, 1) * 9;
8    const check2 = taxNumber.substr(1, 1) * 8;
9    const check3 = taxNumber.substr(2, 1) * 7;
10    const check4 = taxNumber.substr(3, 1) * 6;
11    const check5 = taxNumber.substr(4, 1) * 5;
12    const check6 = taxNumber.substr(5, 1) * 4;
13    const check7 = taxNumber.substr(6, 1) * 3;
14    const check8 = taxNumber.substr(7, 1) * 2;
15 
16    const total = check1 + check2 + check3 + check4 + check5 + check6 + check7 + check8;
17    const divisao = total / 11;
18    const modulo11 = total - parseInt(divisao) * 11;
19    if (modulo11 == 1 || modulo11 == 0) {
20        comparator = 0;
21    }
22    else {
23        comparator = 11 - modulo11;
24    }
25 
26    const ultimoDigito = taxNumber.substr(8, 1) * 1;
27    if (ultimoDigito != comparator) {
28        checkDigitWrong = 1;
29    }
30 
31    if (checkDigitWrong == 1) {
32        return false;
33    }
34    return true;
35}

Simplifying (a lot):

1export function validatePortugueseTaxNumber(taxNumber) {
2    const checkSumMod11 = taxNumber.substr(0,8)
3                                   .split('')
4                                   .map(
5                                       (digit, index) => {
6                                       return parseInt(digit, 10) * (9 - index);
7                                       })
8                                   .reduce((a, b) => a + b) % 11,
9          comparator = checkSumMod11 > 1? 11 - checkSumMod11 : 0;
10 
11    return parseInt(taxNumber.substr(8, 1), 10) === comparator;
12}

This is where I stopped.

Writing unit tests

By now I had a better understanding of what my piece of code did. And, as was said above, it’s a good idea to get rid of a golden master, so the time had come to think about valid test inputs.

Apparently a remainder of 0 and 1 was important. To this, I added the edge case of remainder 10, and some remainder in the middle range just to be sure. As for generating the corresponding inputs, I cheated a little:

1...
2if (checkSumMod11 === 0 && lastDigit === comparator) {
3    console.log(taxNumber);
4}
5...

Using this generator function, I created the final unit tests for the portugueseTaxNumberValidator:

1describe('validatePortugueseTaxNumber', () => {
2    it('should return false for 520363144 (case checkSum % 11 === 0) ', () => {
3        expect(validatePortugueseTaxNumber('520363144')).to.equal(false);
4    });
5 
6    it('should return false for 480073977 (case checkSum % 11 === 1) ', () => {
7        expect(validatePortugueseTaxNumber('480073977')).to.equal(false);
8    });
9 
10    it('should return false for 291932333 (case checkSum % 11 === 2) ', () => {
11        expect(validatePortugueseTaxNumber('291932333')).to.equal(false);
12    });
13 
14    it('should return false for 872711478 (case checkSum % 11 === 10) ', () => {
15        expect(validatePortugueseTaxNumber('872711478')).to.equal(false);
16    });
17 
18    it('should return true for 504917951 (case checkSum % 11 === 0) ', () => {
19        expect(validatePortugueseTaxNumber('523755600')).to.equal(true);
20    });
21 
22    it('should return true for 850769990 (case checkSum % 11 === 2) ', () => {
23        expect(validatePortugueseTaxNumber('998757039')).to.equal(true);
24    });
25 
26    it('should return true for 504917951 (case checkSum % 11 === 10) ', () => {
27        expect(validatePortugueseTaxNumber('504917951')).to.equal(true);
28    });
29});

Conclusion

Creating a Golden Master and using it during refactoring feels like you’re wrapped in a big, fluffy cotton ball. If the Golden Master record is detailed enough, nothing can go wrong. Or rather, if it does, you will notice in an instant. There are no qualms about deleting code, replacing it with something you think will do the same, because it’s a safe experiment. It was a fun exercise and I would do it again in an instant.

Was this post helpful?

Likes

Blog author

Stefanie Hasler

Do you still have questions? Just send me a message.

fromStefanie Hasler

Technologien lösen keine Probleme ― es sind die Menschen dahinter

Software zu entwickeln bedeutet, Mehrwert für den Kunden zu schaffen. Zu oft wird hierbei der menschliche Faktor im System missachtet. Stefanie Hasler, Senior Fullstack Developerin bei codecentric, gibt einen kleinen Einblick auf ihre Sicht der Dinge...

Künstliche Intelligenz
Softwareentwicklung
Remote Work
HR
Collaboration
Kultur

22.3.2021 | 6 Minuten Lesezeit

Kathrin Schaugg

Stefanie Hasler

Kulturkampf Digitalisierung — eine Un-Success Story

Digitalisierung — Was bedeutet das überhaupt? Immer wieder treffen wir in unserem beruflichen Alltag als Berater auf Unternehmen, die „sich digitalisieren” wollen. Damit können ganz unterschiedliche Dinge gemeint sein: eine Transformation zur Arbeit...

Agile Transformation
Digitalisierung
Kultur
Agile Methoden
Agilität

17.4.2019 | 11 Minuten Lesezeit

Stefanie Hasler

Marco Schäfer

Code Freeze

In the beginning of January I had the opportunity to attend a very special conference for the second time: Code Freeze . It’s special because of the setting: It takes place in northern Lapland, north of the polar circle — in winter (the first and only...

Community

30.1.2019 | 5 Minuten Lesezeit

Stefanie Hasler

Your job at codecentric?

Jobs

Agile Developer und Consultant (w/d/m)

Alle Standorte

Planning Poker: Tools for online estimation sessions

Many agile teams are using Planning Poker or Sprint Poker to estimate the size of their product backlog items. Shifting to remote or hybrid work, your team might look for a solution to hold virtual Planning Poker sessions. Luckily there are a lot of ...

Product management
Project management
Agile
Remote Work
Agile methods

23.6.2022 | 9 Minuten Lesezeit

David Lojewski

Jira templates for user stories, tasks and bugs

A recurring task in product management is writing user stories. In agile product development, a user story describes requirements for a product that are formulated from the viewpoint of a user. Therefore they become a key tool to work with requirements...

Project management
Agile
Atlassian
Product management
Agile methods

12.1.2022 | 4 Minuten Lesezeit

David Lojewski

Crowded backlog? A product is more than the sum of its features

We often find businesses in a stage of growth where they are experiencing problems caused by an increasing number of customer requests and requirements. They missed the moment when their success created the need for a different approach to their requirements...

Product management
Agile
Coaching
Agile methods
Agile

28.3.2021 | 5 Minuten Lesezeit

Anja Frank

Agile Toolbox: 10-minute story time

Backlog refinement meetings can become unrewarding and tedious really fast if you have to work through 20 stories in two hours. Wouldn’t it be nice if there was a format where a team could use its full energy while at the same time upping their flexibility...

Agile transformation
Process management
Product management
Project management
Agile
Coaching
Agile methods
Software architecture

23.3.2021 | 7 Minuten Lesezeit

Marco Böttcher

Measuring collaboration tool success – Still a fool with a tool? – Part...

RecapIn the first part of this blog post , we have integrated motivations and best practices for using collaboration tools into a measurement framework following a three-step process. The fourth and final step is probably the trickiest one: Finding ...

Atlassian
IT-Governance
Agile methods
Project management
Remote Work

7.1.2021 | 5 Minuten Lesezeit

Dennis Mersjann

Measuring collaboration tool success – Still a fool with a tool? – Part...

“Knowledge-worker productivity is the biggest of the 21st century management challenges.” – Realized by Peter Drucker as early as 1999, the vast majority of today’s companies need to make use of their collective intelligence to compete. Thus, organizations...

IT-Governance
Agile methods
Project management
Remote Work
Atlassian

6.1.2021 | 6 Minuten Lesezeit

Dennis Mersjann

Pair programming without keyboard

Pairing in general—and pair programming in particular—is an essential practice of XP . Unfortunately, pairing is closely associated with coding. Take, for example, the definition of the driver role: it is the person in control of the keyboard (Beck 2...

Agile
Software development
Agile methods
Team Programming

29.3.2020 | 3 Minuten Lesezeit

Andrey Skorikov

Remote teamwork – experience report from a distributed team

This blog post is something we had on our to-do list for such a long time that it feels like forever: sharing all the learnings about day-to-day remote work in our codecentric Digitization Labs teams. Now that COVID-19 hit Europe and everyone needs to...

Agile methods
Agile
Product management
Remote Work
Agile

23.3.2020 | 15 Minuten Lesezeit

Jan Hölter

Jan Coupette

Mob programming and shared everything

Mob programming is a technique we use extensively for sharing knowledge in the team, improving developer skills and increasing team cohesion. These might not be the primary goals of your business, but they probably contribute much more than you realise...

Agile
Agile methods
Team Programming
Software development
Remote Work

19.3.2020 | 11 Minuten Lesezeit

Florian Schneider

John Fletcher

We did our homework – what are the next steps? – Part 4

First: the most important step for a company is to identify the user’s pain points or particular frustration, rather than focussing on the amount of features you think are good for the user to have. Take a moment and rethink those decisions based on ...

Startup
Agile
Agile transformation
Product management
Agile methods
Testing
UX/UI

16.3.2020 | 6 Minuten Lesezeit

Franziska Schiwora

Talking to users – but how? – Part 3

The previous blog post took care of the differentiation between a “vitamin product” and a “painkiller product”, as well as pointing out how important the right team mindset is for the success of the product. It can be crucial in distinguishing between...

Startup
Agile
Agile transformation
Product management
UX/UI
Agile methods

10.3.2020 | 11 Minuten Lesezeit

Franziska Schiwora

Your product is a vitamin or pain killer? – Part 2

The first part was about your basic way of thinking. As well as answering the questions of whether it is time to change your point of view or whether you are already in the right state of mind to understand the motivation behind users’ actions and to...

Startup
Agile
Agile transformation
Product management
Software development
UX/UI
Agile methods

5.3.2020 | 5 Minuten Lesezeit

Franziska Schiwora

Mindset “I am the user” – Part 1

Imagine searching the web and using a website. Entering some data and suddenly you have a hard time finding the submit button. Or you fill out a form and press the “submit” or “next” button, an error is thrown back, but you can’t figure out the error...

Startup
Agile
Product management
Agile methods

3.3.2020 | 7 Minuten Lesezeit

Franziska Schiwora

Solution Factory – How to get from idea to product in 9 weeks

Digitization has been revolutionizing each and every business out there for the past few decades. It has surely a lot to offer in your business domain as well: a new customer portal to improve users’ satisfaction and help you reach out to a whole new...

Agile
AWS
Cloud
CI/CD
Software development
Agile methods

30.6.2019 | 9 Minuten Lesezeit

Mahdi Ebrahimi

Achim Nierbeck

Retrospective on the value stream of your software delivery

In this article I’ll introduce a retrospective format that you can use to evaluate a team’s ability to deliver software in a healthy manner. I used the structure of a value stream, like we see in value stream mapping or value stream analysis. Value stream...

Agile
Agile methods
Software development

25.2.2019 | 4 Minuten Lesezeit

Kevin van

Android testing (Part2): Kotlin DSL for Espresso and UIAutomator

In the previous post, we were explaining the struggle of choosing the proper cloud solution that provides the physical mobile devices to run the tests on. If you’ve skipped it, don’t worry. It’s here: Android testing (Part1): AWS Device Farm vs Firebase...

Agile methods
Mobile
DSL
Kotlin
Search
Android
Testing

6.11.2018 | 6 Minuten Lesezeit

Dusko Bajic

How to (re)set the iOS Application State in UI Tests

Writing UI tests for iOS applications using the XCTesting Framework where my app had some saved state between runs has caused me a lot of problems. To save the state locally, I used UserDefaults . Therefore I will show how to overcome such problems....

iOS
Swift
Testing
Agile methods

7.6.2018 | 7 Minuten Lesezeit

Mladen Jakovljević

How to Convince a Customer to Trust High-Performing Distributed Teams

Distributed teams, i.e. teams, whose members are not present onsite for the whole length of the project, have become rather widespread in the last several decades. The main advantage of this model is the ability to find qualified team members not only...

Remote Work
Agile
Collaboration
Product management
Agile methods

27.3.2018 | 7 Minuten Lesezeit

Slawa Giterman

Improve your test structure with Lambdas and Mockito’s Answer

Although the use of mock objects is controversial, we as developers have to use them from time to time. The nearly 6000 stars Mockito has on GitHub indicate that others would agree with this statement. Especially when we are dealing with library classes...

Agile methods
Testing

15.2.2018 | 6 Minuten Lesezeit

Ronny Bräunlich

DRY in the 21st Century

It seems that nowadays, the “Don’t Repeat Yourself”(DRY) principle is one of the foundations of programming that is criticized the most. You can find tweets and blog posts questioning it. Also, it seems that critical voices are increasing.But why is...

Agile methods
Software development

29.1.2018 | 8 Minuten Lesezeit

Ronny Bräunlich

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.

Contact

Send

Refactoring Algorithmic Code using a Golden Master Record

Introduction

The Code

Where to start

Creating a Golden Master

Step 1: Create A Number Of Random Inputs

Step 2: Use These Inputs To Generate A Number Of Outputs.

Step 3: Record The Inputs And Outputs

Create a test based on the Golden Master

Refactoring

Writing unit tests

Conclusion

Was this post helpful?

Ja

Blog author

Get in contact

Get in contact

More articles

Technologien lösen keine Probleme ― es sind die Menschen dahinter

Kulturkampf Digitalisierung — eine Un-Success Story

Code Freeze

Your job at codecentric?

Agile Developer und Consultant (w/d/m)

View Job

More articles in this subject area

Planning Poker: Tools for online estimation sessions

Jira templates for user stories, tasks and bugs

Crowded backlog? A product is more than the sum of its features

Agile Toolbox: 10-minute story time

Measuring collaboration tool success – Still a fool with a tool? – Part...

Measuring collaboration tool success – Still a fool with a tool? – Part...

Pair programming without keyboard

Remote teamwork – experience report from a distributed team

Mob programming and shared everything

We did our homework – what are the next steps? – Part 4

Talking to users – but how? – Part 3

Your product is a vitamin or pain killer? – Part 2

Mindset “I am the user” – Part 1

Solution Factory – How to get from idea to product in 9 weeks

Retrospective on the value stream of your software delivery

Android testing (Part2): Kotlin DSL for Espresso and UIAutomator

How to (re)set the iOS Application State in UI Tests

How to Convince a Customer to Trust High-Performing Distributed Teams

Improve your test structure with Lambdas and Mockito’s Answer

DRY in the 21st Century

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Unsere Leistungen

Hilf uns, noch besser zu werden.

Zu den Jobangeboten