Brute Force Search with Linear-Encoding Cryptography

by Justin Skycak on

Brute force search involves trying every single possibility.

This post is a chapter in the book Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents. Suggested citation: Skycak, J. (2021). Brute Force Search with Linear-Encoding Cryptography. Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents. https://justinmath.com/brute-force-search-with-linear-encoding-cryptography/


An encoding function maps a string to a sequence of numbers. For example, you have already implemented the trivial encoding function, defined as follows:


' ' --> 0
'a' --> 1
'b' --> 2
...
'z' --> 26

There are many different types of encoding functions, but here we will focus on linear encoding functions. For example, using a linear encoding function $f(x) = 2x+3,$ we can encode the message 'a cat' as follows:


Original message: 'a cat'
Trivial encoding: [1, 0, 3, 1, 20]
Linear encoding:  [2*1+3, 2*0+3, 2*3+3, 2*1+3, 2*20+3]
                  = [5, 3, 9, 5, 43]

Recovering a Message

If we want to recover our message from the linear encoding, we first solve for the inverse of the encoding function, i.e. the decoding function:

$\begin{align*} f(x) &= 2x+3 \\[5pt] x &= 2 f^{-1}(x)+3 \\[5pt] f^{-1}(x) &= \dfrac{x-3}{2} \end{align*}$


Then, we apply the decoding function to the encoded message:


Linear encoding:  [5, 3, 9, 5, 43]
Trivial encoding: [(5-3)/2, (3-3)/2, (9-3)/2, (5-3)/2, (43-3)/2]
                  = [1, 0, 3, 1, 20]
Original message: 'a cat'

Recovering a Message with Multiple Possible Encoding Functions

Now, suppose we have a message [3, 1, 9, 31, 15] and we know that this message was encoded using a linear encoding function $f(x) = ax+b$ with $a \in \lbrace 1, 2 \rbrace$ and $b \in \lbrace 1, 2 \rbrace.$ Can we break the code and recover the initial message?

The simplest way to do this is through brute-force search, which involves trying every single possibility. Here, there are $4$ possible encoding functions:

$\begin{align*} f_{ab}(x) &= ax + b \\ \\ f_{11}(x) &= 1x + 1 \\ f_{12}(x) &= 1x + 2 \\ f_{21}(x) &= 2x + 1 \\ f_{22}(x) &= 2x + 2 \end{align*}$


By inverting these encoding functions, we obtain the following decoding functions:

$\begin{align*} f_{ab}^{-1}(x) &= \dfrac{x - b}{a} \\ \\ f_{11}^{-1}(x) &= \dfrac{x - 1}{1} \\[5pt] f_{12}^{-1}(x) &= \dfrac{x - 2}{1} \\[5pt] f_{21}^{-1}(x) &= \dfrac{x - 1}{2} \\[5pt] f_{22}^{-1}(x) &= \dfrac{x - 2}{2} \end{align*}$


Let’s apply each of these decoding functions to our encoded message [3, 1, 9, 31, 15] and see what they come up with. Remember that in order to represent a message, the results must all be integers between $0$ and $26$ inclusive.


[3, 1, 9, 31, 15]

a, b
    [(3-b)/a, (1-b)/a, (9-b)/a, (31-b)/a, (15-b)/a]

a=1, b=1
    [(3-1)/1, (1-1)/1, (9-1)/1, (31-1)/1, (15-1)/1]
    [2, 0, 8, 30, 14]
    30 is too big
    does not represent a message
 
a=1, b=2
    [(3-2)/1, (1-2)/1, (9-2)/1, (31-2)/1, (15-2)/1]
    [1, -1, 7, 29, 13]
    -1 is too small, 29 is too big
    does not represent a message

a=2, b=1
    [(3-1)/2, (1-1)/2, (9-1)/2, (31-1)/2, (15-1)/2]
    [1, 0, 4, 15, 7]
    message: $\'$a dog$\'$

a=2, b=2
    [(3-2)/2, (1-2)/2, (9-2)/2, (31-2)/2, (15-2)/2]
    [0.5, -0.5, 3.5, 14.5, 6.5]
    contains non-integer entries
    does not represent a message

We conclude that the original message was 'a dog' and that it was encoded using $a=2$ and $b=1.$

Exercises

As usual, be sure to include a variety of tests.

  1. Write a function encode_string(string, a, b) that encodes the string using the linear encoding function $f(x) = ax+b$ and returns the resulting array of numbers.
  2. Write a function decode_numbers(numbers, a, b) that attempts to decode the numbers array under the assumption that the encoding function was $f(x) = ax+b.$ If the numbers represent a message, then return the string corresponding to that message. Otherwise, return False.
  3. Write a script to decode the message [377, 717, 71, 513, 105, 921, 581, 547, 547, 105, 377, 717, 241, 71, 105, 547, 71, 377, 547, 717, 751, 683, 785, 513, 241, 547, 751], given that it was encoded with a linear encoding function $f(x)=ax+b$ where $a$ and $b$ are both integers between $0$ and $100$ inclusive. Be sure to print out all valid messages along with the values of $a$ and $b$ that generated them. Note that although there may be more than one valid message, only one will contain real words.


This post is a chapter in the book Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents. Suggested citation: Skycak, J. (2021). Brute Force Search with Linear-Encoding Cryptography. Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents. https://justinmath.com/brute-force-search-with-linear-encoding-cryptography/