Python/Prime factorization

Topics covered: prime factorization, flowcharts, Python dictionaries

The flowchart for a simple prime factorization algorithm

Here we learn how a Python dictionary can help express an integer as a product of prime numbers. There are two things each student should know about the code we are about to examine:

Lacking the wherewithal to seize upon either of these opportunities, I googled "python prime factorization", and found many[3] codes, and selected the one we shall examine for two reasons:[4]

  1. I wanted a program that counts repeated factors to later write the factored form in wikitext with exponents.[5]
  2. I didn't understand the Python source code and wanted to master it to fully understand how python dictionaries are used.

A good student exercise would be to modify this code so that it runs approximately twice as fast. This is accomplished by considering only odd numbers in the search for prime factors of N.[6] The table below illustrates the present version of the source code's "wasted" efforts to see if 4 and 6 might be prime factors of 1550. The table also illustrates the efficiency of dividing a candidate integer into the number , which soon becomes much smaller than It took only eight (8) steps to factor 1550 into prime numbers.

n k k^2 k^2<n? n/k mod replace factor *
1550 2 4 TRUE 775 0 n: n/k 2
775 2 4 TRUE 387 1 k: k+1
775 3 9 TRUE 258 1 k: k+1
775 4 16 TRUE 193 3 k: k+1
775 5 25 TRUE 155 0 n: n/k 5
155 5 25 TRUE 31 0 n: n/k 5
31 5 25 TRUE 6 1 k: k+1
31 6 36 FALSE - - - 31


Python code edit

## In a python code, the function definitions precede the code. We need only 1 function:

def prime_factorization(N):
##      12=(2**2)*(2**1)*(5**1), or equivalently, 12 = 2**2 * 3 * 5
## The dictionary, dict, expresses k**v as a "key-value" pair.  In other
## words, if we look up the key, k=5, the dictionary will return the
## value, v=1.  In other words: 
## k**v <=> key**value <=> base**exponent (for the factored form of N)
## Begin function:    
    n=N #--------------------# n will change in the while loop (getting smaller.)
    dict = {} #--------------# Start with an empty dictionary
    if n==1: #---------------# The factorization of 1 is a special case.
        return {1: 1} # In other words 1 = 1**1 (but recall that 1 is not prime.)
    k = 2 #------------------# Our first candidate for factorizing any integer.
    while k**2 <= n: #-------# Do the following unless k is "too big":
        if n % k: #----------# . Equivalent to: "If k is not a factor of n:"
            k += 1 #---------# .. increase the value of k by 1 on next attempt.       
        else: #--------------# . Equivalent to "If k is a factor of n:"
            n /= k #---------# .. Replace n by n/k (for next attempt)
            try: #-----------# .. If k is in the dictionary, this attempts to: 
                dict[k] += 1 # ... increase the value of k by 1.
            except KeyError: # .. KeyError occurs if k is not in the dictionary
                dict[k] = 1  # ... Add k to dictionary and set exponent v=1
    # end(while)#------------# (exits while k**2 <= n:)
    if n > 1: #--------------# It can be shown that n is either 1 or prime.
        try: #---------------# . Try to enter n into dictionary   
            dict[n] += 1 #---# .. increase exponent v->v+1 if k in dictinary  
        except KeyError: #---# . If n is not in the dictionary,
            dict[n] = 1      # .. then enter n into dictionary with v=1            
    return dict #------# python talk for ending function

################   End def prime_factorization  |  Begin program  ##########  
#import sys
again=True
while again == 1:
    text="\nEnter a positive integer you wish to factor into prime numbers:\n"
    N=int(input(text))
    dict=prime_factorization(N)
    print("\n"+str(int(N))+"=")
    text="("
    for b in dict.keys():
        text+=str(int(b))+"**"+str(int(dict.get(b)))+") * ("
    print(text.rstrip(" * (")+"\n")
    print("The dictionary is:\n")
    print(dict)
    again=int(input("\nType 1 for another integer, 0 to terminate."))

Discussion edit

With one extra line, a print command will display the dictionary. From the fact that Python decided to print 31 as 31.0, we know that this dictionary is taking the keys (2, 5, 31) to be floating point numbers. Python permits such a casual attitude regarding the mixing number types.[7] This makes it easier to program python, but perhaps more difficult to sort out such misunderstandings after the program has been written.

print(dict)
   {2: 1, 5: 2, 31.0: 1}

Comments edit

Line 15: while k**2 <= n: edit

This is an important line because it permits the termination of the search before one might think. I leave it as an exercise for the student to learn why any value of k larger than the square root of n will never be the next factor of N. It is also left to the student to understand the fact that the time to find the factors can be cut nearly in half if we modify line 17 and increase k by 2 (instead of 1), and why this larger step cannot be taken until after k=2 has been considered.

Lines in the code that might seem confusing edit

The code uses two clever tricks that might seem confusing. I lack the formal training in programming to know whether such cleverness would be perceived as a flaw. I know that programmers are allowed to be clever, and I know that clear programs are better than confusing programs. The fact that these tricks confused me does not necessarily mean they should be avoided.

And the nice thing about both sources of confusion is that they taught me something about python.

Line 16: if n%k might seem backwards edit

Python follows a common practice of interpreting the number 1 to be True and 0 to be False. But Python also accepts any non-zero number as True. Here n%k denotes nmod k, which is the remainder when n is divided by k. If there is no remainder the "if" condition is not satisfied and the code stipulates that the next value of k is attempted. Any number not equal to zero is interpreted as the "if" condition being satisfied, meaning that k is a divisor of n (in that case, we replace n with n/k.)

This might be good coding, but it seems to me that it would have been better to construct it along the lines of "if n%k == 0:" or even "if n%k != 0:".

Lines 21-23 and 29-31: A failed attempt to change the value of a key edit

In this code fragment, we ask if a given value of k has already been identified as a factor. But instead of "asking" if k has been declared a key in the dictionary, an attempt is made to change the value of k. If that attempt fails (because k is not among the dictionary's keys), then k is added to the list of keys, with a value of 1 (a value of 1 indicates that at this point, k, but not k2 has been established to be a factor of n.)

Again, there might be nothing wrong with using a key error like this. But the code would have been easier for me to follow if the if statement simply called the question of whether k was a key in dict.

Sample output with a large number edit

Sometimes you have to wait a few minutes to factor a number this large. But usually, it comes out in just a few seconds. I would imagine that a number that was the product of only two roughly equal prime numbers would take forever because k=k+1 must be repeated a virtually infinite number of times. This number came quickly:

Enter a positive integer you wish to factor into prime numbers:
99283428376482768916467328912876456416375474272263423467896879187289499287374899
 
99283428376482768916467328912876456416375474272263423467896879187289499287374899=
(619**1) * (688**1) * (768**3) * (1024**17) * (64853**1) * (5302548928**1)
 
The dictionary is:
 
 {619: 1, 688: 1, 768: 3, 1024: 17, 64853: 1, 5302548928.0: 1}
 
Type 1 for another integer, 0 to terminate.


DO NOT TRY TO FACTOR A NUMBER THIS LARGE UNLESS YOU ARE WILLING TO INTERUPT YOUR PYTHON PROGRAM.
Keep in mind that the universe is only about   old.[8]

The number shown rounds up to 1080. It is not likely that any computer will ever count that high. If the number to be factored consists of two prime numbers that are close to 1080, the code will iterate up to k coming close to 1040. The inability of any computer to perform that many iterations helps explain how cryptocurrency works. The owner of a cryptocurrency can have all of their information made public by posting something with the properties of the product of two very large prime numbers. The product is made public and anybody with knowledge of one of the products can claim ownership of the funds.

On the other hand, if you lose that information, nobody on earth can retrieve it for you.

  1. Outside of Python a "dictionary" is also called an w:Associative array.
  2. A working Python script is at w:Special:Permalink/1086485306
  3. I suspect that the abundance of such examples is the result of so many educators tasking students with writing source code that performs prime factorization.
  4. The code that follows is discussed at StackOverflow question 15347174 and scientific-python-101.readthedocs
  5. In the wikitext version, I had to add a few lines so that exponents of 1 would not appear. Example:   should instead read   I omitted the feature here to keep the lesson simple.
  6. All prime numbers are odd, except for 2
  7. "Casualness", I believe, lies at the heart of pythonic philosophy.
  8. http://astronomy.nmsu.edu/geas/lectures/lecture02/slide04.html