Python Phrontistery

Friday, May 2, 2008

Project Euler - Problem 002

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

Find the sum of all the even-valued terms in the sequence which do not exceed four million.

Fibonacci is always entertaining, but I detest how most programming courses use it as an example for explaining recursion. It's a waste of resources for something that can be done pretty simply, and even more simple in Python.

One of Python's incredibly magnificent data structures is the list. It's sort of like an array, in which you can get the value of an indexed position in the list by providing the index, but at the same time, it's not as inflexible as an array. And though most languages include a data structure that represents a dynamic list, Python's version is just way beyond cool.

Here's an example of a list:
randomlist = ['A', 5, "llamas"]

The list randomlist now contains three elements; the index of the first element is 0 (just because that's how computers count).

>>> randomlist[0]
'A'
>>> randomlist[1]
5
>>> randomlist[2]
"llamas"

(Notice how Python lets you quote strings using either single or double quotes? Also, it doesn't restrict what you dump into a list, like some other languages where you have to declare the type of the items you'll add to the list.)

So let's see how our professors taught us Fibonacci:

function fib(n):
if n = 1 or n = 2
return n
else
return fib(n-1) + fib(n-2)
end if

So to calculate fib(5), you have to calculate fib(4) and fib(3). But to calculate fib(4), you have to calculate fib(3) and fib(2), so you've wasted time calculating fib(3) twice. The same happens with fib(2), calculated in fib(4) and fib(3) (Yes, fib(2) is just 2, but we're sticking to principle here.)

And as if this weren't bad enough, every time you call a function, you're taking up memory. A function call will reside in an area of memory called the stack until it has returned something. So aside from calculating fib(3) twice, you're taking up twice the memory as well, until those calls get their result.

Now, let's use a much more reasonable way of calculating this:

fiblist = [1, 2]
if n <>
How does Python improve on this? Well, to get the last element of a list, you could get the value of the length of the list, subtract 1 and use that as the index. But Python already anticipated this as something that it should provide, so it takes the hard work out of it. To get the last element of fiblist, use fiblist[-1]. To get the second-to-last element, use fiblist[-2].

Now, how about adding something to the end of a list? Simple. Just use fiblist.append(newelement) to add newelement to the end of the list.

The above rigmarole gives you the list of Fibonacci elements, but how do you get just the even ones, like the problem requires? Well, you still need all of the elements of the sequence to calculate the next one, but we can capture just the even ones as we're generating the list, and add them to a running total, like we did in Problem 001:

while x <>

This will start a block of code that will continue until the value of x is greater than 4 million.

x = fiblist[-1] + fiblist[-2]

To add the last two elements in the list.

if x % 2 = 0:
total = total + x

If x is even, add it to the running total (remember to zero out the total at the beginning!)

fiblist.append(x)

Add x to the end of the list, ready for the next calculation to use it.

It's interesting that the Fibonacci sequence has a pattern in the parity of its elements:

1, 2, 3, 5, 8, 13, 21, 34, 55, ...
O E O O E O O E O

How come? Well, the sum of two odd numbers or two even numbers is always even, and the sum of an even number and an odd number is always odd. See how the even number sticks around just long enough to get the next two numbers to be odd, guaranteeing that just as it leaves the two-element window, the next element to be added is even.

Ain't math grand?

Thursday, May 1, 2008

Project Euler - Problem 001

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.
Find the sum of all the multiples of 3 or 5 below 1000.

Well, this is an easy one, but it's a great way to introduce the concept of modulo.

Remember having to do long division in elementary school? Lots of multiplication, subtracting, etc. to get a quotient and a remainder in the end. Most people focused on the quotient as the answer to the problem, while the remainder didn't get a second glance. Well, the modulo reverses that situation, only focusing on the remainder, and pretty much ignoring the quotient.

Most programming languages have a function for modulo, and Python is no different; it uses "%" as the modulo operator. Also, different from your pocket calculator, we want to get the integer portion of the division, chopping off the decimal component. Python's operator for integer division is "//". So let's play with some examples:

>>> 10 // 2
5
>>> 9 // 2
4
>>> 10 % 2
0
>>> 9 % 2
1

So, for any number a, where a // b = c and a % b = d, a = b * c + d. c is the quotient, and d is the modulo.

So how does this solve our problem? Well, if you think about it, if x is a multiple of y, then x % y = 0, meaning there's no remainder when you divide x by y.

Let's write some (pseudo)code:

total = 0
for each number X between 1 and 1000:
if X % 3 = 0 or X % 5 = 0:
total = total + X
end if
end for
print total

Now to convert that to Python, we need to know a few things about the language. I'll provide lines from my solution as we go along:

total = 0

Python uses what's known as "duck typing", from the phrase "If it walks like a duck, and quacks like a duck, it's a duck". So, if it's a number, it won't worry if it's an integer, a float (decimal), a long (really large integers), etc.; in terms of operations, it'll use the best type for the result so that it won't lose information. Also, it's loosely typed, meaning that you don't have to declare a variable as a certain type before storing a value in it. Python automatically assumes you know what you want to do with your variables, so it lets you get away with almost anything; however, this also means that it's your responsibility to remember what kind of value you put into a variable.

for x in xrange(1000):

The looping structure here is a for loop, and every time it enters the loop, it'll take the next number between 1 and 999. The xrange function generates all the required numbers as they are needed. (The range and xrange statements always stop one before the argument, so xrange(1000) will offer 999 as its last value.)

if x % 3 == 0 or x % 5 == 0:

The math has been explained, and the == is not assigning a value to x % 3, but asking the question "Is the result of x % 3 equal to 0?". It'll either be true or false, and based on that, it will continue into the body of the if statement, or jump over it. The or in the middle just establishes that if either of the questions is true, then the body of the if statement should be executed.

total = total + x

This just creates a running total. The new total is equal to the previous total plus the value of x.

print total

This just outputs the value of the total after having run through all values of the range. Hey, we need to know what the result is to input it into Project Euler!

Another particularity of Python is that indentation is used to see what statements are included in the loop and if conditional. So the above looks like:


total = 0
for x in xrange(1000):
    if x % 3 == 0 or x % 5 == 0:
        total = total + x
print total

That may be weird for some people coming from other languages, and it takes some getting used to. I keep wanting to use braces a la Java and C++, but after a while it makes sense, sort of. It's also one of the more endearing qualities of Python; in other languages, the indentation isn't a requirement, so even though indentation helps understand what's going on, some programmers can mess it up without any adverse effects in the code. In Python, you *have* to indent to make your code work, so there's no getting around it.

Since this one is pretty much solved off the bat, no separate solution is provided.

Introduction

Hello! I'm Tony, and I'm going to be playing with Python.

This blog is intended to document my experiments in dealing with Python and learning its intricacies. I'll be talking about math, programming, and pretty much anything that comes to mind during these posts.

Also, I plan on documenting solutions in Python to the problems in Project Euler without explicitly revelaing them. How? The idea is to encrypt the Python code in a zip file, but using a not commonly known, but easily discoverable password. I expect this to deter anyone looking for a cheap and easy way to get a solution, while those who really have the need for one can do a little searching and potentially open the zip file if they need to, and get exposure to some digits in a different language.

For now, the passwords for each file will be the individual digits of the problem number, padded with zeroes up to three digits, written one after another with no spaces in between. So what's the catch? The names for the digits will be in a different written language for each problem, or set of problems. Since the concept of zero was unknown for many dead languages (and my reference for digits in other languages doesn't include zeros), I'll use the English word for "zero", but just backwards ("orez") anytime a zero shows in a puzzle number.

For example, if I say that the password for the zip file for problem 094 is in Swedish, the password would be "orezniofyra", with nio and fyra being the Swedish names for the digits 4 and 9 respectively.

This first post will contain references to the language used for each problem, so it will be updated frequently.

Let's get to Pythoning!!