Pair Distance

You are not logged in.

Please Log In for full access to the web site.
Note that this link will take you to an external site (https://shimmer.mit.edu) to authenticate, and then you will be redirected back to this page.

In this problem, you'll be working with an array representing people's locations on a 2-D square. The array is, as usual, a list of lists. Each element is a string: either the name of a person, or the empty string "".

For example, the array

array = [["Olek", ""],["Raj", ""]]

represents a square in which Olek is standing in the top left, Raj is standing at the bottom left, and no one is standing on the right:

Olek
Raj

Your task is to print the average distance between pairs of different people in the square. "Different" people means that a pair of people like "Raj and Raj" is not valid.

Implementation Notes

  • You may assume that if a person is standing in a cell, they're always standing in the exact center of it1.
  • Each cell in the square has height and width 1 unit. So the above square, for example, has height and width 2.
  • You may assume all string names are unique2.
  • You should use the Euclidian distance (think distance formula).
  • The first line of your program should set up a variable array to hold the input array.
  • You may assume that there are at least two people in the square.3

Examples

In the above example, since Raj and Olek are the only two people on the board, the pair containing them is the only pair of different people. That pair has distance 1 apart, since Raj and Olek stand at the center of cells which are 1 unit apart vertically and 0 units apart horizontally. So the average distance is 1. Your program should print this value.


array = [["Juan", "", "Melinda"],
         ["", "", ""],
         ["Valerie", "", ""]]

represents the square:

JuanMelinda
Valerie

The pairs of different people in this square are: Juan and Melinda, Juan and Valerie, Melinda and Valerie. Their distances are 2, 2, and \sqrt{2^2+2^2} = 2\sqrt{2}, respectively. So the average distance your program should print is \frac{2+2+2\sqrt{2}}{3} = 2.27614237492. (You need only print the result accurate to within 10^{-6}.)

You may upload your file for testing:

A Python Error Occurred:

Error on line 5 of question tag.
    csq_soln="""

NameError: name 'logging' is not defined. Did you forget to import 'logging'

Solutions

Here is a commented solution:

array = [["Juan", "", "Melinda"],
         ["", "", ""],
         ["Valerie", "", ""]]
<p>length = len(array)</p>
<h1>1. Make a list of all the locations, which I'll store as tuples like (row_ix, col_ix)</h1>
<p>locations = []
for row_ix in range(length):
for col_ix in range(length):
if array[row_ix][col_ix] != "": # If this array element isn't ""...
# ...then a person is there, so store their location
locations.append((row_ix, col_ix))</p>
<h1>2. Sum up the pairwise distances between all different locations</h1>
<p>distance_sum = 0
num_pairs = 0
for loc1 in locations:
for loc2 in locations:
if loc1 != loc2: # It's not a pair of two people if both people are the same
dist = ((loc1[0]-loc2[0])**2 + (loc1[1]-loc2[1])**2)**0.5
distance_sum = distance_sum + dist
num_pairs = num_pairs + 1</p>
<h1>3. Compute the average.</h1>
<p>average_dist = distance_sum / num_pairs
print(average_dist)</p>

If you step through this solution carefully, you'll notice that it actually overcounts the <code>distance_sum</code> by a factor of two. That is, for every pair of people Alice and Bob, this solution adds the distance between them twice: once for the pair "Alice and Bob," and once for the pair "Bob and Alice." But since the number of pairs count is also overcounted by this same factor of two, the average is unaffected.

-------

You may have been tempted to explicitly keep track of which pairs you've seen, in order to avoid double counting, like this:

<p>array = [["Juan", "", "Melinda"], ["", "", ""], ["Valerie", "", ""]]</p> <p>length = len(array)</p> <h1>1. Make a list of all the locations</h1> <p>locations = [] for row_ix in range(length): for col_ix in range(length): if array[row_ix][col_ix] != "": locations.append((row_ix, col_ix))</p> <h1>2. Sum up the pairwise distances between all locations</h1> <p>seen_pairs = [] distance_sum = 0 num_pairs = 0 for loc1 in locations: for loc2 in locations: if loc1 != loc2 and not ((loc1, loc2) in seen_pairs or (loc2, loc1) in seen_pairs): dist = ((loc1[0]-loc2[0])**2 + (loc1[1]-loc2[1])**2)**0.5 distance_sum = distance_sum + dist num_pairs = num_pairs + 1 seen_pairs.append((loc1, loc2)) # *</p> <h1>3. Compute the average.</h1> <p>average_dist = distance_sum / num_pairs print(average_dist)</p>

Here the variable `seen_pairs` stores seen pairs in the form `(loc1, loc2)`, where `loc1` and `loc2` are themselves tuples of the form `(row_ix, col_ix)`. Notice that, on the line marked with a star, adding the tuple to the list in either order would achieve the same result, since the conditional above checked for membership of the pair in the list in either order.

This approach is fine, although a bit unnecessarily tedious.

-----
And, finally, here's a solution which is much slower:

<p>array = [["Juan", "", "Melinda"], ["", "", ""], ["Valerie", "", ""]]</p> <p>length = len(array) distance_sum = 0 num_pairs = 0</p> <p>for row_ix1 in range(length): for col_ix1 in range(length):</p>
    # Got one element
    elem1 = array[row_ix1][col_ix1]
    for row_ix2 in range(length):
        for col_ix2 in range(length):

            # Got another element
            elem2 = array[row_ix2][col_ix2]
            if elem1 != "" and elem2 != "": 
                # We've found two people
                if elem1 != elem2: 
                    dist = ((row_ix2-row_ix1)**2 + (col_ix2-col_ix1)**2)**0.5
                    distance_sum = distance_sum + dist
                    num_pairs = num_pairs + 1
<p>average_dist = distance_sum / num_pairs print(average_dist)</p>
Wow, a quadruple for loop! 

</showhide>
<div id="cs_footnotes" style="display:none;"></div>