m  import pickle from pathlib import Path from collections import defaultdict  Problem 1 - K Means Clustering A sample dataset has been provided to you in the './data/sample_dataset_kmeans.pickle' path. The centroids are in './data/sample_centroids_kmeans.pickle' and the sample result is in './data/sample_result_kmeans.pickle' path. You can use these to test your code. Here are the attributes for the dataset. Use this dataset to test your functions. Dataset should load the points in the form of a list of lists where each list item represents a point in the space. An example dataset will have the following structure. If there are 3 points in the dataset, this would appear as follows in the list of lists. dataset = [ [5,6], [3,5], [2,8] Note: A sample dataset to test your code has been provided in the location "data/sample_dataset_kmeans.pickle". Please maintain this as it would be necessary while grading. Do not change the variable names of the returned values. After calculating each of those values, assign them to the corresponding value that is being returned. Here is the dataset: [[46, 33], [26, 21], [23, 96], [82, 20], [25, 42], [29, 99], [30, 64], [57, 51], [12, 68], [25, 9]] In [ ]:   Here are the centroids:   [[12, 68], [46, 33], [25, 42]]   Here are the sample results:   {'1': {'cluster1': [[23, 96], [29, 99], [30, 64], [12, 68]], 'cluster2': [[46, 33], [82, 20], [57, 51], [25, 9]], 'cluster3': [[26, 21], [25, 42]], 'centroids': [[23.5, 81.75], [52.5, 28.25], [25.5, 31.5]]}, '2': {'cluster1': [[23, 96], [29, 99], [30, 64], [12, 68]], 'cluster2': [[46, 33], [82, 20], [57, 51]], 'cluster3': [[26, 21], [25, 42], [25, 9]], 'centroids': [[23.5, 81.75], [61.666666666666664, 34.666666666666664], [25.333333333333332, 24.0]]}}   This is the function I need to complete:   def k_means_clustering(centroids, dataset): #   Description: Perform k means clustering for 2 iterations given as input the dataset and centroids. #   Input: #       1. centroids - A list of lists containing the initial centroids for each cluster.  #       2. dataset - A list of lists denoting points in the space. #   Output: #       1. results - A dictionary where the key is iteration number and store the cluster assignments in the  #           appropriate clusters. Also, update the centroids list after each iteration.     result = {         '1': { 'cluster1': [], 'cluster2': [], 'cluster3': [], 'centroids': []},         '2': { 'cluster1': [], 'cluster2': [], 'cluster3': [], 'centroids': []}     }          centroid1, centroid2, centroid3 = centroids[0], centroids[1], centroids[2]

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question

import argparse
import pandas as pd
import numpy as np
import random 
import pickle
from pathlib import Path
from collections import defaultdict

 Problem 1 - K Means Clustering

A sample dataset has been provided to you in the './data/sample_dataset_kmeans.pickle' path. The centroids are in './data/sample_centroids_kmeans.pickle' and the sample result is in './data/sample_result_kmeans.pickle' path. You can use these to test your code.

Here are the attributes for the dataset. Use this dataset to test your functions.

  • Dataset should load the points in the form of a list of lists where each list item represents a point in the space.
  • An example dataset will have the following structure. If there are 3 points in the dataset, this would appear as follows in the list of lists.

dataset = [ [5,6], [3,5], [2,8]

Note:

  • A sample dataset to test your code has been provided in the location "data/sample_dataset_kmeans.pickle". Please maintain this as it would be necessary while grading.
  • Do not change the variable names of the returned values.
  • After calculating each of those values, assign them to the corresponding value that is being returned.

Here is the dataset:

[[46, 33], [26, 21], [23, 96], [82, 20], [25, 42], [29, 99], [30, 64], [57, 51], [12, 68], [25, 9]]
In [ ]:
 
Here are the centroids:
 
[[12, 68], [46, 33], [25, 42]]
 
Here are the sample results:
 
{'1': {'cluster1': [[23, 96], [29, 99], [30, 64], [12, 68]], 'cluster2': [[46, 33], [82, 20], [57, 51], [25, 9]], 'cluster3': [[26, 21], [25, 42]], 'centroids': [[23.5, 81.75], [52.5, 28.25], [25.5, 31.5]]}, '2': {'cluster1': [[23, 96], [29, 99], [30, 64], [12, 68]], 'cluster2': [[46, 33], [82, 20], [57, 51]], 'cluster3': [[26, 21], [25, 42], [25, 9]], 'centroids': [[23.5, 81.75], [61.666666666666664, 34.666666666666664], [25.333333333333332, 24.0]]}}
 
This is the function I need to complete:
 

def k_means_clustering(centroids, dataset):

#   Description: Perform k means clustering for 2 iterations given as input the dataset and centroids.
#   Input:
#       1. centroids - A list of lists containing the initial centroids for each cluster. 
#       2. dataset - A list of lists denoting points in the space.
#   Output:
#       1. results - A dictionary where the key is iteration number and store the cluster assignments in the 
#           appropriate clusters. Also, update the centroids list after each iteration.

    result = {
        '1': { 'cluster1': [], 'cluster2': [], 'cluster3': [], 'centroids': []},
        '2': { 'cluster1': [], 'cluster2': [], 'cluster3': [], 'centroids': []}
    }
    
    centroid1, centroid2, centroid3 = centroids[0], centroids[1], centroids[2]
    
    for iteration in range(2):
        #your code here
        
    return result

 
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps with 1 images

Blurred answer
Follow-up Questions
Read through expert solutions to related follow-up questions below.
Follow-up Question

Unnfortunately I am only allowed to use the following packaes in my answer:

port argparse
import pandas as pd
import numpy as np
import pickle
from pathlib import Path
from collections import defaultdict

 

 

 

Solution
Bartleby Expert
SEE SOLUTION
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY