data cleaning using python in Apriori algorithm , association ruls.
Description
In this assignment, you will use Python’s mlxtend.frequent_patterns to find the association rules satisfying given lift, confidence and support threshold for the list of transactions
- Step 1 (35 points): Clean the data by
- removing rows whose StockCode or Invoice values contain non-digit characters
- removing rows whose Price values are less than 10
- removing rows whose country values are not equal to “United Kingdom”, “Italy”, “France”, “Germany”, “Norway”, “Finland”, “Austria”, “Belgium”, “European Community”, “Cyprus”, “Greece”, “Iceland”, “Malta”, “Netherlands”, “Portugal”, “Spain”, “Sweden”, or “Switzerland”.
- removing rows whose quantity values are negative.
- trimming the description using string.strip function
- Step 2 (30 points) Find the frequent itemsets with min_support = 0.01
- Step 3 (35 points) Find the association rules with confidence greater than 10%. Among them, which rule(s) has the highest value of lift?
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."