ML.Frequent Pattern Mining¶
-
class
ddf_library.functions.ml.fpm.
AssociationRules
(confidence=0.5, max_rules=-1)¶ Bases:
ddf_library.bases.ddf_model.ModelDDF
Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases.
Example: >>> rules = AssociationRules(confidence=0.10).fit_transform(item_set)
Setup all AssociationsRules’s parameters.
Parameters: - confidence – Minimum confidence (default is 0.5);
- max_rules – Maximum number of output rules, -1 to all (default).
-
check_fitted_model
()¶
-
fit_transform
(data, col_item='items', col_freq='support')¶ Fit the model.
Parameters: - data – DDF;
- col_item – Column with the frequent item set (default, ‘items’);
- col_freq – Column with its support (default, ‘support’);
Returns: DDF with ‘Pre-Rule’, ‘Post-Rule’ and ‘confidence’ columns.
-
load_model
(filepath)¶ Load a machine learning model from a binary file in a storage.
Parameters: filepath – The absolute path name; Returns: self Example: >>> ml_model = Kmeans().load_model('hdfs://localhost:9000/model')
-
save_model
(filepath, overwrite=True)¶ Save a machine learning model as a binary file in a storage.
Parameters: - filepath – The output absolute path name;
- overwrite – Overwrite if file already exists (default, True);
Returns: self
Example: >>> cls = KMeans().fit(dataset, input_col=['col1', 'col2']) >>> cls.save_model('hdfs://localhost:9000/trained_model')
-
set_max_rules
(count)¶
-
set_min_confidence
(confidence)¶
-
class
ddf_library.functions.ml.fpm.
FPGrowth
(min_support=0.5)¶ Bases:
ddf_library.bases.ddf_model.ModelDDF
FPGrowth implements the FP-growth algorithm described in the paper LI et al., Mining frequent patterns without candidate generation, where “FP” stands for frequent pattern. Given a data set of transactions, the first step of FP-growth is to calculate item frequencies and identify frequent items.
LI, Haoyuan et al. Pfp: parallel fp-growth for query recommendation. In: Proceedings of the 2008 ACM conference on Recommender systems. ACM, 2008. p. 107-114.
Example: >>> fp = FPGrowth(min_support=0.10) >>> item_set = fp.fit_transform(ddf1, column='col_0')
Setup all FPGrowth’s parameters.
Parameters: min_support – minimum support value. -
check_fitted_model
()¶
-
fit_transform
(data, input_col)¶ Fit the model and transform the data.
Parameters: - data – DDF;
- input_col – Transactions feature name;
Returns: DDF
-
load_model
(filepath)¶ Load a machine learning model from a binary file in a storage.
Parameters: filepath – The absolute path name; Returns: self Example: >>> ml_model = Kmeans().load_model('hdfs://localhost:9000/model')
-
save_model
(filepath, overwrite=True)¶ Save a machine learning model as a binary file in a storage.
Parameters: - filepath – The output absolute path name;
- overwrite – Overwrite if file already exists (default, True);
Returns: self
Example: >>> cls = KMeans().fit(dataset, input_col=['col1', 'col2']) >>> cls.save_model('hdfs://localhost:9000/trained_model')
-