nestor.datasets
Helper function to load excavator toy dataset.
Hodkiewicz, M., and Ho, M. (2016) "Cleaning historical maintenance work order data for reliability analysis" in Journal of Quality in Maintenance Engineering, Vol 22 (2), pp. 146-163.
BscStartDate | Asset | OriginalShorttext | PMType | Cost |
---|---|---|---|---|
initialization of MWO | which excavator this MWO concerns (A, B, C, D, E) | natural language description of the MWO | repair (PM01) or replacement (PM02) | MWO expense (AUD) |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
cleaned |
bool |
whether to return the original dataset (False) or the dataset with keyword extraction rules applied (True), as described in Hodkiewicz and Ho (2016) |
False |
Returns:
Type | Description |
---|---|
pandas.DataFrame |
raw data for use in testing nestor and subsequent workflows |
Source code in nestor/datasets/excavators.py
def load_excavators(cleaned=False):
"""
Helper function to load excavator toy dataset.
Hodkiewicz, M., and Ho, M. (2016)
"Cleaning historical maintenance work order data for reliability analysis"
in Journal of Quality in Maintenance Engineering, Vol 22 (2), pp. 146-163.
BscStartDate| Asset | OriginalShorttext | PMType | Cost
--- | --- | --- | --- | ---
initialization of MWO | which excavator this MWO concerns (A, B, C, D, E)| natural language description of the MWO| repair (PM01) or replacement (PM02) | MWO expense (AUD)
Args:
cleaned (bool): whether to return the original dataset (False) or the dataset with
keyword extraction rules applied (True), as described in Hodkiewicz and Ho (2016)
Returns:
pandas.DataFrame: raw data for use in testing nestor and subsequent workflows
"""
csv_filename = _download_excavators(cleaned=cleaned)
df = (
pd.read_csv(
csv_filename, parse_dates=["BscStartDate"], sep=",", escapechar="\\"
)
.astype(
{
"Asset": AssetType,
"OriginalShorttext": pd.StringDtype(),
"PMType": PMType,
"Cost": float,
}
)
.rename_axis("ID")
)
return df