【HTM 算法】第一个例子 Online Prediction Framework (OPF)
HTM Python
安装好 NuPIC 后,第一个例子。安装教程 前往
官网地址:http://nupic.docs.numenta.org/1.0.5/quick-start/opf.html
总共三个文件,其相对路径关系如图所示:
opf
data
- gymdata.csv
examples
- test.py
params
- model.yaml
Python 代码
新建一个py文件,复制官网文档 该例 代码如下:
import csv
import datetime
import os
import yaml
from itertools import islice
from nupic.frameworks.opf.model_factory import ModelFactory
_NUM_RECORDS = 3000
_EXAMPLE_DIR = os.path.dirname(os.path.abspath(__file__))
_INPUT_FILE_PATH = os.path.join(_EXAMPLE_DIR, os.pardir, "data", "gymdata.csv")
_PARAMS_PATH = os.path.join(_EXAMPLE_DIR, os.pardir, "params", "model.yaml")
# create model according to the file of params/model.yaml
def createModel():
with open(_PARAMS_PATH, "r") as f:
modelParams = yaml.safe_load(f)
return ModelFactory.create(modelParams)
# run Hotgym.
def runHotgym(numRecords):
model = createModel()
model.enableInference({ "predictedField": "consumption"})
# open the csv file as fin
with open(_INPUT_FILE_PATH) as fin:
reader = csv.reader(fin)
# skip three line useless data
headers = reader.next()
reader.next()
reader.next()
results = []
# for each line
for record in islice(reader, numRecords):
# Create a dictionary with field names as keys, row values as values.
modelInput = dict(zip(headers, record))
# Convert string consumption to float value.
modelInput["consumption"] = float(modelInput["consumption"])
# Convert timestamp string to Python datetime.
modelInput["timestamp"] = datetime.datetime.strptime(
modelInput["timestamp"], "%m/%d/%y %H:%M")
# Push the data into the model and get back results.
result = model.run(modelInput)
bestPredictions = result.inferences["multiStepBestPredictions"]
allPredictions = result.inferences["multiStepPredictions"]
# Confidence values are keyed by prediction value in multiStepPredictions.
oneStep = bestPredictions[1]
oneStepConfidence = allPredictions[1][oneStep]
fiveStep = bestPredictions[5]
fiveStepConfidence = allPredictions[5][fiveStep]
result = (oneStep, oneStepConfidence * 100,
fiveStep, fiveStepConfidence * 100)
print "1-step: {:16} ({:4.4}%)\t 5-step: {:16} ({:4.4}%)".format(*result)
results.append(result)
return results
if __name__ == "__main__":
runHotgym(_NUM_RECORDS)
添加一个yaml文件
需要注意:如果不修改上面的代码,则需要在该py文件的上级目录下新建文件夹,起名为 params
,并且在此params目录下新建一个yaml文件,起名为 model.yaml
,填写内容如下:
model: HTMPrediction
version: 1
aggregationInfo:
fields:
- [consumption, mean]
microseconds: 0
milliseconds: 0
minutes: 0
months: 0
seconds: 0
hours: 1
days: 0
weeks: 0
years: 0
predictAheadTime: null
modelParams:
inferenceType: TemporalMultiStep
sensorParams:
verbosity: 0
encoders:
consumption:
fieldname: consumption
name: consumption
resolution: 0.88
seed: 1
type: RandomDistributedScalarEncoder
timestamp_timeOfDay:
fieldname: timestamp
name: timestamp_timeOfDay
timeOfDay: [21, 1]
type: DateEncoder
timestamp_weekend:
fieldname: timestamp
name: timestamp_weekend
type: DateEncoder
weekend: 21
sensorAutoReset: null
spEnable: true
spParams:
inputWidth: 946
columnCount: 2048
spVerbosity: 0
spatialImp: cpp
globalInhibition: 1
localAreaDensity: -1.0
numActiveColumnsPerInhArea: 40
seed: 1956
potentialPct: 0.85
synPermConnected: 0.1
synPermActiveInc: 0.04
synPermInactiveDec: 0.005
boostStrength: 3.0
tmEnable: true
tmParams:
verbosity: 0
columnCount: 2048
cellsPerColumn: 32
inputWidth: 2048
seed: 1960
temporalImp: cpp
newSynapseCount: 20
initialPerm: 0.21
permanenceInc: 0.1
permanenceDec: 0.1
maxAge: 0
globalDecay: 0.0
maxSynapsesPerSegment: 32
maxSegmentsPerCell: 128
minThreshold: 12
activationThreshold: 16
outputType: normal
pamLength: 1
clParams:
verbosity: 0
regionName: SDRClassifierRegion
alpha: 0.1
steps: '1,5'
maxCategoryCount: 1000
implementation: cpp
trainSPNetOnlyIfRequested: false
添加一个 csv 文件
需要注意:如果不修改上面的代码,则需要在该py文件的上级目录下新建文件夹,起名为 data
,并且在此 data目录下新建一个csv文件,起名为 gymdata.csv
,填写内容如下:
timestamp,consumption
datetime,float
T,
7/2/10 0:00,21.2
7/2/10 1:00,16.4
7/2/10 2:00,4.7
7/2/10 3:00,4.7
7/2/10 4:00,4.6
7/2/10 5:00,23.5
7/2/10 6:00,47.5
7/2/10 7:00,45.4
7/2/10 8:00,46.1
7/2/10 9:00,41.5
7/2/10 10:00,43.4
7/2/10 11:00,43.8
7/2/10 12:00,37.8
7/2/10 13:00,36.6
7/2/10 14:00,35.7
7/2/10 15:00,38.9
7/2/10 16:00,36.2
运行
在该例中py文件所在目录下输入命令如下(根据实际文件名更改):
$ sudo python test.py
效果展示
运行后可以看到控制台输出内容如下:
1-step: 21.2 (100.0%) 5-step: 21.2 (100.0%)
1-step: 16.4 (99.8%) 5-step: 16.4 (99.8%)
1-step: 4.7 (99.6%) 5-step: 4.7 (99.6%)
1-step: 4.7 (99.6%) 5-step: 4.7 (99.6%)
1-step: 4.6 (99.4%) 5-step: 4.6 (99.4%)
1-step: 23.5 (99.4%) 5-step: 23.5 (99.4%)
1-step: 47.5 (99.21%) 5-step: 47.5 (99.21%)
1-step: 45.4 (99.06%) 5-step: 45.4 (99.06%)
1-step: 46.1 (98.87%) 5-step: 46.1 (98.87%)
1-step: 41.5 (98.87%) 5-step: 41.5 (98.87%)
1-step: 43.4 (98.68%) 5-step: 43.4 (98.68%)
1-step: 43.8 (33.89%) 5-step: 43.8 (90.87%)
1-step: 43.8 (46.41%) 5-step: 37.8 (87.24%)
1-step: 36.6 (78.89%) 5-step: 45.61 (46.18%)
1-step: 45.61 (40.95%) 5-step: 47.5 (46.79%)
1-step: 45.61 (87.64%) 5-step: 43.8 (95.56%)
1-step: 38.9 (50.13%) 5-step: 43.8 (16.84%)
常见错误与解决方法
- 保证 python 版本为 2.x
如果提示错误如下:
Traceback (most recent call last):
File "test.py", line 4, in <module>
import yaml
ImportError: No module named yaml
可能是没有安装这个yaml包,但是更有可能是需要添加sudo ,比如说输入
python test.py
提示以上错误,可以考虑使用sudo python test.py
。
略加修改与测试
读python源码,发现只输出step1 和step5,所以我想输出step3。
修改
params/model.yaml
文件,查找其中的 1,5的位置,改成1,3,5修改后相应内容如下:
clParams:
verbosity: 0
regionName: SDRClassifierRegion
alpha: 0.1
steps: '1,3,5'
maxCategoryCount: 1000
implementation: cpp
修改python代码,找到oneStep 与 fiveStep,类似地,定义一个变量threeStep,基本过程也是一样,需要修改数组中下标,输出也是需要进行类似修改。总而言之代码如下:
oneStep = bestPredictions[1]
oneStepConfidence = allPredictions[1][oneStep]
treeStep = bestPredictions[3]
treeStepConfidence = allPredictions[3][treeStep]
fiveStep = bestPredictions[5]
fiveStepConfidence = allPredictions[5][fiveStep]
result = (oneStep, oneStepConfidence * 100,
treeStep, treeStepConfidence * 100,
fiveStep, fiveStepConfidence * 100)
print "1-step: {:16} ({:4.4}%)\t 3-step: {:16} ({:4.4}%)\t 5-step: {:16} ({:4.4}%)".format(*result)
测试效果如下:
分析部分将会记录在另一篇博客。
Smileyan
2019年11月8日 20:58
还没有评论,来说两句吧...