有些时候纯粹用shell似乎感觉有点力不从心,在python的帮助下似乎简单很多
format json
using following commands:
echo '{"foo": "bar"}' | python -m json.tool |
analyze results from timer
在系统的performance的相关问题中常常需要测性能,比如某个部分的时间,之后设置好的timer会打印出很多的格式化的数据存在log中。之前的时候常常用shell过滤出来所需要的log信息然后放在excel中处理,但是效率还是低,而且linux环境下使用excel不很方便,于是就采用shell+python结合起来的方式。
比如log中的信息如下
timer1: the time 1.0 |
这个时候可以用如下的bash先讲data过滤出来
cat testdata.txt |cut -d " " -f 4 |
之后再将这些数据使用pip传送给python scripts 比如
cat testdata.txt |cut -d " " -f 4 | python3 analysis.py |
具体的python scripts如下,需要对于list添加什么样的运算都可以灵活处理,简单直接
import sys |
Actually, python statistical library have provided a function to compute these values at once:
import numpy as np |
In addition to these basic descriptions, another description is the statistical significance test, this video shows a good theoretical backgound about the hypothesis testing.
Assuming the that the collected value is the execution time, you may see that 50% of them completed in 1 seconds. However, it is more professional to use the hypothesis testing to show this. Such as the p value to reject the hypothesis is 0.01, null hypothesis is the execution time which is larger then specific threshold.
Using python library can do the ttest quickly, there are flexible parameters to set the left tail testing, right tail testing of double sided testing.