Time Series Classification with Random Forest (Part 1 - Update)

Earlier (like 7 months ago), this thread was started because of the comments received for our paper submitted to Data Mining and Knowledge Discovery. Now, I realize that I missed a point related to the feature extraction schemes in the earlier blog post.  In the earlier blog post (here), there were three items related to features to be used to train the random forests (supervised learning approaches in general). Here is another item (may add more in the future):

4- Learned features: There is a massive amount of work that deals with learning low (or high) dimensional representation of the observations. For example, simplest approach for learning features is principal component analysis (PCA). Using PCA, we can obtain a low dimensional representation of our data (by preserving the variance to certain extent). These (first couple of principal components) can go as features to learning approaches (which is commonly used, please see rotation forest paper [1] for an example). PCA as it is in its basic form is linear so nonlinear extensions are also considered (i.e. kernel PCA). More importantly, deep learning is popular nowadays. The features learned by these approaches can work quite well for time series data mining in general.

I will try to continue with Part 2 of this series (which will be about the details for each feature extraction scheme) as soon as I can. Unfortunately, I am pretty busy lately.

[1] Rodriguez, J.J.; Kuncheva, L.I.; Alonso, C.J., "Rotation Forest: A New Classifier Ensemble Method," Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.28, no.10, pp.1619,1630, Oct. 2006

Copyright © 2014 mustafa gokce baydogan

LinkedIn
Twitter
last.fm